Files
BatchVocabListGenerator/instructions.md
jonasgaudian eabe2e2969 welcome gitea
2026-02-19 17:18:23 +01:00

52 lines
2.2 KiB
Markdown

## Core Rules & App Mechanics
When generating the Python logic for the JSON, you must adhere to these rules because of how the app's Kotlin data models and Room database are structured:
1. **Export Structure:** The root object is a `CategoryExport`. It must contain exactly one category definition and an array of items belonging to it.
2. **Dummy IDs:** Do not attempt to guess or fetch the target database IDs. The app handles ID remapping (ConflictResolution) natively during import.
* Always use `99999` for the category `id`.
* Start item `id`s sequentially from `100000`.
3. **Hardcoded Dates:** The Kotlin parser uses `kotlinx.serialization` and strictly expects ISO-8601 timestamps for date fields. If they are missing or empty, the app will crash. However, the dates do not need to be accurate for new imports. Hardcode `"2024-01-01T00:00:00.000Z"` for the `exportDate` and `createdAt` fields.
4. **Minimal Data:**
* `formatVersion` must be `1`.
* The `states` array must be completely empty: `[]` (since these are new words without learning history).
* The `features` property on every item must strictly be an empty JSON object string: `"{}"`.
* Completely omit the `zipfFrequencyFirst` and `zipfFrequencySecond` fields.
5. **Stage Mappings:** Every generated item needs a corresponding entry in the `stageMappings` array, setting its learning stage to `"NEW"`.
## Target JSON Schema
Your Python script must output a JSON file that perfectly matches this structure. The `items` and `stageMappings` arrays should expand dynamically based on the input words.
```json
{
"type": "Category",
"formatVersion": 1,
"exportDate": "2024-01-01T00:00:00.000Z",
"metadata": {
"itemCount": 1,
"categoryCount": 1,
"exportScope": "Category: <CATEGORY_NAME>"
},
"category": {
"type": "TagCategory",
"id": 99999,
"name": "<CATEGORY_NAME>"
},
"items": [
{
"id": 100000,
"languageFirstId": <LANG_FIRST_ID>,
"languageSecondId": <LANG_SECOND_ID>,
"wordFirst": "<WORD_1>",
"wordSecond": "<WORD_2>",
"createdAt": "2024-01-01T00:00:00.000Z",
"features": "{}"
}
],
"states": [],
"stageMappings": [
{
"vocabularyItemId": 100000,
"stage": "NEW"
}
]
}