BatchVocabListGenerator/instructions.md

## Core Rules & App Mechanics
When generating the Python logic for the JSON, you must adhere to these rules because of how the app's Kotlin data models and Room database are structured:

1. **Export Structure:** The root object is a `CategoryExport`. It must contain exactly one category definition and an array of items belonging to it.
2. **Dummy IDs:** Do not attempt to guess or fetch the target database IDs. The app handles ID remapping (ConflictResolution) natively during import.
   * Always use `99999` for the category `id`.
   * Start item `id`s sequentially from `100000`.
3. **Hardcoded Dates:** The Kotlin parser uses `kotlinx.serialization` and strictly expects ISO-8601 timestamps for date fields. If they are missing or empty, the app will crash. However, the dates do not need to be accurate for new imports. Hardcode `"2024-01-01T00:00:00.000Z"` for the `exportDate` and `createdAt` fields.
4. **Minimal Data:**
   * `formatVersion` must be `1`.
   * The `states` array must be completely empty: `[]` (since these are new words without learning history).
   * The `features` property on every item must strictly be an empty JSON object string: `"{}"`.
   * Completely omit the `zipfFrequencyFirst` and `zipfFrequencySecond` fields.
5. **Stage Mappings:** Every generated item needs a corresponding entry in the `stageMappings` array, setting its learning stage to `"NEW"`.

## Target JSON Schema
Your Python script must output a JSON file that perfectly matches this structure. The `items` and `stageMappings` arrays should expand dynamically based on the input words.

```json
{
  "type": "Category",
  "formatVersion": 1,
  "exportDate": "2024-01-01T00:00:00.000Z",
  "metadata": {
    "itemCount": 1,
    "categoryCount": 1,
    "exportScope": "Category: <CATEGORY_NAME>"
  },
  "category": {
    "type": "TagCategory",
    "id": 99999,
    "name": "<CATEGORY_NAME>"
  },
  "items": [
    {
      "id": 100000,
      "languageFirstId": <LANG_FIRST_ID>,
      "languageSecondId": <LANG_SECOND_ID>,
      "wordFirst": "<WORD_1>",
      "wordSecond": "<WORD_2>",
      "createdAt": "2024-01-01T00:00:00.000Z",
      "features": "{}"
    }
  ],
  "states": [],
  "stageMappings": [
    {
      "vocabularyItemId": 100000,
      "stage": "NEW"
    }
  ]
}