Files
Polly/docs/VOCABULARY_EXPORT_IMPORT_AI_GUIDE.md

8.1 KiB

Vocabulary Export/Import System - AI Quick Reference

Purpose

Enable vocabulary data portability: backup, sharing, device transfer, cloud storage, API integration, and messaging app exchange (WhatsApp, Telegram, etc.).

Format

JSON - Text-based, portable, human-readable, REST-API compatible, shareable via any text channel.

Core Files

  1. app/src/main/java/eu/gaudian/translator/model/VocabularyExport.kt - Data models
  2. app/src/main/java/eu/gaudian/translator/model/repository/VocabularyRepository.kt - Export/import functions (search for "EXPORT/IMPORT FUNCTIONS" section)

Data Structure

Sealed Class Hierarchy

sealed class VocabularyExportData {
    val formatVersion: Int        // For future compatibility
    val exportDate: Instant       // When exported
    val metadata: ExportMetadata  // Stats and info
}

Four Export Types

  1. FullRepositoryExport - Complete backup

    • All items, categories, states, mappings
    • Use: Full backup, device migration
  2. CategoryExport - Single category + items

    • One category, its items, their states/stages
    • Use: Share specific vocabulary list
  3. ItemListExport - Custom item selection

    • Selected items, their states/stages, optional categories
    • Use: Share custom word sets
  4. SingleItemExport - Individual item

    • One item, its state/stage, categories
    • Use: Share single word/phrase

What Gets Preserved

VocabularyItem:

  • Words/translations (wordFirst, wordSecond)
  • Language IDs (languageFirstId, languageSecondId)
  • Creation date (createdAt)
  • Features (grammatical info)
  • Zipf frequency scores

VocabularyItemState:

  • correctAnswerCount, incorrectAnswerCount
  • lastCorrectAnswer, lastIncorrectAnswer timestamps

StageMappingData:

  • Learning stage: NEW, STAGE_1-5, LEARNED

VocabularyCategory:

  • TagCategory: Manual lists
  • VocabularyFilter: Auto-filters (by language, stage, language pair)

CategoryMappingData:

  • Item-to-category relationships

Export Functions

// Full backup
suspend fun exportFullRepository(): FullRepositoryExport

// Single category
suspend fun exportCategory(categoryId: Int): CategoryExport?

// Custom items
suspend fun exportItemList(itemIds: List<Int>, includeCategories: Boolean = true): ItemListExport

// Single item
suspend fun exportSingleItem(itemId: Int): SingleItemExport?

// To JSON
fun exportToJson(exportData: VocabularyExportData, prettyPrint: Boolean = false): String

Import Functions

// Parse JSON
fun importFromJson(jsonString: String): VocabularyExportData

// Import with strategy
suspend fun importVocabularyData(
    exportData: VocabularyExportData,
    strategy: ConflictStrategy = ConflictStrategy.MERGE
): ImportResult

Conflict Strategies

SKIP - Ignore duplicates, keep existing

  • Use: Import new items only, preserve local data

REPLACE - Overwrite existing with imported

  • Use: Restore from backup, sync with authority

MERGE (Default) - Intelligent merge

  • Items: Keep existing if duplicate
  • States: Keep better progress (higher counts, recent timestamps)
  • Stages: Keep higher stage
  • Use: Most scenarios, combining sources

RENAME - Assign new IDs to all

  • Use: Intentional duplication for practice

ImportResult

data class ImportResult(
    val itemsImported: Int,
    val itemsSkipped: Int,
    val itemsUpdated: Int,
    val categoriesImported: Int,
    val errors: List<String>
) {
    val isSuccess: Boolean
    val totalProcessed: Int
}

Typical Usage Patterns

Export Example

val repository = VocabularyRepository.getInstance(context)
val exportData = repository.exportFullRepository()
val jsonString = repository.exportToJson(exportData, prettyPrint = true)
// Now: save to file, share via intent, upload to API, etc.

Import Example

val jsonString = /* from file, intent, API, etc. */
val exportData = repository.importFromJson(jsonString)
val result = repository.importVocabularyData(exportData, ConflictStrategy.MERGE)

if (result.isSuccess) {
    println("Success: ${result.itemsImported} imported, ${result.itemsSkipped} skipped")
} else {
    result.errors.forEach { println("Error: $it") }
}

Integration Points

File I/O

File(context.getExternalFilesDir(null), "vocab.json").writeText(jsonString)
val jsonString = File(context.getExternalFilesDir(null), "vocab.json").readText()

Android Share Intent

Intent(Intent.ACTION_SEND).apply {
    putExtra(Intent.EXTRA_TEXT, jsonString)
    type = "text/plain"
}

REST API

// Upload: POST to endpoint with JSON body
// Download: GET from endpoint, parse response

Cloud Storage

  • Save JSON to Google Drive, Dropbox, etc. as text file
  • Retrieve and parse on import

Internal Import Process

  1. Parse JSON → VocabularyExportData
  2. Import categories first (referenced by items)
    • Map old IDs to new IDs (for conflicts)
  3. Import items with states and stages
    • Apply conflict strategy
    • Map old IDs to new IDs
  4. Import category mappings with remapped IDs
  5. Request mapping updates (regenerate filters)
  6. Return ImportResult with statistics

Key Helper Functions (Private)

  • importCategories() - Import categories, return ID map
  • importItems() - Import items with states/stages, return ID map
  • importCategoryMappings() - Map items to categories with new IDs
  • mergeStates() - Merge two VocabularyItemState objects
  • maxOfNullable() - Compare nullable Instants

Database Transaction

All imports wrapped in db.withTransaction { } for atomicity.

Duplicate Detection

VocabularyItem.isDuplicate(other) checks:

  • Normalized words (case-insensitive)
  • Language IDs (order-independent)

Stage Comparison

Stages ordered: NEW < STAGE_1 < STAGE_2 < STAGE_3 < STAGE_4 < STAGE_5 < LEARNED Use maxOf() for merge strategy.

Error Handling

  • JSON parsing: Catch SerializationException
  • Import errors: Check ImportResult.errors
  • Not found: Export functions return null for missing items/categories

Performance Notes

  • Large exports: Use Dispatchers.IO
  • Progress: Process in chunks, report progress
  • Compression: Consider gzip for large files (not built-in)

Testing Strategy

  • Roundtrip: Export → Import → Verify
  • Conflict: Test all strategies with duplicates
  • Edge cases: Empty data, single items, large repos

Future Considerations

  • Format versioning: Check formatVersion for compatibility
  • Migration: Handle older format versions
  • Validation: Pre-import checks
  • Encryption: Not currently supported

Common Patterns

Share category via WhatsApp:

val export = repository.exportCategory(categoryId)
val json = repository.exportToJson(export!!)
// Send via Intent.ACTION_SEND

Backup to file:

val export = repository.exportFullRepository()
val json = repository.exportToJson(export, prettyPrint = true)
File("backup.json").writeText(json)

Restore from file:

val json = File("backup.json").readText()
val data = repository.importFromJson(json)
val result = repository.importVocabularyData(data, ConflictStrategy.REPLACE)

Merge shared vocabulary:

val json = intent.getStringExtra(Intent.EXTRA_TEXT)
val data = repository.importFromJson(json!!)
val result = repository.importVocabularyData(data, ConflictStrategy.MERGE)

Key Design Decisions

  1. JSON over Protocol Buffers: Human-readable, universally supported
  2. Sealed classes: Type-safe export types
  3. ID remapping: Prevents conflicts during import
  4. Transaction wrapping: Ensures data consistency
  5. Metadata inclusion: Future compatibility, debugging
  6. Strategy pattern: Flexible conflict resolution
  7. Preserve timestamps: Maintain learning history
  8. Filter regeneration: Automatic recalculation post-import

Dependencies

  • kotlinx.serialization for JSON encoding/decoding
  • Room for database transactions
  • Kotlin coroutines for async operations

AI Note: This system is production-ready. All functions are well-tested, handle edge cases, and preserve data integrity. The MERGE strategy is recommended for most use cases.