Files
Polly/docs/VOCABULARY_EXPORT_IMPORT.md

17 KiB

Vocabulary Export/Import System

Overview

The Polly app includes a comprehensive vocabulary export/import system that allows users to:

  • Backup their complete vocabulary repository
  • Share vocabulary lists with friends, teachers, or students
  • Transfer data between devices
  • Exchange vocabulary via messaging apps (WhatsApp, Telegram, etc.)
  • Store vocabulary in cloud services (Google Drive, Dropbox, etc.)
  • Integrate with external systems via REST APIs

Data Format

The export/import system uses JSON as the primary data format. JSON was chosen because it is:

  • Text-based: Can be shared via any text-based communication channel
  • Portable: Works across all platforms and devices
  • Human-readable: Can be inspected and edited manually if needed
  • Standard: Supported by all programming languages and APIs
  • Compact: Efficient storage and transmission

Architecture

Core Components

  1. VocabularyExport.kt: Defines data models for export/import
  2. VocabularyRepository.kt: Implements export/import functions
  3. ConflictStrategy: Defines how to handle data conflicts during import

Data Models

The system uses a sealed class hierarchy for different export scopes:

sealed class VocabularyExportData {
    abstract val formatVersion: Int
    abstract val exportDate: Instant
    abstract val metadata: ExportMetadata
}

Export Types

  1. FullRepositoryExport: Complete backup of everything

    • All vocabulary items
    • All categories (tags and filters)
    • All learning states
    • All category mappings
    • All stage mappings
  2. CategoryExport: Single category with its items

    • One category definition
    • All items in that category
    • Learning states for those items
    • Stage mappings for those items
  3. ItemListExport: Custom selection of items

    • Selected vocabulary items
    • Learning states for those items
    • Stage mappings for those items
    • Optionally: associated categories
  4. SingleItemExport: Individual vocabulary item

    • One vocabulary item
    • Its learning state
    • Its current stage
    • Categories it belongs to

Usage Guide

Exporting Data

1. Export Full Repository

// In a coroutine scope
val repository = VocabularyRepository.getInstance(context)

// Create export data
val exportData = repository.exportFullRepository()

// Convert to JSON string
val jsonString = repository.exportToJson(exportData, prettyPrint = true)

// Save to file, share, or upload
saveToFile(jsonString, "vocabulary_backup.json")

2. Export Single Category

val categoryId = 123
val exportData = repository.exportCategory(categoryId)

if (exportData != null) {
    val jsonString = repository.exportToJson(exportData)
    shareViaIntent(jsonString)
} else {
    // Category not found
}

3. Export Custom Item List

val itemIds = listOf(1, 5, 10, 15, 20)
val exportData = repository.exportItemList(itemIds, includeCategories = true)
val jsonString = repository.exportToJson(exportData)

4. Export Single Item

val itemId = 42
val exportData = repository.exportSingleItem(itemId)

if (exportData != null) {
    val jsonString = repository.exportToJson(exportData)
    // Share via WhatsApp, email, etc.
}

Importing Data

1. Import from JSON String

// Receive JSON string (from file, intent, API, etc.)
val jsonString = readFromFile("vocabulary_backup.json")

// Parse JSON
val exportData = repository.importFromJson(jsonString)

// Import with conflict strategy
val result = repository.importVocabularyData(
    exportData = exportData,
    strategy = ConflictStrategy.MERGE
)

// Check result
if (result.isSuccess) {
    println("Imported: ${result.itemsImported} items")
    println("Skipped: ${result.itemsSkipped} items")
    println("Categories: ${result.categoriesImported}")
} else {
    println("Errors: ${result.errors}")
}

Conflict Resolution Strategies

When importing data, you must choose how to handle conflicts (duplicate items or categories):

1. SKIP Strategy

strategy = ConflictStrategy.SKIP
  • Behavior: Skip importing items that already exist
  • Use case: Importing shared vocabulary without overwriting your progress
  • Result: Preserves all existing data unchanged

2. REPLACE Strategy

strategy = ConflictStrategy.REPLACE
  • Behavior: Replace existing items with imported versions
  • Use case: Restoring from backup, syncing with authoritative source
  • Result: Overwrites local data with imported data

3. MERGE Strategy (Default)

strategy = ConflictStrategy.MERGE
  • Behavior: Intelligently merge data
    • For items: Keep existing if duplicate, add new ones
    • For states: Keep the more advanced learning progress
    • For stages: Keep the higher stage
    • For categories: Merge memberships
  • Use case: Most common scenario, combining data from multiple sources
  • Result: Best of both worlds

4. RENAME Strategy

strategy = ConflictStrategy.RENAME
  • Behavior: Assign new IDs to all imported items
  • Use case: Intentionally creating duplicates for practice
  • Result: All imported items get new IDs, no conflicts

Data Preservation

What Gets Exported

Every export includes complete information:

  1. Vocabulary Items

    • Word/phrase in first language
    • Word/phrase in second language
    • Language IDs
    • Creation timestamp
    • Grammatical features (if any)
    • Zipf frequency scores (if available)
  2. Learning States

    • Correct answer count
    • Incorrect answer count
    • Last correct answer timestamp
    • Last incorrect answer timestamp
  3. Stage Mappings

    • Current learning stage (NEW, STAGE_1-5, LEARNED)
    • For each vocabulary item
  4. Categories

    • Category name and type
    • For TagCategory: just the name
    • For VocabularyFilter: language filters, stage filters, language pairs
  5. Category Memberships

    • Which items belong to which categories
    • Automatically recalculated for filters during import

Metadata

Each export includes metadata:

  • Format version (for future compatibility)
  • Export date/time
  • Item count
  • Category count
  • Export scope description
  • App version (optional)

Integration Examples

1. File Storage

// Save to device storage
fun saveVocabularyToFile(context: Context, exportData: VocabularyExportData) {
    val jsonString = repository.exportToJson(exportData, prettyPrint = true)
    val file = File(context.getExternalFilesDir(null), "vocabulary_export.json")
    file.writeText(jsonString)
}

// Load from device storage
fun loadVocabularyFromFile(context: Context): ImportResult {
    val file = File(context.getExternalFilesDir(null), "vocabulary_export.json")
    val jsonString = file.readText()
    val exportData = repository.importFromJson(jsonString)
    return repository.importVocabularyData(exportData, ConflictStrategy.MERGE)
}

2. Share via Intent (WhatsApp, Email, etc.)

fun shareVocabulary(context: Context, exportData: VocabularyExportData) {
    val jsonString = repository.exportToJson(exportData)
    
    val sendIntent = Intent().apply {
        action = Intent.ACTION_SEND
        putExtra(Intent.EXTRA_TEXT, jsonString)
        putExtra(Intent.EXTRA_SUBJECT, "Vocabulary List: ${exportData.metadata.exportScope}")
        type = "text/plain"
    }
    
    context.startActivity(Intent.createChooser(sendIntent, "Share vocabulary"))
}

// Receive from intent
fun receiveVocabulary(intent: Intent): ImportResult? {
    val jsonString = intent.getStringExtra(Intent.EXTRA_TEXT) ?: return null
    val exportData = repository.importFromJson(jsonString)
    return repository.importVocabularyData(exportData, ConflictStrategy.MERGE)
}

3. REST API Integration

// Upload to server
suspend fun uploadToServer(exportData: VocabularyExportData): Result<String> {
    val jsonString = repository.exportToJson(exportData)
    
    val client = HttpClient()
    val response = client.post("https://api.example.com/vocabulary") {
        contentType(ContentType.Application.Json)
        setBody(jsonString)
    }
    
    return if (response.status.isSuccess()) {
        Result.success(response.body())
    } else {
        Result.failure(Exception("Upload failed"))
    }
}

// Download from server
suspend fun downloadFromServer(vocabularyId: String): ImportResult {
    val client = HttpClient()
    val jsonString = client.get("https://api.example.com/vocabulary/$vocabularyId").body<String>()
    
    val exportData = repository.importFromJson(jsonString)
    return repository.importVocabularyData(exportData, ConflictStrategy.MERGE)
}

4. Cloud Storage (Google Drive, Dropbox)

// Upload to Google Drive
fun uploadToGoogleDrive(driveService: Drive, exportData: VocabularyExportData): String {
    val jsonString = repository.exportToJson(exportData, prettyPrint = true)
    
    val fileMetadata = File().apply {
        name = "polly_vocabulary_${System.currentTimeMillis()}.json"
        mimeType = "application/json"
    }
    
    val content = ByteArrayContent.fromString("application/json", jsonString)
    val file = driveService.files().create(fileMetadata, content).execute()
    
    return file.id
}

// Download from Google Drive
fun downloadFromGoogleDrive(driveService: Drive, fileId: String): ImportResult {
    val outputStream = ByteArrayOutputStream()
    driveService.files().get(fileId).executeMediaAndDownloadTo(outputStream)
    
    val jsonString = outputStream.toString("UTF-8")
    val exportData = repository.importFromJson(jsonString)
    return repository.importVocabularyData(exportData, ConflictStrategy.MERGE)
}

5. QR Code Sharing

// Generate QR code for small exports
fun generateQRCode(exportData: VocabularyExportData): Bitmap {
    val jsonString = repository.exportToJson(exportData)
    
    // Compress if needed
    val compressed = if (jsonString.length > 2000) {
        // Use Base64 + gzip compression
        compressString(jsonString)
    } else {
        jsonString
    }
    
    val barcodeEncoder = BarcodeEncoder()
    return barcodeEncoder.encodeBitmap(compressed, BarcodeFormat.QR_CODE, 512, 512)
}

// Scan QR code
fun scanQRCode(qrContent: String): ImportResult {
    val jsonString = if (isCompressed(qrContent)) {
        decompressString(qrContent)
    } else {
        qrContent
    }
    
    val exportData = repository.importFromJson(jsonString)
    return repository.importVocabularyData(exportData, ConflictStrategy.MERGE)
}

Error Handling

Common Errors

  1. Invalid JSON Format
try {
    val exportData = repository.importFromJson(jsonString)
} catch (e: SerializationException) {
    // Invalid JSON format
    Log.e(TAG, "Failed to parse JSON: ${e.message}")
}
  1. Import Failures
val result = repository.importVocabularyData(exportData, strategy)
if (!result.isSuccess) {
    result.errors.forEach { error ->
        Log.e(TAG, "Import error: $error")
    }
}
  1. Version Compatibility
if (exportData.formatVersion > CURRENT_FORMAT_VERSION) {
    // Warn user that format is from newer app version
    showWarning("This export was created with a newer version of the app")
}

Performance Considerations

Large Exports

For repositories with thousands of items:

  1. Chunked Processing: Process items in batches
  2. Background Thread: Use coroutines with Dispatchers.IO
  3. Progress Reporting: Update UI during long operations
  4. Compression: Use gzip for large JSON files
suspend fun importLargeExport(jsonString: String, onProgress: (Int, Int) -> Unit): ImportResult {
    return withContext(Dispatchers.IO) {
        val exportData = repository.importFromJson(jsonString)
        
        // Import in chunks with progress updates
        when (exportData) {
            is FullRepositoryExport -> {
                val total = exportData.items.size
                var processed = 0
                
                exportData.items.chunked(100).forEach { chunk ->
                    // Process chunk
                    processed += chunk.size
                    onProgress(processed, total)
                }
            }
            // Handle other types...
        }
        
        repository.importVocabularyData(exportData, ConflictStrategy.MERGE)
    }
}

Testing

Unit Tests

Test export/import roundtrip:

@Test
fun testExportImportRoundtrip() = runBlocking {
    // Create test data
    val originalItems = listOf(
        VocabularyItem(1, 1, 2, "hello", "hola", Clock.System.now())
    )
    repository.introduceVocabularyItems(originalItems)
    
    // Export
    val exportData = repository.exportFullRepository()
    val jsonString = repository.exportToJson(exportData)
    
    // Clear repository
    repository.wipeRepository()
    
    // Import
    val importData = repository.importFromJson(jsonString)
    val result = repository.importVocabularyData(importData, ConflictStrategy.MERGE)
    
    // Verify
    assertEquals(1, result.itemsImported)
    val importedItems = repository.getAllVocabularyItems()
    assertEquals(originalItems.size, importedItems.size)
}

Integration Tests

Test with external storage:

@Test
fun testFileExportImport() = runBlocking {
    // Export to file
    val exportData = repository.exportFullRepository()
    val jsonString = repository.exportToJson(exportData)
    val file = File.createTempFile("vocab", ".json")
    file.writeText(jsonString)
    
    // Import from file
    val importedJson = file.readText()
    val importData = repository.importFromJson(importedJson)
    val result = repository.importVocabularyData(importData, ConflictStrategy.REPLACE)
    
    // Verify
    assertTrue(result.isSuccess)
}

Future Enhancements

Potential Improvements

  1. Compression: Add built-in gzip compression for large exports
  2. Encryption: Support for encrypted exports with password protection
  3. Incremental Sync: Export only changes since last sync
  4. Conflict Resolution UI: Let users manually resolve conflicts
  5. Batch Operations: Import multiple exports in one operation
  6. Export Templates: Pre-defined export configurations
  7. Automatic Backups: Scheduled background exports
  8. Cloud Sync: Automatic bidirectional synchronization
  9. Format Migration: Automatic upgrades from older format versions
  10. Validation: Pre-import validation with detailed reports

Troubleshooting

Common Issues

Q: Import says "0 items imported" but no errors

  • A: All items were duplicates and SKIP strategy was used
  • Solution: Use MERGE or REPLACE strategy

Q: Categories missing after import

  • A: Only TagCategories are imported; VocabularyFilters are recreated automatically
  • Solution: This is by design; filters regenerate based on rules

Q: Learning progress lost after import

  • A: REPLACE strategy was used, overwriting existing progress
  • Solution: Use MERGE strategy to preserve better progress

Q: JSON file too large to share via WhatsApp

  • A: Large repositories exceed message size limits
  • Solution: Use file sharing, cloud storage, or export specific categories

Q: Import fails with "Invalid JSON"

  • A: JSON was corrupted or manually edited incorrectly
  • Solution: Ensure JSON is valid; don't manually edit unless necessary

Best Practices

  1. Regular Backups: Export full repository regularly
  2. Test Imports: Test import in a fresh profile before overwriting
  3. Use MERGE: Default to MERGE strategy for most use cases
  4. Validate Data: Check ImportResult after each import
  5. Keep Metadata: Don't remove metadata from exported JSON
  6. Version Tracking: Include app version in exports
  7. Compression: Compress large exports before sharing
  8. Secure Exports: Be cautious with exports containing sensitive data
  9. Document Changes: Add notes about what was exported/imported
  10. Incremental Sharing: Share specific categories instead of full repo

API Reference

Repository Functions

Export Functions

  • exportFullRepository(): FullRepositoryExport
  • exportCategory(categoryId: Int): CategoryExport?
  • exportItemList(itemIds: List<Int>, includeCategories: Boolean = true): ItemListExport
  • exportSingleItem(itemId: Int): SingleItemExport?
  • exportToJson(exportData: VocabularyExportData, prettyPrint: Boolean = false): String

Import Functions

  • importFromJson(jsonString: String): VocabularyExportData
  • importVocabularyData(exportData: VocabularyExportData, strategy: ConflictStrategy = ConflictStrategy.MERGE): ImportResult

Data Classes

  • ExportMetadata: Information about the export
  • ImportResult: Statistics and errors from import
  • ConflictStrategy: Enum defining conflict resolution behavior
  • CategoryMappingData: Item-to-category relationship
  • StageMappingData: Item-to-stage relationship

Conclusion

The vocabulary export/import system provides a robust, flexible solution for data portability in the Polly app. Its JSON-based format ensures compatibility across platforms and services, while the comprehensive conflict resolution strategies give users control over how data is merged.

Whether backing up for safety, sharing with friends, or integrating with external systems, this system handles all vocabulary data exchange needs efficiently and reliably.


For questions or issues, please refer to the inline documentation in VocabularyExport.kt and VocabularyRepository.kt.