17 KiB
Vocabulary Export/Import System
Overview
The Polly app includes a comprehensive vocabulary export/import system that allows users to:
- Backup their complete vocabulary repository
- Share vocabulary lists with friends, teachers, or students
- Transfer data between devices
- Exchange vocabulary via messaging apps (WhatsApp, Telegram, etc.)
- Store vocabulary in cloud services (Google Drive, Dropbox, etc.)
- Integrate with external systems via REST APIs
Data Format
The export/import system uses JSON as the primary data format. JSON was chosen because it is:
- Text-based: Can be shared via any text-based communication channel
- Portable: Works across all platforms and devices
- Human-readable: Can be inspected and edited manually if needed
- Standard: Supported by all programming languages and APIs
- Compact: Efficient storage and transmission
Architecture
Core Components
- VocabularyExport.kt: Defines data models for export/import
- VocabularyRepository.kt: Implements export/import functions
- ConflictStrategy: Defines how to handle data conflicts during import
Data Models
The system uses a sealed class hierarchy for different export scopes:
sealed class VocabularyExportData {
abstract val formatVersion: Int
abstract val exportDate: Instant
abstract val metadata: ExportMetadata
}
Export Types
-
FullRepositoryExport: Complete backup of everything
- All vocabulary items
- All categories (tags and filters)
- All learning states
- All category mappings
- All stage mappings
-
CategoryExport: Single category with its items
- One category definition
- All items in that category
- Learning states for those items
- Stage mappings for those items
-
ItemListExport: Custom selection of items
- Selected vocabulary items
- Learning states for those items
- Stage mappings for those items
- Optionally: associated categories
-
SingleItemExport: Individual vocabulary item
- One vocabulary item
- Its learning state
- Its current stage
- Categories it belongs to
Usage Guide
Exporting Data
1. Export Full Repository
// In a coroutine scope
val repository = VocabularyRepository.getInstance(context)
// Create export data
val exportData = repository.exportFullRepository()
// Convert to JSON string
val jsonString = repository.exportToJson(exportData, prettyPrint = true)
// Save to file, share, or upload
saveToFile(jsonString, "vocabulary_backup.json")
2. Export Single Category
val categoryId = 123
val exportData = repository.exportCategory(categoryId)
if (exportData != null) {
val jsonString = repository.exportToJson(exportData)
shareViaIntent(jsonString)
} else {
// Category not found
}
3. Export Custom Item List
val itemIds = listOf(1, 5, 10, 15, 20)
val exportData = repository.exportItemList(itemIds, includeCategories = true)
val jsonString = repository.exportToJson(exportData)
4. Export Single Item
val itemId = 42
val exportData = repository.exportSingleItem(itemId)
if (exportData != null) {
val jsonString = repository.exportToJson(exportData)
// Share via WhatsApp, email, etc.
}
Importing Data
1. Import from JSON String
// Receive JSON string (from file, intent, API, etc.)
val jsonString = readFromFile("vocabulary_backup.json")
// Parse JSON
val exportData = repository.importFromJson(jsonString)
// Import with conflict strategy
val result = repository.importVocabularyData(
exportData = exportData,
strategy = ConflictStrategy.MERGE
)
// Check result
if (result.isSuccess) {
println("Imported: ${result.itemsImported} items")
println("Skipped: ${result.itemsSkipped} items")
println("Categories: ${result.categoriesImported}")
} else {
println("Errors: ${result.errors}")
}
Conflict Resolution Strategies
When importing data, you must choose how to handle conflicts (duplicate items or categories):
1. SKIP Strategy
strategy = ConflictStrategy.SKIP
- Behavior: Skip importing items that already exist
- Use case: Importing shared vocabulary without overwriting your progress
- Result: Preserves all existing data unchanged
2. REPLACE Strategy
strategy = ConflictStrategy.REPLACE
- Behavior: Replace existing items with imported versions
- Use case: Restoring from backup, syncing with authoritative source
- Result: Overwrites local data with imported data
3. MERGE Strategy (Default)
strategy = ConflictStrategy.MERGE
- Behavior: Intelligently merge data
- For items: Keep existing if duplicate, add new ones
- For states: Keep the more advanced learning progress
- For stages: Keep the higher stage
- For categories: Merge memberships
- Use case: Most common scenario, combining data from multiple sources
- Result: Best of both worlds
4. RENAME Strategy
strategy = ConflictStrategy.RENAME
- Behavior: Assign new IDs to all imported items
- Use case: Intentionally creating duplicates for practice
- Result: All imported items get new IDs, no conflicts
Data Preservation
What Gets Exported
Every export includes complete information:
-
Vocabulary Items
- Word/phrase in first language
- Word/phrase in second language
- Language IDs
- Creation timestamp
- Grammatical features (if any)
- Zipf frequency scores (if available)
-
Learning States
- Correct answer count
- Incorrect answer count
- Last correct answer timestamp
- Last incorrect answer timestamp
-
Stage Mappings
- Current learning stage (NEW, STAGE_1-5, LEARNED)
- For each vocabulary item
-
Categories
- Category name and type
- For TagCategory: just the name
- For VocabularyFilter: language filters, stage filters, language pairs
-
Category Memberships
- Which items belong to which categories
- Automatically recalculated for filters during import
Metadata
Each export includes metadata:
- Format version (for future compatibility)
- Export date/time
- Item count
- Category count
- Export scope description
- App version (optional)
Integration Examples
1. File Storage
// Save to device storage
fun saveVocabularyToFile(context: Context, exportData: VocabularyExportData) {
val jsonString = repository.exportToJson(exportData, prettyPrint = true)
val file = File(context.getExternalFilesDir(null), "vocabulary_export.json")
file.writeText(jsonString)
}
// Load from device storage
fun loadVocabularyFromFile(context: Context): ImportResult {
val file = File(context.getExternalFilesDir(null), "vocabulary_export.json")
val jsonString = file.readText()
val exportData = repository.importFromJson(jsonString)
return repository.importVocabularyData(exportData, ConflictStrategy.MERGE)
}
2. Share via Intent (WhatsApp, Email, etc.)
fun shareVocabulary(context: Context, exportData: VocabularyExportData) {
val jsonString = repository.exportToJson(exportData)
val sendIntent = Intent().apply {
action = Intent.ACTION_SEND
putExtra(Intent.EXTRA_TEXT, jsonString)
putExtra(Intent.EXTRA_SUBJECT, "Vocabulary List: ${exportData.metadata.exportScope}")
type = "text/plain"
}
context.startActivity(Intent.createChooser(sendIntent, "Share vocabulary"))
}
// Receive from intent
fun receiveVocabulary(intent: Intent): ImportResult? {
val jsonString = intent.getStringExtra(Intent.EXTRA_TEXT) ?: return null
val exportData = repository.importFromJson(jsonString)
return repository.importVocabularyData(exportData, ConflictStrategy.MERGE)
}
3. REST API Integration
// Upload to server
suspend fun uploadToServer(exportData: VocabularyExportData): Result<String> {
val jsonString = repository.exportToJson(exportData)
val client = HttpClient()
val response = client.post("https://api.example.com/vocabulary") {
contentType(ContentType.Application.Json)
setBody(jsonString)
}
return if (response.status.isSuccess()) {
Result.success(response.body())
} else {
Result.failure(Exception("Upload failed"))
}
}
// Download from server
suspend fun downloadFromServer(vocabularyId: String): ImportResult {
val client = HttpClient()
val jsonString = client.get("https://api.example.com/vocabulary/$vocabularyId").body<String>()
val exportData = repository.importFromJson(jsonString)
return repository.importVocabularyData(exportData, ConflictStrategy.MERGE)
}
4. Cloud Storage (Google Drive, Dropbox)
// Upload to Google Drive
fun uploadToGoogleDrive(driveService: Drive, exportData: VocabularyExportData): String {
val jsonString = repository.exportToJson(exportData, prettyPrint = true)
val fileMetadata = File().apply {
name = "polly_vocabulary_${System.currentTimeMillis()}.json"
mimeType = "application/json"
}
val content = ByteArrayContent.fromString("application/json", jsonString)
val file = driveService.files().create(fileMetadata, content).execute()
return file.id
}
// Download from Google Drive
fun downloadFromGoogleDrive(driveService: Drive, fileId: String): ImportResult {
val outputStream = ByteArrayOutputStream()
driveService.files().get(fileId).executeMediaAndDownloadTo(outputStream)
val jsonString = outputStream.toString("UTF-8")
val exportData = repository.importFromJson(jsonString)
return repository.importVocabularyData(exportData, ConflictStrategy.MERGE)
}
5. QR Code Sharing
// Generate QR code for small exports
fun generateQRCode(exportData: VocabularyExportData): Bitmap {
val jsonString = repository.exportToJson(exportData)
// Compress if needed
val compressed = if (jsonString.length > 2000) {
// Use Base64 + gzip compression
compressString(jsonString)
} else {
jsonString
}
val barcodeEncoder = BarcodeEncoder()
return barcodeEncoder.encodeBitmap(compressed, BarcodeFormat.QR_CODE, 512, 512)
}
// Scan QR code
fun scanQRCode(qrContent: String): ImportResult {
val jsonString = if (isCompressed(qrContent)) {
decompressString(qrContent)
} else {
qrContent
}
val exportData = repository.importFromJson(jsonString)
return repository.importVocabularyData(exportData, ConflictStrategy.MERGE)
}
Error Handling
Common Errors
- Invalid JSON Format
try {
val exportData = repository.importFromJson(jsonString)
} catch (e: SerializationException) {
// Invalid JSON format
Log.e(TAG, "Failed to parse JSON: ${e.message}")
}
- Import Failures
val result = repository.importVocabularyData(exportData, strategy)
if (!result.isSuccess) {
result.errors.forEach { error ->
Log.e(TAG, "Import error: $error")
}
}
- Version Compatibility
if (exportData.formatVersion > CURRENT_FORMAT_VERSION) {
// Warn user that format is from newer app version
showWarning("This export was created with a newer version of the app")
}
Performance Considerations
Large Exports
For repositories with thousands of items:
- Chunked Processing: Process items in batches
- Background Thread: Use coroutines with Dispatchers.IO
- Progress Reporting: Update UI during long operations
- Compression: Use gzip for large JSON files
suspend fun importLargeExport(jsonString: String, onProgress: (Int, Int) -> Unit): ImportResult {
return withContext(Dispatchers.IO) {
val exportData = repository.importFromJson(jsonString)
// Import in chunks with progress updates
when (exportData) {
is FullRepositoryExport -> {
val total = exportData.items.size
var processed = 0
exportData.items.chunked(100).forEach { chunk ->
// Process chunk
processed += chunk.size
onProgress(processed, total)
}
}
// Handle other types...
}
repository.importVocabularyData(exportData, ConflictStrategy.MERGE)
}
}
Testing
Unit Tests
Test export/import roundtrip:
@Test
fun testExportImportRoundtrip() = runBlocking {
// Create test data
val originalItems = listOf(
VocabularyItem(1, 1, 2, "hello", "hola", Clock.System.now())
)
repository.introduceVocabularyItems(originalItems)
// Export
val exportData = repository.exportFullRepository()
val jsonString = repository.exportToJson(exportData)
// Clear repository
repository.wipeRepository()
// Import
val importData = repository.importFromJson(jsonString)
val result = repository.importVocabularyData(importData, ConflictStrategy.MERGE)
// Verify
assertEquals(1, result.itemsImported)
val importedItems = repository.getAllVocabularyItems()
assertEquals(originalItems.size, importedItems.size)
}
Integration Tests
Test with external storage:
@Test
fun testFileExportImport() = runBlocking {
// Export to file
val exportData = repository.exportFullRepository()
val jsonString = repository.exportToJson(exportData)
val file = File.createTempFile("vocab", ".json")
file.writeText(jsonString)
// Import from file
val importedJson = file.readText()
val importData = repository.importFromJson(importedJson)
val result = repository.importVocabularyData(importData, ConflictStrategy.REPLACE)
// Verify
assertTrue(result.isSuccess)
}
Future Enhancements
Potential Improvements
- Compression: Add built-in gzip compression for large exports
- Encryption: Support for encrypted exports with password protection
- Incremental Sync: Export only changes since last sync
- Conflict Resolution UI: Let users manually resolve conflicts
- Batch Operations: Import multiple exports in one operation
- Export Templates: Pre-defined export configurations
- Automatic Backups: Scheduled background exports
- Cloud Sync: Automatic bidirectional synchronization
- Format Migration: Automatic upgrades from older format versions
- Validation: Pre-import validation with detailed reports
Troubleshooting
Common Issues
Q: Import says "0 items imported" but no errors
- A: All items were duplicates and SKIP strategy was used
- Solution: Use MERGE or REPLACE strategy
Q: Categories missing after import
- A: Only TagCategories are imported; VocabularyFilters are recreated automatically
- Solution: This is by design; filters regenerate based on rules
Q: Learning progress lost after import
- A: REPLACE strategy was used, overwriting existing progress
- Solution: Use MERGE strategy to preserve better progress
Q: JSON file too large to share via WhatsApp
- A: Large repositories exceed message size limits
- Solution: Use file sharing, cloud storage, or export specific categories
Q: Import fails with "Invalid JSON"
- A: JSON was corrupted or manually edited incorrectly
- Solution: Ensure JSON is valid; don't manually edit unless necessary
Best Practices
- Regular Backups: Export full repository regularly
- Test Imports: Test import in a fresh profile before overwriting
- Use MERGE: Default to MERGE strategy for most use cases
- Validate Data: Check ImportResult after each import
- Keep Metadata: Don't remove metadata from exported JSON
- Version Tracking: Include app version in exports
- Compression: Compress large exports before sharing
- Secure Exports: Be cautious with exports containing sensitive data
- Document Changes: Add notes about what was exported/imported
- Incremental Sharing: Share specific categories instead of full repo
API Reference
Repository Functions
Export Functions
exportFullRepository(): FullRepositoryExportexportCategory(categoryId: Int): CategoryExport?exportItemList(itemIds: List<Int>, includeCategories: Boolean = true): ItemListExportexportSingleItem(itemId: Int): SingleItemExport?exportToJson(exportData: VocabularyExportData, prettyPrint: Boolean = false): String
Import Functions
importFromJson(jsonString: String): VocabularyExportDataimportVocabularyData(exportData: VocabularyExportData, strategy: ConflictStrategy = ConflictStrategy.MERGE): ImportResult
Data Classes
ExportMetadata: Information about the exportImportResult: Statistics and errors from importConflictStrategy: Enum defining conflict resolution behaviorCategoryMappingData: Item-to-category relationshipStageMappingData: Item-to-stage relationship
Conclusion
The vocabulary export/import system provides a robust, flexible solution for data portability in the Polly app. Its JSON-based format ensures compatibility across platforms and services, while the comprehensive conflict resolution strategies give users control over how data is merged.
Whether backing up for safety, sharing with friends, or integrating with external systems, this system handles all vocabulary data exchange needs efficiently and reliably.
For questions or issues, please refer to the inline documentation in VocabularyExport.kt and VocabularyRepository.kt.