implement vocabulary packs exploration and request system

This commit is contained in:
jonasgaudian
2026-02-19 13:01:55 +01:00
parent 0f8d605df7
commit b75f5f32a0
17 changed files with 1784 additions and 298 deletions

View File

@@ -0,0 +1,578 @@
# Vocabulary Export/Import System
## Overview
The Polly app includes a comprehensive vocabulary export/import system that allows users to:
- **Backup** their complete vocabulary repository
- **Share** vocabulary lists with friends, teachers, or students
- **Transfer** data between devices
- **Exchange** vocabulary via messaging apps (WhatsApp, Telegram, etc.)
- **Store** vocabulary in cloud services (Google Drive, Dropbox, etc.)
- **Integrate** with external systems via REST APIs
## Data Format
The export/import system uses **JSON** as the primary data format. JSON was chosen because it is:
- **Text-based**: Can be shared via any text-based communication channel
- **Portable**: Works across all platforms and devices
- **Human-readable**: Can be inspected and edited manually if needed
- **Standard**: Supported by all programming languages and APIs
- **Compact**: Efficient storage and transmission
## Architecture
### Core Components
1. **VocabularyExport.kt**: Defines data models for export/import
2. **VocabularyRepository.kt**: Implements export/import functions
3. **ConflictStrategy**: Defines how to handle data conflicts during import
### Data Models
The system uses a sealed class hierarchy for different export scopes:
```kotlin
sealed class VocabularyExportData {
abstract val formatVersion: Int
abstract val exportDate: Instant
abstract val metadata: ExportMetadata
}
```
#### Export Types
1. **FullRepositoryExport**: Complete backup of everything
- All vocabulary items
- All categories (tags and filters)
- All learning states
- All category mappings
- All stage mappings
2. **CategoryExport**: Single category with its items
- One category definition
- All items in that category
- Learning states for those items
- Stage mappings for those items
3. **ItemListExport**: Custom selection of items
- Selected vocabulary items
- Learning states for those items
- Stage mappings for those items
- Optionally: associated categories
4. **SingleItemExport**: Individual vocabulary item
- One vocabulary item
- Its learning state
- Its current stage
- Categories it belongs to
## Usage Guide
### Exporting Data
#### 1. Export Full Repository
```kotlin
// In a coroutine scope
val repository = VocabularyRepository.getInstance(context)
// Create export data
val exportData = repository.exportFullRepository()
// Convert to JSON string
val jsonString = repository.exportToJson(exportData, prettyPrint = true)
// Save to file, share, or upload
saveToFile(jsonString, "vocabulary_backup.json")
```
#### 2. Export Single Category
```kotlin
val categoryId = 123
val exportData = repository.exportCategory(categoryId)
if (exportData != null) {
val jsonString = repository.exportToJson(exportData)
shareViaIntent(jsonString)
} else {
// Category not found
}
```
#### 3. Export Custom Item List
```kotlin
val itemIds = listOf(1, 5, 10, 15, 20)
val exportData = repository.exportItemList(itemIds, includeCategories = true)
val jsonString = repository.exportToJson(exportData)
```
#### 4. Export Single Item
```kotlin
val itemId = 42
val exportData = repository.exportSingleItem(itemId)
if (exportData != null) {
val jsonString = repository.exportToJson(exportData)
// Share via WhatsApp, email, etc.
}
```
### Importing Data
#### 1. Import from JSON String
```kotlin
// Receive JSON string (from file, intent, API, etc.)
val jsonString = readFromFile("vocabulary_backup.json")
// Parse JSON
val exportData = repository.importFromJson(jsonString)
// Import with conflict strategy
val result = repository.importVocabularyData(
exportData = exportData,
strategy = ConflictStrategy.MERGE
)
// Check result
if (result.isSuccess) {
println("Imported: ${result.itemsImported} items")
println("Skipped: ${result.itemsSkipped} items")
println("Categories: ${result.categoriesImported}")
} else {
println("Errors: ${result.errors}")
}
```
### Conflict Resolution Strategies
When importing data, you must choose how to handle conflicts (duplicate items or categories):
#### 1. SKIP Strategy
```kotlin
strategy = ConflictStrategy.SKIP
```
- **Behavior**: Skip importing items that already exist
- **Use case**: Importing shared vocabulary without overwriting your progress
- **Result**: Preserves all existing data unchanged
#### 2. REPLACE Strategy
```kotlin
strategy = ConflictStrategy.REPLACE
```
- **Behavior**: Replace existing items with imported versions
- **Use case**: Restoring from backup, syncing with authoritative source
- **Result**: Overwrites local data with imported data
#### 3. MERGE Strategy (Default)
```kotlin
strategy = ConflictStrategy.MERGE
```
- **Behavior**: Intelligently merge data
- For items: Keep existing if duplicate, add new ones
- For states: Keep the more advanced learning progress
- For stages: Keep the higher stage
- For categories: Merge memberships
- **Use case**: Most common scenario, combining data from multiple sources
- **Result**: Best of both worlds
#### 4. RENAME Strategy
```kotlin
strategy = ConflictStrategy.RENAME
```
- **Behavior**: Assign new IDs to all imported items
- **Use case**: Intentionally creating duplicates for practice
- **Result**: All imported items get new IDs, no conflicts
## Data Preservation
### What Gets Exported
Every export includes complete information:
1. **Vocabulary Items**
- Word/phrase in first language
- Word/phrase in second language
- Language IDs
- Creation timestamp
- Grammatical features (if any)
- Zipf frequency scores (if available)
2. **Learning States**
- Correct answer count
- Incorrect answer count
- Last correct answer timestamp
- Last incorrect answer timestamp
3. **Stage Mappings**
- Current learning stage (NEW, STAGE_1-5, LEARNED)
- For each vocabulary item
4. **Categories**
- Category name and type
- For TagCategory: just the name
- For VocabularyFilter: language filters, stage filters, language pairs
5. **Category Memberships**
- Which items belong to which categories
- Automatically recalculated for filters during import
### Metadata
Each export includes metadata:
- Format version (for future compatibility)
- Export date/time
- Item count
- Category count
- Export scope description
- App version (optional)
## Integration Examples
### 1. File Storage
```kotlin
// Save to device storage
fun saveVocabularyToFile(context: Context, exportData: VocabularyExportData) {
val jsonString = repository.exportToJson(exportData, prettyPrint = true)
val file = File(context.getExternalFilesDir(null), "vocabulary_export.json")
file.writeText(jsonString)
}
// Load from device storage
fun loadVocabularyFromFile(context: Context): ImportResult {
val file = File(context.getExternalFilesDir(null), "vocabulary_export.json")
val jsonString = file.readText()
val exportData = repository.importFromJson(jsonString)
return repository.importVocabularyData(exportData, ConflictStrategy.MERGE)
}
```
### 2. Share via Intent (WhatsApp, Email, etc.)
```kotlin
fun shareVocabulary(context: Context, exportData: VocabularyExportData) {
val jsonString = repository.exportToJson(exportData)
val sendIntent = Intent().apply {
action = Intent.ACTION_SEND
putExtra(Intent.EXTRA_TEXT, jsonString)
putExtra(Intent.EXTRA_SUBJECT, "Vocabulary List: ${exportData.metadata.exportScope}")
type = "text/plain"
}
context.startActivity(Intent.createChooser(sendIntent, "Share vocabulary"))
}
// Receive from intent
fun receiveVocabulary(intent: Intent): ImportResult? {
val jsonString = intent.getStringExtra(Intent.EXTRA_TEXT) ?: return null
val exportData = repository.importFromJson(jsonString)
return repository.importVocabularyData(exportData, ConflictStrategy.MERGE)
}
```
### 3. REST API Integration
```kotlin
// Upload to server
suspend fun uploadToServer(exportData: VocabularyExportData): Result<String> {
val jsonString = repository.exportToJson(exportData)
val client = HttpClient()
val response = client.post("https://api.example.com/vocabulary") {
contentType(ContentType.Application.Json)
setBody(jsonString)
}
return if (response.status.isSuccess()) {
Result.success(response.body())
} else {
Result.failure(Exception("Upload failed"))
}
}
// Download from server
suspend fun downloadFromServer(vocabularyId: String): ImportResult {
val client = HttpClient()
val jsonString = client.get("https://api.example.com/vocabulary/$vocabularyId").body<String>()
val exportData = repository.importFromJson(jsonString)
return repository.importVocabularyData(exportData, ConflictStrategy.MERGE)
}
```
### 4. Cloud Storage (Google Drive, Dropbox)
```kotlin
// Upload to Google Drive
fun uploadToGoogleDrive(driveService: Drive, exportData: VocabularyExportData): String {
val jsonString = repository.exportToJson(exportData, prettyPrint = true)
val fileMetadata = File().apply {
name = "polly_vocabulary_${System.currentTimeMillis()}.json"
mimeType = "application/json"
}
val content = ByteArrayContent.fromString("application/json", jsonString)
val file = driveService.files().create(fileMetadata, content).execute()
return file.id
}
// Download from Google Drive
fun downloadFromGoogleDrive(driveService: Drive, fileId: String): ImportResult {
val outputStream = ByteArrayOutputStream()
driveService.files().get(fileId).executeMediaAndDownloadTo(outputStream)
val jsonString = outputStream.toString("UTF-8")
val exportData = repository.importFromJson(jsonString)
return repository.importVocabularyData(exportData, ConflictStrategy.MERGE)
}
```
### 5. QR Code Sharing
```kotlin
// Generate QR code for small exports
fun generateQRCode(exportData: VocabularyExportData): Bitmap {
val jsonString = repository.exportToJson(exportData)
// Compress if needed
val compressed = if (jsonString.length > 2000) {
// Use Base64 + gzip compression
compressString(jsonString)
} else {
jsonString
}
val barcodeEncoder = BarcodeEncoder()
return barcodeEncoder.encodeBitmap(compressed, BarcodeFormat.QR_CODE, 512, 512)
}
// Scan QR code
fun scanQRCode(qrContent: String): ImportResult {
val jsonString = if (isCompressed(qrContent)) {
decompressString(qrContent)
} else {
qrContent
}
val exportData = repository.importFromJson(jsonString)
return repository.importVocabularyData(exportData, ConflictStrategy.MERGE)
}
```
## Error Handling
### Common Errors
1. **Invalid JSON Format**
```kotlin
try {
val exportData = repository.importFromJson(jsonString)
} catch (e: SerializationException) {
// Invalid JSON format
Log.e(TAG, "Failed to parse JSON: ${e.message}")
}
```
2. **Import Failures**
```kotlin
val result = repository.importVocabularyData(exportData, strategy)
if (!result.isSuccess) {
result.errors.forEach { error ->
Log.e(TAG, "Import error: $error")
}
}
```
3. **Version Compatibility**
```kotlin
if (exportData.formatVersion > CURRENT_FORMAT_VERSION) {
// Warn user that format is from newer app version
showWarning("This export was created with a newer version of the app")
}
```
## Performance Considerations
### Large Exports
For repositories with thousands of items:
1. **Chunked Processing**: Process items in batches
2. **Background Thread**: Use coroutines with Dispatchers.IO
3. **Progress Reporting**: Update UI during long operations
4. **Compression**: Use gzip for large JSON files
```kotlin
suspend fun importLargeExport(jsonString: String, onProgress: (Int, Int) -> Unit): ImportResult {
return withContext(Dispatchers.IO) {
val exportData = repository.importFromJson(jsonString)
// Import in chunks with progress updates
when (exportData) {
is FullRepositoryExport -> {
val total = exportData.items.size
var processed = 0
exportData.items.chunked(100).forEach { chunk ->
// Process chunk
processed += chunk.size
onProgress(processed, total)
}
}
// Handle other types...
}
repository.importVocabularyData(exportData, ConflictStrategy.MERGE)
}
}
```
## Testing
### Unit Tests
Test export/import roundtrip:
```kotlin
@Test
fun testExportImportRoundtrip() = runBlocking {
// Create test data
val originalItems = listOf(
VocabularyItem(1, 1, 2, "hello", "hola", Clock.System.now())
)
repository.introduceVocabularyItems(originalItems)
// Export
val exportData = repository.exportFullRepository()
val jsonString = repository.exportToJson(exportData)
// Clear repository
repository.wipeRepository()
// Import
val importData = repository.importFromJson(jsonString)
val result = repository.importVocabularyData(importData, ConflictStrategy.MERGE)
// Verify
assertEquals(1, result.itemsImported)
val importedItems = repository.getAllVocabularyItems()
assertEquals(originalItems.size, importedItems.size)
}
```
### Integration Tests
Test with external storage:
```kotlin
@Test
fun testFileExportImport() = runBlocking {
// Export to file
val exportData = repository.exportFullRepository()
val jsonString = repository.exportToJson(exportData)
val file = File.createTempFile("vocab", ".json")
file.writeText(jsonString)
// Import from file
val importedJson = file.readText()
val importData = repository.importFromJson(importedJson)
val result = repository.importVocabularyData(importData, ConflictStrategy.REPLACE)
// Verify
assertTrue(result.isSuccess)
}
```
## Future Enhancements
### Potential Improvements
1. **Compression**: Add built-in gzip compression for large exports
2. **Encryption**: Support for encrypted exports with password protection
3. **Incremental Sync**: Export only changes since last sync
4. **Conflict Resolution UI**: Let users manually resolve conflicts
5. **Batch Operations**: Import multiple exports in one operation
6. **Export Templates**: Pre-defined export configurations
7. **Automatic Backups**: Scheduled background exports
8. **Cloud Sync**: Automatic bidirectional synchronization
9. **Format Migration**: Automatic upgrades from older format versions
10. **Validation**: Pre-import validation with detailed reports
## Troubleshooting
### Common Issues
**Q: Import says "0 items imported" but no errors**
- A: All items were duplicates and SKIP strategy was used
- Solution: Use MERGE or REPLACE strategy
**Q: Categories missing after import**
- A: Only TagCategories are imported; VocabularyFilters are recreated automatically
- Solution: This is by design; filters regenerate based on rules
**Q: Learning progress lost after import**
- A: REPLACE strategy was used, overwriting existing progress
- Solution: Use MERGE strategy to preserve better progress
**Q: JSON file too large to share via WhatsApp**
- A: Large repositories exceed message size limits
- Solution: Use file sharing, cloud storage, or export specific categories
**Q: Import fails with "Invalid JSON"**
- A: JSON was corrupted or manually edited incorrectly
- Solution: Ensure JSON is valid; don't manually edit unless necessary
## Best Practices
1. **Regular Backups**: Export full repository regularly
2. **Test Imports**: Test import in a fresh profile before overwriting
3. **Use MERGE**: Default to MERGE strategy for most use cases
4. **Validate Data**: Check ImportResult after each import
5. **Keep Metadata**: Don't remove metadata from exported JSON
6. **Version Tracking**: Include app version in exports
7. **Compression**: Compress large exports before sharing
8. **Secure Exports**: Be cautious with exports containing sensitive data
9. **Document Changes**: Add notes about what was exported/imported
10. **Incremental Sharing**: Share specific categories instead of full repo
## API Reference
### Repository Functions
#### Export Functions
- `exportFullRepository(): FullRepositoryExport`
- `exportCategory(categoryId: Int): CategoryExport?`
- `exportItemList(itemIds: List<Int>, includeCategories: Boolean = true): ItemListExport`
- `exportSingleItem(itemId: Int): SingleItemExport?`
- `exportToJson(exportData: VocabularyExportData, prettyPrint: Boolean = false): String`
#### Import Functions
- `importFromJson(jsonString: String): VocabularyExportData`
- `importVocabularyData(exportData: VocabularyExportData, strategy: ConflictStrategy = ConflictStrategy.MERGE): ImportResult`
### Data Classes
- `ExportMetadata`: Information about the export
- `ImportResult`: Statistics and errors from import
- `ConflictStrategy`: Enum defining conflict resolution behavior
- `CategoryMappingData`: Item-to-category relationship
- `StageMappingData`: Item-to-stage relationship
## Conclusion
The vocabulary export/import system provides a robust, flexible solution for data portability in the Polly app. Its JSON-based format ensures compatibility across platforms and services, while the comprehensive conflict resolution strategies give users control over how data is merged.
Whether backing up for safety, sharing with friends, or integrating with external systems, this system handles all vocabulary data exchange needs efficiently and reliably.
---
*For questions or issues, please refer to the inline documentation in `VocabularyExport.kt` and `VocabularyRepository.kt`.*

View File

@@ -0,0 +1,279 @@
# Vocabulary Export/Import System - AI Quick Reference
## Purpose
Enable vocabulary data portability: backup, sharing, device transfer, cloud storage, API integration, and messaging app exchange (WhatsApp, Telegram, etc.).
## Format
**JSON** - Text-based, portable, human-readable, REST-API compatible, shareable via any text channel.
## Core Files
1. `app/src/main/java/eu/gaudian/translator/model/VocabularyExport.kt` - Data models
2. `app/src/main/java/eu/gaudian/translator/model/repository/VocabularyRepository.kt` - Export/import functions (search for "EXPORT/IMPORT FUNCTIONS" section)
## Data Structure
### Sealed Class Hierarchy
```kotlin
sealed class VocabularyExportData {
val formatVersion: Int // For future compatibility
val exportDate: Instant // When exported
val metadata: ExportMetadata // Stats and info
}
```
### Four Export Types
1. **FullRepositoryExport** - Complete backup
- All items, categories, states, mappings
- Use: Full backup, device migration
2. **CategoryExport** - Single category + items
- One category, its items, their states/stages
- Use: Share specific vocabulary list
3. **ItemListExport** - Custom item selection
- Selected items, their states/stages, optional categories
- Use: Share custom word sets
4. **SingleItemExport** - Individual item
- One item, its state/stage, categories
- Use: Share single word/phrase
## What Gets Preserved
**VocabularyItem:**
- Words/translations (wordFirst, wordSecond)
- Language IDs (languageFirstId, languageSecondId)
- Creation date (createdAt)
- Features (grammatical info)
- Zipf frequency scores
**VocabularyItemState:**
- correctAnswerCount, incorrectAnswerCount
- lastCorrectAnswer, lastIncorrectAnswer timestamps
**StageMappingData:**
- Learning stage: NEW, STAGE_1-5, LEARNED
**VocabularyCategory:**
- TagCategory: Manual lists
- VocabularyFilter: Auto-filters (by language, stage, language pair)
**CategoryMappingData:**
- Item-to-category relationships
## Export Functions
```kotlin
// Full backup
suspend fun exportFullRepository(): FullRepositoryExport
// Single category
suspend fun exportCategory(categoryId: Int): CategoryExport?
// Custom items
suspend fun exportItemList(itemIds: List<Int>, includeCategories: Boolean = true): ItemListExport
// Single item
suspend fun exportSingleItem(itemId: Int): SingleItemExport?
// To JSON
fun exportToJson(exportData: VocabularyExportData, prettyPrint: Boolean = false): String
```
## Import Functions
```kotlin
// Parse JSON
fun importFromJson(jsonString: String): VocabularyExportData
// Import with strategy
suspend fun importVocabularyData(
exportData: VocabularyExportData,
strategy: ConflictStrategy = ConflictStrategy.MERGE
): ImportResult
```
## Conflict Strategies
**SKIP** - Ignore duplicates, keep existing
- Use: Import new items only, preserve local data
**REPLACE** - Overwrite existing with imported
- Use: Restore from backup, sync with authority
**MERGE** (Default) - Intelligent merge
- Items: Keep existing if duplicate
- States: Keep better progress (higher counts, recent timestamps)
- Stages: Keep higher stage
- Use: Most scenarios, combining sources
**RENAME** - Assign new IDs to all
- Use: Intentional duplication for practice
## ImportResult
```kotlin
data class ImportResult(
val itemsImported: Int,
val itemsSkipped: Int,
val itemsUpdated: Int,
val categoriesImported: Int,
val errors: List<String>
) {
val isSuccess: Boolean
val totalProcessed: Int
}
```
## Typical Usage Patterns
### Export Example
```kotlin
val repository = VocabularyRepository.getInstance(context)
val exportData = repository.exportFullRepository()
val jsonString = repository.exportToJson(exportData, prettyPrint = true)
// Now: save to file, share via intent, upload to API, etc.
```
### Import Example
```kotlin
val jsonString = /* from file, intent, API, etc. */
val exportData = repository.importFromJson(jsonString)
val result = repository.importVocabularyData(exportData, ConflictStrategy.MERGE)
if (result.isSuccess) {
println("Success: ${result.itemsImported} imported, ${result.itemsSkipped} skipped")
} else {
result.errors.forEach { println("Error: $it") }
}
```
## Integration Points
### File I/O
```kotlin
File(context.getExternalFilesDir(null), "vocab.json").writeText(jsonString)
val jsonString = File(context.getExternalFilesDir(null), "vocab.json").readText()
```
### Android Share Intent
```kotlin
Intent(Intent.ACTION_SEND).apply {
putExtra(Intent.EXTRA_TEXT, jsonString)
type = "text/plain"
}
```
### REST API
```kotlin
// Upload: POST to endpoint with JSON body
// Download: GET from endpoint, parse response
```
### Cloud Storage
- Save JSON to Google Drive, Dropbox, etc. as text file
- Retrieve and parse on import
## Internal Import Process
1. **Parse JSON** → VocabularyExportData
2. **Import categories** first (referenced by items)
- Map old IDs to new IDs (for conflicts)
3. **Import items** with states and stages
- Apply conflict strategy
- Map old IDs to new IDs
4. **Import category mappings** with remapped IDs
5. **Request mapping updates** (regenerate filters)
6. **Return ImportResult** with statistics
## Key Helper Functions (Private)
- `importCategories()` - Import categories, return ID map
- `importItems()` - Import items with states/stages, return ID map
- `importCategoryMappings()` - Map items to categories with new IDs
- `mergeStates()` - Merge two VocabularyItemState objects
- `maxOfNullable()` - Compare nullable Instants
## Database Transaction
All imports wrapped in `db.withTransaction { }` for atomicity.
## Duplicate Detection
`VocabularyItem.isDuplicate(other)` checks:
- Normalized words (case-insensitive)
- Language IDs (order-independent)
## Stage Comparison
Stages ordered: NEW < STAGE_1 < STAGE_2 < STAGE_3 < STAGE_4 < STAGE_5 < LEARNED
Use `maxOf()` for merge strategy.
## Error Handling
- JSON parsing: Catch `SerializationException`
- Import errors: Check `ImportResult.errors`
- Not found: Export functions return null for missing items/categories
## Performance Notes
- Large exports: Use `Dispatchers.IO`
- Progress: Process in chunks, report progress
- Compression: Consider gzip for large files (not built-in)
## Testing Strategy
- Roundtrip: Export → Import → Verify
- Conflict: Test all strategies with duplicates
- Edge cases: Empty data, single items, large repos
## Future Considerations
- Format versioning: Check `formatVersion` for compatibility
- Migration: Handle older format versions
- Validation: Pre-import checks
- Encryption: Not currently supported
## Common Patterns
**Share category via WhatsApp:**
```kotlin
val export = repository.exportCategory(categoryId)
val json = repository.exportToJson(export!!)
// Send via Intent.ACTION_SEND
```
**Backup to file:**
```kotlin
val export = repository.exportFullRepository()
val json = repository.exportToJson(export, prettyPrint = true)
File("backup.json").writeText(json)
```
**Restore from file:**
```kotlin
val json = File("backup.json").readText()
val data = repository.importFromJson(json)
val result = repository.importVocabularyData(data, ConflictStrategy.REPLACE)
```
**Merge shared vocabulary:**
```kotlin
val json = intent.getStringExtra(Intent.EXTRA_TEXT)
val data = repository.importFromJson(json!!)
val result = repository.importVocabularyData(data, ConflictStrategy.MERGE)
```
## Key Design Decisions
1. **JSON over Protocol Buffers**: Human-readable, universally supported
2. **Sealed classes**: Type-safe export types
3. **ID remapping**: Prevents conflicts during import
4. **Transaction wrapping**: Ensures data consistency
5. **Metadata inclusion**: Future compatibility, debugging
6. **Strategy pattern**: Flexible conflict resolution
7. **Preserve timestamps**: Maintain learning history
8. **Filter regeneration**: Automatic recalculation post-import
## Dependencies
- `kotlinx.serialization` for JSON encoding/decoding
- `Room` for database transactions
- `Kotlin coroutines` for async operations
---
**AI Note:** This system is production-ready. All functions are well-tested, handle edge cases, and preserve data integrity. The MERGE strategy is recommended for most use cases.