Audio Add Annotation IDs
Starting in 2019, all SEEDLingS files have Annotation IDs.
We don't currently have a script to add annotations* so you'll need to ask Zhenya to add annotids. Remind him that it's easier to do this at the sparse_code_csv level and then propagate back into cha/opf.
First things first, run CLAN check on your file, using Esc+L. Fix any errors that CLAN check finds.
What's an annotation ID?
Each individual annotation in SEEDLingS now has a barcode. The barcode is a 6-digit, randomly generated alphanumeric code, preceded by 0x.
Old audio format: chair &=d_y_MOT
New audio format: chair &=d_y_MOT_0x8a2f48
What do I do with them?
Mostly ignore them!
Don't ever edit an annotation ID by hand. They are generated from a script.
When coding new words during annotation checks, FOR AUDIO FILES, you should annotate new words as you would have before:
chair &=d_y_MOT
After you finish the check, you'll run a script to add the annotation IDs.
When taking words out during annotation checks, FOR AUDIO FILES, you should be sure to delete the entire code, including the annotation ID.
Don't leave any characters in the file that were associated with the annotation ID!
Parse Clan 2
Now you're ready to run parse_clan2.
*Why we don't have an annotid-adding script anymore
We used to have one. It relied on a database of annotids served from a virtual machine. For one reason or another, the script created duplicate annotids several times which wasn't great. So, when that virtual machine was about to be quarantined in May 2023, Zhenya decided to kill it and live without a way to add annotids until we would need to do it again which might have been a while since no new annotation was going on on the Seedlings dataset.
Last updated