Proposal for Parallel Annotation of VIHI

Where annotation files can live.
GitHub repo/GitHub. VIHI_LENA repository on GitHub.
Individual folder. An individual annotator-recording folder in , e.g.,
Fas-Phyc-PEB-Lab/VIHI/annotations-in-progress/LENA/George-Romero_AB_123_456. Inside that folder is a clone of VIHI_LENA with only files for this one recording checked out, i.e.,.../George-Romero_AB_123_456/AB/AB_123/AB_123_456/*.*are the only files in that folder. This is also the only folder that ELAN touches.Individual branch. A branch checked out in the individual folder. Named something like
George-Romero_AB_123_456LENA folder. A folder on the BLab share at
Fas-Phyc-PEB-Lab/VIHI/SubjectFiles/LENAto which the GitHub repo is cloned with themainbranch checked out. There are three distinct ways that data are stored here:Working Tree. That’s all the tracked files inside the
LENAfolder. There should never be any files that are in the modified, deleted, etc. state, i.e.,git statusshould always sayworking tree clean, nothing to see here. The only thing that should touch that folder ispull -ff-only. Other than that, this should be considered a read-only folder. I haven’t yet come up with a way to enforce this while allowing pulling at the same time. I’ll think of something.Ignored files. Files in the LENA folder but not tracked by git - they are either large files (wav/its) or files we won’t miss if something happens to them. Ideally, the important files wouldn’t be here at all and would be stored separately and linked here. In any case, I’ll change them to read-only to avoid losing them.
BLab share repo.
Fas-Phyc-PEB-Lab/VIHI/SubjectFiles/LENA/.gitfolder. That’s where copies of the branches inannotations-in-progressare pushed to after every commit. One staff member (by default, Zhenya) regularly pushes these branches to GitHub.
Ensuring no data is lost
Ideally, we should achieve a state where we can delete the LENA folder and the annotations-in-progress folder at any given moment.
The working tree is just a mirror of the GitHub repo - recoverable.
Objects on branches saved in the BLab share repo are pushed to GitHub - recoverable too.
The ignored but important files (.wav, .its, etc.) are not even in the folders.
We don't care about other ignored files.
🤦🏻 And I’ve just realized that I am sort of re-inventing the child-project system.
Operations that change states
Annotating. Changes the .eaf and the .pfsx files
Saving locally. In the individual folder,
git-add all changes (we are not affecting the main branch so it is OK in this situation),
git-commit them,
git-push them to the BLab share repo.
Pushing to GitHub. Push branches from the BLab share repo to GitHub. Not the main branch though, which we only ever
pull --ff-onlyto.Rebasing. Replaying commits on a super-checked branch onto the main branch on GitHub.
Updating BLab share repo. Run
pull --ff-onlyin the LENA folder.
Annotation process
(the level of details decreases as we move down the list.
Annotation
A recording is assigned to an annotator.
Annotator tells
blabpyabout that.An individual folder is created and opens in Finder/Explorer.
Annotator opens ELAN and annotates as annotators do.
(optional) They tell
blabpythat they are finished for the day andblabpysaves the annotations locally.They tell
blabpythat they are finished with the recording andblabpysaves the annotations locally and notifies Lilli.
Super-checking
Lilli tells
blabpythat she wants to super-check.An individual folder is created, Lilli makes edits, optionally finishes for the day, finishes fully, tells
blabpyabout that,blabpysaves changes locally.
Incorporating changes into the GitHub, and the local repos.
rebases the branch in the BLab share repo onto the
mainbranch,merges the branch into the
mainbranch without affecting the working tree,pushes the merged branch to GitHub.
Updates the BLab share repo.
(if
blabpyfails) Zhenya gets a notification, tellsblabpythat he needs to work on that one recording annotated by the annotator and then finishes the steps in the previous list item.(if Zhenya fails at resolving conflicts) Zhenya asks Lilli to resolve the conflicts that require thinking, not coding. Lilli tells blabpy, resolves conflicts, tells blabpy about that, it does the rest.
UX
An annotator starting a new recording:
Opens Terminal.
$ vihi annotation start XX_NNN_MMM> What is your name? (Last First): <type-in-John-space-Doe>If the name hasn’t been used yet:
Prompt changes to:
Finder/Explorer opens on the folder with the EAF file
Opens EAF in ELAN, annotates, saves, and closes ELAN.
If not yet finished with the recording:
Saves an in-progress version as a commit in (I forgot the alternative) we came up with yesterday
Next time, GOTO 1.
If done with the recording:
Saves the finished version.
Zhenya
Receives a notification about a version conflict.
vihi annotation resolove-conflicts --manual XX_NNN_MMM John DoeResolves conflicts if it is a technical thing and finishes
Lilli
Super-checking.
vihi annotation super-check start XX_NNN_MMMA Finder window opens with the folder that has the EAF.
Lilli does the super-checking.
If superchecking isn’t complete:
GOTO 1
If super-checking is complete.
If “Replaying changes…” or “Checking if…” reported any conflicts, go to “Zhenya” → “If there are conflicts…”
Conflict resolution
vihi annotation resolve-conflicts --elan XX_NNN_MMM John DoeA Finder window opens with the folder that has the EAF.
Opens EAF in ELAN.
Resolves conflicts by editing conflicting annotations that are easy to find because the script did something helpful (no idea what that is yet :-).
vihi annotation conflicts-resolvedblabpydoes the rest.
Last updated