Proposal for Parallel Annotation of VIHI
Last updated
Last updated
GitHub repo/GitHub. VIHI_LENA repository on GitHub.
Individual folder. An individual annotator-recording folder in , e.g., Fas-Phyc-PEB-Lab/VIHI/annotations-in-progress/LENA/George-Romero_AB_123_456
. Inside that folder is a clone of VIHI_LENA with only files for this one recording checked out, i.e., .../George-Romero_AB_123_456/AB/AB_123/AB_123_456/*.*
are the only files in that folder. This is also the only folder that ELAN touches.
Individual branch. A branch checked out in the individual folder. Named something like George-Romero_AB_123_456
LENA folder. A folder on the BLab share at Fas-Phyc-PEB-Lab/VIHI/SubjectFiles/LENA
to which the GitHub repo is cloned with the main
branch checked out. There are three distinct ways that data are stored here:
Working Tree. That’s all the tracked files inside the LENA
folder. There should never be any files that are in the modified, deleted, etc. state, i.e., git status
should always say working tree clean, nothing to see here
. The only thing that should touch that folder is pull -ff-only
. Other than that, this should be considered a read-only folder. I haven’t yet come up with a way to enforce this while allowing pulling at the same time. I’ll think of something.
Ignored files. Files in the LENA folder but not tracked by git - they are either large files (wav/its) or files we won’t miss if something happens to them. Ideally, the important files wouldn’t be here at all and would be stored separately and linked here. In any case, I’ll change them to read-only to avoid losing them.
BLab share repo. Fas-Phyc-PEB-Lab/VIHI/SubjectFiles/LENA/.git
folder. That’s where copies of the branches in annotations-in-progress
are pushed to after every commit. One staff member (by default, Zhenya) regularly pushes these branches to GitHub.
Ideally, we should achieve a state where we can delete the LENA folder and the annotations-in-progress folder at any given moment.
The working tree is just a mirror of the GitHub repo - recoverable.
Objects on branches saved in the BLab share repo are pushed to GitHub - recoverable too.
The ignored but important files (.wav, .its, etc.) are not even in the folders.
We don't care about other ignored files.
🤦🏻 And I’ve just realized that I am sort of re-inventing the child-project system.
Annotating. Changes the .eaf and the .pfsx files
Saving locally. In the individual folder,
git-add all changes (we are not affecting the main branch so it is OK in this situation),
git-commit them,
git-push them to the BLab share repo.
Pushing to GitHub. Push branches from the BLab share repo to GitHub. Not the main branch though, which we only ever pull --ff-only
to.
Rebasing. Replaying commits on a super-checked branch onto the main branch on GitHub.
Updating BLab share repo. Run pull --ff-only
in the LENA folder.
(the level of details decreases as we move down the list.
Annotation
A recording is assigned to an annotator.
Annotator tells blabpy
about that.
An individual folder is created and opens in Finder/Explorer.
Annotator opens ELAN and annotates as annotators do.
(optional) They tell blabpy
that they are finished for the day and blabpy
saves the annotations locally.
They tell blabpy
that they are finished with the recording and blabpy
saves the annotations locally and notifies Lilli.
Super-checking
Lilli tells blabpy
that she wants to super-check.
An individual folder is created, Lilli makes edits, optionally finishes for the day, finishes fully, tells blabpy
about that, blabpy
saves changes locally.
Incorporating changes into the GitHub, and the local repos.
rebases the branch in the BLab share repo onto the main
branch,
merges the branch into the main
branch without affecting the working tree,
pushes the merged branch to GitHub.
Updates the BLab share repo.
(if blabpy
fails) Zhenya gets a notification, tells blabpy
that he needs to work on that one recording annotated by the annotator and then finishes the steps in the previous list item.
(if Zhenya fails at resolving conflicts) Zhenya asks Lilli to resolve the conflicts that require thinking, not coding. Lilli tells blabpy, resolves conflicts, tells blabpy about that, it does the rest.
Opens Terminal.
$ vihi annotation start XX_NNN_MMM
> What is your name? (Last First): <type-in-John-space-Doe>
If the name hasn’t been used yet:
Prompt changes to:
Finder/Explorer opens on the folder with the EAF file
Opens EAF in ELAN, annotates, saves, and closes ELAN.
If not yet finished with the recording:
Saves an in-progress version as a commit in (I forgot the alternative) we came up with yesterday
Next time, GOTO 1.
If done with the recording:
Saves the finished version.
Receives a notification about a version conflict.
vihi annotation resolove-conflicts --manual XX_NNN_MMM John Doe
Resolves conflicts if it is a technical thing and finishes
vihi annotation super-check start XX_NNN_MMM
A Finder window opens with the folder that has the EAF.
Lilli does the super-checking.
If superchecking isn’t complete:
GOTO 1
If super-checking is complete.
If “Replaying changes…” or “Checking if…” reported any conflicts, go to “Zhenya” → “If there are conflicts…”
vihi annotation resolve-conflicts --elan XX_NNN_MMM John Doe
A Finder window opens with the folder that has the EAF.
Opens EAF in ELAN.
Resolves conflicts by editing conflicting annotations that are easy to find because the script did something helpful (no idea what that is yet :-).
vihi annotation conflicts-resolved
blabpy
does the rest.