VIHI intercoder ACLEW Reliability

We are going to calculate inter-coder reliability by re-coding 10% of the intervals for the annotations that use a closed vocabulary (xds, vcm, lex, mwu). Do not code inq/utt/cds/ any other tiers for this round. We will NOT be re-segmenting utterances or re-transcribing them: not because we don't want to know, but because it would take excessively long to do so, and there isn't a straightforward way to calculate % agreement.

We have run a script over the files that outputs a copy the only contains the annotations from TWO intervals out of 20 (15 possible random intervals, 5 possible high volubility intervals) and leaves blank annotations on the closed vocabulary tiers.

Find your files:

sox4.university.harvard.edu/Fas-Phyc-PEB-Lab/VIHI/SubjectFiles/LENA/vihi_hi_td_reliability/spring25_hi_td[your_initials]

and navigate to the .eaf that corresponds with the number you have been assigned on Asana. Go through, listening to the two intervals, and recode all the empty annotations. Save and export your file as a tab-delimited txt file.

The .log file tells you which intervals have been left for reliability and the timestamps they occur at, which will help you navigate through your file to find the segments you need to re-code.

Lab Technician's Technical Notes
  • blabpy now has blabpy.vihi.pipeline.create_reliability_test_file and blabpy.vihi.pipeline.prepare_eaf_for_reliability. They randomly pick one high-vol and one random interval, wipe the closed-set annotations (vcm, xds, etc.), and delete all the other intervals and annotations in them.

  • If you need to do it again, consult 2023-11-14_vihi-reliability-files in one_time_scripts

  • The result was saved to the vihi_reliability and then cloned to Fas-Phyc-PEB-Lab/VIHI/lena/vihi_reliability.

  • Lilli will organize the files by annotator and update the repo.

  • I ended up processing all recordings, not only compete ones. But that is OK: Lilli won’t assign incomplete ones for coding.

  • Lilli will tell me when it is time to re-run the script. When that happens, I’ll copy the one-time script, clean it up somewhat, handle different folder structure, maybe also clean up blabpy a bit, and the actually update the test files.

2) Assign the file back to Lilli and Zhenya

Lilli and Zhenya calculate overall agreement between to coders, and keep tabs on agreement by: sensory groupV (TD, HI, VI); original coder; recoder; tier type mistake type?

3) Resolving disagreements

The file will be reassigned to you. Grab another ACLEW-trained coder (e.g. another RA). Your job is to find the codes where you disagreed with the original coder, talk through the rationale of each possibility, and decide jointly what the final code should be.

1) Open the disagreements key:

Fas-Phyc-PEB-Lab/VIHI/SubjectFiles/LENA/vihi_hi_td_reliability/spring25_hi_td/assess/resolving_disagreements_Feb_2025_HI.csv

This is a spreadsheet that lists the orginal coder's annotation, and the recoded annotation from the reliability sample. In column 'N', we ran a formula to identify where the 'before' and 'after' codes agreed. Filter this spreadsheet to only look at codes from the file you're resolving, and codes where column N is 'FALSE (the columns don't match).

2) Check out a copy of your file using parallel annotation instructions

3) Open up the .eaf file in

sox4.university.harvard.edu/Fas-Phyc-PEB-Lab/VIHI/SubjectFiles/LENA/annotations-progress/[SubjID] Your Name

And fix the codes to match your final decision. Do not change anything other than these points of disagreemeent!

Save and close the file

4) Reassign the task to Lilli so she can commit the changes.

Last updated