Correct timestamps

During the annotation process, a few timestamps have been changed to wrong values. A Python script wrote the locations of the wrong timestamps in a csv file, each of these now have to be corrected.

Finding the Git repository

To get the code allowing for timestamps checks, open a terminal (Terminal app on Mac) and run the following command:

$ git clone git@github.com:BergelsonLab/timestamp_checker.git

You do not need to do it if the code already exists on the computer you are working on.

Now using the terminal, go to the code directory using:

$ cd timestamp_checker

Correcting the errors

Go to https://github.com/BergelsonLab/timestamp_checker/blob/master/checklist.md to know which files have been processed or not.

Coordinate with the people who are working on the same issue at the same time as you do to know who is processing which files.

Choose a set of files to process (eg. all the files from one child) [EXCEPT MONTHS 11 AND 12] and open https://github.com/BergelsonLab/timestamp_checker/blob/master/timestamp_error_summary.csv, which contains the locations of the errors in each file.

For each file you process, using Clan, open side by side:

  • the sparse_code version Subject_Files/xx/xx_xx/Home_Visit/Coding/Audio_Annotation/xx_xx_sparse_code.cha

  • the lena version Subject_Files/xx/xx_xx/Home_Visit/Processing/Audio_Files/xx_xx.lena.cha

The lena version contains the right timestamps, while the sparse_code version contains potentially wrong timestamps. You have to correct the timestamps in the sparse_code version. DO NOT MODIFY THE LENA FILE.

For each timestamp reported to be problematic:

  • go to the corresponding region of the file (use the word search tool in your editor and paste the timestamp) in both the sparse and the lena versions

  • identify the error:

    • most of the time, the timestamp before a Subregion or Silence comment has been replaced by a script, but we want to revert the value to the original lena.cha timestamp; in this case, replace the wrong digits by the correct value indicated in the lena file.

    • sometimes, some digits have been deleted. In this case, correct the indicated line and think about checking the following lines as well, as the issue could have repercussions in the next lines.

    • sometimes annotations are added in a line with no timestamp; in this case, move the annotations to their corresponding line and delete the additional line.

  • correct the error: look at the corresponding line in the lena version, and change the timestamp in the sparse_code.cha.

When you are done with all the reported timestamps, run the check script:

$ python timestamp_checker.py path/to/file/that/you/corrected.cha

If all the issues have been solved, it should output:

finish checking xx_xx_sparse_code.cha with 0 errors across different lines and 0 errors on the same lines with 0 error exception

Once all the issues are solved, mark the file as checked in https://github.com/BergelsonLab/timestamp_checker/blob/master/checklist.md (click on the pencil, write an 'x' in the corresponding row and click 'Commit changes') and proceed to the next file!

Last updated