WordCompare

lets you go from the audio and video counts to a list of potential stimuli

Wordcompare is a little script (by Andrei Amatuni) that lets you go from the audio and video counts to a list of potential stimuli

It lives in our github repository here.

Make sure to keep your local version of the repo current. Instructions on how to do this are here.

To use the program, follow the directions below:

Necessary Files

These should all be placed in the data/input folder within the wordcompare directory.

  1. the most recent list of general words we have audio files for.

    a. the list is a google spreadsheet, found here, that needs to be downloaded (file->download as .tsv) into the data/input folder every time you run the script so it's up to date; replace older copies of it with the new ones you download.

  2. the month specific word list

    a. this list is also a google spreadsheet, found here.

    • Sheet 2, wordlist_for_wordcompare is the one you want for this program, also attached here

  3. the basiclevel checked audio csv file titled in this manner: 01_06_coderSK_audio_checkEB.csv (perfectly rectangular, no extra spaces anywhere, no questions/comments column, as a csv file)

  4. the basiclevel checked video csv file titled in this manner: 01_06_coderSK_video_checkEB.csv (same as above)

Running the Program

  1. once you have the latest version of the code in your local repo, open the terminal or command prompt in your computer.

    a. Make sure you are in the right directory (type cd followed by the path of the folder containing the .py files)

    1 ~/:$ cd ~/Desktop/Github/SeedlingsBabylab/wordcompare

    b. You can check what directory you're currently in by running "pwd" (print working directory):

    1 ~/:$ pwd
    2 ~/:$ /Users/elikab
    3 ~/:$ cd ~/desktop/github
    4 ~/desktop/github:$ pwd
    5 ~/desktop/github:$ /Users/elikab/desktop/github

    c. to start the GUI program, run this from the wordcompare folder:

    1 ~/Desktop/Github/SeedlingsBabylab/wordcompare:$ python wordcompare.py
  2. load each of the required files, remembering to select the month for the month-specific file from the drop down.

    • load raw data for audio and video, NOT load audio/video words

  3. select '15' for the last two columns and click 'find'

  4. make sure there are at least 5 words in the unique audio and unique video columns that don't overlap (otherwise choose a higher number than 15)

  5. click 'export counts' for video and audio and save them in opus as, e.g. 01_06_audiocounts.txt

  6. click 'export' and save the file int the subject folder as, e.g. 01_06_unique.txt enter image description here

  7. to look at the file, open it with excel by first opening excel and then open it from there (you may have to click 'enable all files' so excel can see the .txt. When it tries to open it will ask you whether it's fixedwidth or delimited; click 'delimited' and then 'next', then check 'space' and then finish and it should appear normally.

Let me and/or Andrei know if there are any issues!

ignore this part

  1. the audio count file for a visit, generated at the end of the Clan Annotation process

  2. the video count file for a visit, generated at the end of the Datavyu Annotation process

Last updated