Coding for different types of child directed speech

Instructions for tagging different types of child-directed speech. These instructions should be used in tandem with the ACLEW tutorials.

Overview

The "child addressee" of an utterance is the child who is being talked to. We are doing this because we are interested in learning more about the properties of speech directed to a target child vs another child. This classification scheme is part of a larger study that examines how child-directed and different types of overheard speech (i.e., adult-directed vs other child-directed speech) supports early language development.

Every utterance marked as "C" on every xds@SPEAKER tier will be copied to a new tier, called cds@SPEAKER. The cds tier type has the following closed vocabulary:

T: speech exclusively directed to the Target child (CHI)
K: Kin, or Kids other than the Target CHI
M: directed to the both the Target child and other kid(s)
X: Uncertain child directed speech

Coders are encouraged to use two types of information in making a decision about who the child addressee is:

1) contextual information, like what is said, the use of someone's name, the conversation topic, if the addressee responded to what was said.

2) recognizable acoustic markers or prosodic markers. The target infants in the SEEDLinGs corpus are between the ages of 6 and 17 mos. For this, we are transcribing files for a subset of SEEDLinGs infants who have an sibling in their household. The siblings are, on average, 4.8 years older than the target child (range: 0-12 years of age). Thus, compared to the acoustic markers for siblings, recognize acoustic markers for the target child can include higher-pitched voices, sing-song like intonation, simpler speech, and diminutives.

Addressee coding

Target child only (T)

code the cds tier as T when the target child is the exclusive addressee of the utterance. One type of information that you can use is the loudness of speech. If it sounds like the talker is speaking directly to the microphone, then the addressee is most likely the target child.

Other child (K)

code the cds tier as K when an utterance is addressed to a non-target child. Sometimes there are clear clues that indicate the speech is addressed to a non-target child. For instance, utterances addressed to an older sibling are typically longer, and have less variation in pitch. Other features of speech addressed to another child, particularly an older sibling, might include discussing past or future events, asking open-ended questions, or instructing the child to do something. Additionally, if there is a clear back and forth conversation between a speaker and the other child(ren), you can assume the intended recipient of the utterance is the other child.

Both target child and other child(ren) (M)

code the cds tier as M when an utterance is addressed to both the target child and other child(ren). Sometimes there is a clear setting that both speaker types are being addressed (e.g., "are you guys hungry"). Other times, it is harder to decide who an utterance is meant for when there are a mix of children. Ask yourself how many children are participating in an interaction. If both children are participating in an interaction, you can assume that the speech is being address to "multiple" as simultaneously. For example, if two children are bickering and an speaker asks, what happened, you can assume that the speech is directed to both children.

Uncertain child directed speech (X)

codes the cds tier as X when no classification can be made. Use this category when the speech is not obviously directed at any child in particular. This category should be used sparingly.

Tips for choosing who the intended child recipient of an utterance is for:

Consider who is in the room. You can use the "Notes and Cast of Characters" doc found in SEEDLinGs subject files to get an idea of who is in the room (**remember: the eaf files are in OvS_subID_subMo format, in which the second code refers to the subject’s seedlings subject ID)
Use contextual cues to determine the topic of conversation and who the conversation might be relevant for.
Consider the the speaker is imitating another speaker or having a back and forth conversation with another speaking.
Use the loudness of the speech. Sometimes (though not always), if the speaker sounds like they are talking directly to the microphone (e.g., the speech is easy to understand), then the intended speaker is the target child. This isn't always a reliable cue, so again, use context in addition to loudness to disambiguate who the intended speaker is.

CDS Training

Getting Started (Round 1)

Go to folder found at: /Volumes/Fas-Phyc-PEB-Lab/OvSpeech/Training/Training_CDS
Inside this folder, create your "coder_XX" folder (replace XX with your initials)
Navigate to this folder: /Volumes/Fas-Phyc-PEB-Lab/OvSpeech/Training/Training_CDS/Training_Templates/Round 1
- Double click to open the eaf file >> File >> Save As >> Save with your initials at the end (e.g. CDS_001_Round1-MI.eaf)
When you open the eaf, you will notice that it has already been transcribed and annotated following the ACLEW guidelines. For this training, we are only focusing on correctly coding for the type of child-directed utterance. Using the "Overview" and "Addressee Coding" section, you will attempt to annotate all utterances marked as "C" on every xds@Speaker tier.
1. First, check if the xds@Speaker tier contains any C's. If it does, we need to add the cds@Speaker tier
2. Go to Tier > Add New Tier...
  1. Tier Name: cds@Speaker (change Speaker to Speaker Tier Name that contains a C; ex: cds@FA1)
  2. Participant: Speaker (e.g., FA1)
  3. Parent Tier: xds@Speaker Tier (e.g., xds@FA1)
  4. Tier Type: cds
  5. Press Add
    Example of information to fill for Add Tier
3. Go to Tier > Copy Annotations from Tier to Tier…
  1. Select the Speaker-xds@Speaker tier >> click "Next"
  2. Select the cds tier as the destination tier
  3. Select "Annotations where the value is" + insert "C"
  4. Select "Treat as regular expression"
  5. Click Finish

During CDS Training (Round 1)

Listen to the sound clip first. Then, listen to each child-directed (C) utterance and a assign it as value of T, K, M, or X.
Frequently save your file progress as you work through coding

After CDS Training (Round 1)

Let Michika know on Slack when you are done
Michika will go through your Round 1 coding and provide feedback as appropriate. Once you get an ok (+ discuss with LMs/one of the senior researchers if needed), move onto Round 2 and 3 using the same steps as Round 1!

After CDS Training (Round 2 & 3)

Create a mastersheet combining the cds from both rounds:
1. Navigate to either the Round 2 or 3 eaf
2. Go to File >> Export Multiple Files As >> Tab-deliminited-Text...
3. Click New Domain...
4. Click Add File...
5. Clikc Add File...
6. Open your annotated eaf file for Round 2
7. Open your annotated eaf file for Round 3
8. Press Ok
9. In the export tier(s) as tab-delimited panel, select every cds tier
10. In output options, select the following options:
  1. Include file name column
  2. Include time column for:
    Begin Time
    End Time
    Duration
  3. Include time format: msec
11. Press OK and save as CDS_001_Rounds_2_3_coder.csv in your initialed folder under /Volumes/Fas-Phyc-PEB-Lab/OvSpeech/Training/Training_CDS

Let Michika know on Slack when you are done
Michika will go through your Round 2 & 3 coding and provide feedback as appropriate
1. Navigate to this spreadsheet: /Volumes/Fas-Phyc-PEB-Lab/OvSpeech/Training/Training_CDS/oCDS_GS_reliability_score.xlsx
2. Navigate to the "GS reliability overview" tab in which the following information will be provided:
  1. coder_GS column = gold standard cds annotations
  2. coder_YourInitial column= your cds annotation,
  3. agreement_YouInitial column= whether your cds annotations match (equal) or differ (error)
3. After taking a look at the errors:
  1. Navigate to the ""GS_coder_explanation" tab
  2. For each error, provide an explanation for your code of choice under ""explanation_YourInitial" column
4. Let Michika know on Slack when you are done
  1. Read through the "GS_explanation_answer_choice" tab and compare your explanations with the GS explanation. Discuss these differences with LMs/one of the senior researchers the GS explanations does not make sense
5. If all "errors" are corrected, you are CDS trained!

Let me know if you have any questions about the instructions or the process before/during/after, as these will hopefully be shared with random internet strangers, and therefore all feedback is useful!

PreviousCoding Overheard Speech (OvS) Files NextCoding for Syntactic Measures

Last updated 18 hours ago