These are the guidelines I follow for coding and transcribing my production studies.
For each participant, there are three important items that you need to have.
- The
production
file, found with the participant'sraw-data
. - The
WAV
files, found in a folder with theraw-data
. - The data sheet, a hand-written sheet of paper from the experiment
- Make a copy of the participant's
production
file and put it int theprocessed-data/production
folder. - Make a copy of the participant's
rating
file and put it in theprocessed-data/rating
folder.
Here is an example of what the processed data folder should look like. This is the processed data folder for the 0104-inconinput-1day-pluralmorph-6733-training
experiment.
This ensure that the raw data itself is never touched. There is no need to make copies of the audio files as you will not be altering them and they are not used for data analysis.
Open the COPY you just made of the production
data file in the processed-data
folder for the experiment.
Updated 2016-11-08
Add 10 columns: transcription
, corrVerb
, corrNoun
, actualDet
, categoryDet
, notes
, exp.feedback
, exp.prompt.verb
, exp.prompt.noun
, and exp.prompt.det
.
![](/assets/Screenshot 2016-11-08 13.06.02.png)
- 0167-empiricalyang-9noun-hfrule-adults-fastproduction
- add 3 additional columns:
finished.verb
,finished.noun
,finished.det
- add 3 additional columns:
You will fill in each of these columns with the following for each trial:
transcription
: what the participant actually said
corrVerb
: did they produce the correct verb? (0 for no, 1 for yes)corrNoun
: did they produce the correct noun? (0 for no, 1 for yes)actualDet
: what marker (determiner) did they produce? (write it exactly)categoryDet
: what category does the marker belong to?- for inconinput the options are
maj
,min
,other
, andnull
- for yang the options are
R
,e
,other
, andnull
- for inconinput the options are
notes
: write any notes you feel would help- for example: the child produced
ka
asko
; transcriptions are missing; etc
- for example: the child produced
exp.feedback
: did the experimenter provide any feedback on the trial? (0 for no, 1 for yes)- for example: "great job!". Count any feedback no matter how small.
exp.prompt.verb
: did the experimenter prompt the participant for the verb? (0 for no, 1 for yes)- for example: "it starts gentif..."
exp.prompt.noun
: did the experimenter prompt the participant for the noun? (0 for no, 1 for yes)- for example: "it was gentif mawg"
exp.prompt.det
: did the experimenter prompt the participant for the determiner? (0 for no, 1 for yes)- for example: "do you want to add one of the endings you learned?"
- 0167-empiricalyang-9noun-hfrule-adults-fastproduction
finished.verb
: did the participant finish saying the verb before the beep? (0 for no, 1 for yes)finished.noun
: did the participant finish saying the noun before the beep? (0 for no, 1 for yes)finished.det
: did the participant finish saying the determiner before the beep?(0 for no, 1 for yes)
Here is an example with some of the trials filled in.
![](/assets/Screenshot 2016-11-08 13.05.43.png)
Listen to the
WAV
file for the trial in Audacity. Write what the participant said, word-for-word, in quotes. For example,"gentif mawg ka. [Did I say it right?]"
. If the participant goes off topic, transcribe that part in brackets[ ]
. If the participant goes off topic for a really long time (this happens often with children), just write[child talking...]
in the brackets, or[teacher interrupted...]
. There is no need to write every single word the participant says when it is not related to the task.
If the participant says the same word twice, such as
gentif gentif mawg ka
, there are two things you can do. First, check the participant's data-sheet to see how the experimenter recorded the trial. The data-sheet is considered to be the most accurate record of the trial. Second, use your best judgment during the transcription. For example, the participant may have just stuttered as ingentif, um, gentif mawg ka
. Interpret whether the participant intended to say the same word twice for the trial as best you can. Make a note of this in thenotes
column.
The data-sheet is considered to be the most accurate record of the trial. When there is a conflict, go with what is written on the data-sheet. Make a note of the conflict in the
notes
column.
If the production file is missing, check inside the experiment folder on the local computer. (In the
data
folder.) If you cannot find it anywhere, make a note on thesubject tracking sheet
and recommend the subject for exclusion. No further transcription or coding is necessary.If any or all sound files are missing, check inside the experiment folder on the local computer. (In the
data
folder.) If you cannot find them anywhere, make a note on theproduction
file. Use thedata-sheet
to fill in thetranscription
column.If the data sheet is missing, make a note on the
production
file. Use theWAV
files to fill in thetranscription
column.If both the data sheet and the
WAV
files are missing, make a note on thesubject tracking sheet
and recommend the subject for exclusion. No further transcription or coding is necessary in this case.
Here you will put the category of the marker that the participant used. There are typically four options. For inconinput, the options are:
maj
: the majority marker (depends on language)min
: the minority marker (depends on language)other
: some other word (e.g. English +s, any other word)null
: no marker usedFor yang, the options are:
R
: the regular form (ka
)e
: any exceptional form (depends on language)other
: some other word (e.g. English +s, any other word)null
: no marker usedIf you are uncertain how to code something, just ask me in person.
The participant must be finished with the entire word for it to count in the
finished.verb
,finished.noun
, orfinished.det
categories.