You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The review from Cognitive Science is synthesized into 6 main points in the Cog. Sci. Review wiki page. That page also contains links to the issues tracking each the 6 points, and reviewer-by-reviewer syntheses with more details.
Things that have changed
Here is the list of things that have changed, for use in the cover letter.
A number of values in the paper have changed (the tracking issue for that is Fix counts #12)
Clustering coefficient values, since FA link weights are now taken into account in their computation (so it's computed on the undirected weighted graph)
An update in the language detection module has changed the cluster filtering a little, giving us a few more clusters and quotes than before. As a result of this, the number of words coded by Word Frequency has also changed (since frequency of words is computed on the filtered data set). (Details in Fix counts #12.)
The discovery of three bugs, and the improvement of substitution filtering, led us to gain many more substitutions than previously (again, details in Fix counts #12). All in all the code now in extremely more reliable as it has unit tests covering nearly all of it. Language, cluster, and substitution filtering are also controlled by precision/recall analyses.
Introduction has been rewritten from scratch to better explain our goals (→ synthesis point 1). It should be clearer and easier to follow thanks to more examples.
Related work has also been mostly rewritten to incorporate the literature we had missed in the first submission (→ synthesis point 2). In particular, work in psycholinguistics on lists of words has been well reviewed, and work in iterated learning experiments (Kirby) integrated into the whole discussion
The overall writing and phrasing in the whole paper has received a lot of attention (→ synthesis point 1, and criticisms of bad writing)
The initial parts of Methods have been expanded to better explain some choices which seemed arbitrary (→ synthesis points 4 and 6)
The set of features used has been expanded (with orthographic and phonological neighbourhood densities, and number of letters), and the way features are selected has been greatly improved and rationalized (→ synthesis point 3)
The demand for word-word metrics (rev. 2) is partially met by the addition (further down) of H00, and a short discussion of distances travelled by substitutions
The section on Substitution model has been expanded to better explain the work done, and show the robustness of results (→ synthesis point 4)
The possible bias from focusing on single-substitutions has been addressed by extending substitution models to the two-substitution case. The results are unchanged (and available in the code repository). (→ synthesis point 6)
Susceptibility has been much better defined, with respect to a null hypothesis. It is indeed not a probability of substitution (as was questioned by rev. 1), and now reflects a bias w.r.t. to random picking of targets. As a result, our conclusions for that measure have also been updated. A section was added to analyse POS susceptibilities also (→ synthesis point 3)
Variation is now compared to an additional null hypothesis, H00, based on random selection in synonyms of the disappearing word. The interaction between features is also (partly) addressed with an all-feature regression. (→ synthesis point 3)
Both susceptibility and variation have been extended to analyse Sentence context (→ synthesis point 3)
A whole Discussion section has been added to recontextualise the results (→ synthesis points 1 and 2)
The (indeed exaggerated) claims about "convergence" have been revised (→ synthesis point 5)
Things we did not do
Cross-feature interactions are combinatorially explosive, and not the goal of our work. We explored many directions to little avail, and what works is shown in the paper. In particular:
PCA (with or without reconstitution of missing values) gives hard-to-interpret results
Anova combinatorially exploses (between global feature values, sentence-relative feature values, and all their interactions), and there is no directing question to reduce dimensions
Regression of susceptibility gives very unreliable results (because the constraints of the problem don't fit in the model)
Regression of variation does give some insight, and is what we show in the paper
We didn't try to do word-based exact predictions (i.e. without features). This could have been (a) which word is substituted, (b) which word appears instead. (a) comes from the association strength of words in the initial sentence with the word predicted by (b), but (b) is a research program in itself:
Our data set is not adapted to computing LSA/LDA because it has groups of very similar documents, so the associations extracted will most likely reflect this, i.e. they will be between words in the same quotation families. That's not informative for substitutions (we want associations from other families to inform the family we look at).
Even in controlled settings and on lists of random words (i.e. lists not designed to trigger intrusions like in the Deese-Roediger-McDermott paradigm, but still with no syntax involved), the state of the art does not predict the new word (Zaromb et al. 2006); instead it predicts a list from which the new word comes from. Now (b) means predicting the new word in sentences from the real world, so it's two big jumps from what exists.
The data is again badly structured for prediction, since there are only a few measurements on many varied cases (each case, i.e. source sentence, has one prediction, and there are only a few measurements for each source sentence), instead of many measurements on a few cases, making prediction amenable to errors. This is explained in the paper.
The text was updated successfully, but these errors were encountered:
The review from Cognitive Science is synthesized into 6 main points in the Cog. Sci. Review wiki page. That page also contains links to the issues tracking each the 6 points, and reviewer-by-reviewer syntheses with more details.
Things that have changed
Here is the list of things that have changed, for use in the cover letter.
Things we did not do
The text was updated successfully, but these errors were encountered: