Cover letter #21

wehlutyk · 2016-07-01T10:03:05Z

The review from Cognitive Science is synthesized into 6 main points in the Cog. Sci. Review wiki page. That page also contains links to the issues tracking each the 6 points, and reviewer-by-reviewer syntheses with more details.

Things that have changed

Here is the list of things that have changed, for use in the cover letter.

Things we did not do

Cross-feature interactions are combinatorially explosive, and not the goal of our work. We explored many directions to little avail, and what works is shown in the paper. In particular:
- PCA (with or without reconstitution of missing values) gives hard-to-interpret results
- Anova combinatorially exploses (between global feature values, sentence-relative feature values, and all their interactions), and there is no directing question to reduce dimensions
- Regression of susceptibility gives very unreliable results (because the constraints of the problem don't fit in the model)
- Regression of variation does give some insight, and is what we show in the paper
We didn't try to do word-based exact predictions (i.e. without features). This could have been (a) which word is substituted, (b) which word appears instead. (a) comes from the association strength of words in the initial sentence with the word predicted by (b), but (b) is a research program in itself:
- Our data set is not adapted to computing LSA/LDA because it has groups of very similar documents, so the associations extracted will most likely reflect this, i.e. they will be between words in the same quotation families. That's not informative for substitutions (we want associations from other families to inform the family we look at).
- Even in controlled settings and on lists of random words (i.e. lists not designed to trigger intrusions like in the Deese-Roediger-McDermott paradigm, but still with no syntax involved), the state of the art does not predict the new word (Zaromb et al. 2006); instead it predicts a list from which the new word comes from. Now (b) means predicting the new word in sentences from the real world, so it's two big jumps from what exists.
- The data is again badly structured for prediction, since there are only a few measurements on many varied cases (each case, i.e. source sentence, has one prediction, and there are only a few measurements for each source sentence), instead of many measurements on a few cases, making prediction amenable to errors. This is explained in the paper.

wehlutyk added A-writing D-medium labels Jul 1, 2016

wehlutyk mentioned this issue Jul 13, 2016

Fix counts #12

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cover letter #21

Cover letter #21

wehlutyk commented Jul 1, 2016 •

edited

Loading

Cover letter #21

Cover letter #21

Comments

wehlutyk commented Jul 1, 2016 • edited Loading

Things that have changed

Things we did not do

wehlutyk commented Jul 1, 2016 •

edited

Loading