Pls contact me if you need processed data and model! xpeng62 at gatech dot edu.
This code accompanies the paper Inferring the Reader: Guiding Automated Story Generation with Commonsense Reasoning
- There are some extra packages you need to install here.
conda install -c pytorch pytorch
pip install ftfy
First, download the pretrained models from here.
Then untar the file:
tar -xvzf pretrained_models.tar.gz
- Then run the following script to interactively generate arbitrary ATOMIC event effects:
python scripts/interactive/ --model_file pretrained_models/atomic_pretrained_model.pickle
How to use COMeT2020?
Relations | Human Readable Template |
AtLocation | located or found at/in/on |
CapableOf | is/are capable of |
Causes | causes |
CausesDesire | makes someone want |
CreatedBy | is created by |
Desires | desires |
HasA | has, possesses or contains |
HasFirstSubevent | BEGINS with the event/action |
HasLastSubevent | ENDS with the event/action |
HasPrerequisite | to do this, one requires |
HasProperty | can be characterized by being/having |
HasSubEvent | includes the event/action |
HinderedBy | can be hindered by |
InstanceOf | is an example/instance of |
isAfter | happens after |
isBefore | happens before |
isFilledBy | blank can be filled by |
MadeOf | is made of |
MadeUpOf | made (up) of |
MotivatedByGoal | is a step towards accomplishing the goal |
NotDesires | do(es) NOT desire |
ObjectUse, UsedFor | used for |
oEffect | as a result, Y or others will |
oReact | as a result, Y or others feels |
oWant | as a result, Y or others want |
PartOf | is a part of |
ReceivesAction | can receive or be affected by the action |
xAttr | X is seen as |
xEffect | as a result, PersonX will |
xIntent | because PersonX wanted |
xNeed | but before, PersonX needed |
xReact | as a result, PersonX feels |
xReason | because |
xWant | as a result, PersonX wants |
How to use GPT-2?
- Run the code to use gpt-2
python ./examples/text-generation/ --model_type=gpt2 --length=20 --model_name_or_path=gpt2 --num_return_sequences 5
All the related file is in /cometFilter/ folder.
Remember to add path in
in your machine -
Run the baseline of COMeT filtering: No filtering, two characters' interaction story
Run codes with new decoder, new prompt system, single char. Add
to use two chars. Remove '-d' to remove decoder.--model_name_or_path
: str; the language model path.--prompt_file
: str; the first story sentence file. Each line is one prompt. txt file.--comet_use
: store_true; whether to use our technique. Remove it to use baseline.-f
: str; filter level. We usestrong_new
in inlg paper.--coref_use
: store_true; use coreference resoluation to remove the 3rd char--use_cond
: store_true; use new prompt style. EX.* [Char_1] * [Char_1] loves ice cream.
: int; story length. We use 5 or 10.-d
: store_true; new decoding system.--history_use
: store_true; use history when generate story using LM.--diverse
python -W ignore --model_name_or_path finetune_language_model/finetuned_model/roc_char/21 --prompt_file data_story/test.txt --comet_use -f strong_new --use_cond -l 5 -d --history_use --diverse
- Run codes with new decoder,
old prompt system
. Add-t
to use two chars.
python -W ignore --model_name_or_path finetune_language_model/finetuned_model/roc_pure --prompt_file data_story/test.txt --comet_use -f strong_new -l 5 -d --history_use --diverse
- Run baseline to compare with.
python -W ignore --model_name_or_path finetune_language_model/finetuned_model/roc_char/21 --prompt_file data_story/test.txt --use_cond -l 5 --history_use --device_id 1
- Decide matching criteria pairs by the following code:
python --check
: print all the matching criteria pairs' max score
- Decide criteria matching criteria with files.
python --file_check
: summarize all the possible matching criteria pairs in one file.
: file path of story you use for decide matching criteria.
: file path of story you use for saving matching criteria.
- Verify matching criteria by the following code:
python --verify --char 2
: verify the matching criteria pairs
: int; number of characters
Prompt | Continuation |
xWant | xIntent |
sentence | xNeed |
xEffect | sentence |
CausesDesire | Desires |
isBefore | isAfter |
AtLocation | AtLocation |
Prompt | Continuation |
oReact | xAttr |
oWant | xIntent |
oEffect | sentence |
Filter level:
: T or F.--filter_level
: str.weak
: As long as any 1st sentence's "effects on others" matches the 2nd sentence's "effects/ attr/causes for PersonX", return Truemedium
: As long as any 1st sentence's "effects on others" matches the 2nd sentence's "attr/causes for PersonX", return Truestrong
: oWant -> xIntent; oEffect -> xNeed; oReact -> xAttrwant
: oWant -> xIntentneed
: oEffect -> xNeedeffect
: oReact -> xAttr
- Relax the match condition to weak (
) when there is no match after 50 rounds --num_matching
: int, default is 1, indicating how many matching has to be reached.
Parser for extracting character names (
)pip install neuralcoref python -m spacy download en
- spacy.strings.StringStore size changed error: try installing from source
Similarity check for matching
pip install sentence_transformers
- The similarity file is located at
- It embeds the output from the COMET and calculate the similarity score for matching.
- The default similarity score threshold is 0.8 (
- The similarity file is located at
download this weight to the
foldercurl -O ""
Run the codes with diverse beam search (hamming)
: diverse beam search type, i.e. hamming (#TODO for more options)--beam_lamb
: the coefficient for adding penalty, larger is more penalty
Modify the words prob
- do NOT decrease the prob of high frequency words (TFIDF) - added
- do NOT decrease the prob of characters
: bool, to control if exclude chars names
Use similarity to penalize:
: bool, to control whether to use glove embedding
Filter parser: add parser to filter process
- Use the parser to check if the sentence candidate generate the third characters.
: T or F
Coreference resolution:
- We use Coreference Resolution in spaCy with Neural Networks here.
- If we find the third char or like 'them', some coref which is not among the two chars, we filter it out.
: T or F- Only choose one between coref and filter parser.
❣️ Use the whole history to generate text instead of one sentence!
: T or F, default is T
Baseline with no chars name.
- We replace the name of characters with [MALE], [FEMALE], [NEURAL] and train the gpt2, located in
- --char_name_use: bool, default=True. Use F/ False if you wanna use this model
- We replace the name of characters with [MALE], [FEMALE], [NEURAL] and train the gpt2, located in
Number of characters
- We define this model is used to generate a 2-char or 1-char story generation.
: bool, default isTrue
, if you only wanna consider one character, pls set it asFalse
- We use backtrack to help find the matching?
: bool. True indicates use the backtrack.
- RocStories with
- prompt dealed files:
- Fine-tune LM on roc dataset:
python cometFilter/finetune_language_model/
- Fine-tune LM on prompt_char:
python cometFilter/finetune_language_model/
- Run the parser, you need to go to
and then runjava -mx4g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 30000