Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to test on our own data #15

Open
FazeleTavakoli opened this issue Mar 15, 2020 · 1 comment
Open

how to test on our own data #15

FazeleTavakoli opened this issue Mar 15, 2020 · 1 comment

Comments

@FazeleTavakoli
Copy link

FazeleTavakoli commented Mar 15, 2020

Hi,

We would appreciate if we can have a code to test the model on a different dataset where we can easily get a probable plan given the triples and generate a verbalization out of that.

Thanks in advance.

@AmitMY
Copy link
Owner

AmitMY commented Mar 20, 2020

The best solution would be to add test_reader to the Config class, and to use it in this row: https://github.com/AmitMY/chimera/blob/master/process/pre_process.py#L9 (some tweaks necessary like to do it only in test, but not in train or dev).


The simple solution is to run the training code, then to change the test set.

You can run this:

config = Config(reader=WebNLGDataReader,
                planner=neural_planner,
                reg=BertREG)
res = MainPipeline.mutate({"config": config}).execute("WebNLG", cache_name="WebNLG")

Once the model finishes training on the training dataset, you can instantiate a new TestCorpus:

config = Config(reader=YourCustomDatasetReader)
test = TestCorpusPreProcessPipeline.mutate({"config": config}).execute("CustomName", cache_name="CustomName")

And finally combine the two for translation

translate = TranslatePipeline.mutate({*res, "test-corpus": test["test-corpus"]}).execute(...)

The more advanced solution, which is also more extensible is to create your own pipeline based on all of the parts from the process directory.

Here is an example: https://github.com/AmitMY/chimera/blob/master/experiments.py

This file alone runs at least 16 experiments I can remember of different parameters like planners, regs, and decoding methods. It can easily be modified to load whatever train, or dev set you would like, and play with whatever configuration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants