-
Notifications
You must be signed in to change notification settings - Fork 449
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OPENNLP-1589 - Fix incorrect array check in CachedFeatureGenerator #635
Conversation
- removes deprecated getters in several records - removes a very long time deprecated constructor of TokenizerME - removes deprecated field in DefaultNameContextGenerator and adapts tests accordingly - marks a constructor of ADNameSampleStream as "for removal" to indicate actual removal is more likely in upcoming releases - marks deprecated 'NameFinderEventStream#generateOutcomes(..)' as "for removal" to indicate actual removal is more likely in upcoming releases - clears unused code from NameFinderME - fixes resource leak in TokenNameFinderFactory
… a dual caching strategy; relies on cache evication as implemented by the Cache class of OpenNLP. - Use Arrays.equals(...) - Demonstrate that caching based on identity is broken.
We found, that the == to equals results in different f-measures. Guess, we would need to compute baseline f-measure for each test without a cache and compare those results against the f-measure computed with a cache in place. Don't know how the original values we're computed. |
Ok, update as found by @mawiesne: the f-values from main are the correct ones. Tests will output them, if cache is disabled. Changing them (as done in my PR) is therefore an invalid operation. Nevertheless, we need to investigate why == was used in the first place (and not equals), and why this change only affects Dutch. |
I am going to close this PR for now. Given the fact, that the f-measure on However, this is not the case, which implies, that the changes are not valid. However, there are still some open questions to better understand what actually happens here and how we can implement a more efficient way of caching something without changing the f-measure in the eval test. The following questions remain:
Side-note: Played around with |
This (draft) PRs introduces a more aggressive caching strategy in the cached feature generator, which doesn't rely on
==
.However, the eval results are a bit odd for the Conll02 dataset:
Default (Spanish):
1589-Caching (Spanish)
Default (main) (Dutch)
1589-Caching (Dutch)
As you can see, the aggressive mechanism results in better caching.
It doesn't have an impact on Spanish and on any other eval test but the results for conll02 for dutch are odd (see changes in eval f1 scores).
They are sometimes slightly better but at the same time decrease in some scenarios.
I am actually wondering, why we don't see such changes in f-measure for Spanish. Therefore, I am opening this PR, so you can also investigate what is going on here.
@mawiesne is also having a look here, but we would appreciate some additional 👁️ 👁️