Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OPENNLP-1589 - Fix incorrect array check in CachedFeatureGenerator #635

Closed
wants to merge 7 commits into from

Conversation

rzo1
Copy link
Contributor

@rzo1 rzo1 commented Jul 9, 2024

This (draft) PRs introduces a more aggressive caching strategy in the cached feature generator, which doesn't rely on ==.

However, the eval results are a bit odd for the Conll02 dataset:

Default (Spanish):

opennlp.tools.util.featuregen.CachedFeatureGenerator@2ddc9a9f: hits=100385 misses=52923 hit%0.6547929657943486
opennlp.tools.util.featuregen.CachedFeatureGenerator@76a4ebf2: hits=98677 misses=51533 hit%0.6569269689101924

1589-Caching (Spanish)

opennlp.tools.util.featuregen.CachedFeatureGenerator@2e385cce: hits=102229 misses=51079 hit%0.6668210399979126
opennlp.tools.util.featuregen.CachedFeatureGenerator@6e6f2380: hits=99197 misses=51013 hit%0.6603887890286931

Default (main) (Dutch)

opennlp.tools.util.featuregen.CachedFeatureGenerator@2ddc9a9f: hits=67301 misses=37687 hit%0.6410351659237247
opennlp.tools.util.featuregen.CachedFeatureGenerator@76a4ebf2: hits=123179 misses=68875 hit%0.6413769044123007

1589-Caching (Dutch)

opennlp.tools.util.featuregen.CachedFeatureGenerator@45d84a20: hits=68174 misses=36814 hit%0.6493504019506992
opennlp.tools.util.featuregen.CachedFeatureGenerator@52f27fbd: hits=124618 misses=67436 hit%0.6488695887614941

As you can see, the aggressive mechanism results in better caching.

It doesn't have an impact on Spanish and on any other eval test but the results for conll02 for dutch are odd (see changes in eval f1 scores).

They are sometimes slightly better but at the same time decrease in some scenarios.

I am actually wondering, why we don't see such changes in f-measure for Spanish. Therefore, I am opening this PR, so you can also investigate what is going on here.

@mawiesne is also having a look here, but we would appreciate some additional 👁️ 👁️

mawiesne and others added 5 commits July 4, 2024 11:25
- removes deprecated getters in several records
- removes a very long time deprecated constructor of TokenizerME
- removes deprecated field in DefaultNameContextGenerator and adapts tests accordingly
- marks a constructor of ADNameSampleStream as "for removal" to indicate actual removal is more likely in upcoming releases
- marks deprecated 'NameFinderEventStream#generateOutcomes(..)' as "for removal" to indicate actual removal is more likely in upcoming releases
- clears unused code from NameFinderME
- fixes resource leak in TokenNameFinderFactory
… a dual caching strategy; relies on cache evication as implemented by the Cache class of OpenNLP.

 - Use Arrays.equals(...)
 - Demonstrate that caching based on identity is broken.
@rzo1
Copy link
Contributor Author

rzo1 commented Jul 9, 2024

We found, that the == to equals results in different f-measures.

Guess, we would need to compute baseline f-measure for each test without a cache and compare those results against the f-measure computed with a cache in place.

Don't know how the original values we're computed.

@rzo1
Copy link
Contributor Author

rzo1 commented Jul 9, 2024

Ok, update as found by @mawiesne: the f-values from main are the correct ones. Tests will output them, if cache is disabled.

Changing them (as done in my PR) is therefore an invalid operation.

Nevertheless, we need to investigate why == was used in the first place (and not equals), and why this change only affects Dutch.

@rzo1
Copy link
Contributor Author

rzo1 commented Jul 9, 2024

I am going to close this PR for now. Given the fact, that the f-measure on main was created as a baseline without using an actual cache, this PR should create the same results with caching.

However, this is not the case, which implies, that the changes are not valid.

However, there are still some open questions to better understand what actually happens here and how we can implement a more efficient way of caching something without changing the f-measure in the eval test.

The following questions remain:

  • Why did we originally choose to use == instead of Arrays.equals(...) for caching the features.
  • Why does using Arrays.equals(...) only impact the eval tests for Dutch.

Side-note: Played around with IdentityHashMap to get a similar behaviour as on main. This will make the eval tests pass (as expected) because the cache numbers are identical, but doesn't make sense to impl it like that (because no gain)

@rzo1 rzo1 closed this Jul 9, 2024
@rzo1 rzo1 deleted the OPENNLP-1589 branch July 10, 2024 09:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants