Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update lucene to version 8.11.2 #16

Open
wants to merge 26 commits into
base: master
Choose a base branch
from
Open

Update lucene to version 8.11.2 #16

wants to merge 26 commits into from

Conversation

tuomas2
Copy link

@tuomas2 tuomas2 commented Jul 19, 2024

Replaces #15

This gave access to some new features in Lucene, such as Regular Expression search. This is a major refactor because I updated Lucene 5 major versions.

I tested several languages, English, Czech, Chinese, Japanese, Thai and search works in these languages. I am not capable to test if the stemming is good for all languages, so some more testing by native speakers is necessary.

@tuomas2 tuomas2 changed the base branch from master to develop July 19, 2024 16:01
@tuomas2 tuomas2 changed the base branch from develop to master July 19, 2024 16:02
@tuomas2
Copy link
Author

tuomas2 commented Jul 19, 2024

So summarizing @JJK96 , I would like that we try to:

  • Remove AbstractBookAnalyzer alltogether, and all custom analyzers that are based on that.
  • Use StopwordAnalyzer as a baseclass for our custom analyzers (KeyAnalyzer etc)
  • Modify properties file / factory accordingly to use classes from core and other libs.
  • Change filter classes (used by some analyzers like KeyAnalyzer) so that they do not store book (as it does not seem to be used)

(related to discussion started here: #15 (comment))

Also removed LuceneAnalyzer and moved it's functionality into AnalyzerFactory
AnalyzerFactory now returns a real subclass of Analyzer, instead of a wrapper.

For all languages, language-specific analyzers are used, instead of Snowball Analyzers
Removed EnglishAnalyzer test in AnalyzerFactoryTest
@JJK96
Copy link

JJK96 commented Oct 14, 2024

  • Remove AbstractBookAnalyzer alltogether, and all custom analyzers that are based on that.
  • Use StopwordAnalyzer as a baseclass for our custom analyzers (KeyAnalyzer etc)
    • I used Analyzer as the base, since stopwording was not used by these classes.
  • Modify properties file / factory accordingly to use classes from core and other libs.
  • Change filter classes (used by some analyzers like KeyAnalyzer) so that they do not store book (as it does not seem to be used)

Added check for index version when getting index status.
This ensures that the status correctly represents if the index is invalid.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Blockers
Development

Successfully merging this pull request may close these issues.

2 participants