introduce fts_term() function for preparing tokens for FTS5 token matching #217
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I noticed what appears to be a bug in the code but appears to have no negative effects on the quality of results.
There is a piece of code which replaces all occurrences of space with underscore (
.replace(/ /g, '_')
).I don't remember exactly why we do this but I assume its to allow us to keep multi-word tokens (such as 'new york') as a single token.
Recently we started using the FTS index more, so we should apply this space->underscore replacement in all parts of the code which use the FTS index.
Now.. why it 'just works' anyway is a bit of a mystery 🤷