Why do scorers return a different result than process.extract #165
-
This question was originally asked in #164
>>> from fuzzywuzzy import fuzz, process
>>> x = ['atk', 'atk%', 'atk!', 'atk$']
>>> process.extract('atk!', x)
[('atk', 100.0, 0), ('atk%', 100.0, 1), ('atk!', 100.0, 2), ('atk$', 100.0, 3)]
>>> fuzz.ratio('atk', 'atk%')
86
>>> process.extract('atk!', x, scorer=fuzz.ratio)
[('atk', 100.0, 0), ('atk%', 100.0, 1), ('atk!', 100.0, 2), ('atk$', 100.0, 3)] @Dosx001 I moved this into a separate discussion, since it is an unrelated issue |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 1 reply
-
You can disable this behavior by passing process.extract('atk!', x, scorer=fuzz.ratio, processor=None) or enable it for scorers by passing
|
Beta Was this translation helpful? Give feedback.
-
@maxbachmann Thanks! This was what I needed. Also at of curiosity why does RapidFuzz and FuzzyWuzzy work like this. Why do I need |
Beta Was this translation helpful? Give feedback.
process.extract
preprocesses strings by default (see https://maxbachmann.github.io/RapidFuzz/process.html#extract)You can disable this behavior by passing
processor=None
:or enable it for scorers by passing
processor=utils.default_process
: