Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with fieldPotency caused by casesensitive comparison when counting $numberOfMatches #43

Open
rraadd opened this issue May 26, 2017 · 0 comments

Comments

@rraadd
Copy link

rraadd commented May 26, 2017

Hi,

I had some issues with &fieldPotency. The simpleSearch call I used was:

[[!SimpleSearch?
&searchStyle=partial
&docFields=pagetitle,longtitle,description,introtext
&perPage=50
&fieldPotency=pagetitle:10,longtitle:1,description:1,introtext:1
]]

No matter what values for pagetitle I set the sorted results seemed to be randomly ordered. Resources which had the search term in their title were listed after those where the term was used only in the description tag even when pagetitle fieldPotency values were many times higher than those set for the description.

The problem

When the search term is used to count the number of matches for each &docField the comparison seems to be CASE SENSITIVE. Since the titles of my resources all start with a capital letter when compared to search term the result was no match. Thus the potency values for pagetitle were simply ignroned. To understand this behavior better check lines 126-132 of simplesearchdriver.php located in model/simplesearch/driver/

foreach ($this->search->searchArray as $term) {
$queryTerm = preg_quote($term,'/');
$regex = ($searchStyle == 'partial') ? "/{$queryTerm}/i" : "/\b{$queryTerm}\b/i";
$numberOfMatches = preg_match_all($regex, $resource->{$field}, $matches);
if (empty($this->searchScores[$resourceId])) $this->searchScores[$resourceId] = 0;
$this->searchScores[$resourceId] += $numberOfMatches * $potency;
}

In my particular case the problem was 'solved' by simply changing the first letter of $term to be capital before checking for matches by adding those 3 lines of code:

foreach ($this->search->searchArray as $term) {

$first_letter = mb_strtoupper(mb_substr($term, 0, 1, "UTF-8"), "UTF-8");
$term_end = mb_substr($term, 1, mb_strlen($term, "UTF-8"), "UTF-8");
$term = $first_letter . $str_end;

$queryTerm = preg_quote($term,'/');
$regex = ($searchStyle == 'partial') ? "/{$queryTerm}/i" : "/\b{$queryTerm}\b/i";
$numberOfMatches = preg_match_all($regex, $resource->{$field}, $matches);
if (empty($this->searchScores[$resourceId])) $this->searchScores[$resourceId] = 0;
$this->searchScores[$resourceId] += $numberOfMatches * $potency;
}

I am not a programmer so this 'solution' may turned out to be ineffective or even wrong. If you have better ideas how this issue could be avoided I would greatly appreciate if you share you knowledge. Thanks in advance.

@rraadd rraadd changed the title Issue with fieldPotency caused by casesensitive comaprison when counting $numberOfMatches Issue with fieldPotency caused by casesensitive comparison when counting $numberOfMatches May 26, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant