You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi everyone, I'm using the /parser/search API to search a location using free-text. Can we have another option besides lang to filter tokens by only a specific language? Thanks in advance.
Use-cases
I searched for the keyword Holland, Michigan and it returned Holland but in the War language (Austroasiatic language used by the minority of people in Bangladesh and India). When mapping into English, it turned into Baraga, which caused confusion to the users.
As well as two queries match_subject_distinct_subject_ids and match_subject_autocomplete_distinct_subject_ids
Proposal
In my opinion, we should define an option to specify which languages or all of them should be used. For example, we can pass a search_language parameter with a value like "eng,fra" to filter tokens in English and French. If this option is ignored, we can filter all languages.
The SQL queries after editing should be:
SELECT DISTINCT( t1.id ) AS subjectId
FROM tokens AS t1
JOIN fulltext AS f1 ON f1.rowid = t1.rowid
WHERE f1.fulltext MATCH $subject
AND t1.lang IN ('eng', 'fra')
-- AND t1.tag NOT IN ( 'colloquial' )
ORDER BY t1.id ASC
LIMIT $limit
SELECT DISTINCT( t1.id ) AS subjectId
FROM tokens AS t1
JOIN fulltext AS f1 ON f1.rowid = t1.rowid
WHERE f1.fulltext MATCH $subject
AND t1.lang IN ('eng', 'fra')
-- AND t1.tag NOT IN ( 'colloquial' )
ORDER BY t1.id ASC
LIMIT $limit
The text was updated successfully, but these errors were encountered:
I don't understand the issue completely, your description says:
I searched for the keyword Holland, Michigan and it
returned Holland but in the War language
(Austroasiatic language used by the minority of
people in Bangladesh and India). When mapping
into English, it turned into Baraga, which caused
confusion to the users.
Hi everyone, I'm using the
/parser/search
API to search a location using free-text. Can we have another option besideslang
to filter tokens by only a specific language? Thanks in advance.Use-cases
I searched for the keyword Holland, Michigan and it returned Holland but in the
War
language (Austroasiatic language used by the minority of people in Bangladesh and India). When mapping into English, it turned into Baraga, which caused confusion to the users.I have debugged and found the problem within this function:
https://github.com/pelias/placeholder/blob/master/lib/Queries.js#L83-L105
As well as two queries
match_subject_distinct_subject_ids
andmatch_subject_autocomplete_distinct_subject_ids
Proposal
In my opinion, we should define an option to specify which languages or all of them should be used. For example, we can pass a
search_language
parameter with a value like "eng,fra" to filter tokens in English and French. If this option is ignored, we can filter all languages.The SQL queries after editing should be:
The text was updated successfully, but these errors were encountered: