Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

resolve_once behavior #32

Open
sckott opened this issue Mar 3, 2014 · 6 comments
Open

resolve_once behavior #32

sckott opened this issue Mar 3, 2014 · 6 comments
Labels

Comments

@sckott
Copy link

sckott commented Mar 3, 2014

Curious if resolve_once behavior is correct e.g., this call http://resolver.globalnames.org/name_resolvers.json?names=Plantago+major&resolve_once=true returns more than one match (all from different sources I think, but more than 1 match)

The API docs suggest setting resolve_once=TRUE should just get first match.

@dimus
Copy link
Member

dimus commented Mar 3, 2014

I guess the option name is a bit confusing, the idea behind it was to avoid name parsing and return only exact matches if possible. In your example only exact matches (from several data sources) are returned, so parsing and matching by canonical form did not happen. It is faster than running name parser, but also removes quite a few results. Because of that resolve once is disabled by default. Are you interested in getting one result only?

You will see the difference if you change resolve_once=true to resolve_once=false

@sckott
Copy link
Author

sckott commented Mar 3, 2014

Thanks for your quick response! Hmm, I guess just getting one result doesn't make too much sense, so no, I don't think that's needed, and none of the users of my software ask for it.

I'll change the documentation in my software so that the resolve_once parameter is described more accurately.

@sckott
Copy link
Author

sckott commented Mar 4, 2014

Hi again @dimus - Actually, a user just asked about possibly returning just one match for each queried taxon name. Is that possible? It seems a bit tricky to do so, and may require a few possible choices. For example, if the parameter is called return_one, then could pick at random from a set of equivalent names (return_one=random), or pick from preferred data source (return_one=12, 12 for EOL), or other options?

We could do this on our side in R, but of course it make for faster data return times if it is done on your side.

@dimus
Copy link
Member

dimus commented Mar 10, 2014

oups, missed your new comment. There is not yet documented way to do something like that:

http://resolver.globalnames.org/name_resolvers.json?names=Plantago+major&best_match_only=true&data_source_ids=12

if no data_source_ids are given -- all of them will be used

In addition it is possible to add preferred_data_sources to best_match only. If no data_source_ids are given 'best match' will come from any data source. In addition if there is a match in the 'preffered data source' it will also be returned.

http://resolver.globalnames.org/name_resolvers.json?names=Plantago+major&best_match_only=true&preferred_data_sources=12|4

for now only BHL uses this functionality. If you will start to use it I will give it 'official' status and document it on the API page

@sckott
Copy link
Author

sckott commented Mar 10, 2014

Great, thanks for this, I'll include these two parameters in my taxize R package. I don't know how much people will use them - I'm sure at least some will.

@tucotuco
Copy link

tucotuco commented Dec 5, 2014

Looking forward to official status. This will be immensely useful for data improvement workflows in general. VertNet gives a "+1".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants