-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some experiences with the clean method and surnames #17
Comments
These will be tricky ones. The [#<struct Namae::Name
family="Q.",
given="Groom",
suffix=nil,
particle=nil,
dropping_particle=nil,
nick=nil,
appellation=nil,
title=nil>] ...as does the underlying And so, with a namestring like [#<struct Namae::Name
family="A.",
given="A. Sagástegui",
suffix=nil,
particle=nil,
dropping_particle=nil,
nick=nil,
appellation=nil,
title=nil>] ...(as does the underlying
Well, I guess in the example The other issue is when namestrings like these are presented in a list such as:
...here's where it gets really complicated 😄 |
In the example [#<struct Namae::Name
family="Sagástegui A.",
given="A.",
suffix=nil,
particle=nil,
dropping_particle=nil,
nick=nil,
appellation=nil,
title=nil>] But, we'd be fighting with the underlying parsing expression grammar in the |
The other alternative is to treat the Still, this syntax may be quite rare. I'll keep you updated when I find more. |
Interesting idea. That's certainly possible via addition to https://github.com/bionomia/dwc_agent/blob/master/lib/dwc_agent/constants.rb#L358. But, if we were to add |
I ran into this when trying to match parsed names to Wikidata labels.
Abundio Sagástegui Alva
seems to have signed his collected specimens at times asA. Sagástegui A.
As is still the custom, he had two surnames (from both parents), but for a reason I do not know he abbreviated his maternal surname sometimes.The gem parses this latter string as given name equal to
A. Sagástegui
. The clean method reverses this and makesA. Sagástegui
the family name. Both are in principle incorrect, but the gem currently typically treats the surname as a single entity, so the original uncleaned parsing (i.e. only the abbreviatedAlva
as the family name) would be consistent behavior. It also concatenates into the original string again. I don't know exactly why theclean
method does this, but is there a way to stop the behavior without breaking something else?Another parsing issue I encountered was with
Aznavour G. V.
(i.e. Georges Vincent Aznavour), which is parsed intoAznavour G.
andV.
and then afterclean
reversed as well, concatenated in the end asV. Aznavour G.
Is there a complication with treating all initials after a word as given names?The text was updated successfully, but these errors were encountered: