-
Notifications
You must be signed in to change notification settings - Fork 121
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update for lookups.dat #416
Comments
Add 'unhewn_cobblestone' (#270)
|
Great work! As explained in #414, I think this update has for me first priority (number of occurences) and should be implemented. On the other side, it is probably not possible to support/implement all the requests related to lookup/ new OSM tags: I think many impacts should be considered: as example the workload, the impact on the size of the rd5 files, the procesing time of a script where the number of checked tags grows and grows By tags with low/middle occurences (my old requests #241 and #276) I am today not sure, it was a good idea to open them |
Nice start indeed!
traffic_sign:direction is the same as direction=forward/backward/both, both but more specific. Ideally it could be combined with that but AFAIK the current parser does not support that.
cyclestreet is the "Belgium/Dutch version" of bicycle_road. Could again be combined if the parser would support that
See #104 and agricultural I think size is important but see #393, it seems like it is only about 20% of rd5-size. Still I would think it would be good to remove low-used tags that are more-or-less an error, it is not only size but also processing overhead. Examples:
Also dropping for example "living_street;0000000404 yes", "informal;0000002424 yes" etc. would make sense to me. And what about noexit, is that ever going to be used in routing context? |
Thank for this important information! |
See #300, add support for just marking (like a boolean) if there is a *:conditional key with arbitrary value. Following the use as documented in #300, what about:
Both in way and node context. Is support for '*' already present? |
@polyscias culvert = 0 - removed
I'm not sure about this, 'living_street' is grown up to 277.605 uses, but 'highway=living_street' should prefered. Also 'informal' has 103 522 entries. Seems to be pupular and 'living_street' is used in several profiles. 'noexit' - we already have the way end here, but could this be useful to determine a correct ending way and not an incorrect/unconnected way?
That's is heavy stuff. |
Yes, it is difficult to decide which tags should be added or deleted: Not only the occurences should be considered, the impact is also important. Some tags are only for "tuning" of the route and have a minor impact on the routing. (example #241 where I am not sure to suggest to add the tags I proposed 2 years ago) |
Sure, with the current changes we have less data. We always have the two choices:
In second case 'access=service' will be negative |
Thank for adding this tag (acces=service)! About #241 now: But between 5 and 10 values are for each line possible... Regards |
Yes, living_street did grow substantially, did not realize that, same for informal, so, at least these usage number should be updated ;-) I made myself a python script that parses the lookups.dat file, get's the numbers in way and node context from taginfo and writes out lookups_new.dat that has the same key:tags as the original file but with a comment added on the actual use. It also checks if a key:tag in way context is also present in node context and vice-versa. If so, it calculates the usage ratio between these two and if it is smaller than 0.01 is add a note to the comment about that. It also spits out a lookups_update.log file that has all tag:keys covered with a usage >= 5. Doing so I see there are quite some key:tags that are (almost) complete unused:
I see also the route* keys having zero way usage, for example:
But I assume that these key:tags are using for route relations...
For key:tags in both node and way context I think it is better to look at the ratio.
I think everything in the order of 0.0xx% are tagging mistakes and can be removed for sure. I also had a look on what is missing and I see:
The maxspeed tags have a space and I guess that is not supported. Not all should be added but adding quite some as aliases is I think a good idea. |
Great investigation.
Could be, I thought it is, but I have an eye on it. (EncodeDecodeTest it comes back as unknown) |
Yes great work! Is there a real benefit? I found only 1 script with a check on such a tag, it is using this logic ==> or cycleway= cycleway=no|none |
Yes, that is a good idea. I might have done that myself, but I you can do it, prefect. I propose to keep the usage numbers for it to be the same as is so it is easier to review the changes and only after committing the change do update of the usage numbers, I can provide that using the script I have. I will still review the list of tags once more for possible tags that an be removed with an usage between 10 and 100. I would prefer combining things like surface=artificial_turf with surface=earth and surface=ground, that will not only make the list shorter but also the profiles more compact. On access tags I think yes/designated/official, destination/delivery/customers and private/permit can be combined for routing purposes. On cycleway*=no none: How the parser works is that all given values in looksups.dat are assign, but if there is a key:tag whereby the key is in looksups.dat but the tag not, it is automatically assigned to the unknown category. So if you leave out no/none/private these will become part of unknown and the question is if you want that. Having no/none/private separate you can chose to handle them differently in a profile. |
Here an update on my repository It doesn't contain conditional tags. maxspeed=30 mph or similar is escaped for ways. Please see |
Nice piece of work! I would like to take a bit more time to review things, likely in the weekend. |
Thank for the explanation about "cycleway=no" and the "unknown" value! I checked my profile with lookup.dat V11, and had to change: My script is now error free, but to start some tests, I have to wait untill rd5´s for lookup version 11 become available |
@polyscias @EssBee59 |
No time to test iIn the next 2 weeks, so I prefer to wait on the rd5´s when avalaible |
The header says 131 additions and 102 deletions, so more added then deleted ;-) Let me start with three more requests: I am running already a long time with a profile that has support for traffic signals, see #183 but analyzing some routes I saw that they were different then expected because of a highway=traffic_signals that is different and for that there is traffics_signals=. I am interested in the difference between a "normal" traffic light and things like blinker/blink_mode/continuous_green and tram_priority/bus_priority Looking at taginfo and grouping things in 4 groups:
One other key that can be used to detect how popular a cycle route is: monitoring:bicycle. A "boolean" (node context) is enough:
For tracktype=4, can we added "grade3-5" as alias, that is mapped 917 times On the deletion on (almost) unused tags: I saw the commit also updates trekking.brf as that had else oneway=yes|true|1 in it, that "true" has been deleted because the usage was 0, so good catch! I am updating my script to check what the impact of the deletion would be on other profiles I know. I still would like to consider removing highway tags with the wrong context:
Searching for examples I found josm already warns for them: The blue way has the same problem as the problem node... |
All all the other standard profiles. |
Thanks for incorporating what I wrote in my previous post in 17c9873! I have updated the script to parse the profiles I know, should have done that earlier ;-) Most "pain" is with:
i.e. all profiles I know of. less "pain" with:
So the the "network update", oneway=true (zero usage) and sac_scale=T1-hiking (zero usage) On the "network update", with everything mapped to network=[inrl][cw]n:
Most other warnings I see for water related tag for @afischerdev (understood ;-) but also:
So profiles of @poutnikl, @utack and @ThomasTraber On further cleaning up: Instead of describing what I think can be done I took the last version of lookups.dat and made the changes in it:
The updated looksup.dat: https://gist.github.com/polyscias/358055cdad801b7b62b248310e09fddd |
Great, this was the needed pointer, my fail, I hadn't expected hardcoded tags here. So we have only the smaller changes for 'oneway=true (zero usage) and sac_scale=T1-hiking (zero usage)' and more. |
Made the rewind and started with more cleaning:
|
I took an older version of netherlands.osm.pbf I had still laying around and did run mapcreation on it without and with the latest changes the documented in this issue Then I took from the log of both runs the codec stats for E5_N50
Not sure what all these stats mean but overall a small decrees in size. I saw that the mapcreation with a small patch:
dumps the way and node stats for the data processed, filtered through looksup.dat. After that still the NodeFilter is run, I want to see if I can run planet.osm.pbf and get stats on the nodes remaining this filtering maybe that gives more unused tags as only nodes connected the way network are filtered out. It would be also good some feedback from profile developers especially @poutnikl I think. |
I found another strange situation using the rd5 afischerdev: As explained, due to errors or weakness in the tempoary rd5 file tests are much difficult. |
@EssBee59 |
I see the segment4_hessen.zip has in it E5_N50.rd5, E5_N45.rd5 and E10_N50.rd5 but no lookups.dat so likely there is a disconnect between which lookups.dat was used to generated the .rd5's and the one @EssBee59 is using. If that is not the case, can it be because we have the number for new keys/vals set to 00000001? |
@polyscias |
Yes, a retest with the right lookups.dat looks much better! (of course "elevation" is not available in this test version) |
The discussion seems to be over. So it should become real. What do you think? |
Add a check for |
Hi, I just wanted to ask: Is this issue still being worked on? I’d love to see the Edit: Nevermind, I just found issue #458. |
Initial with #414, we like to update the lookups.dat and start the collection
Add 'cycleway:both' (#414)
Add 'cycleway:lane' (#398)
see wiki https://wiki.openstreetmap.org/wiki/Key%3Acycleway%3Alane
Change 'highway=no' (#402)
Change 'traffic_calming' removed * (1x)
The text was updated successfully, but these errors were encountered: