-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Precompute Centroid for polygon areas #7418
Conversation
ed50a8b
to
1fec859
Compare
Review app https://teaching-vacancies-review-pr-7418.test.teacherservices.cloud was successfully deleted |
@@ -18,18 +18,27 @@ def call | |||
geometry = feature["geometry"].to_json |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we have to run this by hand? I couldn't find anywhere where we're running this script. I'm also worried about the 99 polygons in current set vs 247 ish in production - it looks like old ones are never removed - although of course some are needed as they are linked to by landing pages
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also 4326 is everywhere (and scary and wrong) - is it worth at least creating a constant and documenting our usage of it as we currently understand it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The API name is out of date (for Cities) - it should be Major_Towns_and_Cities_Dec_2015_Boundaries_V2_2022 IMHO
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@starswan good points.
On the when is this running? It is scheduled to run every Sunday with the ImportPolygonDataJob
.
Also noticed the current set is way smaller and reached the same conclusion: Old ones are never removed (as the whole script only "find or creates".
I am thinking of adding a rake task to compute the centroid for those with an area and missing centroid, what should cover the legacy existing polygons.
Seems the 4326
gets added automatically by Rails when adding the migration into the schema. Will try to see if I find any option where this can be tweaked: add_column :location_polygons, :centroid, :st_point, geographic: true
I'm going to extend a bit the scope of this ticket and clarify:
- Need to add a task to compute the legacy location polygons area centroid.
- What do we want to do with legacy location polygons? (This may need product team input)
- Over the new 98 polygons or so on the API calls, there are over 20 without area data (what do we do with those? do we want them at all?).
Can we tweak theFound that it defines geographical data. We will explore migrating from geographical to geometrical data in a separate ticket.srid
in the DB fields migrations?- Update the out of date polygon API data sources.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
weird the composite polygons have neither area nor centroid, so I'm not sure how that is going to work...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Description: This file contains the digital vector boundaries for the Major Towns and Cities in England and Wales, as at December 2015. Version 2 includes centroid data in the attributes table. [yummy!]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@starswan I've been studying migrating all the endpoints to the ones that provide LAT/LONG
data for their centroid.
While this is accurate (compared them with our computed ones) for the Major Towns and Cities endpoint, for counties and regions endpoints the LAT/LONG is rounded up to the closes non-fractional number, turning the provided centroid as heavily inaccurate (multiple counties get the same centroid info, for example).
So I'm going to discard the attempt to pull this info from the APIs and keep computing it each time we update the location polygons, as it happens in the background weekly and turned out to be more precise than the available data on the ONS.
1fec859
to
7d50e00
Compare
The centroid is used for calculating the distances on the vacancy searches. The centroid for a polygon area doesn't change unless the area changes too. As we dinamically calculate it for every location polygon area against every vacancy during the searches, if we pre-compute it we save the DB to execute again the centroid calculation on every search.
The task will calculate and populate the centroid of the location polygons with an area but not stored centroid info.
7d50e00
to
2f6b796
Compare
Trello ticket
https://trello.com/c/cf3f17rU/1528-use-precomputed-centroid-of-locationpolygons-areas
Description
The centroid is used to calculate the distances on the vacancy searches.
The centroid for a polygon area doesn't change unless the area changes too.
As we dynamically calculate it for every location polygon area against every vacancy during the searches, if we pre-compute it we save the DB to execute again the centroid calculation on every search.
This is the first step in pre-computing and storing it in DB.
Further work will follow: