Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add reverse geocoding implementation with PostGIS #1

Merged
merged 3 commits into from
Jan 16, 2025

Conversation

emirfabio
Copy link
Contributor

No description provided.

@github-actions github-actions bot added the enhancement New feature or request label Jan 15, 2025
@emirfabio emirfabio changed the title Add reverse geocoding implementation with PostGIS feat: Add reverse geocoding implementation with PostGIS Jan 15, 2025
@spwoodcock spwoodcock self-requested a review January 16, 2025 15:38
"""
self.conn_string = config.get_connection_string()

with importlib.resources.path(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting! I have never used importlib this way - it's a pretty neat approach.

Pathlib is another option, that I typically use:

from pathlib import Path
self.cities_file = Path(__file__).parent / "data" / "cities_1000_simple.txt.gz"

but your approach is probably more immune to project directory restructuring 👍

RuntimeError: If database query fails
"""
# Validate coordinate ranges
if not -90 <= lat <= 90:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice to have this validation 👍

def test_geocoder_initialization():
"""Test that geocoder can initialize and find its data files."""
config = get_test_config()
geocoder = ReverseGeocoder(config)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the initialize method needs to be called here potentially, unless we put the call inside the init method, or perhaps a enter method to be used via context manager

Copy link
Member

@spwoodcock spwoodcock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing work!

This should do nicely for our needs 😄

sync / async

One key extra thing to add is async support (which I'm happy to look at if it doesn't interest you 👍).

I wonder if there is any neat way to have a sync and async version of the same class, without having to duplicate the code adding async/await (might be tricky).

One idea could be to use __enter__ for the sync class and __aenter__ to handle a sync and async initialization separately:

# sync
with ReverseGeocoder as reverse_geocode:
    ...

# async
async with ReverseGeocoder as reverse_geocode:
    ...

Closing the db connection

I think with the current implementation we have a psycopg.connect, but no .close() method called, meaning the connection may remain open?

An advantage of the context manager based approach with ReverseGeocoder as ..., is you can use the __exit__ magic method to do the database closing before the context is closed.

@spwoodcock spwoodcock merged commit 53a7718 into hotosm:main Jan 16, 2025
3 of 5 checks passed
@spwoodcock
Copy link
Member

spwoodcock commented Jan 16, 2025

Created some follow ups!

#2
#3

@emirfabio
Copy link
Contributor Author

Hellooo Sam
Thanks for the feedback!

  1. Pairing the comment on the Initialization and the issue on using an existing connection; is it a possible scenario for a service using this package to be attempting to connect to a db instance that may already have the voronoi table populated by another instance of this package?

I think it'd be nice for the Init function to be both left explicit and have it automatic via context manager. It could run needed checks on the existing data in the db, initialize accordingly and return gracefully.

  1. i'll definitely close the connection hehe
  2. i'll be glad to look into async support

@spwoodcock
Copy link
Member

That's a very good point!

Should we rename the table to something more unique, like pg_nearest_city_geocoding, and then during the initialisation, drop the table? DROP TABLE IF EXISTS pg_nearest_city_geocoding prior to creating?

@spwoodcock
Copy link
Member

Or probably even better is to do a conditional init depending on if the table already exists!

Skip the init if it's already present 😄

What do you think?

@emirfabio
Copy link
Contributor Author

Or probably even better is to do a conditional init depending on if the table already exists!

Yup! That's what I was thinking.

And also yes, I'll rename the table. In fact, the naming in general is all over the place.

We could lean into the NearestCity name everywhere.

Does that work for you ?

@spwoodcock
Copy link
Member

Or probably even better is to do a conditional init depending on if the table already exists!

Yup! That's what I was thinking.

And also yes, I'll rename the table. In fact, the naming in general is all over the place.

We could lean into the NearestCity name everywhere.

Does that work for you ?

Sounds like a plan!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants