-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Delegate details of connection handling #18
Comments
My first thought is to add a level of abstraction. We can grow a source that only provides this ability, namely to manage the connection information, and just pass that down to the actual source implementations. For example:
This gives us a layer where we can toy with threads or working with the connection pool for how we want to dole out the connections to each source, eg whether they all run in the same transaction, share a connection, or each get separate ones, and it prevents upstream from having to know about In our case this can replace the |
What bothers me about joining up the What do you think about keeping with ConnectionManager(all_connections_config) as connmgr:
# enumerate tiles, etc...
source(connmgr, tile) Then, within the def __call__(connmgr, tile):
with connmgr.database_cursor() as cur:
# do stuff with cur How does that sound to you? |
I agree here, and was only suggesting that we effectively get rid of the multisource, and just have a single databasesource, with each of the others being table sources, only reason being that we wouldn't need it any more. If we find ourselves having multiple other sources we can always introduce it and have both, a multisource and a separate databasesource. Or we could just keep it around and not use it. I'm not bothered by these having separate signatures, because nobody is going to be interfacing with the table sources directly from a higher level. And if that's ever required, it just needs to be wrapped in a databasesource, which isn't too cumbersome, and makes sense conceptually as well, ie to talk to a table you need to go through a database first. The existing sources are effectively mapped to tables anyway. Considering the future, if other sources are added, it means that they'll need to get a connmgr, even if they don't need one, which just feels like the details of one implementation have leaked into the api. Continuing with that reasoning, it could also imply that the connmgr might need to grow additional concerns for new implementations, and then we'd need to be aware of all the other implementation's usage of it any time changes are made to it.
This will certainly work in our case for managing connections, but I don't think it's as optimal as moving the concern in a layer above. In effect, it's pulling what it needs, instead of having it pushed to it, which means it knows about more than it might have to otherwise. Additionally, if we find that we want to run each source in a separate thread, we would still need something to coordinate that at a higher level anyway, and then the context manager as a whole needs to be thread safe because calls are being made into it from the sources. I want to point out that I'm playing devil's advocate and this is largely academic, because at this point we only have a single type of implementation that's single threaded. But it's fun to bike shed! So, if the conn mgr still makes more sense to you, go for it! :) |
This connection management is ugly, and forces the calling code to know about connections. This should be delegated and encapsulated in some other object which can handle connection setup and re-use, and can handle connection types other than
psycopg2
. This is slightly complicated because theSource
knows where it wants to get the data from, but this has to be pre-configured for connection sharing. Perhaps this would work best with aConnectionManager
class which collected the sources and introspected them? Or cached connections for re-use?The text was updated successfully, but these errors were encountered: