Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PgCat Gets Confused With Multiple Unnamed Prepared Statements #875

Open
tneely opened this issue Nov 21, 2024 · 2 comments
Open

PgCat Gets Confused With Multiple Unnamed Prepared Statements #875

tneely opened this issue Nov 21, 2024 · 2 comments

Comments

@tneely
Copy link

tneely commented Nov 21, 2024

Describe the bug
There's a bug in PgCat around parsing and binding of multiple unnamed prepared statements.

PgCat saves prepared statements in a HashMap<String, (Arc<Parse>, u64)>, where the key is the name of the prepared statement. It also parses and buffers all extended queries until it receives a Sync, at which point the queries are sent down to the server.

Because PgCat renames and dedupes the prepared statements it sends to the server, it first must make sure the server has all the prepared statements it needs. This is problematic in the case of unnamed prepared statements.

Take the following extended query:

  1. Parse("", "CREATE TABLE my_table")
  2. Bind("")
  3. Execute
  4. Parse("", "SELECT * FROM my_table")
  5. Bind("")
  6. Execute
  7. Sync

When PgCat receives this query it renames and buffers the first parse into its hashmap: { "": ("PGCAT_0", "CREATE TABLE my_table") }.

On the second parse it renames the parse, buffers it, and overwrites its map: { "": ("PGCAT_1", "SELECT * FROM my_table") }.

Now we get the sync and start working through the buffer.

On the first parse: we check to see if the server already has PGCAT_0. In our case it does not and buffers a Parse("PGCAT_0", "CREATE TABLE my_table") to send down.

On the first bind: we need to make sure the prepared statement we're using is already on the server. We look up "" in our hashmap and find ("PGCAT_1", "SELECT * FROM my_table"). PGCAT_1 is not on the server, so we first do a Parse("PGCAT_1", "SELECT * FROM my_table").

But wait! We just sent down the second prepared statement when we were still working on the first. Now the server will fail to parse PGCAT_1 (the table doesn't exist yet!), and the whole extended query will fail.

Now if the second query wasn't dependent on the first (or just didn't fail to parse in general), we should still get correct results. This is because we still buffer the correct Bind("PGCAT_1") and Bind("PGCAT_1") messages to send down to the server.

To Reproduce
Send an extended protocol query with the following:

  1. Parse("", "CREATE TABLE my_table")
  2. Bind("")
  3. Execute
  4. Parse("", "SELECT * FROM my_table")
  5. Bind("")
  6. Execute
  7. Sync

Read the response and see that it fails.

Expected behavior
PgCat handles unnamed prepared statements correctly and parses / binds them in the correct order.

Screenshots
N/A

Desktop (please complete the following information):

  • OS: aarch64 / AL2
  • Browser: N/A
  • Version: Postgres 16 / PgCat 1.2.0

Additional context
Add any other context about the problem here.

@zainkabani
Copy link
Contributor

Hi Taylor, this pattern of using multiple Parse statements before a single sync is not something I've seen in the wild. Is this something your ORM is doing? or are you specifically sending? are you able to send syncs in between the parses?

@tneely
Copy link
Author

tneely commented Dec 13, 2024

I've seen it used by JDBC specifically in the context of Liquibase update. This is with autocommit on:

PostgreSQL
    Type: Parse
    Length: 13
    Statement: 
    Query: BEGIN
    Parameters: 0
PostgreSQL
    Type: Bind
    Length: 12
    Portal: 
    Statement: 
    Parameter formats: 0
    Parameter values: 0
    Result formats: 0
PostgreSQL
    Type: Execute
    Length: 9
    Portal: 
    Returns: all rows
PostgreSQL
    Type: Parse
    Length: 4542
    Statement: 
    Query [truncated]: select string_agg(word, ',') from pg_catalog.pg_get_keywords() where word <> ALL ('{a,abs,absolute,action,ada,add,admin,after,all,allocate,alter,always,and,any,are,array,as,asc,asensitive,assertion,assignment,asymmetric
    Parameters: 0
PostgreSQL
    Type: Bind
    Length: 12
    Portal: 
    Statement: 
    Parameter formats: 0
    Parameter values: 0
    Result formats: 0
PostgreSQL
    Type: Describe
    Length: 6
    Portal: 
PostgreSQL
    Type: Execute
    Length: 9
    Portal: 
    Returns: all rows
PostgreSQL
    Type: Sync
    Length: 4

You can see it does an unnamed parse for BEGIN followed by the select.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants