Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make COPY statement options compliant to Postgres #247

Merged
merged 10 commits into from
Sep 17, 2024
Merged

Conversation

dey4ss
Copy link
Member

@dey4ss dey4ss commented Aug 8, 2024

While fiddling around with a Postgres instance, I noticed that our syntax for COPY statements is not 100% compliant to Postgres. In detail, we used COPY table_a FROM 'path/to/tbl' WITH FORMAT CSV;, whereas Postgres uses COPY table_a FROM 'path/to/tbl' WITH (FORMAT CSV); – there are brackets around the options. Furthermore, the WITH keyword is optional [1].

The incorrect syntax was introduced in #139 (my bad 😕) and it slipped through the review back then. This PR makes the COPY statement compliant to Postgres again. Furthermore, I used the opportunity to add the encoding as another option. This allows us to apply any specified encoding for imported and exported tables in Hyrise.

[1] https://www.postgresql.org/docs/16/sql-copy.html

@dey4ss dey4ss added the bug label Aug 8, 2024
@dey4ss dey4ss requested a review from Bouncner August 8, 2024 14:54
Copy link
Contributor

@Bouncner Bouncner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only tiny things.

Makefile Show resolved Hide resolved
Makefile Show resolved Hide resolved

namespace hsql {

// Name unchanged for compatibility.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't get this comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dates back to when we added the COPY statement and, with it, the ability to export tables. Even though we use these file types for import and export, we kept the name so people with forks don't have to rename stuff in their code.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This thinking is also the reson why I added the encoding as another member of the statement rather than just passing the options object with both file type and encoding.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What name would you have chosen?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the enum? In the grammar, we simply call it file type (so FileType as enum name):

file_type : IDENTIFIER {

I remember that it was a big discussion back then and some insisted on compatibility. Anyway, I think it's kinda late for that now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you briefly mention that we call it FileyType in the grammar, but won't adapt it here?
Not stating what the alternative is (nobody is aware of the probably old discussion) does not give away why the current name might not be perfect. :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to zünd afterwards.

@dey4ss dey4ss merged commit 57c763a into master Sep 17, 2024
4 checks passed
@dey4ss dey4ss deleted the dey4ss/fix_pg_copy branch September 17, 2024 12:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants