test: Test for columns in `dbQuoteIdentifier()` #372

krlmlr · 2021-12-05T06:54:01Z

Closes #263.

krlmlr · 2021-12-05T07:07:21Z

@dpprdan: Checks are failing now, could you please take a look?

dpprdan · 2021-12-24T09:21:17Z

👍 I'll look into it over the holidays.

This reverts commit 6bd835f.

dpprdan · 2021-12-28T13:26:09Z

The checks fail here

With the current CRAN release and an empty database, this looks like this:

library(RPostgres)
con <- postgresDefault()
objects <- dbListObjects(con)

(sql <- lapply(objects$table, dbQuoteIdentifier, conn = con))
#> [[1]]
#> <SQL> "information_schema".
#> 
#> [[2]]
#> <SQL> "pg_catalog".
(unquoted <- vapply(sql, dbUnquoteIdentifier, conn = con, list(1)))
#> [[1]]
#> <Id> schema = information_schema
#> 
#> [[2]]
#> <Id> schema = pg_catalog
testthat::expect_equal(unquoted, unclass(objects$table))

Note however the period (“.”) behind the quoted schemata.

sql[[1]] 
#> <SQL> "information_schema".

This isn’t a valid schema identifier in Postgres.

(qid <- dbQuoteIdentifier(con, Id(schema = "myschema")))
#> <SQL> "myschema".
create_schema_sql <- paste0("CREATE SCHEMA IF NOT EXISTS ", qid)
dbExecute(con, create_schema_sql)
#> Error: Failed to prepare query: ERROR:  syntax error at or near "."
#> LINE 1: CREATE SCHEMA IF NOT EXISTS "myschema".
#>                                               ^

So it seems to me that the test is broken. I don’t think that it can be fixed for schemata either, because the quote-unquote roundtrip cannot be made to work for just schemata. The heuristics employed in dbUnquoteIdentifier() to identify the database objects from a SQL() string can only work for one object class and we chose that to be tables.
The test can probably be fixed for tables if we write a table to the db and then test whether the quote-unquote roundtrip works for that (i.e. only for the objects[!objects$is_prefix] subset). But maybe this test exists already elsewhere?

dbDisconnect(con)

Session info

sessioninfo::session_info()
#> - Session info ---------------------------------------------------------------
#>  setting  value
#>  version  R version 4.1.2 (2021-11-01)
#>  os       Windows 10 x64 (build 19043)
#>  system   x86_64, mingw32
#>  ui       RTerm
#>  language en
#>  collate  German_Germany.1252
#>  ctype    German_Germany.1252
#>  tz       Europe/Berlin
#>  date     2021-12-28
#>  pandoc   2.16.2 @ C:/PROGRA~1/Pandoc/ (via rmarkdown)
#> 
#> - Packages -------------------------------------------------------------------
#>  package     * version date (UTC) lib source
#>  backports     1.4.1   2021-12-13 [1] CRAN (R 4.1.2)
#>  bit           4.0.4   2020-08-04 [1] CRAN (R 4.1.0)
#>  bit64         4.0.5   2020-08-30 [1] CRAN (R 4.1.0)
#>  blob          1.2.2   2021-07-23 [1] CRAN (R 4.1.0)
#>  cli           3.1.0   2021-10-27 [1] CRAN (R 4.1.1)
#>  crayon        1.4.2   2021-10-29 [1] CRAN (R 4.1.1)
#>  DBI           1.1.2   2021-12-20 [1] CRAN (R 4.1.2)
#>  desc          1.4.0   2021-09-28 [1] CRAN (R 4.1.1)
#>  digest        0.6.29  2021-12-01 [1] CRAN (R 4.1.2)
#>  ellipsis      0.3.2   2021-04-29 [1] CRAN (R 4.1.0)
#>  evaluate      0.14    2019-05-28 [1] CRAN (R 4.1.0)
#>  fansi         0.5.0   2021-05-25 [1] CRAN (R 4.1.0)
#>  fastmap       1.1.0   2021-01-25 [1] CRAN (R 4.1.0)
#>  fs            1.5.2   2021-12-08 [1] CRAN (R 4.1.2)
#>  generics      0.1.1   2021-10-25 [1] CRAN (R 4.1.1)
#>  glue          1.6.0   2021-12-17 [1] CRAN (R 4.1.2)
#>  highr         0.9     2021-04-16 [1] CRAN (R 4.1.0)
#>  hms           1.1.1   2021-09-26 [1] CRAN (R 4.1.1)
#>  htmltools     0.5.2   2021-08-25 [1] CRAN (R 4.1.1)
#>  knitr         1.37    2021-12-16 [1] CRAN (R 4.1.2)
#>  lifecycle     1.0.1   2021-09-24 [1] CRAN (R 4.1.1)
#>  lubridate     1.8.0   2021-10-07 [1] CRAN (R 4.1.1)
#>  magrittr      2.0.1   2020-11-17 [1] CRAN (R 4.1.0)
#>  pillar        1.6.4   2021-10-18 [1] CRAN (R 4.1.1)
#>  pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.1.0)
#>  pkgload       1.2.4   2021-11-30 [1] CRAN (R 4.1.2)
#>  purrr         0.3.4   2020-04-17 [1] CRAN (R 4.1.0)
#>  R.cache       0.15.0  2021-04-30 [1] CRAN (R 4.1.0)
#>  R.methodsS3   1.8.1   2020-08-26 [1] CRAN (R 4.1.0)
#>  R.oo          1.24.0  2020-08-26 [1] CRAN (R 4.1.0)
#>  R.utils       2.11.0  2021-09-26 [1] CRAN (R 4.1.1)
#>  R6            2.5.1   2021-08-19 [1] CRAN (R 4.1.1)
#>  Rcpp          1.0.7   2021-07-07 [1] CRAN (R 4.1.0)
#>  reprex        2.0.1   2021-08-05 [1] CRAN (R 4.1.0)
#>  rlang         0.4.12  2021-10-18 [1] CRAN (R 4.1.1)
#>  rmarkdown     2.11    2021-09-14 [1] CRAN (R 4.1.1)
#>  RPostgres   * 1.4.3   2021-12-20 [1] CRAN (R 4.1.2)
#>  rprojroot     2.0.2   2020-11-15 [1] CRAN (R 4.1.0)
#>  rstudioapi    0.13    2020-11-12 [1] CRAN (R 4.1.0)
#>  sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.1.2)
#>  stringi       1.7.6   2021-11-29 [1] CRAN (R 4.1.2)
#>  stringr       1.4.0   2019-02-10 [1] CRAN (R 4.1.0)
#>  styler        1.6.2   2021-09-23 [1] CRAN (R 4.1.1)
#>  testthat      3.1.1   2021-12-03 [1] CRAN (R 4.1.2)
#>  tibble        3.1.6   2021-11-07 [1] CRAN (R 4.1.2)
#>  utf8          1.2.2   2021-07-24 [1] CRAN (R 4.1.0)
#>  vctrs         0.3.8   2021-04-29 [1] CRAN (R 4.1.0)
#>  withr         2.4.3   2021-11-30 [1] CRAN (R 4.1.2)
#>  xfun          0.29    2021-12-14 [1] CRAN (R 4.1.2)
#>  yaml          2.2.1   2020-02-01 [1] CRAN (R 4.1.0)
#> 
#>  [1] C:/Users/Daniel.AK-HAMBURG/Documents/R/win-library/4.1
#>  [2] C:/Program Files/R/R-4.1.2/library
#> 
#> ------------------------------------------------------------------------------

krlmlr · 2021-12-28T18:17:56Z

Thanks. I need to take a look at the workflows.

krlmlr · 2021-12-28T19:23:41Z

Can you please bring in the latest changes from main to fix the workflows?

dpprdan · 2021-12-28T21:46:37Z

The test can probably be fixed for tables if we write a table to the db and then test whether the quote-unquote roundtrip works for that (i.e. only for the objects[!objects$is_prefix] subset). But maybe this test exists already elsewhere?

😬 Ha, I guess I should've just read the rest of the file. 🤦🏻‍♂️

BTW From reading the code, I'd say that RMariaDB has the same problem(s) (both the "cannot quote columns" and the "terminal dot" issue).

Current status: trying to get RMariaDB::mariadbDefault() to run with a MariaDB Docker container on Windows.

dpprdan · 2021-12-30T21:05:55Z

Sure enough I found a case where we need to unquote a schema only, namely in dbListObjects() with a prefix.
So this PR unfortunately breaks dbListObjects() with a SQL()-type prefix at the moment.

library(RPostgres)
con <- postgresDefault()
schema_qid <- dbQuoteIdentifier(con, Id(schema = 'Robert'))

The following doesn’t work with v1.4.3, due to the terminal dot

dbExecute(con, paste0('CREATE SCHEMA IF NOT EXISTS ', schema_qid))
#> [1] 0

We need to add a table for dbQuoteIdentifier() to find the schema, see #388

table_qid <- dbQuoteIdentifier(con, Id(schema = 'Robert', table = "mtcars"))
dbWriteTable(con, table_qid, mtcars)

The following works in v1.4.3 (but relies on the terminal dot to unquote the prefix as schema, not table)

dbListObjects(con, schema_qid)
#> Error in mapply(FUN = f, ..., SIMPLIFY = FALSE): zero-length inputs cannot be mixed with those of non-zero length

It works with an Id() object (which just passes through dbUnquoteIdentifier()).

dbListObjects(con, Id(schema = 'Robert'))
#>                                  table is_prefix
#> 1 <Id> schema = Robert, table = mtcars     FALSE

# clean-up
dbExecute(con, paste0('DROP TABLE ', table_qid))
#> [1] 0
dbExecute(con, paste0('DROP SCHEMA "Robert"'))
#> [1] 0
dbDisconnect(con)

I don’t know how to fix this ATM.

krlmlr · 2022-01-03T05:56:49Z

Thanks. It looks to me like dbListObjects() should always be called with an Id(), never with a SQL() . The behavior is misspecified. Should we adapt the list_objects_features test?

dpprdan · 2022-01-04T09:13:19Z

It looks to me like dbListObjects() should always be called with an Id(), never with a SQL().

That would be an easy fix, indeed, and way cleaner than all the hacks I came up with until now. I hope we're not missing something.

Can anything else than a schema be passed to prefix?

RPostgres/R/dbListObjects_PqConnection_ANY.R

Lines 22 to 26 in 5eced0c

    
           unquoted <- dbUnquoteIdentifier(conn, prefix) 
        
           is_prefix <- vlapply(unquoted, function(x) { 
        
             "schema" %in% names(x@name) && !("table" %in% names(x@name)) 
        
           }) 
        
           schemas <- vcapply(unquoted[is_prefix], function(x) x@name[["schema"]])

Because this looks like one could also specify (schema-qualified) table(s) ~~(and then only the schema is used), but I don't understand the use case that prompted this design.~~ Update: I misread the code. This is to only select schemas, not (possibly schema-qualified) tables. So I guess the answer is that schemata and tables can be passed to prefix, but only schemas are used.

It also looks like it is supposed to be vectorized (v*apply()), but DBI::dbQuoteIdentifier() isn't AFAIU (x must be SQL or Id), so dbListObjects() isn't, at least not with a prefix.

- we cannot get quote-unquote roundtrip to work for schemata, see r-dbi/RPostgres#372

krlmlr

Thanks. It looks like dbQuoteIdentifier() with incomplete Id() (e.g., only "catalog" and "schema") used to leave a trailing dot in the identifier string, but no longer does. As the tests don't fail, I assume this isn't tested anywhere 😱 . Is this even specified?

We could go for a breaking change, but a DBItest specification and test would be great. We could also explore the use of NA to omit (or include?) the trailing dot. If NA means omit trailing dot, we could be fully backward compatible, with a slightly (but not entirely) awkward interface.

dpprdan · 2022-12-06T13:07:16Z

It looks like dbQuoteIdentifier() with incomplete Id() (e.g., only "catalog" and "schema") used to leave a trailing dot in the identifier string, but no longer does.

It still does in RPostgres:

RPostgres/R/dbQuoteIdentifier_PqConnection_Id.R

Line 12 in fa3466f

ret <- paste0(ret, dbQuoteIdentifier(conn, x@name[["schema"]]), ".")

And in RMariaDB.

It's not present in DBI (and never was IIUC, at least not since r-dbi/DBI@6c111b7)

library(DBI)
dbQuoteIdentifier(ANSI(), Id(schema = "myschema"))
#> <SQL> "myschema"

This PR includes a potential fix for RPostgres

https://github.com/r-dbi/RPostgres/pull/372/files#diff-b87396211065ce77f8649cc9f6daadb09f90ee9f7d3cfe4cd4b8dfdc377f616fR8-R9

It is all a bit confusing because the "(can't) quote columns" and "trailing dot" issues and PRs are conflated here (as well as over at RMariaDB). Maybe it's best to separate those from each other?

As the tests don't fail, I assume this isn't tested anywhere 😱 .

My guess would be that the implicit assumption has always been that IDs contain/refer to a table name.

We could go for a breaking change

Not sure I follow. Breaking change with regard to the trailing dot? AFAICT we're not breaking something that used to work before in RPostgres or RMariaDB. So this would merely be a bugfix?

but a DBItest specification and test would be great.

I'll have a look.

dpprdan · 2022-12-06T20:30:50Z

In any case, adding support for columns as quoted identifiers is not a breaking change, but rather in accordance with the DBItest spec 👀

Quoted identifiers can be used as table and column names in SQL queries (source)

but a DBItest specification and test would be great.

There aren't any quote-identifier specs and tests at the moment and I am having a hard time coming up with meaningful ones for this problem.

We don't want to do (SQL) string comparisons, because quoting is implementation-specific and therefore belong in the particular backend packages.
We cannot use the quote-unquote Roundtrip for Id()s that do not reference tables.
The trailing dot is a particular bug, but not spec-relevant per se. Even though I am not aware of a DB that uses something other than the dot as a object name separator (cf. MariaDB, Postgres), but in principle it could be implementation-specific, I guess.

So I'd say that the tests belong in the backend packages rather than in DBItest. Am I missing something?

The MariaDB docs mention many more identifiers (and a similar list applies to Postgres as well - MariaDB is missing schema for example):

Databases, tables, indexes, columns, aliases, views, stored routines, triggers, events, variables, partitions, tablespaces, savepoints, labels, users, roles, are collectively known as identifiers

Shouldn't we support/allow those as well?

krlmlr · 2023-03-16T05:21:51Z

Can you please merge the latest changes here? Looks like I'm not permitted to do this in this PR.

aviator-app · 2023-11-09T11:27:10Z

Current Aviator status

Aviator will automatically update this comment as the status of the PR changes.
Comment /aviator refresh to force Aviator to re-examine your PR (or learn about other /aviator commands).

This PR was merged manually (without Aviator). Merging manually can negatively impact the performance of the queue. Consider using Aviator next time.

See the real-time status of this PR on the Aviator webapp.

Use the Aviator Chrome Extension to see the status of your PR within GitHub.

re r-dbi/DBI#417

dpprdan · 2023-12-21T13:45:58Z

Fixes #453 too. You may want to adjust the title and description, @krlmlr (since you're the PR owner).

krlmlr · 2024-04-01T13:56:29Z

Thanks. This now has been implemented independently, but the tests are good.

aviator-app · 2024-04-01T14:10:24Z

This pull request failed to merge: PR cannot be automatically rebased, please rebase manually to continue. Once the issues are resolved, remove the blocked label and re-queue the pull request. Note that the pull request will be automatically re-queued if it has the mergequeue label.

Additional debug info: Failed to rebase this PR onto the latest changes from the base branch. You will probably need to rebase this PR manually and resolve conflicts).

dbQuoteIdentifier: allow columns

6bd835f

krlmlr added this to the 1.4.1 milestone Dec 5, 2021

krlmlr modified the milestones: 1.4.1, 1.4.2, 1.4.3 Dec 5, 2021

dpprdan added 2 commits December 27, 2021 10:30

Revert "dbQuoteIdentifier: allow columns"

ecaa6cf

This reverts commit 6bd835f.

Merge branch 'main' into fix/col_in_dbQuoteIdentifier

4e4139f

allow columns in dbQuoteIdentifier()

c6bb651

dpprdan added 2 commits December 28, 2021 22:17

Merge branch 'main' into fix/col_in_dbQuoteIdentifier

ff54dd4

qualify na.omit()

c0aca74

dpprdan mentioned this pull request Dec 29, 2021

dbQuoteIdentifier() does not allow columns in Id() object. r-dbi/RMariaDB#254

Closed

dpprdan mentioned this pull request Jan 3, 2022

test: Test for quoting columns with dbQuoteIdentifier() r-dbi/RMariaDB#255

Merged

dpprdan added a commit to dpprdan/DBItest that referenced this pull request Jan 28, 2022

test dbQuoteIdentifier roundtrip for tables only

ae1b7a3

- we cannot get quote-unquote roundtrip to work for schemata, see r-dbi/RPostgres#372

This was referenced Jan 28, 2022

feat: dbQuoteIdentifier() roundtrip is tested for tables only r-dbi/DBItest#256

Merged

feat!: dbListObjects() only allows Id() objects as prefix argument #390

Merged

Merge branch 'main' into fix/col_in_dbQuoteIdentifier

f3de133

krlmlr mentioned this pull request Dec 5, 2022

Specify dbQuoteIdentifier() for incomplete Id() r-dbi/DBItest#290

Open

krlmlr commented Dec 5, 2022

View reviewed changes

dpprdan added 2 commits March 16, 2023 09:30

Merge branch 'main' into fix/col_in_dbQuoteIdentifier

b22b73a

Merge branch 'main' into fix/col_in_dbQuoteIdentifier

dc98731

dpprdan added 5 commits December 20, 2023 17:28

Merge branch 'main' into fix/col_in_dbQuoteIdentifier

a865888

Merge branch 'main' into fix/col_in_dbQuoteIdentifier

0f62fb0

allow quoting of unnamed Id() components

6daebc7

re r-dbi/DBI#417

bump DBI and DBItest versions

81eb472

add Remotes: for DBItest

26d9862

dpprdan force-pushed the fix/col_in_dbQuoteIdentifier branch from e58c405 to 26d9862 Compare December 21, 2023 11:47

dpprdan mentioned this pull request Dec 21, 2023

unnamed Id() breaks dbQuoteIdentifier() #453

Closed

remove Remotes: r-dbi/DBItest

153ec24

krlmlr changed the title ~~dbQuoteIdentifier: allow columns~~ feat: Allow columns in dbQuoteIdentifier() Apr 1, 2024

krlmlr changed the title ~~feat: Allow columns in dbQuoteIdentifier()~~ test: Test for columns in dbQuoteIdentifier() Apr 1, 2024

krlmlr added the mergequeue label Apr 1, 2024

aviator-app bot added the blocked label Apr 1, 2024

krlmlr merged commit 4c99316 into r-dbi:main Apr 1, 2024
16 of 17 checks passed

dpprdan deleted the fix/col_in_dbQuoteIdentifier branch April 3, 2024 16:38

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: Test for columns in `dbQuoteIdentifier()` #372

test: Test for columns in `dbQuoteIdentifier()` #372

krlmlr commented Dec 5, 2021

krlmlr commented Dec 5, 2021

dpprdan commented Dec 24, 2021

dpprdan commented Dec 28, 2021

krlmlr commented Dec 28, 2021

krlmlr commented Dec 28, 2021

dpprdan commented Dec 28, 2021 •

edited

Loading

dpprdan commented Dec 30, 2021

krlmlr commented Jan 3, 2022

dpprdan commented Jan 4, 2022 •

edited

Loading

krlmlr left a comment

dpprdan commented Dec 6, 2022 •

edited

Loading

dpprdan commented Dec 6, 2022

krlmlr commented Mar 16, 2023

aviator-app bot commented Nov 9, 2023 •

edited

Loading

dpprdan commented Dec 21, 2023

krlmlr commented Apr 1, 2024

aviator-app bot commented Apr 1, 2024

test: Test for columns in dbQuoteIdentifier() #372

test: Test for columns in dbQuoteIdentifier() #372

Conversation

krlmlr commented Dec 5, 2021

krlmlr commented Dec 5, 2021

dpprdan commented Dec 24, 2021

dpprdan commented Dec 28, 2021

krlmlr commented Dec 28, 2021

krlmlr commented Dec 28, 2021

dpprdan commented Dec 28, 2021 • edited Loading

dpprdan commented Dec 30, 2021

krlmlr commented Jan 3, 2022

dpprdan commented Jan 4, 2022 • edited Loading

krlmlr left a comment

Choose a reason for hiding this comment

dpprdan commented Dec 6, 2022 • edited Loading

dpprdan commented Dec 6, 2022

krlmlr commented Mar 16, 2023

aviator-app bot commented Nov 9, 2023 • edited Loading

Current Aviator status

dpprdan commented Dec 21, 2023

krlmlr commented Apr 1, 2024

aviator-app bot commented Apr 1, 2024

test: Test for columns in `dbQuoteIdentifier()` #372

test: Test for columns in `dbQuoteIdentifier()` #372

dpprdan commented Dec 28, 2021 •

edited

Loading

dpprdan commented Jan 4, 2022 •

edited

Loading

dpprdan commented Dec 6, 2022 •

edited

Loading

aviator-app bot commented Nov 9, 2023 •

edited

Loading