Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ExtendedTask and db connections with pool - unexpected behavior - bad_weak_ptr #4145

Open
benhmin opened this issue Oct 15, 2024 · 3 comments

Comments

@benhmin
Copy link

benhmin commented Oct 15, 2024

System details

Browser Version:
using internal RStudio browser (run in viewer pane)

Output of sessionInfo():

R version 4.4.1 (2024-06-14)
Platform: aarch64-apple-darwin20
Running under: macOS Sonoma 14.6.1

Matrix products: default
BLAS:   /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: America/Los_Angeles
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] pool_1.0.3     promises_1.3.0 future_1.33.2  bslib_0.7.0    shiny_1.9.1   

loaded via a namespace (and not attached):
 [1] bit_4.0.5         jsonlite_1.8.8    compiler_4.4.1    Rcpp_1.0.12       blob_1.2.4        parallel_4.4.1    later_1.3.2      
 [8] jquerylib_0.1.4   globals_0.16.3    fastmap_1.2.0     mime_0.12         R6_2.5.1          tibble_3.2.1      DBI_1.2.2        
[15] pillar_1.9.0      rlang_1.1.3       utf8_1.2.4        cachem_1.1.0      httpuv_1.6.15     fs_1.6.4          sass_0.4.9       
[22] bit64_4.0.5       RSQLite_2.3.6     memoise_2.0.1     cli_3.6.2         withr_3.0.0       magrittr_2.0.3    digest_0.6.35    
[29] rstudioapi_0.16.0 xtable_1.8-4      lifecycle_1.0.4   vctrs_0.6.5       glue_1.7.0        listenv_0.9.1     codetools_0.2-20 
[36] rsconnect_1.2.2   fansi_1.0.6       parallelly_1.37.1 tools_4.4.1       pkgconfig_2.0.3   htmltools_0.5.8.1

Example application or steps to reproduce the problem

library(shiny)
library(bslib)
library(future)
library(promises)
library(pool)
future::plan(multisession)

pool <- pool::dbPool(RSQLite::SQLite(), db = 'test.sqlite')

DBI::dbWriteTable(pool, 'mtcars', mtcars, overwrite = T)

ui <- page_fluid(
  numericInput('cyl', label = 'Cylinder Filter', value = '4')
  , actionButton('btn', label = 'Go')
  , card(
    card_title('Test Data Output')
    , card_body(
      textOutput('outstat')
      ,tableOutput('out')
    )
  )
)

server <- function(input, output, session) {

  data_get <- shiny::ExtendedTask$new(
    function(param, con) {
      future_promise(
        {
          c <- con
          param <- param
          RSQLite::dbGetQuery(
            conn = c,
            statement = glue::glue_sql(
              "select * from mtcars where cyl = {param}",
              .con = c
            )
          )
        },
        seed = T
      )
    }
  )

  observeEvent(
    eventExpr = input$btn
    , handlerExpr = {
      p <- poolCheckout(pool) ## comment out this line to create the error
      data_get$invoke(param = input$cyl, con = pool)
    }
  )

  output$out <- renderTable({
    data_get$result()
  })

  output$outstat <- renderText({
    paste('Extended Task Status:', data_get$status())
  })
}

shinyApp(ui, server)

Describe the problem in detail

I have been working on developing several apps using the ExtendedTask framework. Typically the "extended" portion of tasks in my apps involves retrieving data from or doing operations on databases. I also typically use the Pool package to create a pool of connections to the databases I use.

This is an example app that details a problem I have recently encountered when attempting to pass a pool connection into a function via ExtendedTask.

You can see I create a local database with the mtcars data set in a sqlite file. I connect to that database with a connection pool.

In my extended task invocation I pass the input$cyl and pool as parameters to data_get. The problem: if I don't put this p <- poolCheckout(pool) line in the observeEvent() handler then I get the following output:

Warning: Error in : bad_weak_ptr
  1: runApp

This can be reproduced by commenting out that line as indicated in the code. I have no idea why checking out a connection from the pool would make any difference within the observer, especially since I don't even pass p to the data_get invocation (in fact, it doesn't work if i try that - same error is returned).

I only figured out I could make it work this way when I was trying desperate things to get rid of the bad_weak_ptr error, not because I have any idea why it makes it work.

Thanks!

@ismirsehregal
Copy link
Contributor

ismirsehregal commented Oct 16, 2024

I'm not sure why the example works through calling poolCheckout from the parent process, but a similar issue was discussed here.

The problem discussed over there is that we can't pass database connections from one process to another. You have to make sure the background process gets a new connection. This can be done by setting minSize = 0, idleTimeout = 0 in pool::dbPool (also works for your code with poolCheckout commented out).

However:

In this case, the pool is not actually pooling anything and performance is the same as not using a pool at all.

Also note:

pool database connections between futures (and hence multiple R processes) [...] is not possible due to limitations in the database packages themselves.

@benhmin
Copy link
Author

benhmin commented Oct 16, 2024

@ismirsehregal thank you so much! I can't really understand exactly why the poolcheckout helps either unless it somehow clears out something in the pool and forces a new connection to be provided to the extendedtask future process.

I think i'm going to be able to work within the constraints mentioned in the linked issue. One thing i guess I would recommend is that the documentation for extendedtask on the Shiny site do a little more to explicitly state that passing database connection objects into those functions is fraught and help with some best practices. This is important since probably there are a lot of these extendedtasks that are really just database operations.

@benhmin benhmin closed this as completed Oct 16, 2024
@ismirsehregal
Copy link
Contributor

@benhmin I would suggest to leave the issue opened as a reminder to update the documentation. I would also be interested to know why your above example works. Cheers

@benhmin benhmin reopened this Oct 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants