Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow inner subqueries with CTEs in postgres #720

Merged
merged 6 commits into from
Jan 11, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion beam-core/Database/Beam/Query/Internal.hs
Original file line number Diff line number Diff line change
Expand Up @@ -41,9 +41,14 @@ data QF be (db :: (Type -> Type) -> Type) s next where

QAll :: Projectible be r
=> (TablePrefix -> T.Text -> BeamSqlBackendFromSyntax be)
-- ^ build the FROM syntax using the table prefix and the table name
-> (T.Text -> r)
-- ^ Given a table name, get the various Qs for all the expressions in that table
-> (r -> Maybe (WithExprContext (BeamSqlBackendExpressionSyntax be)))
-> ((T.Text, r) -> next) -> QF be db s next
-- ^ on clause, if any
-> ((T.Text, r) -> next)
-- ^ Generate the result from the table name and projectible result
-> QF be db s next

QArbitraryJoin :: Projectible be r
=> QM be db (QNested s) r
Expand Down
8 changes: 4 additions & 4 deletions beam-postgres/Database/Beam/Postgres/Connection.hs
Original file line number Diff line number Diff line change
Expand Up @@ -155,13 +155,13 @@ runPgRowReader conn rowIdx res fields (FromBackendRowM readRow) =
Pg.ConversionFailed { Pg.errSQLType = sql
, Pg.errHaskellType = hs
, Pg.errMessage = msg
, Pg.errSQLField = field } ->
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting rid of GHC warnings here.

pure (ColumnTypeMismatch hs sql ("Conversion failed for field'" <> field <> "': " <> msg))
, Pg.errSQLField = errField } ->
pure (ColumnTypeMismatch hs sql ("Conversion failed for field'" <> errField <> "': " <> msg))
Pg.Incompatible { Pg.errSQLType = sql
, Pg.errHaskellType = hs
, Pg.errMessage = msg
, Pg.errSQLField = field } ->
pure (ColumnTypeMismatch hs sql ("Incompatible field: '" <> field <> "': " <> msg))
, Pg.errSQLField = errField } ->
pure (ColumnTypeMismatch hs sql ("Incompatible field: '" <> errField <> "': " <> msg))
Pg.UnexpectedNull {} ->
pure ColumnUnexpectedNull
in pure (Left (BeamRowReadError (Just (fromIntegral curCol)) err))
Expand Down
43 changes: 42 additions & 1 deletion beam-postgres/Database/Beam/Postgres/Full.hs
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,9 @@ module Database.Beam.Postgres.Full

, locked_, lockAll_, withLocks_

-- ** Inner WITH queries
, pgSelectWith

-- ** Lateral joins
, lateral_

Expand Down Expand Up @@ -55,15 +58,18 @@ module Database.Beam.Postgres.Full
) where

import Database.Beam hiding (insert, insertValues)
import Database.Beam.Query.Internal
import Database.Beam.Backend.SQL
import Database.Beam.Backend.SQL.BeamExtensions
import qualified Database.Beam.Query.CTE as CTE
import Database.Beam.Query.Internal
import Database.Beam.Schema.Tables

import Database.Beam.Postgres.Types
import Database.Beam.Postgres.Syntax

import Control.Monad.Free.Church
import Control.Monad.State.Strict (evalState)
import Control.Monad.Writer (runWriterT)

import Data.Kind (Type)
import Data.Proxy (Proxy(..))
Expand Down Expand Up @@ -274,6 +280,41 @@ lateral_ using mkSubquery = do
(\_ -> Nothing)
(rewriteThread (Proxy @s))))

-- | The SQL standard only allows CTE expressions (WITH expressions)
-- at the top-level. Postgres allows you to embed these within a
-- subquery.
--
-- For example,
--
-- @
-- SELECT a.column1, b.column2 FROM (WITH RECURSIVE ... ) a JOIN b
-- @
--
-- @beam-core@ offers 'selectWith' to produce a top-level 'SqlSelect'
-- but these cannot be turned into 'Q' objects for use within joins.
--
-- The 'pgSelectWith' function is more flexible and indeed
-- 'selectWith' for @beam-postgres@ is equivalent to se
pgSelectWith :: forall db s res
. Projectible Postgres res
=> With Postgres db (Q Postgres db s res) -> Q Postgres db s res
pgSelectWith (CTE.With mkQ) =
let (q, (recursiveness, ctes)) = evalState (runWriterT mkQ) 0
fromSyntax tblPfx =
case recursiveness of
CTE.Nonrecursive -> withSyntax ctes (buildSqlQuery tblPfx q)
CTE.Recursive -> withRecursiveSyntax ctes (buildSqlQuery tblPfx q)
in Q (liftF (QAll (\tblPfx tName ->
let (_, names) = mkFieldNames @Postgres @res (qualifiedField tName)
in fromTable (PgTableSourceSyntax $
mconcat [ emit "(", fromPgSelect (fromSyntax tblPfx), emit ")" ])
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need the ( ) to surround the WITH clause here.

(Just (tName, Just names)))
(\tName ->
let (projection, _) = mkFieldNames @Postgres @res (qualifiedField tName)
in projection)
(\_ -> Nothing)
snd))

-- | By default, Postgres will throw an error when a conflict is detected. This
-- preserves that functionality.
onConflictDefault :: PgInsertOnConflict tbl
Expand Down
54 changes: 54 additions & 0 deletions docs/user-guide/backends/beam-postgres.md
Original file line number Diff line number Diff line change
Expand Up @@ -274,3 +274,57 @@ runInsert $
)
```

### Inner CTEs

Standard SQL only allows CTEs (`WITH` expressions) at the top-level SELECT. However, PostgreSQL
allows them anywhere, including in subqueries for joins.

For example, the following is valid Postgres, but not valid standard SQL.

```sql
SELECT a.column1, b.column2
FROM (WITH RECURSIVE ... SELECT ...) a
INNER JOIN b
```

`beam-core` enforces this by forcing `selectWith` to only return a `SqlSelect`, which represents a
top-level SQL `SELECT` statement that can be executed against a backend. However, if we want to
allow `WITH` expressions to appear within joins, then we will need a function similar to
`selectWith` but returning a `Q` value, which is a re-usable query. `beam-postgres` provides this
function for PostgreSQL, named `pgSelectWith`. For `beam-postgres`, `select (pgSelectWith x)` is
equivalent to `selectWith x`. But, with the new type, we can reuse CTEs (including recursive ones)
within other queries.

As an example using our Chinook schema, suppose we had an error with all orders in the month of
September 2024, and needed to send out employees to customer homes to correct the issue. We want to
find, for each order, an employee who lives in the same city as the customer, but we only want the
highest ranking employee for each customer.

First, we order the employees by org structure so that managers appear first, followed by direct reports. We use a recursive query for this, and then join it against the orders.

!beam-query
```haskell
!example chinook only:Postgres
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This actually works!

aggregate_ (\(cust, emp) -> (group_ cust, Pg.pgArrayAgg (employeeId emp)))
$ do inv <- filter_ (\i -> invoiceDate i >=. val_ (read "2024-09-01 00:00:00.000000") &&. invoiceDate i <=. val_ (read "2024-10-01 00:00:00.000000")) $ all_ (invoice chinookDb)
cust <- lookup_ (customer chinookDb) (invoiceCustomer inv)
-- Lookup all employees and their levels
(employee, _, _) <-
Pg.pgSelectWith $ do
let topLevelEmployees =
fmap (\e -> (e, val_ (via @Int32 0))) $
filter_ (\e -> isNull_ (employeeReportsTo e)) $ all_ (employee chinookDb)
rec employeeOrgChart <-
selecting (topLevelEmployees `unionAll_`
do { (manager, managerLevel) <- reuse employeeOrgChart
; report <- filter_ (\e -> employeeReportsTo e ==. manager) $ all_ (employee chinookDb)
; pure (report, managerLevel + val_ 1) })
pure $ filter_ (\(employee, level, minLevel) -> level ==. minLevel)
$ withWindow_ (\(employee, level) -> frame_ (partitionBy_ (addressCity (employeeAddress employee))) noOrder_ noBounds_)
(\(employee, level) cityFrame ->
(employee, level, coalesce_ [min_ level `over_` cityFrame] (val_ 0)))
(reuse employeeOrgChart)
-- Limit the search only to employees that live in the same city
guard_ (addressCity (employeeAddress employee) ==. addressCity (customerAddress cust))
pure (cust, employee)
```
Loading