Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for constraint schemas: This, that, or both keys in map #474

Open
brettrowberry opened this issue Jul 13, 2021 · 14 comments
Open
Labels
enhancement New feature or request

Comments

@brettrowberry
Copy link
Contributor

brettrowberry commented Jul 13, 2021

I'm just getting started with Malli, and I really like it!

There is something I can't figure out.

Consider this schema:

(def Schema
  [:map
   [:x boolean?]
   [:y {:optional true} int?]
   [:z {:optional true} string?]])

(m/validate Schema {:x true}) => true

Now imagine I want to require :y, :z, or both such that:

(m/validate Schema {:x true})              => false
(m/validate Schema {:x true :y 1})         => true
(m/validate Schema {:x true :z "hi"})      => true
(m/validate Schema {:x true :y 1 :z "hi"}) => true

Attempt 1

(def Schema
  [:map
   [:x boolean?]
   [:or
    [:y {:optional true} int?]
    [:z {:optional true} string?]]])

(m/validate Schema {:x true})

; Execution error (ExceptionInfo) at malli.impl.util/-fail! (util.cljc:17).
; :malli.core/invalid-schema {:schema :z}

Attempt 2

(def Schema
  [:or
   [:map
    [:x boolean?
     :y int?]]
   [:map
    [:x boolean?
     :z string?]]
   [:map
    [:x boolean?
     :y int?
     :z string?]]])

(m/validate Schema {:x true}) => true

That's not good!

Attempt 3

(def Schema
  [:and
   [:map [:x boolean?]]
   [:or
    [:map [:y int?]]
    [:map [:z string?]]
    [:map [:y int? :z string?]]]])

(m/validate Schema {:x true})              => false
(m/validate Schema {:x true :y 1})         => true
(m/validate Schema {:x true :z "hi"})      => true
(m/validate Schema {:x true :y 1 :z "hi"}) => true

This seems to work, but there's a bit of repetition.

The errors are pretty good, but perhaps they could be better if I did this the right way:

(me/humanize (m/explain Schema {:x true}))

=>
{:y
 ["missing required key"
  "missing required key"],
 :z ["missing required key"]}

(me/humanize (m/explain Schema {:x true :y "hi"}))

=> 
{:y
 ["should be an int"
  "should be an int"]
 :z ["missing required key"]}

What's the best way to do this?

@ikitommi
Copy link
Member

Current solution:

(def Schema
  [:and
   [:map
    [:x boolean?]
    [:y {:optional true} int?]
    [:z {:optional true} string?]]
   [:fn {:error/message "y & z are mutually exclusive"
         :error/path [:y]}
    (fn [{:keys [z y]}]
      (not (and z y)))]])

(-> Schema
    (m/explain {:x true :y 1 :z "hi"})
    (me/humanize))
; => {:y ["y & z are mutually exclusive"]}

Future: a declarative way to do this, something like https://github.com/bsless/malli-keys-relations

@ikitommi ikitommi added the enhancement New feature or request label Jul 15, 2021
@ikitommi
Copy link
Member

actually, this is duplicate of #438. As this has more info, will close the other one.

@ikitommi ikitommi changed the title This, that, or both keys in map Add support for constraint schemas: This, that, or both keys in map Jul 16, 2021
@bsless
Copy link
Contributor

bsless commented Jul 17, 2021

What do you think about a datalog-like syntax for describing the constraints? Something like a simplified :where clause
It's serializable and simple to compile

@ikitommi
Copy link
Member

I was thinking of using meander. I think both could be tested "on paper" for few sample cases to get more insight on how they would/not work. Code syntax samples into this thread?

@bsless
Copy link
Contributor

bsless commented Jul 17, 2021

Sure, I'll whip up a few examples

@bsless
Copy link
Contributor

bsless commented Jul 17, 2021

I considered two notations, one as property and one as a standalone schema:

[:map
 {:where
  '[[?e :x ?x]
    [?e :y ?y]
    [(< ?x ?y)]]}
 [:x int?]
 [:y int?]]

[:and
 [:map
  [:x int?]
  [:y int?]]
 [:where
  '[[?e :x ?x]
    [?e :y ?y]
    [(< ?x ?y)]]]]

I'll with the property notation for brevity.
I'm also missing a syntax for binding the original value being queried, let's go with :in for now
querying keys in a map:

[:map
 {:in '?e
  :where
  '[[?e :x ?x]
    [?e :y ?y]
    [(< ?x ?y)]]}
 [:x int?]
 [:y int?]]

Binding intermediate values:

[:map
 {:in '?e
  :where
  '[[?e :x ?x]
    [?e :y ?y]
    [(count ?x) ?n]
    [(< ?n ?y)]]}
 [:x [:vector int?]]
 [:y int?]]

Nested maps:

[:map
 {:in 'e?
  :where
  '[[?e :x ?x]
    [?x :z ?z]
    [?e :y ?y]
    [(== ?z ?y)]]}
 [:x [:map [:z int?]]]
 [:y int?]]

"every" syntax:

[:map
 {:in '?e
  :where
  '[[?e :x [?x :as ?xs]]
    [?x :z ?z]
    [?e :y ?y]
    [(count ?xs) ?n]
    [(< ?x ?n)]
    [(== ?z ?y)]]}
 [:x [:vector [:map [:z int?]]]]
 [:y int?]]

Implicit unification:

[:map
 {:in '?e
  :where
  '[[?e :x [?x :as ?xs]]
    [?x :z ?y]
    [?e :y ?y]
    [(count ?xs) ?n]
    [(< ?x ?n)]]}
 [:x [:vector [:map [:z int?]]]]
 [:y int?]]

Mutual exclusion (can't have a situation where both 2 and 3 succeed), this is also a "contains" syntax.

[:map
 {:in '?e
  :where
  '[[?e :x ?x] ;1
    (not
     [?e :y] ;2
     [?e :z])]} ;3
 [:x int?]
 [:y int?]
 [:z int?]]

I tried to introduce as few innovations as possible with this syntax. What do you think?

@ikitommi
Copy link
Member

Thanks for your thoughts @bsless! I think the separate :where type looks better of the two. About using datalog, that would be powerful as the nested sample shows. As it's generic, it could be used for sequences too, a generic matching on the data.

But, not ready on committing to an official malli relations-mapping solution yet, so if you would like to implement that, it could start as an add-on lib. Will try to resolve the global registry so that it's easy to plug-in custom schema types from 3rd party libraries.

cheers.

@bsless
Copy link
Contributor

bsless commented Jul 22, 2021

Sounds good. One issue I'd appreciate directions on is how to compile it to be performant. A naive approach would be building up a map of all the bindings, but where's the fun in that? Can it be done with functions? Do I need some CPS magic?

@ikitommi
Copy link
Member

not an datalog perf expert, there is #datalog chan in clojurians, maybe people there can share insights. I believe Asami has a fast imp (https://github.com/threatgrid/asami)

@bsless
Copy link
Contributor

bsless commented Jul 22, 2021

I figured I can use malli to parse the query :)

@bsless
Copy link
Contributor

bsless commented Jul 22, 2021

First not so successful attempt at translating the datalog spec directly to malli
Not so successful because ::clause can't be built into a schema.
Any tips @ikitommi ?
https://gist.github.com/bsless/632b4040a2b2ad7469369f52cd610c06

@bsless
Copy link
Contributor

bsless commented Jul 24, 2021

Update: the parser works. Data patterns are a bit too general so I'm considering adding specialized EA and EAV patterns but they're slightly ugly.
Regarding execution model: I'll start with some naive linear interpreter but eventually I'll want to add some sort of planner and compiler. Any suggestions regarding modeling are welcome

@ebunders
Copy link

ebunders commented Jan 22, 2024

I also ran into this problem, and we discussed it on slack.

I tried something like

[:union
   [:or
    [:map [:data map?]]
    [:map [:dependencies {:optional true} [:set [:ref "dependency"]]]]]
   [:map [:type {:optional true} string?]]]

but that didn't work out. I also tried a lot of stuff with merge but it didn't work as expected and I just didn't really understand the logic of something like this:

[:and
   [:map [:x boolean?]]
   [:or
    [:map [:y int?]]
    [:map [:z string?]]
    [:map [:y int? :z string?]]]]

This seems to say: a map with property :x, and a map with property :y, a map with property :z or a map with property :y and property :z.
What I would naively expect is more like the following (here I create a naive address schema that can either have a zip code and house number or a city, street name and house number):

[:map
  [:house-number number?]
  [:or 
    [:zip-code string?]
    [:and
      [:city string?]
      [:street-name string?]]]]

The top level map suggests that everything here happens in the context of a single map.
So:

  • house number should always be there.
  • further more: there should be either a :zip-code or a :city ánd a :street-name

This kind of syntax would honor the principle of least astonishment (dear to my heart) for me, but Tommi pointed out that te problem with this is that Malli can in this case not distinguish :or & :and from properties in the map. I guess a combination of :or and :map is used for that.
But as again Tommi pointed out this could be fixed with name-spacing all Malli keywords. That seems like a good idea all round.

so, for completeness:

[::m/map
  [:house-number number?]
  [::m/or 
    [:zip-code string?]
    [::m/and
      [:city string?]
      [:street-name string?]]]]

To me this kind of syntax feels most familiar, it is what I would expect to have to write.

@logseq-cldwalker
Copy link

I also hope something like @ebunders suggested works one day as the syntax is more explicit and easier to maintain than the current :fn workaround

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
Status: No status
Development

No branches or pull requests

5 participants