Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

spec2 schema/select distinction #95

Open
xificurC opened this issue Oct 24, 2019 · 9 comments
Open

spec2 schema/select distinction #95

xificurC opened this issue Oct 24, 2019 · 9 comments
Labels
documentation Documentation improvement

Comments

@xificurC
Copy link
Contributor

Rich talks extensively about making the distinction about a schema and about the required data within a schema when being used. A person can have many things and only some (different) parts of its data will be required in different functions. That's what the new select is about. Is there similar functionality in malli? I see {:optional true} in an example schema which is exactly what Rich found bad in his first design.

@jeans11
Copy link
Contributor

jeans11 commented Oct 24, 2019

Hi!
It's the plan to implement select like spec2. There is an issue #63

@ikitommi
Copy link
Member

ikitommi commented Oct 24, 2019

Excellent question! Have been thinking about that a lot and I don't think there is a correct answer for this. Some thoughts:

  • in Malli, it's all Schemas right now. Like @jeans11 pointed out, we should (and will) have good tools to program with schemas. One can easily run a complex transformation/select to a root schema and get a new "select" schema out. It might even be simpler to have just one concept (schema) instead of two (schema and select)? I would say select should be a function, not a concept. We have been programming with schemas in real-life project since 2014 with Plumatic Schema & Schema-tools and have been really happy with that.

  • with Malli, It's easy to make all fields of root schemas as required and thus to be compliant with "Maybe Not". There could even be a top-level option to removing support for :optional keys all-together. Maybe malli should be promoted to a library for creating schema libraries ;)

  • in real life, many databases and dynamic schema definition systems already support the optional keys (like the JSON Schema. Not supporting optionality at malli would mean converting to/from those formats would be hard if not possible. With Spec2, per my understanding one needs to create both a Schema and a Select for each JSON Schema Object definition: the Schema to define all the possible keys and select to define the required keys. Like Spec1 :opt, but information is now copied in two places. Also, as the closing of the specs seems to be going into call-site, not sure if there will be a "closed spec" concept, which JSON Schema (and mostly all programming languages) support.

  • optional key + nillable is actually ternary: we don't know, we know it is, we know it is not. Depending on the context, this might be valuable. With Statically Typed FP langs, I lean on Option | Maybe | Either a lot.

@xificurC
Copy link
Contributor Author

I see you're thinking about this a lot, that's great.

So from your POV select creates a new, derived schema. I suppose that makes sense and makes things more general, everything's just a schema this way. Supporting the already existing standards is also important.

Once you finish pondering about this it would be nice to get this mentioned in the readme, people (like me) will be comparing this to spec2 and specifically to what lead to the creation of spec2.

@ikitommi ikitommi added the documentation Documentation improvement label Nov 24, 2019
@xificurC
Copy link
Contributor Author

Revisiting this I see you already implemented select-keys, required-keys and optional-keys, cool!

I'm pondering on one thing:

(defn foo [{:keys [x y z] :or {z 1}}]
  (if (pos-int? x) (/ y z) 0))

(select-keys M [:x :y]) is not OK in this case because input like {:x 1 :y 2 :z 0} throws. To generate reasonable tests one would need (optional-keys M [:z]). However if the model is huge we really want to narrow down the generation to just these three keys with :z being optional. That would be (-> M (select-keys [:x :y :z]) (optional-keys [:z])).

I think for my own data I would model everything as required, that would describe the "shape" of the data and then chop things from that with select-keys and optional-keys to get to the correct subset.

Does this sound reasonable?

I'm also thinking whether there could be an API to capture the 2 operations in 1 swoop, like select-keys taking another argument for optionals? (select-keys M [:x :y] [:z]).

What about nested maps or a collection of maps? Maybe one could devise a descriptive data-driven DSL to capture the requirements, like

{:x :! :y :! :z :?}                    ; :x :y and optionally :z
{:x :! :y :! :z {:? [{:a ! :b !}]}}    ; optional :z where it is a collection of required :a and :b

Take this for what it is, a brain dump :)

@ikitommi
Copy link
Member

a standard query format like EQL might be the way to go.

@xificurC
Copy link
Contributor Author

I thought about pull syntax, which is close to EQL, before posting, but I don't see a way to describe the required/optional part

@ikitommi
Copy link
Member

Good point. One though I had in mind was to support transforming key optionality using the map-entry syntax.

(mu/assoc-in nil [:a [:b {:optional true}]] int?)
; => [:map [:a [:map [:b {:optional true}] int?]]]]

would work also with mu/select-keys and could be used with the nested select thing?

@xificurC
Copy link
Contributor Author

A map is a good idea as it leaves space for future additions. One could want to e.g. define the probability of a key ({:p 0.9} to include this key with 90% probability) and other conditional logic.

@eval
Copy link
Contributor

eval commented Dec 18, 2020

For a pet-project I implemented schema-select that allows you to create a sub-schema using the spec2 select syntax 1.

It has the limitation that it only accepts a map-schema to select from.
Also, the traversal of nested composites is pretty simplistic: given [{:child [:att]}] it simply picks the first matching path out of [:child :att], [:child :malli.core/in :path], [:child 0 :path] and [:child 1 :path].
But it was Good Enough™️ for my usecase :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Documentation improvement
Projects
None yet
Development

No branches or pull requests

4 participants