Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate with, the one and only, ScalaJSON #232

Open
ktoso opened this issue Jun 29, 2017 · 24 comments
Open

Integrate with, the one and only, ScalaJSON #232

ktoso opened this issue Jun 29, 2017 · 24 comments
Labels

Comments

@ktoso
Copy link
Member

ktoso commented Jun 29, 2017

The common JSON AST is happening: mdedetrich/scalajson#13
We promised to support it and shall deliver on the promise.

@ktoso ktoso added the Feature label Jun 29, 2017
@jrudolph
Copy link
Member

jrudolph commented Jun 29, 2017

I think there are several ways of integrating with ScalaJson. There seems to be no guideline how it should be integrated. So, here are a few alternatives:

1. Shallow integration

We just add conversions from and to JValue, i.e. our JsValue gets a new toJValue method and we also provide an (extension) method to convert in the other direction.

This has a number of advantages:

  • Everything stays compatible. There won't be any deprecations or sweeping changes.
  • Integrating with other JS libraries is straightforward. Just call toJValue and use the result in with other compatible libraries (We could also provide implicit conversions for users that don't like manual conversion)
  • If there are problems with the new AST, we (and our users) are one layer away from them because our own AST is still there.

There are also a few disadvantages:

  • Integration with other libraries is not completely seamless as you always have to use one of the toXYZ methods
  • Conversion requires a deep copy of one AST structure into a very similar one. This can be slow. On the other hand, ScalaJson itself has the problem that to use the fast "unsafe" variant you have to copy the AST.

2. Eventual full conversion

The end result in that scenario should be that we can remove our own AST and only use the common one. For compatibility reasons we might want to support both at the same time. How could that work?

We need to identify the basic API that users use to work with spray-json:

  • parsing
  • rendering
  • conversion between JsValue and JValue
  • declaring JsonFormats
  • sugar methods added with extension methods

Parsing: add another entry-point that creates JValue instead of JsValue (and deprecate the old one)

Rendering: we need a renderer for JValue, rendering method needs to be added via extension methods to JsValue instances

Conversion: conversion must be implemented, JsValue can get a toJValue method, JValue could get an extension method (if required at all)

JsonFormat:

  • All JsonFormat should support both ASTs.
  • Given a reader/writer for one AST we can always derive one for the other AST using conversion.
  • For the predefined JsonFormats we should implement the reader/writers for the new AST manually
  • For user-defined custom JsonFormat we should recommend and guide people to only provide conversions for the new AST (and automatically provide implementations for the old one for the time being)

Sugar methods: Can simply be added for the new AST as well

Advantages:

  • eventually, there will be only one AST, no copying required for conversions

Disadvantages:

  • somewhat big changes needed in the short-term to do the migration in a compatible way
  • even bigger, incompatible changes needed to get rid of the old implementation at some point, that might fragment the user base into those that are using the current spray-json and those that are using a future spray-json based on ScalaJson
  • alternatively, we could keep the hybrid model forever, which would slow down further development of the library (though, to be fair, as no development has happened on this library for several years now, it would be a slow-down from an already very low level ;) )
  • if the common AST has some problems (e.g. performance problems, no way to instantiate it quickly, etc.), we cannot fix it ourselves

3. Engage in the discussion about ScalaJson to provide a more easily to adopt AST model

I previously suggested that the best outcome for the common AST would be that AST is based on abstract classes/ traits (I didn't follow up on that one, so maybe I didn't make my point clear at that point in time). The user-facing API could be almost exactly the same as it is now but the integration would be much simpler: We would just make sure that our AST implements the interface and we would be done. As long there are no name clashes with the exsting methods in our AST classes that would work compatibly and with minimal effort required on the side of project maintainers. An SPI should be provided to implement and register constructors for the AST to have a common way for users to create AST objects.

Needed changes in spray-json:

  • implement the abstract classes by extending our AST model with the (suggested) abstract classes from the common model
  • change all methods accepting AST to accept the more general common AST instead of our own one for maximum compatibility with other implementations

Advantages:

  • Minimal changes needed
  • No compatibility issues for our users
  • Ability to evolve the AST implementation directly in spray-json
  • No extension methods needed to add methods to the AST (though, by now, we might see adding methods to the AST itself as bad practice)

Disadvantages:

  • The ship might have sailed on that one
  • Someone needs to actually make these suggestions concrete and engage in discussions to make it work

Note: Despite its flaws spray-json has been considered somewhat complete. At this point (which has been reached several years ago), it isn't clear that doing any changes, especially incompatible ones,
are actually beneficial to its user-base. The incentives are somewhat low to make big changes with changes being very costly because they need to be compatible. I think for the bigger changes to become
true, we will 1) indication that significant parts of the user-base is actually interested in seeing them and 2) people who help with the transition.

Open questions

  • How to use the unsafe AST?
    • If conversion from the usual AST is required everywhere, nothing is won, so should implementors always have to support both ASTs?
    • Who decides which version of the AST should be produced? The user? The library? By which logic?
    • Does it need an unsafe AST at all? Isn't rather a SPI needed that allows more performant access to the common AST?
    • As the unsafe AST is concrete as well, how to evolve it if different libraries / applications have different speed requirements?

@mdedetrich
Copy link

mdedetrich commented Jun 29, 2017

So generally, here are the guidelines

  • Unsafe AST is designed for parsing/serializing (and corner cases like comparing JSON objects) but isn't what should be exposed to users.
  • Standard AST is what should be exposed to users.

If you implement parsing with unsafe.JValue, you automatically get conversion to standard AST for free (since the library provides this function). You can also provide your own function for this conversion. You can also just parse yourself and provide the standard ast.

Note that the idea is that frameworks/libraries implement this AST and since its a breaking change, only do this for a next major release. This of course depends on the framework/library. Circe for example, is likely to provide conversion modules but that is because their internal AST representation is completely different (and they are going to decouple this at one point, see circe/circe#690 (comment) under Future Integration) so eventually its going to be a non issue.

However the case for spray-json is, from what I can see, the AST in spray json is basically almost the same as ScalaJSON AST (barring naming/JNumber representation). The naming can actually be fixed with type/object alias, see https://github.com/mdedetrich/playframework/blob/b0934ce452d82312f37cd62af28524c6840a4c45/framework/src/play-json/src/main/scala/play/api/libs/json/package.scala#L47-L58 on how to do this). Making the AST an interface is redundant as naming/migration issues can be solved with type/object alias, and it just creates a lot of ceremony which defeats the purpose of the AST.

Regarding JNumber, its being talked about here mdedetrich/scalajson#16

Responding to some specific points

alternatively, we could keep the hybrid model forever, which would slow down further development of the library (though, to be fair, as no development has happened on this library for several years now, it would be a slow-down from an already very low level ;) )

I would emphasis this point about development, also this is the reason why ScalaJSON AST is made in the first place. The implementation of a JSON AST is fairly trivial, and all of the current implementations differ in subtle ways.

if the common AST has some problems (e.g. performance problems, no way to instantiate it quickly, etc.), we cannot fix it ourselves

This has already been extensively looked at. There is a benchmark suite already available, and afaik when doing an apples to apples comparison, ScalaJSON is either competitive or faster. Also I worked with @sirthias to make sure this is as performant as spray-json (also in specific reasons to how its being parsed)

I would ideally prefer option 2 in the next major release of spray/spray json, or a transition of option one to two (using the @deprecated annotation along with https://github.com/mdedetrich/playframework/blob/b0934ce452d82312f37cd62af28524c6840a4c45/framework/src/play-json/src/main/scala/play/api/libs/json/package.scala#L47-L58 seems a good approach for this)

@fommil
Copy link
Contributor

fommil commented Jun 30, 2017

Please don't. I use spray-json because it is zero dep. Let somebody contribute an independent module that they maintain and release separately.

@jrudolph
Copy link
Member

Thanks for chiming in @mdedetrich.

If you implement parsing with unsafe.JValue, you automatically get conversion to standard AST for free (since the library provides this function).

No, it's not free and that's exactly the point. A big part of the time spent parsing JSON is in constructing the AST nodes. If the AST needs to be constructed twice (parse -> unsafe -> convert to safe for the user) before it can be passed to the user it will be slower than doing it directly. So, I don't think we will add any special support for the unsafe variant unless someone asks for it.

Note that the idea is that frameworks/libraries implement this AST and since its a breaking change, only do this for a next major release.

I thought a bit more about it and I think that a major breaking release is not going to happen for spray-json. By now it's so deeply ingrained in so much code that doing a major release on the same artifact ID would lead to too many problems with binary compatibility.

The only alternative for a breaking change would be basically a fork under a new artifact ID. But that again would be opposing the whole initiative which tries to unify parts of the json landscape instead of introducing yet another variant. We don't want (library) users to decide which version of spray-json they should use. Also, we would probably have to introduce our own version of abstract interfaces to make both variants of the library interoperable. I don't think that makes sense.

The naming can actually be fixed with type/object alias

I understand. So, we might be able to do a source (but not binary) compatible update at some point. The best time to do that would be during the next Scala version bump where the whole universe is recompiling anyway. Might still be difficult to get it right.

The implementation of a JSON AST is fairly trivial, and all of the current implementations differ in subtle ways.

Yes, sometimes they differ without a reason and sometimes with reasonable explanations. During the reactive-streams standardization we learned the hard way that coming to any kind of consensus will mean that only very general rules and restrictions will be agreed on. You will have to go with the least common denominator for many parts even if that means that there will be compatibility problems because semantics are allowed to differ slightly, and, yes, performance might not be the best when objects are passed from one implementation to the other. But that's how it is.

This has already been extensively looked at.

I know that has been looked at but fixing the implementation of the model means that there's no way someone can still try to optimize for particular use cases. E.g. I might want to decide in the parser that a number "is a reasonable integer" and put it into a long field instead of keeping the string. Or, I want to represent empty objects or objects with few fields in a particular way. Or I want to have very simple string-based equality for numbers. In many cases that's totally fine.

What we need is a reasonably simple interface to share instances between json libraries. As the interoperability probably won't even be a common case, performance is not most important thing for that interface. Performance might be important for the libraries, though, so the interface shouldn't restrain implementations too much.

With the current proposal that would mean the "shallow" solution for us. It would mean that interaction with other libraries would be possible but not as simple as it could be.

@jrudolph
Copy link
Member

@fommil as long as the new dependency will be very small, this is not a blocker for us.

@fommil
Copy link
Contributor

fommil commented Jun 30, 2017

It is a blocker for me. I am not taking on any more scala dependencies, our build for new scala releases is already a nightmare and I don't want one more link added to the chain. I will have to fork spray-json under ensime-server. Which probably means I can strip out the ProductProtocol and replace it with spray-json-shapeless. It may make sense to change the API at that point to have Either, to avoid using exceptions which is the one thing that I would do to improve spray-json, an otherwise perfect library that I have learnt a lot from.

@mdedetrich
Copy link

No, it's not free and that's exactly the point. A big part of the time spent parsing JSON is in constructing the AST nodes. If the AST needs to be constructed twice (parse -> unsafe -> convert to safe for the user) before it can be passed to the user it will be slower than doing it directly. So, I don't think we will add any special support for the unsafe variant unless someone asks for it.

I didn't mean in terms of performance, I meant for free in the sense you don't have to implement the conversion yourself.

I understand. So, we might be able to do a source (but not binary) compatible update at some point. The best time to do that would be during the next Scala version bump where the whole universe is recompiling anyway. Might still be difficult to get it right.

Yes, this sounds like the best option

I know that has been looked at but fixing the implementation of the model means that there's no way someone can still try to optimize for particular use cases. E.g. I might want to decide in the parser that a number "is a reasonable integer" and put it into a long field instead of keeping the string. Or, I want to represent empty objects or objects with few fields in a particular way. Or I want to have very simple string-based equality for numbers. In many cases that's totally fine.

Well there is, it would just be done in the next release of ScalaJSON and if its in a binary compatible then its even better because its really easy to update the dependency

It is a blocker for me. I am not taking on any more scala dependencies, our build for new scala releases is already a nightmare and I don't want one more link added to the chain.

Its not very hard to write a parser specific for ScalaJSON AST (and this is something on the plan down the road, are likely to use Jawn for this)

@fommil
Copy link
Contributor

fommil commented Jun 30, 2017

@mdedetrich that isn't even remotely related to the problem. When a new version of scala comes out, all the base low-level libraries need to publish, which involves a very delicate list of working up the dependency chain and impacted by every single person's availability that week. And this is all done blind, because most tooling is not available to bootstrap the dev process. I'm simply not interested, I will fork.

@mdedetrich
Copy link

@fommil

Not sure if I understand the problem, but ScalaJSON is mandated to be published for Scala 2.10.x/2.11.x/2.12.x for JVM and ScalaJS (ScalaJSON also explicitly has no dependencies)

@fommil
Copy link
Contributor

fommil commented Jun 30, 2017

no deps?

      "org.specs2" %% "specs2-core" % specs2Version % Test,
      "org.specs2" %% "specs2-scalacheck" % specs2Version % Test,
      "org.scalacheck" %% "scalacheck" % scalaCheckVersion % Test

      "org.scalacheck" %%% "scalacheck" % scalaCheckVersion % Test,
      "com.lihaoyi" %%% "utest" % "0.4.4" % Test

      "com.storm-enroute" %% "scalameter" % "0.7" % Test

      addSbtPlugin("org.scala-js" % "sbt-scalajs" % "0.6.18")
      addSbtPlugin("org.scoverage" % "sbt-scoverage" % "1.5.0")

let's see how fast you can pump out a dotty or scala 2.13.0-M2 release and then you'll see what I mean.

@mdedetrich
Copy link

mdedetrich commented Jun 30, 2017

Those are all test dependencies, this is an issue if they are in pom.xml?

Likewise with scala.js, that is for building scala.js artifact.

If its helpful, I can make the tests a pure SBT subproject so there are no dependencies in the pom?

@fommil
Copy link
Contributor

fommil commented Jun 30, 2017

so your answer is to release untested software? That's not what I mean. I think you need to appreciate the limitations involved in being so close to the ground in the release cycle. Many libraries reduce themselves down to a junit dependency to avoid introducing cycles.

@mdedetrich
Copy link

so your answer is to release untested software?

No, travis will always run tests from the subproject, and I can also set up SBT to do this

@mdedetrich
Copy link

Also what about https://github.com/spray/spray-json/blob/master/build.sbt#L23-L27 ?

In fact these are the same test dependencies that ScalaJSON AST uses (see https://github.com/mdedetrich/scalajson/blob/master/build.sbt#L99-L101 and https://github.com/mdedetrich/scalajson/blob/master/build.sbt#L8-L9), Scala.js plugin also does not effect the pom.xml for JVM artifacts ( see https://search.maven.org/#artifactdetails%7Corg.scala-lang.platform%7Cscalajson_2.11%7C1.0.0-M2%7Cjar )

I don't see how spray-json is any different here....

@mdedetrich
Copy link

@jrudolph Also apart from the JsNumber BigDecimal constructor, both AST's are practically the same, so the migration from a purely technical PoV should be relatively painless (assuming you also do the type/object aliasing)

@gmethvin
Copy link

gmethvin commented Jul 1, 2017

let's see how fast you can pump out a dotty or scala 2.13.0-M2 release and then you'll see what I mean.

I would really like you to address this part of @fommil's comment in particular.

Because scalajson needs specs2, scalacheck, utest, scalameter, scoverage, and scalajs to actually build the project and run the tests, all of those dependencies must be released before you can release. spray-json's current dependencies are a subset of those.

The other issue is that, since you've chosen to force your own implementation, any bugs in your implementation would need to be fixed there, giving library authors less control if a bug shows up in the implementation. If you provided only an interface to implement (perhaps with a default implementation) this would be less of a problem.

@mdedetrich
Copy link

mdedetrich commented Jul 1, 2017

Because scalajson needs specs2, scalacheck, utest, scalameter, scoverage, and scalajs to actually build the project and run the tests, all of those dependencies must be released before you can release. spray-json's current dependencies are a subset of those.

scalameter is in a seperate project, its completely unaffected (alternately this can be changed to use JMH, its not that big of an issue). uTest is going to be removed when specs2 supports Scala.js (alternately there is an issue to move to ScalaTest if this won't happen anytime soon, see mdedetrich/scalajson#6). specs2/scalacheck are very common test dependencies (spray-json has the same), and on a new major release of Scala the community makes sure that these libraries can be released before the real release of Scala.

The thing is, these can be resolved later. They are just test dependencies, not dependencies of the actual project.

The other issue is that, since you've chosen to force your own implementation, any bugs in your implementation would need to be fixed there, giving library authors less control if a bug shows up in the implementation. If you provided only an interface to implement (perhaps with a default implementation) this would be less of a problem.

Actually the interface is going to be an even worse option if this is your concern. If the concern is about bugs or performance, this should be addressed in ScalaJSON (this is open source software, its being developed in the open and not behind closed doors). People have had literally over a year to voice such concerns.

I mean if you have issues with JNumber, then this is gong to be exposed in ScalaJSON, regardless if its in an interface or an actual implementation. You can parameterise the number type, but that is probably going to create more problems with bugs, not less.

@gmethvin
Copy link

gmethvin commented Jul 3, 2017

As I've stated on the play-json issue, my main concern right now is rushing into making a decision on an unstable library. This is not a criticism of the general ScalaJSON effort at all. If you're asking us to switch while key issues are "going to be fixed", that is worrying, especially if these are possibly binary incompatible changes. If we give up control over the AST to another library, we want to know it is stable.

@mdedetrich
Copy link

As I've stated on the play-json issue, my main concern right now is rushing into making a decision on an unstable library. This is not a criticism of the general ScalaJSON effort at all. If you're asking us to switch while key issues are "going to be fixed", that is worrying, especially if these are possibly binary incompatible changes. If we give up control over the AST to another library, we want to know it is stable.

Sure, don't disagree with this. My main point before was the complaints about the test dependencies weren't accurate (ScalaJSON has the same test dependencies as spray-json which are scalatest and quickcheck, scala-js doesn't add any test dependencies and uTest is just for scala-js)

@jrudolph
Copy link
Member

jrudolph commented Jul 4, 2017

@fommil I see how deep dependencies can be a problem for an application like ensime. In such a case maintaining a private fork might really be the best solution if you want to optimize the time to get a version out for a new Scala version.

I'm actually not so much concerned that adding the dependency will add a huge amount of time to our 2.13 release. Also I'm much less convinced that one week or two actually matters for most people these days. I understand that this might be different for ensime. I think we can still deal with the issue if it turns out to be one.

@mdedetrich:

@jrudolph Also apart from the JsNumber BigDecimal constructor, both AST's are practically the same, so the migration from a purely technical PoV should be relatively painless (assuming you also do the type/object aliasing)

The technical issue for spray-json is maintaining binary compatibility. We cannot just switch out the AST right now. The solution sketched under 2 above could help with an eventual migration but it's a lot of work for which it isn't clear yet if it is worthwhile. So, unless either user ask for better integration or someone contributes a compatible scheme, we will probably go with option 1. Maybe, we even mark the dependency as provided so users which don't need the integration feature, don't get an extra dependency (though, I realize that doesn't help with @fommil's concern).

@ktoso
Copy link
Member Author

ktoso commented Jul 4, 2017

I realised I did not reply here, fixing that:

Basically since the goal both for us, and the ecosystem here is to provide more compatibility, going for a big release with breaking compatibility is a no-go (at least right now).

Option 1 is the most interesting, and perhaps the only viable even, since it provides the most bang-for-buck. Spray-json has been somewhat seem "kind of complete" (look at commit activity, yet it still does the job which is pretty valuable) though of course others are free and invited to disagree and develop their libraries, so our goal really is to integrate with the common ecosystem, rather than spark a rewrite.

For us this means that if we base other libraries (Alpakka comes to mind) on the scalajson ASTs, we free people to use whichever library they want. And same as with Reactive Streams, it's not realistic to think the common one will be a perfect fit for everyone, it'll always be the middle-ground, and we'll simply accept that for what it is - the inter-op types.

So in one sentence, I'm also in favour of attempting Option 1 for spray-json.
Until then, we await the final polish on scalajson tracking those closely.

Sounds reasonable @jrudolph @sirthias @mdedetrich ?

@mdedetrich
Copy link

mdedetrich commented Jul 5, 2017

@jrudolph

The technical issue for spray-json is maintaining binary compatibility. We cannot just switch out the AST right now. The solution sketched under 2 above could help with an eventual migration but it's a lot of work for which it isn't clear yet if it is worthwhile. So, unless either user ask for better integration or someone contributes a compatible scheme, we will probably go with option 1. Maybe, we even mark the dependency as provided so users which don't need the integration feature, don't get an extra dependency (though, I realize that doesn't help with @fommil's concern).

Oh of course, when I am advocating for Option 2, its implied that it would be a new major release which breaks binary compatibility. My main point is that I would want spray-json to eventually get to Option 2/3 for many reasons (performance between libraries that use ScalaJSON is one of them)

I think the only contentious issue is binary compatibility. I am not (at all) advocating for making a release that is binary compatible, its just my preference that a release for spray-json is made (eventually) that is a new major (which indicates breakages with binary compatibility) and older versions of spray-json can use the conversion module.

It will still be a users decision whether to upgrade or not, but at least we get the best of both worlds

@jrudolph
Copy link
Member

jrudolph commented Jul 5, 2017

To manage expectations: as it stands right now no major breaking release of spray-json is planned. As I explained it would entail changing artifact and package names to preventing binary compatibility issues in downstream projects.

@mdedetrich
Copy link

To manage expectations: as it stands right now no major breaking release of spray-json is planned.

Okay I think we are clear then

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants