-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Serializability of coroutine classes #76
Comments
Serializable
but I suspect that this is compiler language issue. |
It does, but, unfortunately, making it |
Note, that you can easily serialize coroutine state with 3rd party serialization framework like Kryo. It is just that standard JVM serialization does not currently work due to objects that do not implement |
+1. I was able to achieve serialization (with the help of @elizarov, thank you so much), but only with the use of reflection, to change the value of |
+1 @elizarov totally understand the concept and saw the great example for ES6 generators but is there a sample project that can demonstrate the serialization that we can build upon.. so we make sure we're doing it correctly seems pretty deep. |
@wdroste What is your use-case for serialization? Can you explain it here, please. |
We have a IoT type server application w/ millions of clients. These clients are low powered in most cases and the protocol they're using requires many HTTP based request/responses to a per enterprise customer customizable business logic (scriptable workflow). Now that process can take anywhere from 30 secs to 1 min for an entire HTTP Session.. that process is mostly waiting on I/O from the clients, the actual server processing is only a 1-2 secs total. The memory required to hold the session state is rather large so we like to make a trade off. Incur the CPU cost of serialization and server side I/O persistence so we can free up some memory to continue processing. Basically the issue is the JVM spends a lot of time looking for memory to free that it can't because the session/workflow state must be maintained, if we were able to persist that state it could free the memory and use it for the other requests, thereby increasing throughput and as a side effect we could be stateless as well since any server could load the current state and continue the workflow making loading balancing much easier. Our workflows are python scripts built as generators such that we can provide an synchronous programing model to an asynchronous protocol, the issue is python does not support pickling/serialization of the generator. We're looking to move to something that does. I mentioned ES6 generator because that's the style python uses and we require to convert our workflows to another language. The 'yield' statement must return an object as well as provide one. I would prefer Kotlin excellent IDE support, type safety, and compiler based null checks, but there's others that would recommend we just do this in Javascript Rhino since there's support for both 'yield' and serialization of coroutines. |
I wrote a Kotlin test of it using Kryo with @elizarov's help - https://gist.github.com/Restioson/fb5b92e16eaff3d9267024282cf1ed72 . The only issue is that it uses Kryo, a Java library (and a bit of java reflection which I'm sure can be converted to Kotlin). Theoretically it would work if you subbed that out with another serialization library able to serialize private fields, I imagine, or if you did it yourself with reflection. |
@Restioson appreciate it, I will take a look. |
I'd like to add an additional use case. This is in case of Android. In Android in particular there are cases that require repeated invocations of dialogs or independent activities. The coding of this is very suited for coroutines and it works "well" if you ignore the fact that android activities can be closed (and saved to disk) at any time. Some form of serialization would solve this problem. as a continuation can then just be stored allowing safe resumption of state. |
I've managed to make something work on Android (including serializing across activity recreation). There is one giant hack though. In a coroutine it is deceptively easy to capture an activity. This is not valid. The hack will actually serialize Service, Application and Activity (subtypes) as simple enum constants. On deserialization a context is passed which is used in it's place (with a cast to the "type"). The Kryo part lives in KryoIO.kt. It's use for android (including some stuff around wrapping startActivityForResult is in CoroutineActivity.kt |
I've create a project to illustrate my statements above. The ES6 generator part works great, however I'm unable to serialize the coroutine. If we can serialize the workflow we can use it for long running processes. @elizarov could really use some help w/ this. |
Looking at aws step functions, this would be so much nicer. |
@wdroste There are some really nasty bits of internals to take care of. I've managed to make it work (https://github.com/pdvrieze/android-coroutines/tree/master/core/src/main/java/nl/adaptivity/android/kryo) although this is not for the latest coroutines library (and it has internals). One of the problems is that the coroutines library uses sentinel objects that are just objects (rather than enum instances - for the experimental version. Enums are used in 1.3 nonexperimental) and as such don't handle serialization properly. If course there are also other issues. |
Any update on this? I think this would be super useful! |
@mrussek What's your use-case for this? |
I think it would be very useful to implement long running business processes. |
You can do it right now. It takes so hardship to setup, but I don't see how we could make it any more simple. If you are to serialize coroutine it automtically means you have to abide by certain restrictions in your code. |
I can understand that. Tx. |
Read the discussion in this issue. There are links to a bunch of working examples. |
@joost-de-vries You may want to look at business process managements systems for this. For now, that is your best bet (although indeed it doesn't support debugging across the process). I'm not sure that coroutines can actually fully do what you want though. A key design criterion is still in ACID properties of the steps. Btw. serialization would probably be a lot easier if it was possible to have the compiler validate captures (or the lack of captures). |
Yeah, my work is currently looking at Uber's cadence for something like this, although their approach seems a bit more hacky imo. |
@mrussek I had a quick look at uber cadence. It seems a sensible system although I'm not clear why you would use it over a bpmn2 based system. In any case, key in those systems is that they are essentially distributed systems where the workflows system doesn't do the work. It may be interesting to use workflows in a single-process system, but this is something that is not quite ready for bleeding edge yet. I have a system that should be able to do it reasonably easily as I already have the testing of workflows implemented without actual actions attached. Linking them with a lambda for the behaviour (rather than workflow messaging) should be reasonably straightforward. |
One more use-case: Telegram bots. suspend fun talkTo(user: User, firstMessage: Message) {
when (firstMessage.text) {
"Buy some donuts" -> sellDonutsTo(user)
…
}
}
private suspend fun sellDonutsTo(user: User) {
sendMessage(user, "How many?")
val number = receiveNumberFrom(user, onError = "Why don't you send me a whole positive number?")
…
} Currently this can be done with sealed classes which requires much more patience. |
Another use-case: support for scripting via coroutines in a game engine that needs to serialize the scripts on save/load, or heaven forbid, in a "rollback" networking system, where suspended scripts are saved each frame and swapped between depending on network-delayed inputs from other clients. Save/load also encompasses "zoning", where for instance, as the player runs around the game world, parts of the world are loaded and unloaded dynamically. This would include NPCs with "wander" scripts, enemies with "scan then attack" scripts, etc. |
@Miha-x64 Actually, direct cloning is not that easy due to control inversion (both serialization and deserialization are driven from the serializer side, so the format cannot easily shuffle between the two. You'd have to use (actual) threads, or use a buffer. |
Thanks for your feedback and the codepointer @pdvrieze . I'll analyze it to see if I can adapt it for my use case. |
@pdvrieze I've tried to work through your mentioned android-coroutines repo, extracted the non-android bits into https://github.com/holgerbrandl/kryo-kotlin-sam/blob/master/src/main/kotlin/kryo (to work out a more minimalistic POC example) and prepared an example along with it in https://github.com/holgerbrandl/kryo-kotlin-sam/blob/master/src/main/kotlin/kryo/SeqBuilderSerializationExample.kt. When doing so, I had lifted coroutines, kotlin and kryo dependencies to the current versions. Unfortunately, the example still throws an NPE when continuing the sequence. I suspect https://github.com/holgerbrandl/kryo-kotlin-sam/blob/eb3dd4ab23e6dd036fde095e6e3065202ac6d995/src/main/kotlin/kryo/AndroidKotlinResolver.kt#L42-L45 to to refer to no longer valid/used classes. Also I know it is asking a lot, but is there any chance you could provide me with more guidance to make this - i.e. sequence-builder persistence & continuation using kryo - work? |
@holgerbrandl I didn't look at it, but I suspect you are correct. Basically to serialize coroutines you are going into the guts of how coroutines work (and for Android into replacing old instances with new ones - so you have a valid context rather than the original one - which isn't valid). I haven't really touched that code in a while so it is probably no longer working with the latest library versions that have different internal state. |
Since it seem to have worked once, I'd still have some hope that we could make it work again. I'm happy to help, but I would need some pointer/guidance to get started. I'd be interested to do so just on the JVM without any Android. And I'd also be primarily interested in kryotizing |
I've played with it, using the recent version of Kryo (5.2.1) and the recent version of Kotlin (1.6.10) . I'm not sure how it worked in the past, but it does not seem to be capable of serializing classes that do not have no-arg constructors anymore, which includes anonymous classes, lambdas, and suspending functions. Even though you can ask Kotlin to make the lambdas themselves serializable (including There have been some changes in the way Kotlin JVM generates lambdas, which might somehow contribute, too. See also discussion here KT-45375 Generate all Kotlin lambdas via invokedynamic + LambdaMetafactory by default. |
On a side note, serializing a state of |
This would be great as well. I guess it could also be helpful when using kryo (which is more applicable if not all types implement |
@elizarov I would agree with you. I've done the Kryo thing and it is brittle/unstable by design. Nice for a proof of concept, but not suitable for long-term production projects. Actually making the things serializable (either using kotlinx.serialization or otherwise) would be a better approach. |
Given the somehow slow pace of adopting |
I have stumbled upon this ticket while dreaming about a programming language with a support for durable workflows. Though I believe that the workflows should have explicit schemas: without those they would be brittle to changes of the codebase. So I would like to propose an annotations API, which would make it possible to develop an IR-plugin for JVM as an unstable proof of concept, though this would be more stable with the compiler support, like There have to be at least two annotation types:
Here is an example of an annotated workflow definition:
Those two annotations are enough to implement a simple serialization plugin, but there could be others, like ones What do you think about this API? |
My initial thought is that a coroutine cannot be persisted without serializing all needed fields anyway - the only benefits of the annotation are explicitness and providing customizability to the serialized data (e.g. XML names), at the cost to legibility imo. (Omitting the name and using (static) reflection to retrieve it could help here.) Marking certain suspensions as serializable or not is also strange to me - what should the API look like from outside the coroutine? E.g. suppose I’m shutting down the application and there are suspended continuations - should those in a “non-persistable” state be canceled as usual, while others are persisted? Or is persistence at the discretion of the suspended continuation’s “owner,” the annotation merely indicating the possibility? Or perhaps we should inject “Serializers” as we currently do Dispatchers, in a global way? If so, how do owners get a handle back to their suspended continuations? It’s really a question of “who does what” here. What about migrating data across version changes? Can annotations alone account for this? I think I’d prefer a Ruby on Rails-like approach, or at least the option. There’s just a lot of open questions, almost certainly worthy of its own KEEP. |
Continuations may have unserializable fields invisible inbetween persistence points: sockets, processes, intermediate buffers, etc. I think it is reasonable to explicitly select what fields to save and how, while the compiler would check if there are no other fields required by the persistence points.
For most variables this will probably be true and
Yes, I was thinking of it as a possibility of persistence. Using that the Kotlin compiler user should be able to implement any policy of liveness. The policy may be centralizing in single JVM, so the coroutines would be durable to the JVM restarts, or it may be distributing across the cluster with state persisting into some distributed NoSQL storage, so the coroutines would be also durable to death of single processing nodes.
When user is responsible for the location where the coroutines are being stored and understands the format they are being stored in, then he will be able to write any migration code he wants, if that's what you are talking about. |
If I understood correctly what you mean by "persistence," it's the ability to stop the execution of the program, and then resume it from the same place. Did I get this right? If so, the problem is completely unrelated to coroutines. Sure, coroutines do represent a reified state of the execution, and it's, in some sense, easier to save the execution in the middle of the coroutine than in some arbitrary code. But!
Etc. The common approach to dealing with this (and there are some precedents!) is to just dump the whole state of RAM related to that program. For example, emacs did do this, using the
I hope this is what you meant by persistence, or I'd feel stupid for giving a lecture about an irrelevant thing! |
Not the whole program, the granular persistence of single coroutines one by one would be enough. In this case there should be no problems with multithreading, considering the user avoids shared mutable state, such as shared objects or static variables. |
Workflows (or business processes - my research area) are much more heavyweight than coroutines are. When you state that workflows should use explicit schemas you are looking at it with the correct attitude. While I agree that current languages have poor support for workflows, I would suggest this is partially due to the challenge. A workflow management system needs to deal with the management of the workflow instances in flight. They require a high amount of robustness, may have state associated with the instance, and most of all may fail in unclear ways that may need some sort of manual handling. Unfortunately a lot of the work in this area is stuck in the enterprise (software) world giving rise to languages such as BPEL4WS, or (better) direct execution of BPMN models. The high requirement approach (which still needs plenty of streamlining) is probably good when you're dealing with high-frequency processes, but for simple processes a more hard-coded workflow is probably better. Serializable coroutines could be a way in which this is done, but not always the best approach. (My context for looking at this originally was to support multi-stage Android steps: download a library, once downloaded install it, once installed use it to get an account (indirectly through the account manager)...., a process that can be done in a few minutes, but involves many resumption points. Writing it as a coroutine is much much cleaner than using some sort of manual way to implement events and resumptions. In this case failure of storage just means it need to start from scratch. In a situation that requires more robustness (or more explicit synchronization structure such as multi-instance subprocesses) you probably want some sort of library that provides a degree of explicit process support (with schemas for the state). A library is needed as there is significant runtime support needed. Serialization is part of that (look at kotlinx.serialization), but the inputs and outputs of each activity/part should probably be explicit (and marked serializable). I'm not sure about coroutines as the basis for such a library, in any case it would go quite deep into what coroutines are/do, and may very well just use the suspend infrastructure without any of the other coroutine bits (one thing needed is the ability to resume the coroutine halfway through on restart). |
IMO, a combination of both Kotlin coroutines and kotlin serialization can serve as a great backbone for persistent, long-running workflows. The missing piece is some explicit way to indicate that a given function represents a serializable workflow. From the user-perspective, this can be implemented simply as a Just like we have an optional support for |
What does it mean to serialize a variable that Kotlin wouldn't turn into a field by default? Isn't this strictly a variable that has no impact after the suspension point? Hence, what is the purpose? Conversely, what would it mean to not serialize a variable that Kotlin converts to a field by default? Isn't this strictly necessary? Or were you thinking more like Java's I suppose my point is, I don't know that annotations for fields are strictly necessary, even if their existence would be nice for customization. You could enable a linter setting if you'd like to mandate all fields be annotated by the programmer, otherwise IDE underscoring could be sufficient for many, and a 1-to-1 mapping of what must actually be serialized. (Correct me if I'm missing something.) On your next points, I agree that it's good that annotation only represent the possibility of serializability. I'm not familiar with most of the following terminology however, but it sounds like things that may be important use cases for coroutine serialization, rather than features we'd want Kotlin to implement specifically? As long as we can get serialization into a custom format out of suspended coroutines, I think that would suit most purposes. In terms of use cases, for a game engine it would be good to support serializability as a side-effect of a suspension point, not it's primary purpose. E.g. It's problematic though, because designers aren't programmers, and they're not going to want to annotate a bunch of fields to avoid catastrophic serialization issues (nor should anyone want them to.) But annotation isn't the only thing necessary to avoid disaster - adding, removing or converting fields would have to be automatically mapped to schema changes as well, to make the process sane to a non-coroutine-expert. This is way out of left field, but could the IDE automatically detect changes to EDIT: example DSL to make this concrete: // Example game script DSL: NPC hauling items for all eternity
suspend fun behavior(npc: Npc) {
while (true) {
val item = npc.pickUpNearestItem()
npc.walkTo(0, 0, 100) // << suspend function, item needs serializability here
npc.dropItem(item)
npc.walkTo(0, 0, 0)
}
} In this case, if a designer were to modify this script, I don't want them to even have to think about persistence if I can help it. I'd accept a ridiculous git history of ten-thousand schema transitions, so long as it's correct and designers don't have to squint their eyes too hard. |
The first example that comes into mind are function interface parameters, which aren't being used by the implementation at this moment, but the code could be updated to use them later. They also could be only used before the first suspension point, so there would be no need in generating continuation fields for them, though the code using them after the first suspension point may appear later. I believe that at the moment the Kotlin compiler optimizes such unused variables and does not save them into continuation fields.
If a network socket is being opened which has suspendable usage between two near persistence points, then the compiler will generate a continuation field for it, though there could be no use for it in persisted state, because it never escapes out of those two persistence points:
Yes, that is what I have had in mind.
Though extracting schema from suspendable code into a separate source file is probably possible, I believe that making those two files matchable will be far more cryptic than having a single source file with suspendable code annotated. |
That makes sense, persisting fields not currently used but possibly necessary in the future would indeed be valuable.
I see - in my mind, every suspension in the coroutine would be persistable, but I suppose this isn't strictly necessary. I suppose non-annotated suspension points would be "unsafe points," rolling back to the last persisted if interrupted, effectively? Outside my domain of experience, so hard to comment, sorry.
How would annotations cover the case of a field being renamed? What about multiple renames over time? How would data under the old schema be loaded into the new? If the programmer does not update annotations to account for a rename, what should happen? By "cryptic" did you mean "difficult to implement?" Because if you meant "difficult to understand," I don't think a generated schema and series of migrations (a la Ruby on Rail) would be hard to read - I think it would be clear when looked at, which would basically only be necessary upon commit or in advanced usage scenarios. It did occur to me that some migration code may be impossible to generate - when a new field is added, for instance, code to create it would have to be hand-written, barring a modal-dialogue-specified default value. And in the case a meaningful value cannot be generated via migration, the coroutine code itself might have to interpret a "does not exist" value and modify its own behavior to handle it - this would violate the principle of coroutines not caring about generated schema/migrations underneath. There may not be an ideal solution to this. |
If there would no data persisted under the current field, then the deserializer would try to get it from an alias.
Renaming
Renaming to
I guess, then a runtime deserialization error would happen –
Ruby on Rails migrations only work with explicitly named tables and fields, while the serialization of unannotated coroutines would have to work not only with variables, whose names could be inferred with reflection, but also with unnamed suspension-points. @elizarov has suggested to give those sequential labels, as Kotlin compiler does, so the migrations would have to be defined in terms of suspension points 0, 1, 2, ... I believe that those would be harder to understand than explicit names. |
@faucct The way I would see it to work would be (using kotlinx.serialization) that the hidden object holding the function stack/the continuation would be serializable. Then all values would be required to be serializable (and could be annotated). If you want to specify the type for serialization that would be something that maybe would be done as an annotation on the suspend declaration (in line with the context feature), and then local variables not existing, except as provided by that context. (It would need some special rules to deal with setting "val" properties) |
I have hacked together a kryo-based schemaless persistence solution. The |
@faucct I did at some point have the similar (without crashes) - focused on the Android context. However it is quite tricky and generally (esp. in the Android context) require some fixups as well. For example objects should not be deserialized, android Context objects should be injected (and not serialized/deserialized). And of course it is extremely dependent upon the specifics of the library. It can "work" but is fragile. In addition, the values captured in the closure are not explicit. This makes for the code using such serialized coroutines to be quite error-prone. |
I have tried to think about the serialization in terms of kotlin.serialization API to see that if there is enough of it.
|
I have implemented workflows using my JVM instrumenting-based continuations implementation serializable continuations and a JS transpiler. Durability/scalability is offered by Kafka Streams, or, in fact, any stateful stream transformer. However, I'd prefer Kotlin Coroutines if they are serializable.
Do you mean the workflow versioning here? The versioning indeed must be part of the code, but the workflow-as-code is much easier than using declarative workflow definitions, even for versioning. |
@awto Part of the problem of workflows is versioning. It is required. I'd say that workflows would probably be better declaratively (although this can be done in code - with the actual activity implementation being directly in code - BPEL is just bad in usability and too tight on an imperative execution model). What I was more concerned about is how you deal with workflows that have "problems" in the middle of their (sometimes weeks/months long) execution. Part of dealing with errors is to have robust specifications to start with. You want to be explicit in the inputs and outputs of steps. As such coroutines that do this automatically/transparently may not be the best approach (I haven't tried it, so this is not a dismissal, just an "I don't know"). |
We can be very flexible with the versioning and code. As it's Kafka we can have the whole history. We can travel back to some point where it's easy to update. Or wait for some point in the future. We can simulate already recorded inputs. Or we can throw an exception signalling the update, and it's the workflow which catches it and perform the necessary state handling. For this, the state doesn't need to be exposed at all. Though it can be (for example, using local variable annotations described in this thread above). But I wouldn't do this.
It depends on how we define the word declarative. The workflow languages I know don't look declarative at all. They are usually some visual diagrams or some lists in yaml, or something like this - but they are still instructions about how we do things, not what we do. So they are just yet another programming language, but heavily restricted though.
These "problems" are just exceptions. The workflow developer is responsible to handle them gracefully in both approaches. An advantage of the continuations+Kafka approach is we have the log of the weeks/months etc. We can time travel to any position. No need for any spec or vendors to support this. Everything needed is already available.
I'm not sure I understand this. The spec is Kafka topics schema. It's only the input/output spec and the internal state isn't exposed, it can be exposed but it shouldn't be. Everything the outside needs to know must be posted into some dedicated Kafka topic. And everything it needs to know from the outside, the workflow must read from some dedicated Kafka topic. Note, the internal state is still available for debugging (with any JVM debugger), and since we can travel to the past it's available at any point in time. The steps are just the code - it looks explicit enough. We can use any software development practices to make the code better, or we can write something ugly. But it's the same for BPEL, except BPEL restricts using the practices. |
Currently, if you attempt to serialize coroutine state via standard Java Serialization the tree of reference leads you to the objects defined in
kotlinx.coroutines
likeStandaloneCoroutine
(which serves as a completion for launched coroutines) and its contexts (likeCommonPool
) which are not currently serializable. They should be made properly serializable.See also https://github.com/Kotlin/kotlin-coroutines/issues/28
The text was updated successfully, but these errors were encountered: