Serializability of coroutine classes #76

elizarov · 2017-06-20T09:07:23Z

Currently, if you attempt to serialize coroutine state via standard Java Serialization the tree of reference leads you to the objects defined in kotlinx.coroutines like StandaloneCoroutine (which serves as a completion for launched coroutines) and its contexts (like CommonPool) which are not currently serializable. They should be made properly serializable.

See also https://github.com/Kotlin/kotlin-coroutines/issues/28

The text was updated successfully, but these errors were encountered:

fvasco · 2017-06-20T10:04:37Z

Serializable object should work out of the box, like

enum class CommonPool { INSTANCE }

but I suspect that this is compiler language issue.

elizarov · 2017-06-20T10:11:18Z

It does, but, unfortunately, making it enum pollutes its namespace with methods like valueOf and values which we don't need here. The correct solution is to provide private fun readResolve. Ideally, this should be done by Kotlin compiler, but it is not there yet: https://youtrack.jetbrains.com/issue/KT-14528

elizarov · 2017-06-20T10:12:39Z

Note, that you can easily serialize coroutine state with 3rd party serialization framework like Kryo. It is just that standard JVM serialization does not currently work due to objects that do not implement Serializable.

Restioson · 2017-06-21T15:15:49Z

+1. I was able to achieve serialization (with the help of @elizarov, thank you so much), but only with the use of reflection, to change the value of result to COROUTINE_SUSPENDED

wdroste · 2017-07-19T17:06:32Z

+1

@elizarov totally understand the concept and saw the great example for ES6 generators but is there a sample project that can demonstrate the serialization that we can build upon.. so we make sure we're doing it correctly seems pretty deep.

elizarov · 2017-07-20T14:28:23Z

@wdroste What is your use-case for serialization? Can you explain it here, please.

wdroste · 2017-07-20T16:53:26Z

We have a IoT type server application w/ millions of clients. These clients are low powered in most cases and the protocol they're using requires many HTTP based request/responses to a per enterprise customer customizable business logic (scriptable workflow). Now that process can take anywhere from 30 secs to 1 min for an entire HTTP Session.. that process is mostly waiting on I/O from the clients, the actual server processing is only a 1-2 secs total. The memory required to hold the session state is rather large so we like to make a trade off. Incur the CPU cost of serialization and server side I/O persistence so we can free up some memory to continue processing.

Basically the issue is the JVM spends a lot of time looking for memory to free that it can't because the session/workflow state must be maintained, if we were able to persist that state it could free the memory and use it for the other requests, thereby increasing throughput and as a side effect we could be stateless as well since any server could load the current state and continue the workflow making loading balancing much easier.

Our workflows are python scripts built as generators such that we can provide an synchronous programing model to an asynchronous protocol, the issue is python does not support pickling/serialization of the generator. We're looking to move to something that does.

I mentioned ES6 generator because that's the style python uses and we require to convert our workflows to another language. The 'yield' statement must return an object as well as provide one. I would prefer Kotlin excellent IDE support, type safety, and compiler based null checks, but there's others that would recommend we just do this in Javascript Rhino since there's support for both 'yield' and serialization of coroutines.

Restioson · 2017-07-20T18:49:42Z

I wrote a Kotlin test of it using Kryo with @elizarov's help - https://gist.github.com/Restioson/fb5b92e16eaff3d9267024282cf1ed72 . The only issue is that it uses Kryo, a Java library (and a bit of java reflection which I'm sure can be converted to Kotlin). Theoretically it would work if you subbed that out with another serialization library able to serialize private fields, I imagine, or if you did it yourself with reflection.

wdroste · 2017-07-20T19:21:13Z

@Restioson appreciate it, I will take a look.

pdvrieze · 2017-11-20T20:38:12Z

I'd like to add an additional use case. This is in case of Android. In Android in particular there are cases that require repeated invocations of dialogs or independent activities. The coding of this is very suited for coroutines and it works "well" if you ignore the fact that android activities can be closed (and saved to disk) at any time. Some form of serialization would solve this problem. as a continuation can then just be stored allowing safe resumption of state.

pdvrieze · 2017-11-27T12:03:47Z

I've managed to make something work on Android (including serializing across activity recreation). There is one giant hack though. In a coroutine it is deceptively easy to capture an activity. This is not valid. The hack will actually serialize Service, Application and Activity (subtypes) as simple enum constants. On deserialization a context is passed which is used in it's place (with a cast to the "type").

The Kryo part lives in KryoIO.kt. It's use for android (including some stuff around wrapping startActivityForResult is in CoroutineActivity.kt

wdroste · 2018-01-13T05:45:04Z

I've create a project to illustrate my statements above. The ES6 generator part works great, however I'm unable to serialize the coroutine. If we can serialize the workflow we can use it for long running processes.

@elizarov could really use some help w/ this.
@Restioson tried to follow your gist but it doesn't seem to work in 1.2.10
https://github.com/wdroste/kotlin-generator-serializer

unicomp21 · 2018-09-14T08:03:17Z

Looking at aws step functions, this would be so much nicer.

pdvrieze · 2018-09-20T15:04:55Z

@wdroste There are some really nasty bits of internals to take care of. I've managed to make it work (https://github.com/pdvrieze/android-coroutines/tree/master/core/src/main/java/nl/adaptivity/android/kryo) although this is not for the latest coroutines library (and it has internals).

One of the problems is that the coroutines library uses sentinel objects that are just objects (rather than enum instances - for the experimental version. Enums are used in 1.3 nonexperimental) and as such don't handle serialization properly. If course there are also other issues.

mrussek · 2019-04-14T01:45:52Z

Any update on this? I think this would be super useful!

elizarov · 2019-04-20T16:59:11Z

@mrussek What's your use-case for this?

joost-de-vries · 2019-05-02T13:17:10Z

I think it would be very useful to implement long running business processes.
The tools that are available for this are usually proprietary using some incomplete home brew expression language and flow control. And almost all of them use proprietary storage and run platform. As a result you can't use proper version control, proper debugging, proper module reuse, proper cloud deployment etc.
So I being able to use coroutines to implement this would be incredibly exciting.

elizarov · 2019-05-23T23:51:41Z

You can do it right now. It takes so hardship to setup, but I don't see how we could make it any more simple. If you are to serialize coroutine it automtically means you have to abide by certain restrictions in your code.

joost-de-vries · 2019-05-24T05:37:05Z

I can understand that. Tx.
Can you give some pointers? I'd like to make a small playground to see what's involved.

elizarov · 2019-05-24T12:50:33Z

Read the discussion in this issue. There are links to a bunch of working examples.

pdvrieze · 2019-06-07T07:48:18Z

@joost-de-vries You may want to look at business process managements systems for this. For now, that is your best bet (although indeed it doesn't support debugging across the process). I'm not sure that coroutines can actually fully do what you want though. A key design criterion is still in ACID properties of the steps.

Btw. serialization would probably be a lot easier if it was possible to have the compiler validate captures (or the lack of captures).

mrussek · 2019-06-13T17:19:47Z

Yeah, my work is currently looking at Uber's cadence for something like this, although their approach seems a bit more hacky imo.

pdvrieze · 2019-06-17T07:00:36Z

@mrussek I had a quick look at uber cadence. It seems a sensible system although I'm not clear why you would use it over a bpmn2 based system. In any case, key in those systems is that they are essentially distributed systems where the workflows system doesn't do the work. It may be interesting to use workflows in a single-process system, but this is something that is not quite ready for bleeding edge yet. I have a system that should be able to do it reasonably easily as I already have the testing of workflows implemented without actual actions attached. Linking them with a lambda for the behaviour (rather than workflow messaging) should be reasonably straightforward.

Miha-x64 · 2020-04-09T21:06:40Z

One more use-case: Telegram bots.
For each user, a bot stores a state of the dialogue, which is a finite state machine easy to express with a coroutine, e. g.

suspend fun talkTo(user: User, firstMessage: Message) {
    when (firstMessage.text) {
        "Buy some donuts" -> sellDonutsTo(user)
        …
    }
}
private suspend fun sellDonutsTo(user: User) {
    sendMessage(user, "How many?")
    val number = receiveNumberFrom(user, onError = "Why don't you send me a whole positive number?")
    …
}

Currently this can be done with sealed classes which requires much more patience.

philipguin · 2020-04-24T19:31:52Z

Another use-case: support for scripting via coroutines in a game engine that needs to serialize the scripts on save/load, or heaven forbid, in a "rollback" networking system, where suspended scripts are saved each frame and swapped between depending on network-delayed inputs from other clients.

Save/load also encompasses "zoning", where for instance, as the player runs around the game world, parts of the world are loaded and unloaded dynamically. This would include NPCs with "wander" scripts, enemies with "scan then attack" scripts, etc.

pdvrieze · 2021-11-15T16:57:26Z

@Miha-x64 Actually, direct cloning is not that easy due to control inversion (both serialization and deserialization are driven from the serializer side, so the format cannot easily shuffle between the two. You'd have to use (actual) threads, or use a buffer.

holgerbrandl · 2021-11-16T06:21:39Z

Thanks for your feedback and the codepointer @pdvrieze . I'll analyze it to see if I can adapt it for my use case.

holgerbrandl · 2022-02-01T07:54:55Z

@pdvrieze I've tried to work through your mentioned android-coroutines repo, extracted the non-android bits into https://github.com/holgerbrandl/kryo-kotlin-sam/blob/master/src/main/kotlin/kryo (to work out a more minimalistic POC example) and prepared an example along with it in https://github.com/holgerbrandl/kryo-kotlin-sam/blob/master/src/main/kotlin/kryo/SeqBuilderSerializationExample.kt.

When doing so, I had lifted coroutines, kotlin and kryo dependencies to the current versions.

Unfortunately, the example still throws an NPE when continuing the sequence. I suspect https://github.com/holgerbrandl/kryo-kotlin-sam/blob/eb3dd4ab23e6dd036fde095e6e3065202ac6d995/src/main/kotlin/kryo/AndroidKotlinResolver.kt#L42-L45 to to refer to no longer valid/used classes. Also CommonPool which is somehow used to configure the Kryo instance seems gone. But these aspects may not even relate to the root cause, which is still in the dark for me.

I know it is asking a lot, but is there any chance you could provide me with more guidance to make this - i.e. sequence-builder persistence & continuation using kryo - work?

pdvrieze · 2022-02-01T18:58:52Z

@holgerbrandl I didn't look at it, but I suspect you are correct. Basically to serialize coroutines you are going into the guts of how coroutines work (and for Android into replacing old instances with new ones - so you have a valid context rather than the original one - which isn't valid). I haven't really touched that code in a while so it is probably no longer working with the latest library versions that have different internal state.

holgerbrandl · 2022-02-01T19:22:12Z

Since it seem to have worked once, I'd still have some hope that we could make it work again. I'm happy to help, but I would need some pointer/guidance to get started.

I'd be interested to do so just on the JVM without any Android. And I'd also be primarily interested in kryotizing sequence{}.iterator() instances. Not sure if this makes things simpler or more complicated.

elizarov · 2022-02-02T10:04:41Z

I've played with it, using the recent version of Kryo (5.2.1) and the recent version of Kotlin (1.6.10) . I'm not sure how it worked in the past, but it does not seem to be capable of serializing classes that do not have no-arg constructors anymore, which includes anonymous classes, lambdas, and suspending functions. Even though you can ask Kotlin to make the lambdas themselves serializable (including suspend lambdas), but the state machines that Kotlin generates for the suspending lambda and for suspending functions are not serializable and do not have a default constructor, so Kryo, as of today, cannot serialize them.

There have been some changes in the way Kotlin JVM generates lambdas, which might somehow contribute, too. See also discussion here KT-45375 Generate all Kotlin lambdas via invokedynamic + LambdaMetafactory by default.

elizarov · 2022-02-02T10:17:35Z

On a side note, serializing a state of sequence does not work with Java Serialization either, because kotlin.sequences.SequenceBuilderIterator is not a serializable class. This might be easier to fix, than to figure it out with Kryo. See KT-51100 Consider making kotlin.sequences.SequenceBuilderIterator serializable.

holgerbrandl · 2022-02-02T20:11:03Z

This would be great as well. I guess it could also be helpful when using kryo (which is more applicable if not all types implement Serializable) via https://github.com/EsotericSoftware/kryo#javaserializer-and-externalizableserializer

pdvrieze · 2022-02-03T19:34:57Z

@elizarov I would agree with you. I've done the Kryo thing and it is brittle/unstable by design. Nice for a proof of concept, but not suitable for long-term production projects. Actually making the things serializable (either using kotlinx.serialization or otherwise) would be a better approach.

holgerbrandl · 2022-02-07T08:26:52Z

Given the somehow slow pace of adopting Serializable in the coroutines/stdlib (timeline in this thread, lack of target release & lack of assignees), I would not mind a "brittle/unstable" kryo setup to get started with a POC. Do you think it's a bug in kryo? If so, I could try to submit a ticket there.

faucct · 2023-01-15T11:00:18Z

I have stumbled upon this ticket while dreaming about a programming language with a support for durable workflows.

Though I believe that the workflows should have explicit schemas: without those they would be brittle to changes of the codebase.

So I would like to propose an annotations API, which would make it possible to develop an IR-plugin for JVM as an unstable proof of concept, though this would be more stable with the compiler support, like kotlinx.serialization has.

There have to be at least two annotation types: @PersistencePoint and @PersistedField.
@PersistencePoint would be annotating the suspension-points where the coroutine could be persisted: either directly by calling some Persistor-interface element from coroutineContext, responsible for persistence, either indirectly by calling another suspendable function, which could at some point persist the coroutine, also directly or indirectly. The annotation instance should have an explicit label value, unique for a continuation, so it would be stable to the code changes.

@PersistedField would be annotating the persisted parameters/fields. This ones would also have unique label, also unique for a continuation.

Here is an example of an annotated workflow definition:

abstract class Persistor : CoroutineContext.Element, suspend () -> Unit {
    abstract suspend fun invoke()

    companion object Key : CoroutineContext.Key<Persistor>

    override val key: CoroutineContext.Key<*> = Key
}

suspend fun persist() = coroutineContext[Persistor.Key]()

suspend fun bookTrip(@PersistedField("name") name: String) {
    @PersistencePoint("reservingCar") @PersistedField("car") val car = reserveCar()
    @PersistencePoint("reservingHotel") @PersistedField("hotel") val hotel = reserveHotel()
    @PersistencePoint("reservingFlight") @PersistedField("flight") val flight = reserveFlight()
    return Trip(car, hotel, flight)
}

suspend fun reserveCar() -> String {
    @PersistedField("idempotencyKey") val idempotencyKey = java.util.UUID.randomUUID().toString()
    @PersistencePoint("requesting") val _ = persist()
    return GRPC.reserveCar(idempotencyKey)
}

suspend fun reserveHotel() -> String {
    @PersistedField("idempotencyKey") val idempotencyKey = java.util.UUID.randomUUID().toString()
    @PersistencePoint("requesting") val _ = persist()
    return GRPC.reserveHotel(idempotencyKey)
}

suspend fun reserveFlight() -> String {
    @PersistedField("idempotencyKey") val idempotencyKey = java.util.UUID.randomUUID().toString()
    @PersistencePoint("requesting") val _ = persist()
    return GRPC.reserveFlight(idempotencyKey)
}

Those two annotations are enough to implement a simple serialization plugin, but there could be others, like ones kotlinx.serialization has to support polymorphism, custom serializers, etc...

What do you think about this API?

philipguin · 2023-01-15T23:13:55Z

My initial thought is that a coroutine cannot be persisted without serializing all needed fields anyway - the only benefits of the annotation are explicitness and providing customizability to the serialized data (e.g. XML names), at the cost to legibility imo. (Omitting the name and using (static) reflection to retrieve it could help here.)

Marking certain suspensions as serializable or not is also strange to me - what should the API look like from outside the coroutine? E.g. suppose I’m shutting down the application and there are suspended continuations - should those in a “non-persistable” state be canceled as usual, while others are persisted? Or is persistence at the discretion of the suspended continuation’s “owner,” the annotation merely indicating the possibility? Or perhaps we should inject “Serializers” as we currently do Dispatchers, in a global way? If so, how do owners get a handle back to their suspended continuations? It’s really a question of “who does what” here.

What about migrating data across version changes? Can annotations alone account for this? I think I’d prefer a Ruby on Rails-like approach, or at least the option.

There’s just a lot of open questions, almost certainly worthy of its own KEEP.

faucct · 2023-01-16T06:31:53Z

My initial thought is that a coroutine cannot be persisted without serializing all needed fields anyway

Continuations may have unserializable fields invisible inbetween persistence points: sockets, processes, intermediate buffers, etc. I think it is reasonable to explicitly select what fields to save and how, while the compiler would check if there are no other fields required by the persistence points.

suspend fun reserveCar() -> String {
    @PersistedField("idempotencyKey") val idempotencyKey = java.util.UUID.randomUUID().toString()
    @PersistencePoint("requesting") val _ = persist()
    return GRPC.newClient().using { grpc -> grpc.reserveCar(idempotencyKey) }
}

Omitting the name and using (static) reflection to retrieve it could help here.

For most variables this will probably be true and @PersistedField annotations may be skipped, but I don't think this approach would be able to also replace @PersistencePoint annotations.

Marking certain suspensions as serializable or not is also strange to me - what should the API look like from outside the coroutine? E.g. suppose I’m shutting down the application and there are suspended continuations - should those in a “non-persistable” state be canceled as usual, while others are persisted? Or is persistence at the discretion of the suspended continuation’s “owner,” the annotation merely indicating the possibility?

Yes, I was thinking of it as a possibility of persistence. Using that the Kotlin compiler user should be able to implement any policy of liveness. The policy may be centralizing in single JVM, so the coroutines would be durable to the JVM restarts, or it may be distributing across the cluster with state persisting into some distributed NoSQL storage, so the coroutines would be also durable to death of single processing nodes.
I imagine the distributed policy as something like akka-cluster sharding: where every shard is responsible for running a subset of persisted coroutines, whose PersistenceIds hashes to specific ranges of values. When someone in a cluster wants to start a new coroutine, he would have to give it a unique PersistenceId and persist the initial fields to it. After that the cluster would be responsible for liveness and consistency of this coroutine. Liveness would be achieved with cluster trying to eventually have a node responsible for the coroutine. Consistency would be achieved by persisting the state using distributed compare-and-swap requests at persistence points, so if there are multiple nodes running the same coroutine they would be able to notice that at persistence points and synchronize who is supposed to be running it before its execution forks unmergeably.
https://doc.akka.io/docs/akka/current/typed/cluster-sharding.html#persistence-example

class HelloWorldService(system: ActorSystem[_]) {
  import system.executionContext

  private val sharding = ClusterSharding(system)

  // registration at startup
  sharding.init(Entity(typeKey = HelloWorld.TypeKey) { entityContext =>
    HelloWorld(entityContext.entityId, PersistenceId(entityContext.entityTypeKey.name, entityContext.entityId))
  })

  private implicit val askTimeout: Timeout = Timeout(5.seconds)

  def greet(worldId: String, whom: String): Future[Int] = {
    val entityRef = sharding.entityRefFor(HelloWorld.TypeKey, worldId)
    val greeting = entityRef ? HelloWorld.Greet(whom)
    greeting.map(_.numberOfPeople)
  }
}

What about migrating data across version changes? Can annotations alone account for this? I think I’d prefer a Ruby on Rails-like approach, or at least the option.

When user is responsible for the location where the coroutines are being stored and understands the format they are being stored in, then he will be able to write any migration code he wants, if that's what you are talking about.
Though I was talking about versioning of code around the persistence points. The annotations would let you add/remove statements between the persistence points as long there are no required fields being added. You would also be able to add persistence points between existing ones. Removing would also be possible, though you would have to be sure that there are no persisted coroutines in those persistence points, or else they won't be able to resume and will halt until you will decide how to migrate them or bring back those persistence points.

dkhalanskyjb · 2023-01-16T12:19:00Z

If I understood correctly what you mean by "persistence," it's the ability to stop the execution of the program, and then resume it from the same place. Did I get this right?

If so, the problem is completely unrelated to coroutines. Sure, coroutines do represent a reified state of the execution, and it's, in some sense, easier to save the execution in the middle of the coroutine than in some arbitrary code. But!

What about static variables? These are not part of a coroutine.
If you have several fields pointing to the same (potentially mutable) object, how do you preserve this relation when serializing?
What about multithreading? One coroutine could execute in one thread and reach a "persistence point", but the other ones could still execute in parallel.

Etc. The common approach to dealing with this (and there are some precedents!) is to just dump the whole state of RAM related to that program. For example, emacs did do this, using the unexec glibc function that stored the memory state: https://lwn.net/Articles/673724/ This deals with all the problems I listed. However, even this approach has its limitations:

What about open files? If you have log.txt open for writing, then stop the process, and then start again, there may even be no log.txt, and the operating system will certainly not remember that you're supposed to be able to write to it.
Likewise for any per-process state that the operating system stores.

I hope this is what you meant by persistence, or I'd feel stupid for giving a lecture about an irrelevant thing!

faucct · 2023-01-16T13:12:15Z

If I understood correctly what you mean by "persistence," it's the ability to stop the execution of the program, and then resume it from the same place. Did I get this right?

Not the whole program, the granular persistence of single coroutines one by one would be enough. In this case there should be no problems with multithreading, considering the user avoids shared mutable state, such as shared objects or static variables.

pdvrieze · 2023-01-16T15:32:34Z

@faucct

I have stumbled upon this ticket while dreaming about a programming language with a support for durable workflows.

Workflows (or business processes - my research area) are much more heavyweight than coroutines are. When you state that workflows should use explicit schemas you are looking at it with the correct attitude.

While I agree that current languages have poor support for workflows, I would suggest this is partially due to the challenge. A workflow management system needs to deal with the management of the workflow instances in flight. They require a high amount of robustness, may have state associated with the instance, and most of all may fail in unclear ways that may need some sort of manual handling.

Unfortunately a lot of the work in this area is stuck in the enterprise (software) world giving rise to languages such as BPEL4WS, or (better) direct execution of BPMN models. The high requirement approach (which still needs plenty of streamlining) is probably good when you're dealing with high-frequency processes, but for simple processes a more hard-coded workflow is probably better. Serializable coroutines could be a way in which this is done, but not always the best approach. (My context for looking at this originally was to support multi-stage Android steps: download a library, once downloaded install it, once installed use it to get an account (indirectly through the account manager)...., a process that can be done in a few minutes, but involves many resumption points. Writing it as a coroutine is much much cleaner than using some sort of manual way to implement events and resumptions. In this case failure of storage just means it need to start from scratch.

In a situation that requires more robustness (or more explicit synchronization structure such as multi-instance subprocesses) you probably want some sort of library that provides a degree of explicit process support (with schemas for the state). A library is needed as there is significant runtime support needed. Serialization is part of that (look at kotlinx.serialization), but the inputs and outputs of each activity/part should probably be explicit (and marked serializable). I'm not sure about coroutines as the basis for such a library, in any case it would go quite deep into what coroutines are/do, and may very well just use the suspend infrastructure without any of the other coroutine bits (one thing needed is the ability to resume the coroutine halfway through on restart).

elizarov · 2023-01-16T16:05:18Z

IMO, a combination of both Kotlin coroutines and kotlin serialization can serve as a great backbone for persistent, long-running workflows. The missing piece is some explicit way to indicate that a given function represents a serializable workflow.

From the user-perspective, this can be implemented simply as a @Serializable suspend function, where a compiler will ensure that function's state carried over each suspension point is serializable, and all the invoked suspending functions are serializable as well.

Just like we have an optional support for @SerialName in class serialization, we might need an optional mechanism for naming suspension points, so that the workflows can evolve while maintaining backwards compatibility with the previously serialized state. However, this is not a hard requirement in all domains, so a compiler can just number suspension points by default.

philipguin · 2023-01-16T22:02:33Z

Continuations may have unserializable fields invisible inbetween persistence points: sockets, processes, intermediate buffers, etc. I think it is reasonable to explicitly select what fields to save and how, while the compiler would check if there are no other fields required by the persistence points.

What does it mean to serialize a variable that Kotlin wouldn't turn into a field by default? Isn't this strictly a variable that has no impact after the suspension point? Hence, what is the purpose?

Conversely, what would it mean to not serialize a variable that Kotlin converts to a field by default? Isn't this strictly necessary? Or were you thinking more like Java's transient, where someone somewhere would be reconstructing the field during deserialization?

I suppose my point is, I don't know that annotations for fields are strictly necessary, even if their existence would be nice for customization. You could enable a linter setting if you'd like to mandate all fields be annotated by the programmer, otherwise IDE underscoring could be sufficient for many, and a 1-to-1 mapping of what must actually be serialized. (Correct me if I'm missing something.)

On your next points, I agree that it's good that annotation only represent the possibility of serializability. I'm not familiar with most of the following terminology however, but it sounds like things that may be important use cases for coroutine serialization, rather than features we'd want Kotlin to implement specifically? As long as we can get serialization into a custom format out of suspended coroutines, I think that would suit most purposes.

In terms of use cases, for a game engine it would be good to support serializability as a side-effect of a suspension point, not it's primary purpose. E.g. delay would be serializable by default, so the entire game or level in a game could be halted, saved to disk, then loaded and resumed later, delayed coroutines in-tact. The only effort necessary would be on the implementer of delay incorporating serialization, with minimal effort (if any) from the coroutine author.

It's problematic though, because designers aren't programmers, and they're not going to want to annotate a bunch of fields to avoid catastrophic serialization issues (nor should anyone want them to.) But annotation isn't the only thing necessary to avoid disaster - adding, removing or converting fields would have to be automatically mapped to schema changes as well, to make the process sane to a non-coroutine-expert.

This is way out of left field, but could the IDE automatically detect changes to @Serializable coroutine code, present modal dialogues if necessary to the programmer/DSL-user (e.g. possible rename detected), then generate or update the schema and migration code accordingly? I'm just trying to think of a way to make this process as human-friendly as possible -- this is in keeping with the purpose of DSLs in my opinion, which are also likely to be suspend routines. (E.g. "move the turtle" DSL, or game DSLs, etc.)

EDIT: example DSL to make this concrete:

// Example game script DSL: NPC hauling items for all eternity
suspend fun behavior(npc: Npc) {
    while (true) {
        val item = npc.pickUpNearestItem()
        npc.walkTo(0, 0, 100) // << suspend function, item needs serializability here
        npc.dropItem(item)
        npc.walkTo(0, 0, 0)
    }
}

In this case, if a designer were to modify this script, I don't want them to even have to think about persistence if I can help it. I'd accept a ridiculous git history of ten-thousand schema transitions, so long as it's correct and designers don't have to squint their eyes too hard.

faucct · 2023-01-17T06:15:19Z

What does it mean to serialize a variable that Kotlin wouldn't turn into a field by default? Isn't this strictly a variable that has no impact after the suspension point? Hence, what is the purpose?

The first example that comes into mind are function interface parameters, which aren't being used by the implementation at this moment, but the code could be updated to use them later. They also could be only used before the first suspension point, so there would be no need in generating continuation fields for them, though the code using them after the first suspension point may appear later. I believe that at the moment the Kotlin compiler optimizes such unused variables and does not save them into continuation fields.

Conversely, what would it mean to not serialize a variable that Kotlin converts to a field by default? Isn't this strictly necessary?

If a network socket is being opened which has suspendable usage between two near persistence points, then the compiler will generate a continuation field for it, though there could be no use for it in persisted state, because it never escapes out of those two persistence points:

suspend fn request() {
  @PersistedField("idempotencyKey") val idempotencyKey = UUID.randomUUID().toString()
  @PersistencePoint("requesting") val _ = persist()
  @PersistedField("response") val response = Socket("tcp://...").use { socket -> // there is no need to persist this variable
    socket.write(...) // these are suspension points, though not persistence points
    socket.read(...) //
  }
  @PersistencePoint("requested") val _ = persist()
  ...
}

I'm not familiar with most of the following terminology however, but it sounds like things that may be important use cases for coroutine serialization, rather than features we'd want Kotlin to implement specifically? As long as we can get serialization into a custom format out of suspended coroutines, I think that would suit most purposes.

Yes, that is what I have had in mind.

This is way out of left field, but could the IDE automatically detect changes to @serializable coroutine code, present modal dialogues if necessary to the programmer/DSL-user (e.g. possible rename detected), then generate or update the schema and migration code accordingly?

Though extracting schema from suspendable code into a separate source file is probably possible, I believe that making those two files matchable will be far more cryptic than having a single source file with suspendable code annotated.

philipguin · 2023-01-17T21:21:09Z

The first example that comes into mind are function interface parameters, which aren't being used by the implementation at this moment, but the code could be updated to use them later

That makes sense, persisting fields not currently used but possibly necessary in the future would indeed be valuable.

If a network socket is being opened which has suspendable usage between two near persistence points, then the compiler will generate a continuation field for it, though there could be no use for it in persisted state, because it never escapes out of those two persistence points:

I see - in my mind, every suspension in the coroutine would be persistable, but I suppose this isn't strictly necessary. I suppose non-annotated suspension points would be "unsafe points," rolling back to the last persisted if interrupted, effectively? Outside my domain of experience, so hard to comment, sorry.

Though extracting schema from suspendable code into a separate source file is probably possible, I believe that making those two files matchable will be far more cryptic than having a single source file with suspendable code annotated.

How would annotations cover the case of a field being renamed? What about multiple renames over time? How would data under the old schema be loaded into the new? If the programmer does not update annotations to account for a rename, what should happen?

By "cryptic" did you mean "difficult to implement?" Because if you meant "difficult to understand," I don't think a generated schema and series of migrations (a la Ruby on Rail) would be hard to read - I think it would be clear when looked at, which would basically only be necessary upon commit or in advanced usage scenarios.

It did occur to me that some migration code may be impossible to generate - when a new field is added, for instance, code to create it would have to be hand-written, barring a modal-dialogue-specified default value. And in the case a meaningful value cannot be generated via migration, the coroutine code itself might have to interpret a "does not exist" value and modify its own behavior to handle it - this would violate the principle of coroutines not caring about generated schema/migrations underneath. There may not be an ideal solution to this.

faucct · 2023-01-18T07:22:08Z

How would annotations cover the case of a field being renamed? What about multiple renames over time? How would data under the old schema be loaded into the new?

If there would no data persisted under the current field, then the deserializer would try to get it from an alias.

val npc;

Renaming npc to npc1:

@PersistedAliases("npc") val npc1;

Renaming to joshNpc:

@PersistedAliases("npc", "npc1") val joshNpc;

If the programmer does not update annotations to account for a rename, what should happen?

I guess, then a runtime deserialization error would happen – "joshNpc" field not found amongst available fields: npc.

By "cryptic" did you mean "difficult to implement?" Because if you meant "difficult to understand," I don't think a generated schema and series of migrations (a la Ruby on Rail) would be hard to read - I think it would be clear when looked at, which would basically only be necessary upon commit or in advanced usage scenarios.

Ruby on Rails migrations only work with explicitly named tables and fields, while the serialization of unannotated coroutines would have to work not only with variables, whose names could be inferred with reflection, but also with unnamed suspension-points. @elizarov has suggested to give those sequential labels, as Kotlin compiler does, so the migrations would have to be defined in terms of suspension points 0, 1, 2, ... I believe that those would be harder to understand than explicit names.

pdvrieze · 2023-01-19T09:13:19Z

@faucct The way I would see it to work would be (using kotlinx.serialization) that the hidden object holding the function stack/the continuation would be serializable. Then all values would be required to be serializable (and could be annotated). If you want to specify the type for serialization that would be something that maybe would be done as an annotation on the suspend declaration (in line with the context feature), and then local variables not existing, except as provided by that context. (It would need some special rules to deal with setting "val" properties)

faucct · 2023-01-21T13:47:44Z

I have hacked together a kryo-based schemaless persistence solution. The persist-calls crash the coroutine, but saving it to a file before that, so the subsequent runs would be able to resume from the same point.
https://github.com/faucct/wrapper-persisting-coroutines

pdvrieze · 2023-01-25T11:59:29Z

@faucct I did at some point have the similar (without crashes) - focused on the Android context. However it is quite tricky and generally (esp. in the Android context) require some fixups as well. For example objects should not be deserialized, android Context objects should be injected (and not serialized/deserialized). And of course it is extremely dependent upon the specifics of the library. It can "work" but is fragile. In addition, the values captured in the closure are not explicit. This makes for the code using such serialized coroutines to be quite error-prone.

faucct · 2023-01-29T08:32:03Z

I have tried to think about the serialization in terms of kotlin.serialization API to see that if there is enough of it.
It looks like not only the suspended continuations should be Serializable, which means having a Serializer. They also should be able to somehow alter the serialization context for the child continuations. This way the parent continuation serializer could be able to help resolving the concrete subclass of child continuation.
Probably their serializer should be provided by the intercepted continuation, which means the chain of ContinuationImpl#intercepted references should be inverted, so the deserialization routine would look like this:

fun deserialize(decoder: Decoder): T {
  val continuation = ... // decoder stores the `ContinuationImpl#intercepted`
  if (decoder.hasNext()) {
    return continuation.childSerializer().deserialize(decoder)
  } else {
    return continuation
  }
}

awto · 2023-04-11T21:30:41Z

@faucct

I have stumbled upon this ticket while dreaming about a programming language with a support for durable workflows.

I have implemented workflows using my JVM instrumenting-based continuations implementation serializable continuations and a JS transpiler. Durability/scalability is offered by Kafka Streams, or, in fact, any stateful stream transformer. However, I'd prefer Kotlin Coroutines if they are serializable.

@pdvrieze

While I agree that current languages have poor support for workflows, I would suggest this is partially due to the challenge. A workflow management system needs to deal with the management of the workflow instances in flight. They require a high amount of robustness, may have a state associated with the instance, and most of all may fail in unclear ways that may need some sort of manual handling.

Do you mean the workflow versioning here? The versioning indeed must be part of the code, but the workflow-as-code is much easier than using declarative workflow definitions, even for versioning.

pdvrieze · 2023-04-12T13:12:59Z

@awto Part of the problem of workflows is versioning. It is required. I'd say that workflows would probably be better declaratively (although this can be done in code - with the actual activity implementation being directly in code - BPEL is just bad in usability and too tight on an imperative execution model). What I was more concerned about is how you deal with workflows that have "problems" in the middle of their (sometimes weeks/months long) execution.

Part of dealing with errors is to have robust specifications to start with. You want to be explicit in the inputs and outputs of steps. As such coroutines that do this automatically/transparently may not be the best approach (I haven't tried it, so this is not a dismissal, just an "I don't know").

awto · 2023-04-12T14:05:31Z

@pdvrieze

Part of the problem of workflows is versioning. It is required.

We can be very flexible with the versioning and code. As it's Kafka we can have the whole history. We can travel back to some point where it's easy to update. Or wait for some point in the future. We can simulate already recorded inputs. Or we can throw an exception signalling the update, and it's the workflow which catches it and perform the necessary state handling. For this, the state doesn't need to be exposed at all. Though it can be (for example, using local variable annotations described in this thread above). But I wouldn't do this.

I'd say that workflows would probably be better declaratively (although this can be done in code - with the actual activity implementation being directly in code - BPEL is just bad in usability and too tight on an imperative execution model).

It depends on how we define the word declarative. The workflow languages I know don't look declarative at all. They are usually some visual diagrams or some lists in yaml, or something like this - but they are still instructions about how we do things, not what we do. So they are just yet another programming language, but heavily restricted though.

What I was more concerned about is how you deal with workflows that have "problems" in the middle of their (sometimes weeks/months long) execution.

These "problems" are just exceptions. The workflow developer is responsible to handle them gracefully in both approaches. An advantage of the continuations+Kafka approach is we have the log of the weeks/months etc. We can time travel to any position. No need for any spec or vendors to support this. Everything needed is already available.

Part of dealing with errors is to have robust specifications to start with. You want to be explicit in the inputs and outputs of steps. As such coroutines that do this automatically/transparently may not be the best approach (I haven't tried it, so this is not a dismissal, just an "I don't know").

I'm not sure I understand this. The spec is Kafka topics schema. It's only the input/output spec and the internal state isn't exposed, it can be exposed but it shouldn't be. Everything the outside needs to know must be posted into some dedicated Kafka topic. And everything it needs to know from the outside, the workflow must read from some dedicated Kafka topic. Note, the internal state is still available for debugging (with any JVM debugger), and since we can travel to the past it's available at any point in time. The steps are just the code - it looks explicit enough. We can use any software development practices to make the code better, or we can write something ugly. But it's the same for BPEL, except BPEL restricts using the practices.

elizarov added the enhancement label Nov 6, 2017

elizarov mentioned this issue Apr 3, 2018

Serialization of closures and continuations. Kotlin/kotlinx.serialization#44

Closed

qwwdfsad self-assigned this Apr 28, 2018

elizarov added the for 1.0 release label Sep 7, 2018

elizarov removed the for 1.0 release label Oct 16, 2018

ccjernigan mentioned this issue Jun 18, 2022

Coroutines Exceptions Violate Serializable Interface #3328

Closed

kyay10 mentioned this issue Jan 14, 2025

Serialization? kyay10/KonTinuity#9

Open

Serializability of coroutine classes #76

Serializability of coroutine classes #76

Comments

elizarov commented Jun 20, 2017 • edited Loading

fvasco commented Jun 20, 2017

elizarov commented Jun 20, 2017

elizarov commented Jun 20, 2017

Restioson commented Jun 21, 2017 • edited Loading

wdroste commented Jul 19, 2017

elizarov commented Jul 20, 2017

wdroste commented Jul 20, 2017

Restioson commented Jul 20, 2017 • edited Loading

wdroste commented Jul 20, 2017

pdvrieze commented Nov 20, 2017

pdvrieze commented Nov 27, 2017 • edited Loading

wdroste commented Jan 13, 2018 • edited Loading

unicomp21 commented Sep 14, 2018

pdvrieze commented Sep 20, 2018 • edited Loading

mrussek commented Apr 14, 2019

elizarov commented Apr 20, 2019

joost-de-vries commented May 2, 2019 • edited Loading

elizarov commented May 23, 2019

joost-de-vries commented May 24, 2019

elizarov commented May 24, 2019 • edited Loading

pdvrieze commented Jun 7, 2019

mrussek commented Jun 13, 2019

pdvrieze commented Jun 17, 2019

Miha-x64 commented Apr 9, 2020

philipguin commented Apr 24, 2020

pdvrieze commented Nov 15, 2021

holgerbrandl commented Nov 16, 2021

holgerbrandl commented Feb 1, 2022

pdvrieze commented Feb 1, 2022

holgerbrandl commented Feb 1, 2022

elizarov commented Feb 2, 2022 • edited Loading

elizarov commented Feb 2, 2022

holgerbrandl commented Feb 2, 2022

pdvrieze commented Feb 3, 2022

holgerbrandl commented Feb 7, 2022

faucct commented Jan 15, 2023 • edited Loading

philipguin commented Jan 15, 2023

faucct commented Jan 16, 2023

dkhalanskyjb commented Jan 16, 2023

faucct commented Jan 16, 2023 • edited Loading

pdvrieze commented Jan 16, 2023

elizarov commented Jan 16, 2023

philipguin commented Jan 16, 2023 • edited Loading

faucct commented Jan 17, 2023

philipguin commented Jan 17, 2023

faucct commented Jan 18, 2023

pdvrieze commented Jan 19, 2023

faucct commented Jan 21, 2023 • edited Loading

pdvrieze commented Jan 25, 2023

faucct commented Jan 29, 2023

awto commented Apr 11, 2023

pdvrieze commented Apr 12, 2023

awto commented Apr 12, 2023

elizarov commented Jun 20, 2017 •

edited

Loading

Restioson commented Jun 21, 2017 •

edited

Loading

Restioson commented Jul 20, 2017 •

edited

Loading

pdvrieze commented Nov 27, 2017 •

edited

Loading

wdroste commented Jan 13, 2018 •

edited

Loading

pdvrieze commented Sep 20, 2018 •

edited

Loading

joost-de-vries commented May 2, 2019 •

edited

Loading

elizarov commented May 24, 2019 •

edited

Loading

elizarov commented Feb 2, 2022 •

edited

Loading

faucct commented Jan 15, 2023 •

edited

Loading

faucct commented Jan 16, 2023 •

edited

Loading

philipguin commented Jan 16, 2023 •

edited

Loading

faucct commented Jan 21, 2023 •

edited

Loading