-
Notifications
You must be signed in to change notification settings - Fork 693
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Custom RTTs, Type-Associated Data, and WasmGC JS Interop #1552
Comments
This looks amazing! You can count on the Scala.js toolchain for early experiments and adoption. We've been needing the custom prototypes to complete the correctness of our semantics wrt. the JS backend. Avoiding an additional field for vtables is a nice bonus. |
Great to see this! From a first read it seems to be along the lines of the ideas we discussed in the past. All makes sense, as far as I can tell. Paging back in my thoughts from back then, though, I would like to suggest a structural simplification that would make the design both more minimal and more expressive at the same time. As currently described in the OP, RTTs duplicate a lot of the machinery for structs (syntax, instructions, type constructors, subtyping rules). And I highly suspect that the endgame would be that they end up duplicating almost all of it, e.g., mutability, casts, etc. And in terms of implementation, there really isn't much difference anyway. The natural conclusion then is to simply make RTTs into structs. Instead of declaring a list of RTT fields on a struct, we allow to declare one type index that denotes the type of the descriptor: (type $t (struct (descriptor $rtt) (field ...))) Inversely, to make this sound, the RTT itself must also declare which type it is identifying: (type $rtt (struct (describes $t) (field ...))) Only types with a As you note, there necessarily is a mutual recursion here. Yet this design has several advantages:
There would be only a single new instruction necessary in this setup, which extracts the RTT from a struct: struct.get_desc : (ref null $t) → (ref $rtt)
- where $t ≈ struct (descriptor $rtt) t* In particular, all allocation instructions remain as they are: Subtyping would extend to descriptor/describes clauses as follows:
Here, a "describes-oblivious" subtype of a composite type is one that is a structural subtype up to their describes clause. [Edit: The "describes-oblivious" notion admittedly is ugly. A cleaner alternative might be to keep a version of the In fact, since the only purpose of this extra type constructor is preventing subtyping, a cleaner and more forward-looking solution would be to introduce that as a generic feature. That is, have an optional struct.new : (ref null final $𝑟𝑡𝑡)? 𝑡* → (ref final $𝑠)
- where $𝑠 ≈ struct (descriptor $𝑟𝑡𝑡)? (field 𝑡*) All allocation instructions would return final references. This essentially is the same as the Wrt deduplicating field lists: why would we need a new form of declaration for field lists? Can't we simply have a form of (rec
(type $foo (struct (field (ref $foo) (ref $bar))))
(type $bar (struct (include $foo) (field (ref $baz))))
(type $baz (struct (include $bar) (field i32)))
) The same constraints as you mention apply, i.e., an include clause can only refer to lexically preceding struct types. |
Thanks, @rossberg. Reusing structs largely makes sense to me, but I'll have to think more about the consequences for the subtyping rules and soundness. I think you're right that adding Adding an |
Hi, Initial thoughts, I really like this, and agree with @rossberg 's suggestions of reducing duplication with structs; IMO first-class RTT instances are just like regular struct instances, and field accesses are basically the same machine operation. That also improves reuse if we have parametric polymorphism over GC types in the future. For this part,
For this we could add a new kinds of identity-RTT cast that takes an RTT instance and compares for equality. Since equality of RTT instance implies a static type match, it would obviate the need for a second, redundant @rossberg I didn't quite understand the second point here:
I think we could allow this at the cost of a dynamic check at allocations. In particular, if we define Edit: Further, I think we above works if actually require a subtype of a type with a description to declare a new description. This ensures that the hierarchy of description types (RTTs) is parallel to the hierarchy of struct types. |
@jakobkummerow, from the engine perspective, what do you think about the "RTTs are just structs" idea? They would still be statically differentiated from normal structs by the existence of the |
I can see that in terms of spec simplicity it's compelling. In implementations, they very likely won't be the same. The key point is that each GC'ed object already has a "shape descriptor" as its first field, and that's going to remain the case no matter what. For extending that basic object layout, we'll have some design choices to make. Off the top of my head I can think of three options; I don't know yet which one we might end up picking, or if there might be even more possible designs:
In terms of spec (and binary encoding), maybe so; in terms of implementations, there would likely be very little to reuse, see above.
Agreed. AFAICS that would effectively force us into option (1) above, which would be a rather unfortunate restriction.
At this time, I don't see a reason why we couldn't. The only way to be sure is to try it, though. (nit: I find "describes" a misleading name, I'd say it's more of an "enriches" or "decorates" or "attaches-to". That's bikeshedding though. Ultimately only the binary encoding matters.) |
Great to see this idea coming back! Using structs as other structs' rtts is an intriguing new wrinkle. For the "JS Prototypes" part of the design, could an alternative to custom sections be to allow configuring the rtt via the JS API of the rtt value (perhaps via post-creation-initialization so that wasm could still be the one to create the rtt using the type index space)? This would avoid giving runtime semantics to custom sections which (ignoring |
@jakobkummerow, thanks, implementation strategy 3b sounds fine to start, and hopefully we could move to implementation strategy 2 in the long run. In the original idea where RTTs are entirely separate from structs, I would have expected implementation strategy 3a to start and still hopefully implementation strategy 2 eventually. @lukewagner, yes, it should be feasible to configure prototypes via an API instead of a custom section. That being said, our concern around programmatic configuration in JS is that it would have a significant cost at startup time when there are very many prototypes to configure, just like we saw with initial experiments with imported string constants. The proposed custom section design would let the engine implement fast paths to avoid this cost, and we would also design it to be fully specified in the JS embedding spec without changing or violating core Wasm's abstractions in any way. |
Mainly because it's lifting the same concept to a different level, and because the term perhaps is more suggestive to many people — my impression was that folks found exact types a bit weird, presumably because they couldn't relate them to anything they knew. (I've never been a big fan of the keyword
In principle, you are right, we could add them later. Personally, I would dislike the possibility of a runtime check on such a basic operation if we can avoid it. Conceptually, final/exact ref types are rather straightforward and only a refinement to subtyping that shouldn't affect much of anything else.
In most of the implementation approaches you describe, accessing the fields of a custom RTT only seems to differ from a regular struct by offset, as if they have an additional field — which they do, and that would be immediately visible from their declaration as a struct. Is there a deeper difference than that? (Note that the actual getting-to-the-RTT-struct is encapsulated in the The only additional complexity appears to occur with approach (2), where the offset could differ per type. Would that offset still be statically known? FWIW, for a clean-slate Wasm engine I could envision yet another strategy, where the shape hangs off the custom RTT struct when both are present. So the shape is an extra indirection away, not the custom fields. That could potentially make sense for a pure Wasm engine, if we assume that casts are less frequent than access to meta fields in such cases.
Well, I'm not married to the keywords, but it is the exact dual role to the |
Yeah. Which means field offset computation isn't the same for RTT structs and regular structs. And subtyping will have to distinguish them. And, depending on spec design and implementation choices, And FWIW, regarding "field accesses only differ by their offset", that still holds for approach (2), the offset delta will just be bigger. The main complexity of approach (2) I can see so far is the fact that shape descriptors will become trickier to handle due to being variable-size. That's an example of a more general principle in software design: having a single object serving two purposes (being a shape descriptor and being a Wasm struct) is generally more complex than having two objects focusing on one purpose each. Overall, from an implementation perspective, I expect there to be no significant complexity difference between "let's just use structs" and "let's define customizable RTTs as a new kind of object". In cases where code reuse is possible (with appropriate parameterization), it'll be possible either way. And either way there'll be a need to distinguish RTT structs and regular structs. If anything, I'm slightly worried that "let's just use structs" might lead to cases that are difficult to implement and useless in practice, and only exist because they make sense for non-RTT structs. I don't have a specific example in mind; but the earlier observation that "let's just use structs" rules out implementation approach (3a) is illustrative of the general concern: while that case is no big deal because there's a viable alternative (3b), it demonstrates that "let's just use structs" doesn't exactly make engines' lives simpler; it can just as easily create complications and constraints. Anyway, all I'm saying is: use structs as RTTs in the spec if you like, just don't argue that this choice is motivated by engine simplicity.
I don't know; if I implemented a new engine from scratch I'd probably still want all GC'ed objects to have their shape descriptor in the same place... but that's a very hypothetical scenario for me, so I won't spend more time thinking about it.
Of course. I commented only on one half of the pair for the sake of brevity. |
@tlively That's a reasonable concern. Some time later, it'd be useful to see a realistic example of how things get big to see what we're trying to optimize and if perhaps it ends up looking different than JS string built-ins. Given that JS parsers have been hyper-optimized for bulk parsing (arrays of) object literals (that could serve as prototypes), hypothetically JS might have a perf advantage if rtts could be initialized in-bulk, but it'll depend on the details. |
@lukewagner We are assuming that configurable JS-side prototypes for Wasm structs will mostly contain methods backed by exported Wasm functions, so the hypothetical setup code would be many repetitions of: struct1_prototype.do_something = function(arg) {
return wasm_instance.exports.struct1_do_something(this, arg);
}; or perhaps Object.defineProperty(struct1_prototype, "foo", {
get: function() { return wasm_instance.exports.struct1_get_foo(this); },
set: function(value) { wasm_instance.exports.struct1_set_foo(this, value); },
}); so I don't think plain JS object literal parsing speed will be very helpful for that. Rather, since the things being installed on the prototypes are coming from the Wasm instance, bulk initialization (or, perhaps even better, lazy initialization) is enabled by having some declarative description of the desired prototypes somewhere in the Wasm module. A new (non-custom) section type in core Wasm would do the trick; using a "custom" section on the core Wasm level is a form of a more layered design that would allow specifying JS-related things in the JS API spec. Maybe there's a third option? The scale we need to target is a good question. I don't think any hard data on that exists just yet, given the early state of this proposal and partner teams' limited time so far to seriously evaluate it. They can count how many types and/or functions they have in their app, but that doesn't necessarily indicate how many of those will need rich JS exposure (just a small "API" set? Or almost all, for generality?). I do expect that larger deployments may well annotate thousands, perhaps tens of thousands of types, with easily a dozen methods each. So having to parse and compile or interpret, say, 100K repetitions of the patterns sketched above would add up to quite some startup overhead. |
Thanks for writing this up, this looks reasonable to me. Couple quick thoughts:
|
@eqrion, to clarify, even as an extension of structs, custom RTTs would remain statically separated by type (per the presence of a Good point about atomic access! Can you elaborate on point 2? I don't quite understand what case you are referring to. AFAICS, canonical RTTs cannot have user-defined fields, or depend on RTTs with user-defined fields, by construction. |
@jakobkummerow Thanks! That helps paint a clearer picture. Given the formulaic nature of the initialization logic, it still seems like there could be some sort of bulk-prototype-initialization JS API tailored to this purpose. A declarative alternative that is sortof symmetric to JS String Builtins would be to have JS API instantiation do extra prototype initialization of exports with a certain name prefix. I suppose there's a wide design space here, so my main suggestion in the short term is just that we not take the custom section approach for granted. But also agreed with @eqrion that before any of that, it's probably best to start with JS glue code that we can measure. |
@eqrion @jakobkummerow Another implementation consideration is a space/time tradeoff for RTT instances in how to store the Cohen display (canonical supertype array). Previously, with only canonical RTTs, storing the supertype array inline in the RTT object could save an indirection, making a cast just two loads and a compare (load RTT, load supertype-at-depth D, compare). With customizable RTTs, it's possible that programs instantiate them many times, so to save space, it makes sense to move it out of line, making casts three loads and a compare. I'm not sure if Web engines have been storing that array inline, but it would necessitate some refactoring. It basically boils down to: struct instances are essentially a record with a tag and their fields, and RTT instances are a record with their own tag, a description for structs that refer to them, and then their fields. The "description for structs that refer to them" is heavy if its an inline supertype array. They are inherently heavier-weight than regular structs.
Sure, but that probably means we need an instruction to cast between a non-final type and a final type, because likely programs will want to use subtyping with RTTs in most places but would be required to supply final types for allocations. So it's moving a runtime check and also introducing type system complexity. It's maybe too early to tell, but my preference would be to keep things simpler, especially since the cost of an allocation likely dwarfs a dynamic check for the exactness of an RTT's descriptor. |
It's possible of course, but is it likely? I would expect most (if not all) toolchains to generate exactly one instance of each custom RTT. If we take Java as an example, one does not need several instances of the RTT for AFAICT, toolchains would prefer to keep the performant Cohen encoding, rather than hypothetically duplicating a few fields. |
The non-final version of a final type is its direct supertype. Hence, if you need both variants, passing the final one is always sufficient, as it subsumes the other. A final type, by its very nature, has no other final sub- nor super-types, so there are no interesting casts between two final types. It would already be paradoxical to have a cast involving a final source, because its whole point is to express that this already is most precise type possible for the referenced value. The only cast involving final that makes any sense would be from a non-final down to a final target type. That would already be covered (conceptually, and if we chose to allow it) by the existing cast operators, as they cast between two reftypes. And the implementation would be straightforward: instead of going through the supertype vector, all it needs to perform is a direct pointer comparison against the value's type representation, so it's much simpler than a regular cast — it's the same check that the dynamic version would need to implement in every allocation instructions. That said, I cannot even think of a scenario where I'd need or want such a cast. |
Sure, subsumption works here, no need for an explicit cast.
Sure, but in any case we do allow other impossible casts now, so maybe that's not important. Depending on how to we choose to encode
Agreed that it's the same check as would be needed for the allocation, but it's not just a pointer comparison because RTTs will now have instances, unless you mean loading a pointer from the RTT instance to compare. But even then It's different from other casts that compare a tag at known depth in the supertype vector; it'd either that same check-at-depth plus a length of supertype vector check, or another, dedicated descriptor field. In the binary encoding of casts, which is already different than how reftypes are encoded as value types, we'd probably need to steal one bit from the flags, which would represent whether the target type was final.
Well the point of letting RTTs have a subtyping relation is exactly so they can be passed around like other objects, so it's reasonable to expect that programs would use the non-final type liberally, because the final type of course doesn't allow subtypes. Languages that have a meta-object protocol will make use of that feature--e.g. likely Scala. So they'll need a cast to go from the happy-go-lucky world of passing around meta-objects to a final type to allocate. (As an aside, I'm designing Virgil's meta-object protocol in the background for exactly the same reason, but one layer down--to implement Wasm in Wizard using the Virgil object model and combining the Wasm metadata with the Virgil metadata). And so far it makes sense to me that the meta-object subtyping hierarchy mirrors the object hierarchy, so they'd naturally be non-final in generated Wasm-GC code. |
Scala does have a meta-object protocol, but the RTTs themselves would never be first-class. The only place where I imagine it could come up would be array types. Arrays of dimension (This might change if we got read-only field types, because then we would probably have different struct types for different reference array types, as I outlined in WebAssembly/gc#558 (comment)) |
Unrelated, but I would caution against the introduction of final types anyway. The problem with final types is that once they're in, you can never invent new subtypes of existing types. A simple example would be intersection types. In Wasm v6.0, it could be the case that |
Yes, that's what I meant. (Yeah, when we discussed this idea some years back, we concluded that it is confusing to call these custom things RTTs, and something like "descriptor" would be more appropriate. They merely carry a proper RTT.) I don't follow why anything more than a pointer comparison on the proper RTT would be needed, but given that it has to happen one place or the other, that's probably getting OT.
Sure, subtype polymorphism is useful for consumers of objects and their descriptors alike. But any code site that produces and initialises an object naturally has to fix both its exact Wasm type and its descriptor's statically. So I'm on the same page as @sjrd and not sure where that kind of downcast would come in. For languages with primitives for truly generically creating an object via a meta-object protocol, I'd expect that they will — for many reasons — require a generic object representation, where the Wasm-level type is invariant across source types. But then, all their descriptors have the same Wasm-level type as well, and no Wasm-level casting between them is needed either. But I may well be missing some less obvious scenarios. That said, I don't mind allowing such casts.
Ah, interesting point, but I don't believe it is true as stated. New forms of subtyping are not incompatible with final types per se. What matters concretely are the introduction forms for final types, and whether they could change to produce values of subtypes in the future. In our case, the only introduction forms are allocation expression, and they explicitly spell out the very type of the value they produce. By nature, those won't be able to produce more precise types in the future. Or if they did, that would be very different instructions. So while this can indeed be a problem in a more implicitly typed language with unannotated intro forms, I don't think it can become one in our setting. Or am I missing something? (Edit: This is assuming that we stick to the current principle of always requiring type annotations where otherwise lubs or glbs would be required. I have no reason to believe that will change, as it would certainly produce other problems.) |
So IIUC, it’s safe to introduce final/exact types iff we never further refine the types produced by allocation instructions. On the one hand, we’re discussing refining the allocation types now by introducing final/exact types, but on the other hand, it’s unclear why we would need to refine them again in the future given the type annotations @rossberg mentioned. Personally, I would love to have final/exact types to help propagate more precise type information for optimization purposes in Binaryen, but that use case is entirely orthogonal to custom RTTs. |
I recently learned that the stack-switching proposal is encountering a vaguely similar sub-problem (re-using existing type definitions for a new purpose), and currently appears to be leaning towards a different solution. I think it would probably be better if both proposals eventually settled for similarly-looking solutions; I don't care much which solution that is. As very quick summary: stack-switching introduces "continuations", which are a bit like funcrefs (they have a signature, they can be "called"/resumed with the right parameters, and will eventually return results of a certain type), but need to be kept separate from funcrefs as they aren't interchangeable: they're produced and consumed by a set of instructions that doesn't overlap with the set of instructions that deal with funcrefs. This reuse of function signature definitions for a strictly separate purpose reminds me of this discussion's idea to reuse struct type definitions for a new purpose, while keeping this new use case statically separate from non-RTT structs. I'm aware of three different approaches to this: (1) "Set a flag". That's what's been suggested above: (2) "Wrapper". That's what the stack-switching proposal currently sketches out: a new kind of type definition (3) "Separate kind of reference". That's what early versions of the GC proposal assumed, and what still somewhat echoes in the thread-starting post above: instead of having a single kind of I believe these three approaches are by and large interchangeable in the sense that they're each successfully solving the same problem of reusing existing type definitions for a new, statically-distinguishable purpose. Of course they come with different pros and cons in their details; in particular they keep different parts of the system simpler at the cost of making other parts more complex. I don't think I have much of a preference at this time (in part because, within certain limits, implementations can pick their own favorite internal model and map the decoded wire bytes onto that); but I do think it would be weird if the custom-rtts proposal that's in the making here went for approach (1) because there's no precedent for wrapping types, whereas the stack-switching proposal went for approach (2) because there's no precedent for setting flags on types, and then once finalized they'll both be the odd one out that doesn't do things the way other proposals do them. |
Reflecting on @sjrd 's comments, I think it's probably true that the actual metaobject will need to be separate from descriptors anyway, because in the current design a descriptor can't be its own descriptor, whereas metaobjects will eventually end up in a cul-de-sac, like the But I think you still end up with the final/subtyping problem, because those metaobjects will have a field which is a reference to the descriptor. If we want to have one descriptor field in the metaobject, and have it read-only and co-variant, they can only be so if the type is non-final. |
Thanks, that's a really good use case! |
Ah, I suppose. I was mostly thinking about the discussed extension where a custom RTT could have a nested custom RTT. Although that's something we could syntactically limit.
This was responding to some of the proposed extensions from the original post.
So not something that's part of the core proposal here, just trying to point out a potential issue with a maximal design.
Yeah it's a good question. We also have an optimization where we would do an early comparison of the rtt on the casted value exactly with the canonical rtt for the common case where it's an exact match. In that case we can skip loading the supertype vector. My understanding is that would also no longer be valid for rtt's that can be instantiated multiple times. Although I haven't measured how much that optimization is buying us in a while, so maybe it's not a big deal. |
I thought of a reason we may want something besides structs to provide the
To optimize the representation of the repeated field sequence
But the introduction of In contrast, if we used a new
|
@jakobkummerow: The difference is that continuations have a completely disjoint set of instructions from functions. In contrast, RTTs/descriptors require pretty much the exact same set of instructions that ordinary structs do, so a unification is natural. (There are other technical reasons for the wrapping design of continuation types, which I explain here; these likewise do not apply to descriptors.) @tlively: That's an interesting point. In what sense is it not "safe" to do this transformation? I see that it would have to be a whole-program transformation, but I was under the impression that Binaryen is doing that anyway? Do you know how often this particular case occurs? Introducing a whole new declaration sort and namespace to the language is not a small feat. Having them nestable into FWIW, if we already had generic type definitions, then the factorisation could be expressed — the following ought to work:
|
No, I don’t know how common this situation is, so it will require some investigation. I suspect it’s not common, though, so planning to do the simpler thing for now makes sense to me. Even with closed-world optimization, Binaryen still has to be careful not to change the identities of “public” types that appear in the module interface, otherwise the optimized module may not be able to link with its environment. We’re also moving toward supporting open-world optimization better due to demand from Dart and Kotlin. |
Even for cases where parsing lots of JS prototypes is a concern, presumably languages could just import a minimal reflection library for JS and do the calls inside WASM to create methods. e.g. Something like: ;; define $counterPrototype like in the OP
(...)
;; function defineMethod(wasmFunc) { return function(...args) { return wasmFunc(this, ...args) } }
(import "js-reflect" "toMethod" (func $toMethod (param funcref) (result externref)))
;; function defineProperty(target, name, value) { return Reflect.defineProperty(target, name, { value, writable: true, configurable: true }) }
(import "js-reflect" "definePropertyValue" (func $defineProperty (param $target externref) (param $name externref) (param $value externref))
;; Connect prototype object to rtt
;; Some wasm func
(func $counter.get (param (ref $counterType)) (result i32))
;; Install methods as needed on any prototypes needed
(func $start
(call $definePropertyValue
(global.get $counterPrototype)
(... js stringref for "get" ...)
(call $toMethod (ref.func $counter.get))
)
) I have seen elsewhere that in v8 currently calls are slower wasm->js than js->wasm, so I do wonder if there could (or should) be a builtin module for reflection containing the neccessary functions to interact with the JS object model from WASM (i.e. the |
Thinking more about exact/final reference types, I realized they would be most useful for optimizations if the finality/exactness was orthogonal to nullability:
In keeping with all of our other instructions accepting nullable operands, the rtt/descriptor operand to struct.new $t could have type Additionally, to keep each heap type hierarchy a lattice, we could special case |
Yes, absolutely, that's an important clarification. The final attribute applies to the ht dimension (classifying the pointee), not the null dimension (classifying the pointer). And I think I mentioned above that it means no subtype other than bottom. |
I wasn't sure before whether you meant "bottom" as in none/nofunc/noextern/noexn/nocont or "bottom" as in validation-only-real-bottom. Glad we're on the same page 👍 |
WebAssembly structs and other heap values are currently associated at runtime with “runtime type” objects, or RTTs, which contain the information necessary to implement structural casts and type checks. In the MVP design of WebAssembly GC, these RTTs are left implicit and do not appear in the syntax of the language. Nevertheless, they are already part of WebAssembly’s semantics, and require an implicit extra reference field in every heap object.
Similarly, many source languages have some form of data associated with source types. Examples include vtables and itables for method dispatch as well as static fields. These features must currently be implemented with an extra reference field in every object representing a source object, just like the RTT reference field managed by the engine. Using just one field to access both the structural type information the engine needs to implement casts and the userspace type-associated information would reduce the in-memory size of every WebAssembly object that needs both. For Google Sheets calcworker, we estimate this to be a roughly 10% reduction in total memory use.
To enable that memory savings, this proposal extends struct definitions to allow arbitrary data to be attached to RTTs, exposes RTT references as first-class values, and introduces new instructions for allocating custom RTTs, allocating structs with custom RTTs, and retrieving data from objects’ RTTs. To avoid code size regressions in the type section, this proposal also includes a new mechanism for deduplicating common sequences of fields in type definitions.
Putting type-associated data on the RTT also lays the foundation for allowing embedders, especially JS, richer access to WebAssembly structs. In JS engines, WebAssembly RTTs are analogous to JS type descriptors that store information about the prototype and field layout of JS objects. With explicit RTTs holding arbitrary data, types can have externref RTT fields annotated as holding a JS prototype that can be used to call methods, including getters and setters, on objects of that type when they flow out to JS. Other embedders don't have JS prototypes specifically, but they can use the same annotation for whatever other host data they wish to use to mediate access to WebAssembly structs.
The rest of this writeup gives a detailed design for custom RTTs to serve as a basis for discussion.
RTT Definitions
Struct definitions are augmented with `(rtt fieldlist)’. The fields must all be immutable. Struct definitions without RTT definitions are redefined to be shorthands for struct definitions with empty RTTs, i.e. RTTs with no fields.
In addition to the existing rules about how the normal fields of struct subtypes may differ from those of their supertypes, we need new rules governing how the RTT fields of a struct subtype may differ from those of its supertype. These rules must ensure that access to RTT fields, as described later, is sound. The rules turn out to be the same as for normal struct fields; both width and depth subtyping are allowed, and they can be applied independently to the RTT fields and normal fields because they have different index spaces.
RTT Types
RTTs form a new heap type hierarchy with a top type
rtt
and a bottom typenortt
.rttref
is a shorthand for(ref null rtt)
andnullrttref
is a shorthand for(ref null nortt)
. In between these types are the defined RTT types for particular struct heap types:(rtt typeidx)
. Defined RTT types are never in subtype relationships with one another, even if their associated struct types are subtypes of one another.Here's a program that would execute unsoundly if RTT subtyping mirrored the associated struct subtyping:
RTT Instructions
Custom RTTs may be allocated by the
rtt.new
andrtt.new_default
instructions. These are similar tostruct.new
andstruct.new_default
, but they take values for a type’s RTT’s fields rather than its own fields.Structs can be allocated with explicit RTTs using
struct.new_with_rtt
andstruct.new_default_with_rtt
. These instructions trap if the provided RTT reference is null.The semantics of the existing
struct.new
andstruct.new_default
instructions can be redefined in terms ofstruct.new_with_rtt
andstruct.new_default_with_rtt
along with anrtt.canon
administrative instruction that looks up the canonical RTT for a type.rtt.canon
only makes sense for empty RTTs, sostruct.new
andstruct.new_default
can only be used to allocate types that have empty RTTs.A new
rtt.get
instruction provides access to the data stored on a struct’s RTT, along withrtt.get_s
andrtt.get_u
for packed RTT fields.struct.get
is not repurposed for retrieving RTT data to allow the RTT fields to have their own index space separate from that of the struct’s own fields. This is a requirement to support width subtyping on the struct and RTT fields independently.It would not be sound to allow retrieving a reference to an arbitrary struct’s RTT without some way to express or check for an exact type, so instead we offer an instruction
rtt.eq
that can compare the identity of a struct’s RTT with that of a given RTT. This can be used as a fast path in the userspace implementation of casts. For maximum generality, it can take any struct type and any RTT type.Casts and Custom RTTs
WebAssembly uses structural types because the only intended purpose of its type system is to guard against unsafe accesses that are out of bounds or use the wrong value type. As such, WebAssembly casts and type checks are intended to recover information about the structure of a heap object and nothing more. For lack of an efficient alternative, several toolchains currently use WebAssembly casts to implement source-level casts, but this only works in a closed-world scenario where the toolchain can ensure all source types lower to unique WebAssembly types.
The structure of the RTT associated with a struct type is part of the struct type’s structure, so RTT declarations are included in the structural type canonicalization procedure. On the other hand, the particular RTT value associated with an object does not affect the object’s structure, so the outcome of a WebAssembly cast never depends on RTT identity; two objects of the same WebAssembly type always pass the same set of WebAssembly casts no matter what their RTT values are.
Engines currently depend on type canonicalization producing unique RTTs for each type to implement casts efficiently. Each RTT stores a vector of the canonical RTTs for its supertypes and the relative subtype depths between types can be used to index into that vector to perform casts in constant time. With custom RTTs, the RTT values for each type are no longer unique, but canonical values can still be produced and canonical supertype vectors can still be stored on all RTTs to preserve constant time casts. Cast fast paths that perform identity checks on RTTs will not work for types with customizable RTTs, though.
Whether RTT reference types themselves can be cast is another question that will require discussion. On the one hand the consistency of being able to
ref.cast
to recover lost type information for any reference type is useful for optimizers, and on the other hand we are already considering making exceptions for e.g. continuation references to simplify implementations.JS Prototypes
The JS embedding specification currently says that Wasm structs that flow into JS have immutable, null prototypes. To allow richer interop with Wasm structs from JS, we can annotate RTT allocations to tell the engine to duplicate a particular immutable fields into the prototype slot in the RTT. This will allow JS to call methods on objects that use the allocated RTT. The annotation must be on the RTT allocation rather than the type definition because there would be no composable way to resolve conflicting annotations on different definitions of the same type.
Declarative Population of Prototypes
One problem with the example given above is the large amount of initialization code that has to be run after instantiation to populate the prototypes. It is possible that there is a faster way to manually populate the prototypes than assigning one property at a time, but we still anticipate that this will be a measurable problem for startup performance. To allow the engine to optimize this part of startup, we propose adding a "magic import" mechanism similar to what we have for strings paired with a custom section that will allow the engine to synthesize and populate the imported prototypes. Users would still be able to perform additional manual configuration of the prototypes after instantiation.
At its most basic, the custom section needs to annotate externref imports with a sequence of method names and associated exports. Note that it is important that the custom section refer to export names rather than e.g. function indices so that it can be specified is the JS embedding document and implemented without breaking the core Wasm extraction that only exports are visible to the embedder.
The design for the custom section could also be extended to support other features, although the precise details can be determined later:
this
as a parameter.Any features not supported by the declarative custom section can still be implemented in user space by exporting the prototypes and mutating them imperatively after instantiation. For example, if the prototypes need to contain more complex methods than can be mechanically generated from the custom section, those methods can simply be added after instantiation.
Deduplicating Field Lists
Just like normal struct fields, RTT fields need to be repeated when declaring subtypes. In cases where vtable types could currently be shared between multiple types in a source type hierarchy, moving the vtable fields into RTTs will cause those fields to be repeated more in the type section. To prevent code size regressions, we need a new mechanism to deduplicate lists of fields in type definitions. There is a large design space here, but this is one possible design:
A new kind of type section entry called
fieldlist
is introduced. Struct fields can become references to fieldlists, which are inlined into the struct definitions. In the text format, field names given in the field list are inlined into the type definition as well.To prevent infinitely large types, fieldlists can use only previously defined fieldlists in their definitions. They can refer to any type defined in the current or previous recursion groups. Types can only use previously defined fieldlists.
The text was updated successfully, but these errors were encountered: