-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
lang: add the union
type
#45
Conversation
Also clarify that `Return` also allows expression with a type that is a subtype of the return type.
The new `fitExpr` procedure handles type mismatch errors as well as performing the necessary conversions. In addition, `Return` translation is refactored to better handle error propagation: the expression having an error no longer prevents the `Return` from being typed as `void`.
`typeToIL` already handles `tkError`, but `genProcType` unnecessarily aborted on encountering it. This is now fixed.
Edited the description to note that subtype allowance in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One possible issue with not collapsing unions, but otherwise looks good, hopefully I'm not misunderstanding something.
if typ.kind != tkError: | ||
c.error("union type operands must be unique") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps I'm wrong, just looking at this but I think union(union(float, int), union(int, string))
should result in union(float, int, string)
.
This points to two things:
- maybe we shouldn't get a duplication error?
- we should flatten unions, I don't believe
evalType
alone does that
I haven't looked at the whole PR, so this, union inside a union, might not be possibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps I'm wrong, just looking at this but I think
union(union(float, int), union(int, string))
should result inunion(float, int, string)
.
I did think about this when writing the specification. My main motivation was to allow int or float or string
producing the union(int, float, string)
type instead of union(union(int, float), string)
without any special cases or rules.
I ultimately discarded the "flatten" idea, mostly for practical usage reasons, but also because it means that type formation, term elimination, and term introduction rules become more complex for union
.
There are two concrete situations where I think flattening is a problem.
Case 1: ergonomics and efficiency
Consider the following (using hypothetical syntax):
proc floatOrStr(): union(float, string) =
...
proc p(): union(int, union(float, string)) =
return floatOrStr()
With the rules as they currently are, this would work as one would expect (in my opinion). If the union type is flattened, then there's a problem, because union(A, B)
is currently not considered a subtype of union(A, B, C)
, meaning that the code above becomes illegal.
Of course, union(A, B)
could be made a subtype of union(A, B, C)
, and I also think it should. The subtype relationship would - in my opinion - need to be coercive, however, given that the necessary run-time conversion is not trivial (the input union needs to be unpacked and then repacked).
In the example showed above, requiring an explicit coercion would hurt ergonomics and also be somewhat counterintuitive.
Case 2: composition in generics
Consider the following:
proc f[A, B](a: A, b: B) =
let x: union(A, B) = select(..., a, b)
case x
of A as val: ...
of B as val: ...
My expectation is that a call such as a: int, b: union(float, string) |- f(a, b)
should be perfectly valid, but when union types are flattened, it wouldn't be, since the second of
branch would specify a type that's not part of the union.
Again, there are some possible remedies:
- the notion of strong and weak union types is introduced; strong unions cannot be "destructured", whereas weak ones can be. Union types bound to type variables would then always become strong ones
- in
of T as val
,T
is allowed to be a union type covering a subset of the types in the matched-against union
Number 1 introduces additional complexity to the type system. Number 2 would only work for non-lvalue conversion (because the repacking of unions cannot preserve lvalue-ness), thus having composition issues again.
Summing it up, I think that flattening unions (by default) is not a good idea. There could be some type operator, like flatten
, which constructs a "flat" union type from a nested one, though I think that's something for later.
I think there's also a consistency argument to be made (though maybe not a very strong one) with regards to tuple
, which doesn't collapse nested tuples, for reasons similar to those I stated above.
maybe we shouldn't get a duplication error?
This is also something I though about. I decided to go with reporting an error because I think it'll work better with (future) generic routines.
Consider:
proc f[A, B](a: A, b: B) =
let x: union(A, B) = ...
When making duplicating types in a union type constructor an error, it's possible to derive an A != B
constraint for the generic parameters when typing the generic body, making it possible to fail analysis as soon as the concrete types are known, which could have some beneficial interactions with overload resolution (should we decide to employ type-based overload resolution, that is).
However, I do have a less strong opinion on an error for duplication than I have on flattening.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the detailed response. One thing I didn't consider was strong versus weak unions, that's a pretty cool idea; but you're correct I don't see the benefit to cost at present..
## Computes whether `a` is a subtype of `b`. | ||
if b.kind == tkError: | ||
true # the error type acts as a top type | ||
elif a.kind == tkVoid: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making void
a bottom means we're treating it as an all
, a sort of antonym to any
(top) which is also occupied by error, pretty sure that works for us long term.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making
void
a bottom means we're treating it as anall
I somewhat disagree, but maybe it's only because of the name all
. When I first read all
, my first thought was that it's the sum type of all possible types, which would make it a top-like type. Even when interpreting it as a bottom type, my first thought was that it is an inhabited type, which void
is not.
For completeness' sake (though I believe you know this already), from the start, my intention with void
was for it to be the uninhabited bottom type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Making
void
a bottom means we're treating it as anall
I somewhat disagree, but maybe it's only because of the name
all
. When I first readall
, my first thought was that it's the sum type of all possible types, which would make it a top-like type. Even when interpreting it as a bottom type, my first thought was that it is an inhabited type, whichvoid
is not.For completeness' sake (though I believe you know this already), from the start, my intention with
void
was for it to be the uninhabited bottom type.
Yup, I know it's meant to be uninhabited. all
, or we could call it every
, is something that is all types all once, which can be thought of as either being inhabited by an impossible value (an opposite to unit
), or an uninhabitable type (void
), the latter is how I tend to think about it. For now all this is fine, until we hit certain cases the distinctions are typically not helpful.
Co-authored-by: Saem Ghani <[email protected]>
Summary
union
type, which is a concrete anonymous sum typeunion
type constructorReturn
also allows subtypes (covariant return)Return
Details
tkUnion
type kindcmp
procedure totypes
, used for sorting typesisSubtypeOf
procedure totypes
tkUnion
(type rendering,value rendering, etc.)
Return
Return
analysis and translation to be more linearIn memory
union
values are currently represented as an 8-byte integertag + a union of all allowed types, but this is subject to change.
Notes For Reviewers
They're not part of the core set of language features for version 0, but union types are useful, and also stress-test the lower-level passes a bit, so I figured I'd implement them right away.
There's no pattern matching facility yet, due to the current lack of local variables. Those are better implemented separately, which will then unblock pattern matching.