Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type Classes #256

Merged
merged 6 commits into from
Feb 7, 2025
Merged

Type Classes #256

merged 6 commits into from
Feb 7, 2025

Conversation

FlandiaYingman
Copy link

@FlandiaYingman FlandiaYingman commented Dec 22, 2024

This (draft) PR introduces type classes by a minimal implementation of contextual parameters, including:

Parser rules for parsing use and using keywords.

abstract class Foo with
  fun foo(): Int

class FooImpl extends Foo with
  fun foo(): Int = 40

use Int as i = 2

use Foo = new FooImpl()
module M with
  fun f(using someInt: Int, someFoo: Foo): Int = someInt + someFoo.foo()

Elaborator rules for elaborating use and using keywords.

A use definition is elaborated into a TermDefinition of kind instance (Ins). If there is no explicitly specified identifier with the instance, a synthesized identifier is used.

If any parameter in the parameter list is modified by the using keyword, the whole parameter list becomes an "implicit parameter list".

Further elaboration on module methods

Module method applications (or implicit applications) are further elaborated with the type information.

In particular, if there is any missing contextual argument lists, the elaborator will unify the argument lists with the parameter lists, with some placeholder Elem. These Elems are populated later in the new implicit resolution pass.

module M with
  fun foo(using someInt: Int, someFoo: Foo): Int = someInt + someFoo.foo()
  fun bar()(using someInt: Int) = someInt

M.foo
// => M.foo(using Int ‹unpopulated›, using Foo ‹unpopulated›)

M.bar
// => elaboration error: mismatch

M.bar()
// => M.foo()(using Int ‹unpopulated›)

New pass: Implicit Resolution

A new pass called ImplicitResolution is introduced to resolve implicit arguments after elaboration.

It traverses the elaboration tree, find:

  • Instance definitions: add them into the context with their type signature as the key.
  • Contextual argument placeholder: resolve the contextual argument with the corresponding instance definition in the context.

TODO: Move rules of module methods to this pass.

@FlandiaYingman
Copy link
Author

FlandiaYingman commented Dec 22, 2024

Wow it works like magic 🫨🤓

// Monoid Example

abstract class Monoid[T] extends Semigroup[T] with
  fun combine(a: T, b: T): T
  fun empty: T

object IntAddMonoid extends Monoid[Int] with
  fun combine(a: Int, b: Int): Int = a + b
  fun empty: Int = 0

object IntMulMonoid extends Monoid[Int] with
  fun combine(a: Int, b: Int): Int = a * b
  fun empty: Int = 1

module M with
  fun foldInt(x1: Int, x2: Int, x3: Int)(using m: Monoid[Int]): Int =
    m.combine(x1, m.combine(x2, m.combine(x3, m.empty)))
  fun fold[T](x1: T, x2: T, x3: T)(using m: Monoid[T]): T =
    m.combine(x1, m.combine(x2, m.combine(x3, m.empty)))
    
use Monoid[Int] = IntAddMonoid
M.foldInt(2, 3, 4)
//│ = 9

use Monoid[Int] = IntMulMonoid
M.foldInt(2, 3, 4)
//│ = 24

use Monoid[Int] = IntAddMonoid
M.fold(1, 2, 3)
//│ FAILURE: Unexpected type error
//│ ═══[ERROR] Missing instance for context argument Param(‹›,m,Some(TyApp(SynthSel(Ref(globalThis:block#18),Ident(Monoid)),List(Ref(T)))))

Copy link
Contributor

@LPTK LPTK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, nice draft changes!

TODO: Instance identifiers should not be able to be referred directly in the code.

Edit: Wait. Do we really want to do that? I thought in Scala implicit instances are in another name scope, but seems that's not the case 🤓

Indeed, there need be no such restriction.

  • Instance definitions: add them into the context with their type signature as the key.

The key should be the type constructor of the type class being instantiated. It can be concrete or abstract (a type parameter or type member). As in fun foo[A](someA) = { use A = someA; ... }

BTW eventually we'll like to define things like:

module Foo[A](using Bar[A]) with
  ...
    [baz, Foo.baz] // equivalent to `[this.baz, Foo(using Bar[A].baz]`
    use Bar[Int] // possible to change the context here
    Foo.baz // equivalent to `Foo(using Bar[Int]).baz`
  ...

But for this we first need to add support for parameterized modules.

@FlandiaYingman
Copy link
Author

The key should be the type constructor of the type class being instantiated. It can be concrete or abstract (a type parameter or type member). As in fun foo[A](someA) = { use A = someA; ... }

If only the type constructors are specified as keys, wouldn't the following application fail?

use Monoid[Int] = IntAddMonoid
use Monoid[Str] = StrConcatMonoid

M.foldInt(2, 3, 4)
// resolves to M.foldInt(2, 3, 4)(using StrConcatMonoid)

@LPTK
Copy link
Contributor

LPTK commented Dec 23, 2024

It does fail, indeed. I guess we could make the rules a wee bit more fine-grained: we could look for the innermost instance in scope that could match the type query, where only those types explicitly specified (like the Int of Ord[Int]) are taken into account – so, as expected, Ord[A] for some unspecified A will unconditionally use StrConcatMonoid.

In any case, the map should still use keys that are type constructor symbols and not signatures. You can't just use syntactic types as keys as several type expressions could be equivalent and represent the same semantic type.

@FlandiaYingman
Copy link
Author

FlandiaYingman commented Jan 3, 2025

Most comments addressed, except:

Symbols as Context Keys

I tried, but it seems that it is not possible to use pure Symbol as the keys, because by doing so I couldn't get a previously defined Ord[Int] after defining a Ord[Str] (or maybe I didn't get your point 🙋‍♂️). Currently, a enum Type is used as the keys:

enum Type:
  case Any
  case Sym(sym: TypeSymbol)
  case App(t: Sym, typeArgs: Ls[Type])

Any is a placeholder for type parameters that are not explicitly provided. It could unconditionally match the innermost instance.

Map is not working here because many "query" keys can map to the same value, so instead a List is used, and it performs linear search to find an instance that matches the query.

I'm also aware that a TypeSymbol might be a TypeAliasSymbol. Should we resolve a type alias' definition (possibly perform a "lightweight" elaboration if absent) and substitute?

Partial Application Restrictions

It is currently not enforced because I find it easier to implement it when we move all module methods related checks to the new pass. I also find this PR is becoming large so let's have the checks in another PR

@FlandiaYingman
Copy link
Author

All tests should pass now, except for this one

fun use: [Rg, Br] -> Reg[Rg, Br] ->{Rg} Int

The ident use is now a keyword, so the parsing fails. I have no idea on bbml. How could I fix it 🫨

Copy link
Contributor

@LPTK LPTK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, looks very promising!

Though the light elaboration logic seems inappropriate. It appears that you're re-elaborating term definitions every time they are accessed, and in the wrong context! What you should do instead is to elaborate (in light mode) things that are imported from other files. This should be done in Importer.scala. Notice the old comment:

// * Note: we don't even need to elaborate the block!
// * Though doing so may be needed later for type checking,
// * so we should probably do it lazily in the future.
/*
given Elaborator.State = new Elaborator.State
given Elaborator.Ctx = Elaborator.Ctx.init.nest(N)
val elab = Elaborator(tl, file / os.up)
val (blk, newCtx) = elab.importFrom(resBlk)
*/

(Forget about the laziness for now!)

Then, if we still find that a definition is missing, it means either there was an error while elaborating the thing (and trying to re-elaborate it won't help) or there is a bug in the compiler.

I tried, but it seems that it is not possible to use pure Symbol as the keys, because by doing so I couldn't get a previously defined Ord[Int] after defining a Ord[Str] (or maybe I didn't get your point 🙋‍♂️).

Instead of a list of all instances and plain linear searches, we should at least index these instances on the type symbol. Thus "using symbols as keys".

I'm also aware that a TypeSymbol might be a TypeAliasSymbol. Should we resolve a type alias' definition and substitute?

Yes, if the type alias is not abstract, this should be done.

(possibly perform a "lightweight" elaboration if absent)

No (for the reason mentioned above).

[Partial Application Restrictions] currently not enforced because I find it easier to implement it when we move all module methods related checks to the new pass. I also find this PR is becoming large so let's have the checks in another PR

Sure, that's fine. Let's get small PRs merged regularly rather than big ones including many loosely related changes.


Don't forget to update the branch and mark the PR ready when it's ready for a thorough review.

@LPTK
Copy link
Contributor

LPTK commented Jan 6, 2025

PS:

The ident use is now a keyword, so the parsing fails. I have no idea on bbml. How could I fix it 🫨

Just rename the use function to something else! E.g., use_.

@FlandiaYingman
Copy link
Author

This PR is ready for review. The history is also clean. Main changes since last review:

Light Elaboration

I fixed the light elaboration according to the suggestion. There is a problem that there are two kinds of imports - one in the Importer.scala and the other in the MLsDiffMaker.scala. All sources will implicitly import Predef.mls. But this would not succeed without importing Prelude.mls because we will now elaborate all definitions and that needs the symbols from Prelude.mls.

I added a field prelude: Ctx to Elaborator so that instead of empty Ctx, the Importer will elaborate the imported files with prelude to bypass the problem. I think it might be not perfect, but I couldn't think of a cleaner way to do it.

@FlandiaYingman
Copy link
Author

Oh wait I forgot to resolve the conflicts

@FlandiaYingman FlandiaYingman marked this pull request as ready for review January 12, 2025 21:00
@LPTK
Copy link
Contributor

LPTK commented Jan 13, 2025

Thanks, I will try to review it soon. But I already notice some problems.

First, I just realized that I wanted to comment on this last time but somehow forgot: It seems you didn't understand the reason why we need to do resolution in a pass that's separate from elaboration. Indeed, what you implemented is not a separate pass: you do resolution on the fly, as elaboration progresses. So examples like this don't work:

:r
fun main() =
  use Bar = new Bar
  Test.test()
class Bar
module Test with
  fun test()(using bar: Bar): Bar = bar
//│ Resolved: { ‹› fun member:main() = { ‹› use member:instance$Ident_Bar_: globalThis:block#11#666(.)Bar‹member:Bar› = new SynthSel(Ref(globalThis:block#11),Ident(Bar))(); globalThis:block#11#666(.)Test‹member:Test›.test‹member:test›() }; Cls Bar { }; Mod Test { ‹module› fun member:test()ctx (Param(‹›,bar,Some(SynthSel(Ref(globalThis:block#11),Ident(Bar)))), ): globalThis:block#11#666(.)Bar‹member:Bar› = bar#666; }; }

main()
//│ = [Function (anonymous)]

Notice that the Test.test() call is not resolved (because we haven't even elaborated Bar or Test yet, at this point) and thus main() returns an incorrect result.

This test also exemplifies a critical flaw in your way of writing code: instead of reporting errors when things fail to elaborate, you silently generate wrong code! This sort of behavior is really to be avoided at all costs. Again, the default should be rejection, and you should only accept the code when it's 100% sure to be good. Even with a compiler crash is by far preferable to a miscompilation.

Notice that the following works and returns a different result. We don't what that changing the order of module and method definitions cause this sort of semantic discrepancies!

class Bar
module Test with
  fun test()(using bar: Bar): Bar = bar
fun main() =
  use Bar = new Bar
  Test.test()

main()
//│ = Bar {}

@FlandiaYingman
Copy link
Author

Thanks for the comments! You are right that this doesn't work, but I think the reason is not really related to the resolution pass itself. Prior to resolution, the elaborator identifies all module method applications and "marks" replaces any missing contextual parameters as some placeholders. What I was thinking is that it can be performed during elaboration to speed up the process, but it turns out that it indeed requires all symbols to be resolved, so I'll move it to the resolution pass.

@LPTK
Copy link
Contributor

LPTK commented Jan 13, 2025

Ah you're right, I read your diff too fast. You indeed have a properly separated phase, which is not the same as the moduleMethodApp method.

@FlandiaYingman
Copy link
Author

FlandiaYingman commented Jan 14, 2025

I tried to move the process of identifying module method applications later to the resolution stage, but it turned out to be a little bit difficult to do that directly. This is because during elaboration, we don't know whether some term represents a module method application or not. E.g., a Sel like M.f can become a module method application if f takes some implicit parameters!

If we want to completely move the process to the resolution stage, while keeping our performance goal (make arguments mutable instead of re-building the tree again), we will have to either change all Sel, App or TyApp to mutable, or create a mutable term wrapper that wraps over all Sel, App and TyApp for further modification by resolution. I think either is problematic because the former is error-prone and the latter breaks the semantic structure of our Term tree! I tried to implement in (60ad410), but many tests fail because of the wrapper, as I have to add one more case to all pattern matchings related to Sel, App and TyApp!

Now I'd suggest (and implemented in the latest push) to keep the current " identifying all module method applications" process during elaboration, but in addition we would have to "elaborate“ (in a very lightweight way) the module methods' parameter lists (because they might not be available at the time), so that we may construct a correct shape of nested Term.App to be populated later in the resolution pass.

@LPTK
Copy link
Contributor

LPTK commented Jan 15, 2025

we will have to either change all Sel, App or TyApp to mutable, or [...]

I think either is problematic because the former is error-prone

I don't follow. Why is this error-prone? I don't see a problem with it and this is more or less how I envisioned the feature. BTW you will also have to do this for Ref (I noticed that your PR currently mishandles references that should elaborate into applications to implicit arguments). All these ctors can be made to extend an additional ArgumentTaking trait that stores the actual state.

So, to be more specific: there is only one place where the trait is stored (the trait) and there can be only one place where this data is used (in lowering) to introduce the missing applications into the lowered tree. We just need to remember to do this change in lowering.

Now I'd suggest (and implemented in the latest push) to keep the current " identifying all module method applications" process during elaboration, but in addition we would have to "elaborate“ (in a very lightweight way) the module methods' parameter lists

This seems unnecessarily inefficient.

@FlandiaYingman
Copy link
Author

I don't follow. Why is this error-prone?

IMO a better design may only allow to modify things that really need to be modified. Notice that almost all Sels, TyApps and Apps are not eventually evaluated to implicit argument applications, only a few of them are. In contrast, the current implementation only allows modifying contextual arguments themselves, which, sounds more natural to me.

All these ctors can be made to extend an additional ArgumentTaking trait that stores the actual state.
So, to be more specific: there is only one place where the trait is stored (the trait) and there can be only one place where this data is used (in lowering) to introduce the missing applications into the lowered tree. We just need to remember to do this change in lowering.

I am not sure if I exactly understand this, but this sounds way better than what I was originally thinking 🥲.

Do you mean to introduce a trait that stores only the potential contextual arguments? So that other parts of the compiler may follow the original flow if they are not interested in the contextual arguments (or they may handle the arguments implicitly via subterms). E.g.,

  M.foo[T](using ...)()(using ...)  // has the following tree
  //│ App:
  //│   lhs = TyApp:
  //│     lhs = Sel:
  //│       prefix = SynthSel{member:M}:
  //│         prefix = Ref of globalThis:block#7
  //│         nme = Ident of "M"
  //│       nme = Ident of "foo"
  //│     targs = Nil
+ //|     ctxArgs = Tup of A :: Nil
  //│   rhs = Tup of Nil
+ //|   ctxArgs = Tup of B :: Nil
  // where ctxArgs are stored in the trait

BTW you will also have to do this for Ref

Isn't that only module method applications are resolved? And since module methods must somehow be Seled from some modules instances, I only did the resolution for Sel.

This seems unnecessarily inefficient.

Maybe we can do some caching? 🙈

@LPTK
Copy link
Contributor

LPTK commented Jan 23, 2025

Just a heads-up @FlandiaYingman: I've merged some pretty significant changes on the main branch, which affects code generation for local definitions (including the top-level ones of test worksheets). You may have to fix a few merge conflicts and possibly some bugs revealed by the change.

@LPTK
Copy link
Contributor

LPTK commented Jan 23, 2025

Oh, and I also think local functions should be allowed the same affordances as module methods (regarding type classes etc.) since we also always know their symbol when calling them.

Copy link
Contributor

@LPTK LPTK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In private conversation you told me you want to merge these changes as is. Which is a wise choice as I intend to make further changes to the elaboration system that would cause conflicts if this is not merged. So please merge with the main branch, address the current comments, and make sure this is ready for my final review.

@FlandiaYingman FlandiaYingman force-pushed the whatever-a-new-pass branch 3 times, most recently from 5f58799 to 5a5f837 Compare February 7, 2025 04:45
Copy link
Contributor

@LPTK LPTK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! I only have minor comments, but overall the changes look good enough. I'll make the merge once you resolve everything.

@LPTK
Copy link
Contributor

LPTK commented Feb 7, 2025

I see you force-push to clean the history, which is laudable. But if you want me to merge this without squashing, make sure you run git rebase -i --exec "sbt hkmc2AllTests/test" hkmc2 first, to make sure the commits are not broken by rebasing.

@FlandiaYingman FlandiaYingman force-pushed the whatever-a-new-pass branch 2 times, most recently from 242198a to cff416b Compare February 7, 2025 10:12
@LPTK LPTK force-pushed the whatever-a-new-pass branch from cff416b to 3a9170d Compare February 7, 2025 10:16
@LPTK LPTK merged commit 0c89194 into hkust-taco:hkmc2 Feb 7, 2025
1 check passed
@LPTK
Copy link
Contributor

LPTK commented Feb 8, 2025

PS: Next time please pick a more informative branch name 😉

@FlandiaYingman FlandiaYingman deleted the whatever-a-new-pass branch February 8, 2025 11:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants