-
Notifications
You must be signed in to change notification settings - Fork 5
Parameter dependency
Parameter dependency is the name given to a tricky issue that comes up in many settings where one is dealing with higher-order abstract syntax. For
that frequently comes up in CLF encodings and
Consider the following nearly-standard SSOS encoding of the untyped lambda calculus, environment-semantics style. (It's only "nearly" standard because I was too lazy to define frames, instead using poor-type-theorist's frames, which are just linear functions (exp -o exp). There are a number of reasons you wouldn't want to do this in practice!)
[example1-good.clf]
exp: type.
app: exp -o exp -o exp.
lam: (exp -> exp) -o exp.
dest: type.
eval: exp -> dest -> type. #mode eval - -.
retn: exp -> dest -> type. #mode retn - -.
cont: (exp -o exp) -> dest -> dest -> type. #mode cont - - -.
bind: exp -> exp -> type. #mode bind - -.
ev-lam:
eval (lam \!x. E !x) D
-o {retn (lam \!x. E !x) D}.
ev-app:
eval (app E1 E2) D
-o {Exists d1.
eval E1 d1
* cont (\e1. app e1 E2) d1 D}.
ev-app1:
cont (\e1. app e1 E2) D1 D
-o retn (lam \!x. E !x) D1
-o {Exists d2.
eval E2 d2
* cont (\e2. app (lam \!x. E !x) e2) d2 D}.
ev-app2:
cont (\e2. app (lam \!x. E !x) e2) D2 D
-o retn V2 D2
-o {Exists x.
eval (E !x) D
* !bind x V2}.
ev-bind:
bind X V
-o eval X D'
-o {retn V D'}.
Here's the execution of (app (lam !x. x) (lam !y. y)), using a stylized notation for frames and expressions, and also using exponentials as a shorthand for judgments (we write "!bind x (\y.y)" instead of "bind x (\y.y) pers").
-- d, eval ((\x.x)(\y.y) d
-- d,d1 eval (\x.x) d, cont ([](\y.y)) d1 d
-- d,d1 retn (\x.x) d, cont ([](\y.y)) d1 d
-- d,d1,d2 eval (\y.y) d, cont ((\x.x)[]) d2 d
-- d,d1,d2 retn (\y.y) d, cont ((\x.x)[]) d2 d
-- d,d1,d2,x eval x d, !bind x (\y.y)
-- d,d1,d2,x retn (\y.y) d, !bind x (\y.y)
Here's the execution of an only slightly more complicated program:
-- d, eval ((\x.xx)(\y.y) d
-- d,d1 eval (\x.xx) d, cont ([](\y.y)) d1 d
-- d,d1 retn (\x.xx) d, cont ([](\y.y)) d1 d
-- d,d1,d2 eval (\y.y) d, cont ((\x.xx)[]) d2 d
-- d,d1,d2 retn (\y.y) d, cont ((\x.xx)[]) d2 d
-- d,d1,d2,x eval (xx) d, !bind x (\y.y)
-- d,d1,d2,x,d3 eval x d3, cont ([]x) d3 d, !bind x (\y.y)
-- d,d1,d2,x,d3 retn (\y.y) d3, cont ([]x) d3 d, !bind x (\y.y)
-- d,d1,d2,x,d3,d4 eval x d4, cont ((\y.y)[]) d4 d, !bind x (\y.y)
-- d,d1,d2,x,d3,d4 retn (\y.y) d4, cont ((\y.y)[]) d4 d, !bind x (\y.y)
-- d,d1,d2,x,d3,d4,y eval y d, !bind x (\y.y), !bind y (\y.y)
-- d,d1,d2,x,d3,d4,y retn (\y.y) d, !bind x (\y.y), !bind y (\y.y)
-- d,d1,d2,x,d3,d4,y retn (\y.y) d, !bind x (\y.y), !bind y (\y.y)
And a third example, demonstrating how environments can get into the final
answer (\z.x
with x
bound to \y.y
is actually \z.\y.y
).
-- d, eval ((\x.\z.x)(\y.y) d
-- d,d1 eval (\x.\z.x) d, cont ([](\y.y)) d1 d
-- d,d1 retn (\x.\z.x) d, cont ([](\y.y)) d1 d
-- d,d1,d2 eval (\y.y) d, cont ((\x.\z.x)[]) d2 d
-- d,d1,d2 retn (\y.y) d, cont ((\x.\z.x)[]) d2 d
-- d,d1,d2,x eval (\z.x) d, !bind x (\y.y)
-- d,d1,d2,x retn (\z.x) d, !bind x (\y.y)
As Ian Zerny points out, the program above is highly "defunctionalized," as writing programs in a defunctionalized style has been the fashion at least since Frank's 2004 APLAS talk and our subsequent 2009 LICS paper.
We'll get at our example by re-functionalizing the program just a little bit: trying to eliminate the "bind" predicate and associated ev-bind rule by collapsing it into the ev-app2 rule: whenever we previously had a predicate "!bind x V" in the context, we now ought to have a premise "!eval x D' -o retn V D'".
Of course, because I'm explaining a problem, I'll do this in a way that is subtly wrong.
==== [begin example2-buggy.clf] exp: type. app: exp -o exp -o exp. lam: (exp -> exp) -o exp.
dest: type. eval: exp -> dest -> type. #mode eval - -. retn: exp -> dest -> type. #mode retn - -. cont: (exp -o exp) -> dest -> dest -> type. #mode cont - - -.
ev-lam: eval (lam !x. E !x) D -o {retn (lam !x. E !x) D}.
ev-app: eval (app E1 E2) D -o {Exists d1. eval E1 d1 * cont (\e1. app e1 E2) d1 D}.
ev-app1: cont (\e1. app e1 E2) D1 D -o retn (lam !x. E !x) D1 -o {Exists d2. eval E2 d2 * cont (\e2. app (lam !x. E !x) e2) d2 D}.
ev-app2: cont (\e2. app (lam !x. E !x) e2) D2 D -o retn V2 D2 -o {Exists x. eval (E !x) D * !(eval x D' -o {retn V D'})}. ==== [end example2-buggy.clf]
Our first example actually works! However, our first example should also make us aware that there's a potential problem.
- d, eval ((\x.x)(\y.y) d
- d,d1 eval (\x.x) d, cont () d1 d
- d,d1 retn (\x.x) d, cont () d1 d
- d,d1,d2 eval (\y.y) d, cont ((\x.x)[]) d2 d
- d,d1,d2 retn (\y.y) d, cont ((\x.x)[]) d2 d
- d,d1,d2,x eval x d, !(eval x D'[d/x1,d1/x2,d2/x3] -o {retn (\y.y) D'[d/x1,d1/x2,d2/x3]}) (in this step, D' becomes "x1", so D'[d,d1,d2] becomes "d")
- d,d1,d2,x retn (\y.y) d, !bind x (\y.y)
Our new ev-app2 rule has D' free in the premise but not the conclusion, so our operational semantics has to place an unbound existential variables in the context. Existential variables, following contextual modal type theory, always run with their own substitutions. For the purposes of this conversation, that explicit substitution represents that what an existential variable actually "is" is a representation of the fact that we have to instantiate a variable (like D') back where it's implicit Pi-binding lives, in the rule ev-app2, so D' can only mention variables that had been introduced when we encountered the Pi binding for D' (namely, d, d1, and d2).
This will be the source of our first problem, when we try to run the more complicated program:
- d, eval ((\x.xx)(\y.y) d
- d,d1 eval (\x.xx) d, cont () d1 d
- d,d1 retn (\x.xx) d, cont () d1 d
- d,d1,d2 eval (\y.y) d, cont ((\x.xx)[]) d2 d
- d,d1,d2 retn (\y.y) d, cont ((\x.xx)[]) d2 d
- d,d1,d2,x eval (xx) d, !(eval x D'[d/x1,d1/x2,d2/x3] -o {retn (\y.y) D'[d/x1,d1/x2,d2/x3]})
- d,d1,d2,x,d3 eval x d3, cont ([]x) d3 d, !(eval x D'[d/x1,d1/x2,d2/x3] -o {retn (\y.y) D'[d/x1,d1/x2,d2/x3]}) STUCK!
Recall that, previously, the next step was an application of ev-bind that turned "eval x d3" into "retn (\y.y) d3". But d3 is not one of the variables mentioned in the explicit substitution for D' - it is not yet introduced when we fire ev-app2 - so we can't get to "retn (\y.y) d3," which is the logical next step.
==== [begin example2-fixed.clf] exp: type. app: exp -o exp -o exp. lam: (exp -> exp) -o exp.
dest: type. eval: exp -> dest -> type. #mode eval - -. retn: exp -> dest -> type. #mode retn - -. cont: (exp -o exp) -> dest -> dest -> type. #mode cont - - -.
ev-lam: eval (lam !x. E !x) D -o {retn (lam !x. E !x) D}.
ev-app: eval (app E1 E2) D -o {Exists d1. eval E1 d1 * cont (\e1. app e1 E2) d1 D}.
ev-app1: cont (\e1. app e1 E2) D1 D -o retn (lam !x. E !x) D1 -o {Exists d2. eval E2 d2 * cont (\e2. app (lam !x. E !x) e2) d2 D}.
ev-app2: cont (\e2. app (lam !x. E !x) e2) D2 D -o retn V2 D2 -o {Exists x. eval (E !x) D * !(Pi d'. eval x d' -o {retn V d'})}. ==== [end example2-fixed.clf]
There was a subtler bug, as well, one that I'm not sure how to demonstrate with a reasonable forward-chaining program. Because D' is existentially quantified, we can only instantiate it once, so if we'd somehow run into a situation where we needed to return "x" to both the destinations "d1" and "d", we would be as hosed for a different reason.
This more subtle bug is actually easier to illustrate with a Twelf program that tries to replace all variable occurrences with some natural number (which natural number isn't important, it's just important that it changes).
==== [begin param-bad.elf] nat: type. z: nat. s: nat -> nat.
exp: type. app: exp -> exp -> exp. lam: (exp -> exp) -> exp. n: nat -> exp.
gogo: exp -> nat -> exp -> type. %mode gogo +A +B -C.
-: gogo (app E1 E2) N (app E1' E2') <- gogo E1 (s N) E1' <- gogo E2 (s N) E2'.
-: gogo (lam [x] E x) N (lam [x] E' x) <- ({x} gogo x M (n M) -> gogo (E x) (s N) (E' x)).
-: gogo (n M) N (n M).
%block govar: some {M: nat} block {x: exp} {d: gogo x M (n M)}. %worlds (govar) (gogo _ _ _).
%query 1 * gogo (lam [x] app x x) z E. %query 1 * gogo (lam [x] lam [y] app x (app y y)) z E. %query 0 * gogo (lam [x] lam [y] app x (app x y)) z E. ==== [end param-bad.elf]
There's no totality declaration, as the last %query shows. If we try to conclude %total T (gogo T _ _), Twelf will report the following:
Coverage error --- missing cases: {X1:nat} {#govar:{x:exp} {d:gogo x X1 (n X1)}} {X2:exp} |- gogo #govar_x z X2, {X1:nat} {#govar:{x:exp} {d:gogo x X1 (n X1)}} {X2:nat} {X3:exp} |- gogo #govar_x (s X2) X3.
The problem is not a parameter dependency issue, it's the more subtle bug that I was mentioning above: M appears as an existential variable - with two "lam"s to traverse, we end up with two existential variables, which I'll call "Mx" and "My" - once we get to the "app" our context contains, x, y, the fact "gogo x Mx[] (n Mx[])" and the fact "gogo y My[] (n My[])". In the second example, Mx gets unified with "s (s (s z))" - 3 - whereas My gets unified with "s (s (s (s z)))" - 4, and everything is fine. In the second example, we encounter the "x" in (app x y) and we need to return "4", but "x" has already been unified with 3 by the other occurrence of "x", so the query as a whole fails.
Conclusion: even if/when we can avoid parameter dependency issues, unification variables in the context act like very unpredictable linear resources. My argument during the meeting today was that the right solution is to always avoid programs that can leak existential variables out of the focusing phase and into the context. However, this requirement is neither necessary nor sufficient for the replacement of unification with matching in the moded semantics, which is why the consensus seemed to be that it wasn't the place of mode checking as it's understood in CLF currently.
The way to do the thing we presumably intended to do with the program above is to make the program more higher order: locally pi-quantifying M. This is essentially the same way we fixed the SSOS program in CLF, of course!
==== [begin param-good.elf] nat: type. z: nat. s: nat -> nat.
exp: type. app: exp -> exp -> exp. lam: (exp -> exp) -> exp. n: nat -> exp.
gogo: exp -> nat -> exp -> type. %mode gogo +A +B -C.
-: gogo (app E1 E2) N (app E1' E2') <- gogo E1 (s N) E1' <- gogo E2 (s N) E2'.
-: gogo (lam [x] E x) N (lam [x] E' x) <- ({x} ({M} gogo x M (n M)) -> gogo (E x) (s N) (E' x)).
-: gogo (n M) N (n M).
%block govar: block {x: exp} {d: {M} gogo x M (n M)}. %worlds (govar) (gogo _ _ _).
%query 1 * gogo (lam [x] app x x) z E. %query 1 * gogo (lam [x] lam [y] app x (app y y)) z E. %query 1 * gogo (lam [x] lam [y] app x (app x y)) z E. %total T (gogo T _ _). ==== [end param-good.elf]
A final point about parameter dependency: sometimes, we want to do something that seems bad because we know what we're doing. Take the program above; we could have just as well written the clause that deals with lam this way, because the whole point of the program is that it substitutes for all the variables:
-: gogo (lam [x] E x) N (lam [x] E') <- ({x} ({M} gogo x M (n M)) -> gogo (E x) (s N) E').
If we'd done so, however, the %total declaration would have flunked us, because it doesn't do any of the analysis that would be necessary to notice that the output can't depend on the parameter "x" - the only analysis it does do is based on type subordination.
We can, by making this example more degenerate (nats can depend on exps, which changes the subordination relation significantly), come up with an example that can fail because of parameter dependency, not because of the "linear-ish" behavior of unification. The fix is the same as before: quantify more locally.
==== [end param-goofy-bad1.elf] nat: type. z: nat. s: nat -> nat.
exp: type. app: exp -> exp -> exp. lam: (exp -> exp) -> exp. n: nat -> exp.
nx: exp -> nat -> nat.
gogo: exp -> nat -> exp -> type. %mode gogo +A +B -C.
-: gogo (app E1 E2) N (app E1' E2') <- gogo E1 (s N) E1' <- gogo E2 (s N) E2'.
-: gogo (lam [x] E x) N (lam [x] E' x) <- ({x} gogo x M (n M) -> gogo (E x) (nx x N) (E' x)).
-: gogo (n M) N (n M).
%block govar: some {M: nat} block {x: exp} {d: gogo x M (n M)}. %worlds (govar) (gogo _ _ _).
%query 0 * gogo (lam [x] app x x) z E. %query 0 * gogo (lam [x] lam [y] app x (app y y)) z E. %query 0 * gogo (lam [x] lam [y] app x (app x y)) z E. ==== [end param-goofy-bad1.elf] ==== [end param-goofy-bad2.elf] nat: type. z: nat. s: nat -> nat.
exp: type. app: exp -> exp -> exp. lam: (exp -> exp) -> exp. n: nat -> exp.
nx: exp -> nat -> nat.
gogo: exp -> nat -> exp -> type. %mode gogo +A +B -C.
-: gogo (app E1 E2) N (app E1' E2') <- gogo E1 (s N) E1' <- gogo E2 (s N) E2'.
-: gogo (lam [x] E x) N (lam [x] E' x) <- ({x} gogo x (M x) (n (M x)) -> gogo (E x) (nx x N) (E' x)).
-: gogo (n M) N (n M).
%block govar: some {M: exp -> nat} block {x: exp} {d: gogo x (M x) (n (M x))}. %worlds (govar) (gogo _ _ _).
%query 1 * gogo (lam [x] app x x) z E. %query 0 * gogo (lam [x] lam [y] app x (app y y)) z E. %query 0 * gogo (lam [x] lam [y] app x (app x y)) z E. ==== [end param-goofy-bad2.elf] ==== [begin param-goofy-good.elf] nat: type. z: nat. s: nat -> nat.
exp: type. app: exp -> exp -> exp. lam: (exp -> exp) -> exp. n: nat -> exp.
nx: exp -> nat -> nat.
gogo: exp -> nat -> exp -> type. %mode gogo +A +B -C.
-: gogo (app E1 E2) N (app E1' E2') <- gogo E1 (s N) E1' <- gogo E2 (s N) E2'.
-: gogo (lam [x] E x) N (lam [x] E' x) <- ({x} ({M} gogo x M (n M)) -> gogo (E x) (nx x N) (E' x)).
-: gogo (n M) N (n M).
%block govar: block {x: exp} {d: {M} gogo x M (n M)}. %worlds (govar) (gogo _ _ _).
%query 1 * gogo (lam [x] app x x) z E. %query 1 * gogo (lam [x] lam [y] app x (app y y)) z E. %query 1 * gogo (lam [x] lam [y] app x (app x y)) z E. %total T (gogo T _ _). ==== [end param-goofy-good.elf]
Another way this bug occurs is in queries. Given example1-good.clf or example2-fixed.clf, the programs corresponding to our two example programs will succeed:
#query * 1 * 1 Pi d. eval (app (lam !x. x) (lam !y. y)) d -o {retn V d}.
#query * 1 * 1 Pi d. eval (app (lam !x. app x x) (lam !y. y)) d -o {retn V d}.
However, with our third example program, we will fail:
#query * 1 * 1 Pi d. eval (app (lam !x. lam !z. x) (lam !y. y)) d -o {retn V d}.
For (\x.x)(\y.y) it was:
- d,d1,d2,x retn (\y.y) d, !bind x (\y.y)
For (\x.xx)(\y.y) it was:
- d,d1,d2,x,d3,d4,y retn (\y.y) d, !bind x (\y.y), !bind y (\y.y)
For our problematic example, (\x.\z.x)(\y.y), our final state was:
- d,d1,d2,x retn (\z.x) d, !bind x (\y.y)
Therein lies the problem: Celf represents "V" as the existential variable "V[]". It's not allowed to mention any of the variables (we could have allowed it to refer to the variable "d" by writing (V d) in the query, not that this would have helped anything in this particular case. It's not possible to unify V with anything that makes it equal to (\z.x), because x is free in that expression.