-
Notifications
You must be signed in to change notification settings - Fork 5
Parameter dependency
Parameter dependency is the name given to a tricky issue that comes up in many settings where one is dealing with higher-order abstract syntax. It can be particularly devious in CLF encodings.
Consider the following nearly-standard SSOS encoding of the untyped lambda
calculus, environment-semantics style. (It's only "nearly" standard because I
was too lazy to define frames, instead using poor-type-theorist's frames,
which are just linear functions exp -o exp
. There are a number of reasons
you wouldn't want to do this in practice!)
[example1-good.clf]
exp: type.
app: exp -o exp -o exp.
lam: (exp -> exp) -o exp.
dest: type.
eval: exp -> dest -> type. #mode eval - -.
retn: exp -> dest -> type. #mode retn - -.
cont: (exp -o exp) -> dest -> dest -> type. #mode cont - - -.
bind: exp -> exp -> type. #mode bind - -.
ev-lam:
eval (lam \!x. E !x) D
-o {retn (lam \!x. E !x) D}.
ev-app:
eval (app E1 E2) D
-o {Exists d1.
eval E1 d1
* cont (\e1. app e1 E2) d1 D}.
ev-app1:
cont (\e1. app e1 E2) D1 D
-o retn (lam \!x. E !x) D1
-o {Exists d2.
eval E2 d2
* cont (\e2. app (lam \!x. E !x) e2) d2 D}.
ev-app2:
cont (\e2. app (lam \!x. E !x) e2) D2 D
-o retn V2 D2
-o {Exists x.
eval (E !x) D
* !bind x V2}.
ev-bind:
bind X V
-o eval X D'
-o {retn V D'}.
Here's the execution trace of the evaluation of app (lam \!x. x) (lam \!y. y)
, using a stylized notation for frames and expressions, and also using exponentials as a shorthand for judgments (we write !bind x (\y.y)
instead of bind x (\y.y) pers
).
-- d, eval ((\x.x)(\y.y) d
-- d,d1 eval (\x.x) d, cont ([](\y.y)) d1 d
-- d,d1 retn (\x.x) d, cont ([](\y.y)) d1 d
-- d,d1,d2 eval (\y.y) d, cont ((\x.x)[]) d2 d
-- d,d1,d2 retn (\y.y) d, cont ((\x.x)[]) d2 d
-- d,d1,d2,x eval x d, !bind x (\y.y)
-- d,d1,d2,x retn (\y.y) d, !bind x (\y.y)
Here's the execution of an only slightly more complicated program:
-- d, eval ((\x.xx)(\y.y) d
-- d,d1 eval (\x.xx) d, cont ([](\y.y)) d1 d
-- d,d1 retn (\x.xx) d, cont ([](\y.y)) d1 d
-- d,d1,d2 eval (\y.y) d, cont ((\x.xx)[]) d2 d
-- d,d1,d2 retn (\y.y) d, cont ((\x.xx)[]) d2 d
-- d,d1,d2,x eval (xx) d, !bind x (\y.y)
-- d,d1,d2,x,d3 eval x d3, cont ([]x) d3 d, !bind x (\y.y)
-- d,d1,d2,x,d3 retn (\y.y) d3, cont ([]x) d3 d, !bind x (\y.y)
-- d,d1,d2,x,d3,d4 eval x d4, cont ((\y.y)[]) d4 d, !bind x (\y.y)
-- d,d1,d2,x,d3,d4 retn (\y.y) d4, cont ((\y.y)[]) d4 d, !bind x (\y.y)
-- d,d1,d2,x,d3,d4,y eval y d, !bind x (\y.y), !bind y (\y.y)
-- d,d1,d2,x,d3,d4,y retn (\y.y) d, !bind x (\y.y), !bind y (\y.y)
-- d,d1,d2,x,d3,d4,y retn (\y.y) d, !bind x (\y.y), !bind y (\y.y)
And a third example, demonstrating how environments can "leak" into the final answer (\z.x
with x
bound to \y.y
is actually \z.\y.y
).
-- d, eval ((\x.\z.x)(\y.y) d
-- d,d1 eval (\x.\z.x) d, cont ([](\y.y)) d1 d
-- d,d1 retn (\x.\z.x) d, cont ([](\y.y)) d1 d
-- d,d1,d2 eval (\y.y) d, cont ((\x.\z.x)[]) d2 d
-- d,d1,d2 retn (\y.y) d, cont ((\x.\z.x)[]) d2 d
-- d,d1,d2,x eval (\z.x) d, !bind x (\y.y)
-- d,d1,d2,x retn (\z.x) d, !bind x (\y.y)
As Ian Zerny points out, the program above is highly "defunctionalized," as writing programs in a defunctionalized style has been the fashion at least since Frank's 2004 APLAS talk and our subsequent 2009 LICS paper.
We'll get at our example by re-functionalizing the program just a little bit:
trying to eliminate the bind
predicate and associated ev-bind rule by
collapsing it into the ev-app2
rule: whenever we previously had a predicate
!bind x V
in the context, we now ought to have a premise
!eval x D' -o retn V D'
.
Of course, because I'm explaining a problem, I'll do this in a way that is subtly wrong.
[example2-buggy.clf]
exp: type.
app: exp -o exp -o exp.
lam: (exp -> exp) -o exp.
dest: type.
eval: exp -> dest -> type. #mode eval - -.
retn: exp -> dest -> type. #mode retn - -.
cont: (exp -o exp) -> dest -> dest -> type. #mode cont - - -.
ev-lam:
eval (lam \!x. E !x) D
-o {retn (lam \!x. E !x) D}.
ev-app:
eval (app E1 E2) D
-o {Exists d1.
eval E1 d1
* cont (\e1. app e1 E2) d1 D}.
ev-app1:
cont (\e1. app e1 E2) D1 D
-o retn (lam \!x. E !x) D1
-o {Exists d2.
eval E2 d2
* cont (\e2. app (lam \!x. E !x) e2) d2 D}.
ev-app2:
cont (\e2. app (lam \!x. E !x) e2) D2 D
-o retn V2 D2
-o {Exists x.
eval (E !x) D
* !(eval x D'
-o {retn V D'})}.
Our first example actually works! However, our first example should also make us aware that there's a potential problem.
-- d, eval ((\x.x)(\y.y) d
-- d,d1 eval (\x.x) d, cont ([](\y.y)) d1 d
-- d,d1 retn (\x.x) d, cont ([](\y.y)) d1 d
-- d,d1,d2 eval (\y.y) d, cont ((\x.x)[]) d2 d
-- d,d1,d2 retn (\y.y) d, cont ((\x.x)[]) d2 d
-- d,d1,d2,x eval x d, !(eval x D'[d/x1,d1/x2,d2/x3]
-o {retn (\y.y) D'[d/x1,d1/x2,d2/x3]})
(in this step, D' becomes "x1", so D'[d,d1,d2] becomes "d")
-- d,d1,d2,x retn (\y.y) d, !bind x (\y.y)
Our new ev-app2
rule has D'
free in the premise but not the conclusion, so our
operational semantics has to place an unbound existential variables in the
context. Existential variables, following contextual modal type theory, always
run with their own substitutions. For the purposes of this conversation, that
explicit substitution represents that what an existential variable actually
"is" is a representation of the fact that we have to instantiate a variable
(like D'
) back where it's implicit Pi-binding lives, in the rule ev-app2
, so
D'
can only mention variables that had been introduced when we encountered the
Pi binding for D'
(namely, d
, d1
, and d2
).
This will be the source of our first problem, when we try to run the more complicated program:
-- d, eval ((\x.xx)(\y.y) d
-- d,d1 eval (\x.xx) d, cont ([](\y.y)) d1 d
-- d,d1 retn (\x.xx) d, cont ([](\y.y)) d1 d
-- d,d1,d2 eval (\y.y) d, cont ((\x.xx)[]) d2 d
-- d,d1,d2 retn (\y.y) d, cont ((\x.xx)[]) d2 d
-- d,d1,d2,x eval (xx) d, !(eval x D'[d/x1,d1/x2,d2/x3]
-o {retn (\y.y) D'[d/x1,d1/x2,d2/x3]})
-- d,d1,d2,x,d3 eval x d3, cont ([]x) d3 d,
!(eval x D'[d/x1,d1/x2,d2/x3]
-o {retn (\y.y) D'[d/x1,d1/x2,d2/x3]})
STUCK!
Recall that, previously, the next step was an application of ev-bind
that
turned eval x d3
into retn (\y.y) d3
. But d3
is not one of the variables
mentioned in the explicit substitution for D'
- it is not yet introduced
when we fire ev-app2
- so we can't get to retn (\y.y) d3
, which is the
logical next step.
[example2-fixed.clf]
exp: type.
app: exp -o exp -o exp.
lam: (exp -> exp) -o exp.
dest: type.
eval: exp -> dest -> type. #mode eval - -.
retn: exp -> dest -> type. #mode retn - -.
cont: (exp -o exp) -> dest -> dest -> type. #mode cont - - -.
ev-lam:
eval (lam \!x. E !x) D
-o {retn (lam \!x. E !x) D}.
ev-app:
eval (app E1 E2) D
-o {Exists d1.
eval E1 d1
* cont (\e1. app e1 E2) d1 D}.
ev-app1:
cont (\e1. app e1 E2) D1 D
-o retn (lam \!x. E !x) D1
-o {Exists d2.
eval E2 d2
* cont (\e2. app (lam \!x. E !x) e2) d2 D}.
ev-app2:
cont (\e2. app (lam \!x. E !x) e2) D2 D
-o retn V2 D2
-o {Exists x.
eval (E !x) D
* !(Pi d'.
eval x d'
-o {retn V d'})}.
There was a subtler bug, as well, one that I'm not sure how to demonstrate with
a reasonable forward-chaining program. Because D'
is existentially quantified,
we can only instantiate it once, so if we'd somehow run into a situation
where we needed to return x
to both the destinations d1
and d
, we would
be as hosed for a different reason.
This more subtle bug is actually easier to illustrate with a Twelf program that tries to replace all variable occurrences with some natural number (which natural number isn't important, it's just important that it changes).
[param-bad.elf]
nat: type.
z: nat.
s: nat -> nat.
exp: type.
app: exp -> exp -> exp.
lam: (exp -> exp) -> exp.
n: nat -> exp.
gogo: exp -> nat -> exp -> type.
%mode gogo +A +B -C.
-: gogo (app E1 E2) N (app E1' E2')
<- gogo E1 (s N) E1'
<- gogo E2 (s N) E2'.
-: gogo (lam [x] E x) N (lam [x] E' x)
<- ({x} gogo x M (n M)
-> gogo (E x) (s N) (E' x)).
-: gogo (n M) N (n M).
%block govar: some {M: nat} block {x: exp} {d: gogo x M (n M)}.
%worlds (govar) (gogo _ _ _).
%query 1 * gogo (lam [x] app x x) z E.
%query 1 * gogo (lam [x] lam [y] app x (app y y)) z E.
%query 0 * gogo (lam [x] lam [y] app x (app x y)) z E.
There's no totality declaration, as the last %query
shows. If we try to
conclude %total T (gogo T _ _)
, Twelf will report the following:
Coverage error --- missing cases:
{X1:nat} {#govar:{x:exp} {d:gogo x X1 (n X1)}} {X2:exp} |- gogo #govar_x z X2,
{X1:nat} {#govar:{x:exp} {d:gogo x X1 (n X1)}} {X2:nat} {X3:exp}
|- gogo #govar_x (s X2) X3.
The problem is not a parameter dependency issue, it's the more subtle bug that
I was mentioning above: M
appears as an existential variable - with two lam
s
to traverse, we end up with two existential variables, which I'll call Mx
and My
- once we get to the app
our context contains, x
, y
,
the fact gogo x Mx[] (n Mx[])
and the fact gogo y My[] (n My[])
. In the
second example, Mx
gets unified with s (s (s z))
- 3 - whereas My
gets
unified with s (s (s (s z)))
- 4, and everything is fine. In the
second example, we encounter the x
in app x y
and we need to return 4,
but x
has already been unified with 3 by the other occurrence of x
, so the
query as a whole fails.
Conclusion: even if/when we can avoid parameter dependency issues, unification
variables in the context act like very unpredictable linear resources.
I think the right solution is to always
avoid programs that can leak existential variables out of the focusing phase
and into the context.
The way to do the thing we presumably intended to do with the program above
is to make the program more higher order: locally pi-quantifying M
. This is
essentially the same way we fixed the SSOS program in CLF, of course!
[param-good.elf]
nat: type.
z: nat.
s: nat -> nat.
exp: type.
app: exp -> exp -> exp.
lam: (exp -> exp) -> exp.
n: nat -> exp.
gogo: exp -> nat -> exp -> type.
%mode gogo +A +B -C.
-: gogo (app E1 E2) N (app E1' E2')
<- gogo E1 (s N) E1'
<- gogo E2 (s N) E2'.
-: gogo (lam [x] E x) N (lam [x] E' x)
<- ({x} ({M} gogo x M (n M))
-> gogo (E x) (s N) (E' x)).
-: gogo (n M) N (n M).
%block govar: block {x: exp} {d: {M} gogo x M (n M)}.
%worlds (govar) (gogo _ _ _).
%query 1 * gogo (lam [x] app x x) z E.
%query 1 * gogo (lam [x] lam [y] app x (app y y)) z E.
%query 1 * gogo (lam [x] lam [y] app x (app x y)) z E.
%total T (gogo T _ _).
A final point about parameter dependency: sometimes, we want to do something
that seems bad because we know what we're doing. Take the program above; we
could have just as well written the clause that deals with lam
this way,
because the whole point of the program is that it substitutes for all the
variables:
-: gogo (lam [x] E x) N (lam [x] E')
<- ({x} ({M} gogo x M (n M))
-> gogo (E x) (s N) E').
If we'd done so, however, the %total
declaration would have flunked us, because
it doesn't do any of the analysis that would be necessary to notice that the
output can't depend on the parameter x
- the only analysis it does do is
based on type subordination.
We can, by making this example more degenerate (nats can depend on exps, which changes the subordination relation significantly), come up with an example that can fail because of parameter dependency, not because of the "linear-ish" behavior of unification. The fix is the same as before: quantify more locally.
[param-goofy-bad1.elf]
nat: type.
z: nat.
s: nat -> nat.
exp: type.
app: exp -> exp -> exp.
lam: (exp -> exp) -> exp.
n: nat -> exp.
nx: exp -> nat -> nat.
gogo: exp -> nat -> exp -> type.
%mode gogo +A +B -C.
-: gogo (app E1 E2) N (app E1' E2')
<- gogo E1 (s N) E1'
<- gogo E2 (s N) E2'.
-: gogo (lam [x] E x) N (lam [x] E' x)
<- ({x} gogo x M (n M)
-> gogo (E x) (nx x N) (E' x)).
-: gogo (n M) N (n M).
%block govar: some {M: nat} block {x: exp} {d: gogo x M (n M)}.
%worlds (govar) (gogo _ _ _).
%query 0 * gogo (lam [x] app x x) z E.
%query 0 * gogo (lam [x] lam [y] app x (app y y)) z E.
%query 0 * gogo (lam [x] lam [y] app x (app x y)) z E.
[param-goofy-bad2.elf]
nat: type.
z: nat.
s: nat -> nat.
exp: type.
app: exp -> exp -> exp.
lam: (exp -> exp) -> exp.
n: nat -> exp.
nx: exp -> nat -> nat.
gogo: exp -> nat -> exp -> type.
%mode gogo +A +B -C.
-: gogo (app E1 E2) N (app E1' E2')
<- gogo E1 (s N) E1'
<- gogo E2 (s N) E2'.
-: gogo (lam [x] E x) N (lam [x] E' x)
<- ({x} gogo x (M x) (n (M x))
-> gogo (E x) (nx x N) (E' x)).
-: gogo (n M) N (n M).
%block govar: some {M: exp -> nat} block {x: exp} {d: gogo x (M x) (n (M x))}.
%worlds (govar) (gogo _ _ _).
%query 1 * gogo (lam [x] app x x) z E.
%query 0 * gogo (lam [x] lam [y] app x (app y y)) z E.
%query 0 * gogo (lam [x] lam [y] app x (app x y)) z E.
[param-goofy-good.elf]
nat: type.
z: nat.
s: nat -> nat.
exp: type.
app: exp -> exp -> exp.
lam: (exp -> exp) -> exp.
n: nat -> exp.
nx: exp -> nat -> nat.
gogo: exp -> nat -> exp -> type.
%mode gogo +A +B -C.
-: gogo (app E1 E2) N (app E1' E2')
<- gogo E1 (s N) E1'
<- gogo E2 (s N) E2'.
-: gogo (lam [x] E x) N (lam [x] E' x)
<- ({x} ({M} gogo x M (n M))
-> gogo (E x) (nx x N) (E' x)).
-: gogo (n M) N (n M).
%block govar: block {x: exp} {d: {M} gogo x M (n M)}.
%worlds (govar) (gogo _ _ _).
%query 1 * gogo (lam [x] app x x) z E.
%query 1 * gogo (lam [x] lam [y] app x (app y y)) z E.
%query 1 * gogo (lam [x] lam [y] app x (app x y)) z E.
%total T (gogo T _ _).
Another way this bug occurs is in queries. Given example1-good.clf or example2-fixed.clf, the programs corresponding to our two example programs will succeed:
#query * 1 * 1
Pi d.
eval (app (lam \!x. x) (lam \!y. y)) d
-o {retn V d}.
#query * 1 * 1
Pi d.
eval (app (lam \!x. app x x) (lam \!y. y)) d
-o {retn V d}.
However, with our third example program, we will fail:
#query * 1 * 1
Pi d.
eval (app (lam \!x. lam \!z. x) (lam \!y. y)) d
-o {retn V d}.
For (\x.x)(\y.y)
it was:
-- d,d1,d2,x retn (\y.y) d, !bind x (\y.y)
For (\x.xx)(\y.y)
it was:
-- d,d1,d2,x,d3,d4,y retn (\y.y) d, !bind x (\y.y), !bind y (\y.y)
For our problematic example, (\x.\z.x)(\y.y)
, our final state was:
-- d,d1,d2,x retn (\z.x) d, !bind x (\y.y)
Therein lies the problem: Celf represents V
as the existential variable
V[]
. It's not allowed to mention any of the variables we could have allowed
it to refer to the variable d
by writing V d
in the query, not that this
would have helped anything in this particular case. It's not possible to unify
V
with anything that makes it equal to \z.x
, because x
is free in that
expression.