Programming Language Technology, DAT151/DIT231
At the core of functional languages is the λ-calculus.
In pure λ-calculus, everything is a function.
Expressions e of pure λ-calculus are given by this
grammar:
e,f ::= x -- Variable
| f e -- Application of function f to argument e
| λx → e -- Function: abstraction of x in eThis is a subset of the Haskell expression syntax.
The abstract syntax corresponds to this LBNF grammar:
EId. Exp3 ::= Ident;
EApp. Exp2 ::= Exp2 Exp3;
EAbs. Exp ::= "λ" Ident "→" Exp;
coercions Exp 3;
Example: x y z should be read (x y) z.
We consider an extension by let, numerals, and primitive
operators:
... | let x = e in f
| let x₁ = e₁; ...; xₙ = eₙ in f
| n -- E.g. 0,1,2..
| e₁ op e₂ -- op could be +,-,... ELet. Exp ::= "let" [Bind] "in" Exp;
EInt. Exp3 ::= Integer;
EOp. Exp1 ::= Exp1 Op Exp2;
Bind. Bind ::= Ident "=" Exp;
separator Bind ";";
let x = e in f could be regarded just syntactic sugar
for (λ x → f) e,
and let x₁ = e₁; ...; xₙ = eₙ in f as sugar for
(λ x₁ → ... λ xₙ → f) e₁ ... eₙ.
But let is conceptually important.
Convenient syntactic sugar:
λ x₁ ... xₙ → e sugar for
λ x₁ → .... → λ xₙ → elet f x₁ ... xₙ = e in ... for
let f = λ x₁ ... xₙ → e in ...Running example:
let
double x = x + x
comp f g x = f (g x)
in let
twice f = comp f f
in
twice twice double 2Example (Church numerals):
never f = λ x → x
once f = λ x → f x
twice f = λ x → f (f x)
thrice f = λ x → f (f (f x))In λ-calculus and functional languages, functions are first
class.
They can be:
E.g. 2. in C:
int comp (int f(int), int g(int), int x) {
return f(g(x));
}Trying 3. let twice f = comp f f in twice double in
C:
Attempt 1: illegal type
int twice (int f(int)) (int x) {
return comp(f,f,x);
}
... twice(double) ... Attempt 2: Cannot partially apply
int twice (int f(int), int x) {
return comp(f,f,x);
}
... twice(double,??) ... The essential feature of functional languages is not anonymous functions (via λ) but partial application, forming a new function by giving less arguments than the function arity.
Example: The increment function:
let plus x y = x + y in plus 1Can be written as anonymous function λ y → 1 + y.
In λ x → x y, variable y is considered
free while x is bound by the
λ.
In let x = y in x, variable y is free and
x is bound by the let.
Both let and λ are binders.
Free variable computation:
FV(x) = {x}
FV(f e) = FV(f) ∪ FV(e)
FV(λ x → e) = FV(e) \ {x}
FV(let x₁=e₁;...;xₙ=eₙ in f) = FV(e₁,...,eₙ) ∪ (FV(f) \ {x₁,...,xₙ})
An expression without free variables is called closed,
otherwise open.
For closed expression, we are interested in their value.
Running example continued:
let ... in double 2 has value 4let ... in twice double 2 has value 8let ... in twice twice double 2 has value ??let ... in twice double has value λ x → e
where e computes 4*x.λ x → (x + x) + (x + x).)A value v (in our language) can be a numeral
(int value) or a λ (function value).
OBS! Some closed expressions do not have a value, e.g.
(λ x → x x) (λ x → x x)
How to compute the value of an expression?
Example (fix-point combinator)
Y f = (λ x → f (x x)) (λ x → f (x x))
Y f = f (Y f) = f (f (Y f)) = f (f (f (Y f))) ...
Example (faculty function)
f n = if n <= 1 then 1 else n * f (n-1)
Y f n = n!
Russel paradox: read application x y as
y ∈ x.
Define the Russel set x ∈ R if x ∉ x:
R x = not (x x)
Does R contain itself?
R R = (λ x → not (x x)) (λ x → not (x x))
= not (R R)
= not (not (R R))
= not (not (not ...))
Idea: compute the value by substituting function arguments for function parameters, and evaluating operations.
(λ x → 1 + x) 2
↦ 1 + 2
↦ 3
let x = 2 in 1 + x
↦ 1 + 2
↦ 3
This is formally a small-step semantics e ↦ e',
called reduction.
Reduction rules:
(λ x → f) e ↦ f[x=e] -- named β by Alonzo Church
let x = e in f ↦ f[x=e]
let x₁=e₁;...;xₙ=eₙ in f ↦ f[x₁=e₁;...;xₙ=eₙ]The lhs (left hand side) is called a redex (reducible expression) and the rhs (right hand side) its reduct.
Strategy question: where and under which conditions can these rules be applied?
Example:
(λ y f → (λ x → x) f (f y)) ((λ z → z) 42)
Substitution f[x=e] of course does not substitute
bound occurrences of x in f:
(λ x → (λ x → x)) 1
↦ (λ x → x)[x=1]
= (λ x → x)
Still substitution has some pitfalls related to shadowing:
Example: let x = 1 in (λ f x → f x) (λ y → x)
Reducing the let first:
let x = 1 in (λ f x → f x) (λ y → x)
↦ ((λ f x → f x) (λ y → x))[x=1]
= (λ f x → f x) (λ y → 1)
↦ (λ x → f x)[f = λ y → 1]
= λ x → (λ y → 1) x
↦ λ x → 1[y=x]
= λ x → 1Reducing the λ first:
let x = 1 in (λ f x → f x) (λ y → x)
↦ let x = 1 in (λ x → f x)[f = λ y → x]
= let x = 1 in λ x → (λ y → x) x
↦ let x = 1 in λ x → x[y=x]
= let x = 1 in λ x → x
↦ (λ x → x)[x=1]
= λ x → xWhat goes wrong here?
Variable capture problem in
(λ x → f x)[f = λ y → x]:
Naive substituting under a binder can produce meaningless results.
Here, the free variable x is captured by the binder
λ x.
Solutions:
Never substitute open expressions under a binder.
For evaluation of closed expressions, we can use strategies that
respect this imperative.
E.g., call-by-value, call-by-name.
Rename bound variables to avoid capture.
(λ x → f x)[f = λ y → x]
= (λ z → f z)[f = λ y → x]
= (λ z → (λ y → x) z)
↦ λ z → xConsistently renaming bound variables does not change the meaning of an expression.
There are terms that reduce to themselves!
(λ x → x x) (λ x → x x)
↦ (x x)[x = (λ x → x x)]
= (λ x → x x) (λ x → x x)
Such terms do not have a value.
We use the notation let γ in e where γ is a
list of bindings xᵢ = eᵢ.
This list may be called environment.
Just like for C--, we can give a big step semantics
γ ⊢ e ⇓ v for the lambda-calculus. In terms of reduction
this should mean:
(let γ in e) ↦* v
The big-step semantics corresponding to the call-by-value strategy
uses an environment of closed values, i.e., γ is
of the form x₁=v₁,...,xₙ=vₙ.
A value v is a closed expression which is
nlet δ in λ x → f.The latter is called a closure and may be written
⟨λx→f; δ⟩ or (λx→f){δ} (IPL book).
Variable:
------------
γ ⊢ x ⇓ γ(x)
Let:
γ ⊢ e₁ ⇓ v₁
γ,x=v₁ ⊢ e₂ ⇓ v₂
-------------------------
γ ⊢ let x = e₁ in e₂ ⇓ v₂
Lambda:
--------------------------------
γ ⊢ (λx → f) ⇓ (let γ in λx → f)
Application:
γ ⊢ e₁ ⇓ let δ in λx → f
γ ⊢ e₂ ⇓ v₂
δ,x=v₂ ⊢ f ⇓ v
-----------------------------
γ ⊢ e₁ e₂ ⇓ v
Exercise: Evaluate
(((λ x₁ → λ x₂ → λ x₃ → x₁ + x₃) 1) 2) 3
Rules for integer expressions:
---------
γ ⊢ n ⇓ n
γ ⊢ e₁ ⇓ n₁
γ ⊢ e₂ ⇓ n₂
---------------- n = n₁ + n₂
γ ⊢ e₁ + e₂ ⇓ n
Drawback of call-by-value: unused arguments are still evalutated.
Example:
(\ y -> 1) (twice twice double 2)
The result is 1, but call-by-value will first compute
the value of twice twice double 2.
Idea: only evaluate expressions when needed.
Call by name differs from call-by-value by not evaluating arguments
when calling functions, but to form a closure. An environment entry
c is now itself a closure let δ in e where
environment δ is of the form
x₁=c₁,...,xₙ=cₙ.
The evaluation judgement is still of the form
γ ⊢ e ⇓ v.
Variable:
δ ⊢ e ⇓ v
---------- γ(x) = let δ in e
γ ⊢ x ⇓ v
Let:
γ,x=c ⊢ e₂ ⇓ v₂
------------------------- c = let γ in e₁
γ ⊢ let x = e₁ in e₂ ⇓ v₂
Lambda (unchanged):
--------------------------------
γ ⊢ (λx → f) ⇓ (let γ in λx → f)
Application:
γ ⊢ e₁ ⇓ let δ in λx → f
δ,x=c ⊢ f ⇓ v
----------------------------- c = let γ in e₂
γ ⊢ e₁ e₂ ⇓ v
Rules for integer expressions: (unchanged).
Comparing cbn (call-by-name) with cbv (call-by-value):
Arguments are only evaluated when needed.
This is an advantage if an argument is unused or only used under rare
conditions.
E.g.:
e₁ && e₂ = if e₁ then e₂ else false
Arguments are every time evaluated when
needed.
This is a disadvantage if an argument is used twice or more.
E.g.:
double x = x + x
Synthesis of call-by-name and call-by-value: call-by-need (Haskell)
Call-by-need example:
twice twice double 2
double x = add x x
comp f = \ g x -> f (g x)
twice f = comp f fStep visualization with https://github.com/well-typed/visualize-cbn:
Literature to study call-by-need:
big-step semantics:
John Launchbury:
A Natural Semantics for Lazy Evaluation.
POPL 1993: 144-154
small-step semantics:
Peter Sestoft:
Deriving a Lazy Abstract Machine.
J. Funct. Program. 7(3): 231-264 (1997)