Programming Language Technology, DAT151/DIT231
We implement interpretation of expressions and statements via Haskell functions:
eval
: evaluate an expressionexec
: execute a statementexecs
: execute a statement sequenceThe interpreter need some structure for the following tasks:
readInt
,
printInt
),Env
),return
a value (type Val
) from the middle
of a computation, andSig
) that maps function names to their definition.We can implement these structures as follows:
For input/output, there is the Haskell IO
monad.
State threading: pass in the environment "before", return the environment "after".
Exception: the return value is an handled as exceptional
value,
propagated through the computation, caught at the function
call.
Supply in the function signature as separate parameter; no need
to return it, since it does not change.
(Alternatively, we could make the function signature part of the
environment and just never change it.)
The return
exception can be thrown only when
statements are executed.
The considerations above suggest the following type signatures
eval :: Exp -> Sig -> Env -> IO (Val, Env)
exec :: Stm -> Sig -> Env -> IO (Either Val Env)
execs :: [Stm] -> Sig -> Env -> IO (Either Val Env)
and these implementations fragments:
Binary operations
EBinOp op e1 e2) sig env = do
eval (<- eval e1 sig env
(v1, env1) <- eval e2 sig env1
(v2, env2) <- binOp op v1 v2 -- May throw runtime exception in IO (division by 0)
v return (v, env2)
Statement sequences
= return (Right env)
execs [] sig env :ss) sig env = do
execs (s<- exec s sig env
r case r of
Left v -> return (Left v)
Right env1 -> execs ss sig env1
Statement: while
loop
SWhile e s) sig env = do
exec (<- eval e sig env
(v, env1) if isFalse v then return (Right env1)
else execs [SBlock [s], SWhile e s] sig env1
Return statement
SReturn e) sig env = do
exec (<- eval e sig env
(v, _) return (Left v)
Function call
ECall f es) sig env = do
eval (<- evals es sig env
(vs, env') let (Def t _ params ss) = lookupFun f sig
<- execs ss sig (makeEnv params vs)
r case r of
Left v -> return (v, env')
Right _ -> runtimeError "Function did not return anything"
Value lists
evals :: [Exp] -> Sig -> Env -> IO ([Val], Env)
= return ([], env)
evals [] sig env :es) sig env = do
evals (e<- eval e sig env
(v , env1) <- evals es sig env1
(vs, env2) return (v:vs, env2)
Using evals
, we can hide the state threading in the case
of binary operators:
EBinOp op e1 e2) sig env = do
eval (<- evals [e1,e2] sig env
([v1,v2], env') <- binOp op v1 v2
v return (v, env')
Remark: The matching [v1,v2]
is total
because the output list of evals
has the same length
n
as the input list.
However, GHC does not see that and might complain.
We would need dependent types to communicate the invariant:
evals :: Vector Env n -> Sig -> Env -> IO (Vector Val n, Env)
This is possible in GHC but a bit tricky and clearer in languages like Agda.
The approach so far is a bit pedestrian and slightly error
prone.
In particular, we need to be very careful to pass the right environment
to the recursive call when several environments are in scope.
However, you may be content with this style of implementation, since everything is explicit and the code clearly says what is going on.
The code though exhibits common programming patterns that are used all the time in Haskell and can be automated using standard monads and monad transformers. Using monad transformers, we can remove some clutter from our code and hide it behind the standard monad combinators.
In the remainder of this text, we step-by-step identify the common programming patterns and how to encapsulate them by monad transformers.
A monad m
is a type constructor that implements
the Monad
type class.
class Monad m where
return :: a -> m a
(>>=) :: m a -> (a -> m b) -> m b
For example, IO
is a monad, and we assume that the
reader is familiar with the IO
monad already.
The return
method embeds a pure computation of type
a
into the monadic computation.
The bind method ma >>= k
is supposed to run
computation ma
and feed its result of type a
to continuation k
.
The effects of ma >>= k
are the effects of
ma
and k a
combined.
We can think of bind in the terms of do
notation:
>>= k = do
m <- m
a k a
In reality, it is the other way round: do
is desugared
to uses of bind.
A monad transformer t
takes a monad
m
and makes a new monad t m
,
usually adding some effects on top of the ones of m
.
This can be explained best with instances of Monad
.
The library Control.Monad.Except
defines the exception
monad transformer. In the library, it is a newtype
, here we
just use a type
synonym to simplify things.
type ExceptT e m a = m (Either e a)
A computation in ExceptT e m a
will return a result of
Either e a
,
where Left
alternatives of type e
are
considered errors, and Right
alternatives of type
a
regular results. Further, the effects of the inner monad
m
can happen. For now, let us think of m = IO
,
but m
can be any monad.
The monad implementation embeds pure computations into
Right
and propagates Left
results---discarding
the continuation k
when ma
produced a
Left
.
instance Monad m => Monad (ExceptT e m) where
return a = return (Right a)
>>= k = do
ma <- ma
r case r of
Left e -> return (Left e)
Right a -> k a
We saw this pattern already: in our implementation of
execs
! Hence,
we can simply execs
, making use of the exception monad
transformer:
exec :: Stm -> Sig -> Env -> ExceptT Val IO Env
execs :: [Stm] -> Sig -> Env -> ExceptT Val IO Env
= return env
execs [] sig env :ss) sig env = do
execs (s<- exec s sig env
env1 execs ss sig env1
Raising and catching of errors is encapsulated in the
MonadError
class defined in
Control.Monad.Except
:
class MonadError e m where
throwError :: e -> m a
catchError :: m a -> (e -> m a) -> m a
instance Monad m => MonadError e (ExceptT e m) where
= return (Left e)
throwError e = do
catchError ma h <- ma
r case r of
Left e -> h e
Right a -> return (Right a)
The throwError
method returns a non-regular result
(Left
). The catchError
method takes a
computation ma
and looks at its result.
If it is an exception, the handler h
is invoked. Otherwise,
the result is passed on.
We use throwError
in our execution of
SReturn
:
SReturn e) sig env = do
exec (<- eval e sig env -- Problem here!
(v, _) throwError v
There is a slight problem remaining here: eval
returns
something in the IO
monad, but we are now computing in the
ExceptT Val IO
monad! We shall deal with this problem
shortly.
But first let us observe that we do not really need to catch and
handle errors. Rather, when we call a function, we want to extract the
exceptional value, and then continue in a monad without exceptions. This
is facilitated by the runExceptT
function defined in
Control.Monad.Except
.
runExceptT :: ExceptT e m a -> m (Either e a)
= id runExceptT
In our simplification, the run method is the identity, but in the
true implementation, we unwrap the newtype
:
newtype ExceptT e m a = ExceptT { runExceptT :: m (Either e a) }
Evaluation of function calls uses runExceptT
:
ECall f es) sig env = do
eval (<- evals es sig env
(vs, env') let (Def t _ params ss) = lookupFun f sig
<- runExceptT (execs ss sig (makeEnv params vs))
r case r of
Left v -> return (v, env')
Right _ -> runtimeError "Function did not return anything"
We have observed above the problem of accessing an IO
computation inside a ExceptT Val IO
computation.
Luckily, this problem has been solved by the MonadIO
class
that gives us access to the IO
in any monad m
that supports I/O:
class MonadIO m where
liftIO :: IO a -> m a
instance MonadIO (ExceptT e IO) where
= do
liftIO io <- io
a return (Right a)
The method liftIO
provided by MonadIO
can
be generalized to a lift
method that gives access to the
functionality of the inner monad m
in a transformed monad
t m
.
Module Control.Monad.Trans
defines a type class
MonadTrans
to this end,
which has an instances for ExceptT
:
class MonadTrans t where
lift :: Monad m => m a -> t m a
instance MonadTrans (ExceptT e) where
= do
lift ma <- ma
a return (Right a)
We can then think of MonadIO
being implemented simply
as:
instance MonadIO IO where
= id
liftIO
instance (MonadTrans t, MonadIO m) => MonadIO (t m) where
= lift . liftIO liftIO
We can fix our problem now by using liftIO
(or,
equivalently here, lift
):
SReturn e) sig env = do
exec (<- liftIO (eval e sig env)
(v, _) throwError v
In general, we may want to think about how to run expression evaluation inside the statement execution.
type Eval = IO
type Exec = ExceptT Val IO
liftEval :: Eval a -> Exec a
= liftIO
liftEval
SReturn e) sig env = do
exec (<- liftEval (eval e sig env)
(v, _) throwError v
While this may look like an over-generalization at the moment and
just boilerplate code, we will refine the definition of
liftEval
as we introduce more monad transformers.
The threading of state, passing in an environment and returning a
modified environment, is a pattern that is encapsulated in the state
monad transformer of Control.State.Monad
. Again,
dropping the newtype
wrapping, the definition would be:
type StateT s m a = s -> m (a, s)
This equips a monadic computation m a
with reading from
an extra parameter of type s
, the input state, and
returning an extra result of type s
, the output state.
The state threading is taken care of by the Monad
instance:
instance Monad m => Monad (StateT s m) where
return a = \ s -> return (a, s)
>>= k = \ s -> do
ma <- ma s
(a, s') k a s'
The state s'
returned from ma
(invoked with
the original state s
) is passed on to continuation
k
.
We hand-threaded the environment in a couple of places. All of these places can now benefit for the encapsulation into the state monad:
type Eval = StateT Env IO
evals :: [Exp] -> Sig -> Eval [Val]
= return []
evals [] sig :es) sig = do
evals (e<- eval e sig
v <- evals es sig
vs return (v:vs)
eval :: Exp -> Sig -> Eval Val
EBinOp op e1 e2) sig = do
eval (<- eval e1 sig
v1 <- eval e2 sig
v2 <- liftIO $ binOp op v1 v2
v return v
The use of liftIO
requires StateT s IO
to
implement MonadIO
, or more generally, StateT s
to be a monad transformer:
instance MonadTrans (StateT s) where
= \ s -> do
lift ma <- ma
a return (a, s)
Reading and writing to the state is encapsulated in the
MonadState
class.
class MonadState s m where
get :: m s
put :: s -> m ()
modify :: (s -> s) -> m ()
instance MonadState s (StateT s m) where
= \ s -> return (s, s)
get = \ _ -> return ((), s)
put s = \ s -> return ((), f s) modify f
These services of the state monad are needed when we access variables.
EId x) sig = do
eval (<- get
s return $ lookupVar x s
EAssign x e) sig = do
eval (<- eval e sig
v -> updateVar x v s)
modify (\ s return v
StateT
in
exec
The form of StateT
does not directly match the pattern
of the type of exec
:
exec :: Stm -> Sig -> Env -> ExceptT Val IO Env
We need to formally add a result to statement execution. The result
has no information, though; we use the unit type ()
.
exec :: Stm -> Sig -> Env -> ExceptT Val IO ((), Env)
Now it fits the form of StateT
:
type Exec = StateT Env (ExceptT Val IO)
exec :: Stm -> Sig -> Exec ()
execs :: [Stm] -> Sig -> Exec ()
We can use automatic state threading now also in statement execution:
= return ()
execs [] sig :ss) sig = do
execs (s
exec s sig
execs ss sig
SWhile e s) sig = do
exec (<- liftEval (eval e sig)
v if isFalse v then return ()
else execs [SBlock [s], SWhile e s] sig
The definition of liftEval
has to be updated:
liftEval :: Eval a -> Exec a
= \ s -> do
liftEval ev <- lift (ev s)
(a, s') return (a, s')
This is simply:
liftEval :: Eval a -> Exec a
= \ s -> lift (ev s) liftEval ev
However, this definition does not fly with the newtype
wrapping of StateT
in Control.Monad.State
:
newtype StateT s m a = StateT { runStateT :: s -> m (a, s) }
We need to wrap and unwarp StateT
in the right
places:
liftEval :: Eval a -> Exec a
= StateT (\ s -> lift (runStateT ev s)) liftEval ev
Point-free:
= StateT (lift . runStateT ev) liftEval ev
Finally, we want to get rid of passing the signature sig
around explicitly. One solution is to make it part of Env
.
Or, we can employ the reader monad transformer.
type ReaderT r m a = r -> m a
This is simpler than the state monad: we only pass a state
r
in,
but do not return one.
instance Monad m => Monad (ReaderT r m) where
return a = \ r -> return a
>>= k = \ r -> do
ma <- ma r
a
k a r
instance MonadTrans (ReaderT r) where
= \ r -> ma lift ma
Bind simply distributes the read-only state r
to both
ma
and k
.
The embeddings return
and lift
simply ignore
r
.
Accessing the state is facilitated through the
MonadReader
class:
class MonadReader r m where
ask :: m r
local :: (r -> r) -> m a -> m a
instance MonadReader r (ReaderT r m) where
= \ r -> return r
ask = \ r -> ma (f r) local f ma
The method ask
returns the state, and
local f ma
modifies if via f
for a
subcomputation ma
. Modifying a read-only state seems
paradoxical, but note that the modification only applies to the
continuation ma
. Any computation following a
local
in sequence will receive the original, unmodified
state. We shall not need local
in interpreter nor
type-checker.
The official definition of ReaderT
uses a
newtype
:
newtype ReaderT r m a = ReaderT { runReaderT :: r -> m a }
The projection
runReaderT :: ReaderT r m a -> r -> ma
allows us to
unwind the reader component by supplying an initial state
r
.
The final definitions of the monads for evaluation and execution are thus:
type Eval = ReaderT Sig (StateT Env IO)
type Exec = ReaderT Sig (StateT Env (ExceptT Val IO))
The lifting of Eval
computations into Exec
computations needs to unwind the reader and state layers and restore
them after the lift
for ExceptT
.
liftEval :: Eval a -> Exec a
= ReaderT (\ r -> StateT (\ s -> lift (runStateT (runReaderT ev r) s))) liftEval ev
Its "inverse" is needed to call execs
from the evaluator
of function calls:
runExec :: Exec a -> Eval (Either Val a)
The idea is that exceptions thrown in a computation
Exec a
should be transformed into a Left
result in the Eval
monad, whereas regular termination of
the computation manifests as Right
result. The problem is
that exceptions do not carry the state, because
StateT Env (ExceptT Val IO) a = Env -> IO (Either Val (a, Env)).
If we look at the relevant part of Eval (Either Val a)
,
this is
StateT Env IO (Either Val a) = Env -> IO (Either Val a, Env).
So runExec
needs a state to continue with after the
exception. In our case, we can even reset the state after a regular
termination of the Exec a
computation, since we do not need
the state of a function's locals after the function returns.
runExecAndReset :: Exec a -> Env -> Eval (Either Val a)
=
runExecAndReset ex env ReaderT (\ r ->
StateT (\ s -> do
<- runExceptT (runStateT (runReaderT ex r) s)
r case r of
Left v -> return (Left v, env)
Right (a, _s) -> return (Right a, env)))
This gives the following implementation of the function call:
ECall f es) = do
eval (<- evals es
vs <- ask
sig let (DFun _ _ params ss) = lookupFun f sig
<- get
env
put (makeEnv params vs)<- runExecAndReset (execs ss) env
r case r of
Left v -> return v
Right() -> runtimeError "Function did not return anything"
We developed Exec
and Eval
to capture the
patterns found in our initial solution that used only the
IO
monad.
Of course, it is possible to use the Exec
monad also for
the evaluator, even though the evaluator never raises an
exception.
In this case, there is no need for liftEval
nor
runExec
.
Instead, we use catchError
instead of runExec
in the definition of eval (ECall f es)
.
This simplification is recommended, as the resulting interpreter implementation is easier to understand.
Now that we have completely "monadified" our interpreter,
eval :: Exp -> Eval Val
exec :: Stm -> Exec ()
what have we gained?
Well, we have removed clutter from our implementation and the chance of
introducing bugs in the boilerplate code.
But further, the lifting of eval
and exec
to
lists are now completely generic!
evals :: [Exp] -> Eval [Val]
= mapM eval es
evals es
execs :: [Stm] -> Exec ()
= mapM_ exec ss execs ss
It is not even worthwhile anymore to define evals
and
execs
,
we can simply write mapM eval
and mapM_ exec
directly.