1. The following C code has around six errors (some of them depend on what you count as an error). Locate the errors and explain at which compiler phase they are (or should be) revealed.
/* this is a buggy hello world program int main () { int i ; int j = k + 1 ; int a[] = {1,2,3} j = a + 6 ; a[4] = 7 ; printf("hello world\n) ; }
2. Decode the following representation of a JVM program to show (a) the corresponding assembly code and (b) a Java expression from which it can be obtained.
0001 0000 1111 1111 0001 0000 0010 0000 0110 0100 0001 0000 0000 1001 0110 1000
Use the key in lecture 1. All codes are found in the on-line JVM specification
3. Consider the context-free grammar
SWhile. Stm ::= "while" "(" Exp ")" Stm ; SExp. Stm ::= Exp ";" ; EAss. Exp ::= Ident "=" Exp ; EGt. Exp1 ::= Exp2 ">" Exp2 ; EIdent. Exp2 ::= Ident ; EInt. Exp2 ::= Integer ; _. Exp ::= Exp1 ; _. Exp1 ::= Exp2 ; _. Exp2 ::= "(" Exp ")" ;
Show the parse tree and abstract syntax tree of the string
while (x > y) y = x = 3 ;
4. Write an unambiguous context-free grammar for nonnegative numbers in the decimal notation (0,1,2,...,2007,...). Show the syntax tree of the number 2007.
5. Write a grammar for a subset of unix shell scripts:
|
<
or >
followed by a file name)
-
followed by an identifier
You don't have to specify the structure of identifiers and file names,
but can use the basic categories Ident
and File
.
6. Implement a parser for Lisp in BNFC.
It should be able to parse the code example in
http://lib.store.yahoo.net/lib/paulgraham/jmc.lisp
- which is actually a complete Lisp interpreter!
7. Implement a parser for regular expressions in BNFC. The parser should be able to correctly recognize the expression
(letter | '_') (letter | digit | '_')*
and any other expression combining the same operators. As for precedences, Kleene star binds stronger than sequence, which binds stronger than union.
8. Write a regular expression for comments of the form /*
... */
,
where /*
inside double quotes does not start a comment.
9. Construct an NFA and DFA for comments as in the previous exercise.
10. Write a program that removes tags from an HTML document, using one of the standard lexer tools. Start by writing a regular expression for tags. Test your program with the course web page.
11. Trace the LR parsing of the statement
if (x + y > 4) if (true) {x = y = y - 1 ;} else 8 ;
in the grammar
Stm ::= "if" "(" Exp ")" Stm "else" Stm ; Stm ::= "if" "(" Exp ")" Stm ; Stm ::= "{" [Stm] "}" Stm ::= Exp ";" Exp2 ::= Integer Exp2 ::= Ident Exp1 ::= Exp1 "+" Exp2 Exp1 ::= Exp1 "-" Exp2 Exp ::= Exp1 ">" Exp1 Exp ::= Ident "=" Exp [Stm] ::= [Stm] ::= Stm [Stm] Exp2 ::= "(" Exp ")" Exp1 ::= Exp2 Exp ::= Exp1
Do this by showing the evolution of the stack and the input, and whether shift or reduce actions are taken.
12. Consider the language 'X'*
, i.e. sequences of
symbol X
. Write two context-free grammars for it:
one left-recursive and one right-recursive. With both
grammars, trace the LR parsing of the string XXXX
.
What can you say about the memory consumption of the
two processes?
13. Write a calculator for the four floating point
operations (+ - * /
) by using the semantic actions
in one of the standard parser tools (Happy, CUP, Bison).
You can start by writing an expression grammar in BNFC
and then modify the generated parser file.
To make the usage of the calculator smooth, also
accept input without a decimal point, e.g. 3 * 3.14
.
The usual operator precedences must of course be followed.
14. Semantic actions can be used to make the parser manage languages
that are not context-free. Write a Happy/JLex/Cup program that
implements a parser for a copy language, where valid expressions
are sequences of identifiers (Ident
), with any such sequence
followed by itself:
{x x | x : Ident*}
For instance, foo bar foo foo bar foo
is a valid expression, but
foo bar foo bar foo
isn't. What is the time complexity of the parser?
15. Using the grammar of Question 11,
write a syntax-directed thanslator
that returns a symbol table indicating the number of occurrences of each
identifier that occurs in the program (i.e. a Stm
). For instance, in
the program
{ x = 3 ; x = 2 - x - y ; if (x > y) {x = x-1 ;} }
the identifier x
occurs 6 times and y occurs 2 times.
16. Using the typing rules of Chapter 4, write a derivation of the validity of the statement
while (x > 4) {x = x-1 ;}
You will need a context and a couple of extra typing rules.
17. Write typing rules for the following constructs of C/C++, which are not covered by Lab 2.
e ? e : e
x+=e, x-=e
e[i] = e
18. Trace the evolution of the state (the values of variables) when executing the following program statement by statement.
int main () { int low,hi,mx ; low = 1 ; hi = low ; mx = 20 ; printInt(low) ; while (hi < mx) { printInt(hi) ; hi = low + hi ; low = hi - low ; } }
19. Write the big-step semantic rules for if
statements,
both with and without else
. Take care of all possible
side effects that expressions can have.
20. if
statements without else
are easy to translate into
if
statements with else
. Show how.
But the inverse is more tricky. Show a counterexample to the following translation
if (exp) stm1 else stm2 ==> if (exp) stm1 if (!exp) stm2
and give a one that always works.
21. Consider the subset of JVM consisting of the following instructions.
bipush n -- push byte constant n iadd -- add two integers; pop the operands and push the result imul -- multiply two integers; pop the operands and push the result istore x -- store value in stack address x and pop it iload x -- push value from stack address x dup -- duplicate the top of the stack iprint -- print the top-most integer and pop it (macro for an ``invokestatic/runtime/iprint(I)V``)
Trace the evolution of the stack with the following code.
bipush 6 istore 0 bipush 5 iload 0 dup iadd iadd dup iprint istore 0
Hint: if you do question 5 first, you will get this one for free!
22. Write an interpreter for the subset of JVM consisting of the instructions in the previous question. Notice that you only need integer values in the stack.
You can use pseudocode or real program code. Is your interpreter closer to big step or small step semantics?
23. Write compilation schemes for
invokestatic runtime/iprint(I)V
, which prints
the top of the stack)
Apply these schemes to compile the expression
printInt((1 + 3*5)/2)
24. Design and implement an environment that supports
Show how the environment changes when compiling the code
int i ; int j ; if (i < j) i++ ; else j++ ;
You don't need to show the code yet - this belongs to the later questions.
25. Write compilation schemes for expressions and statements with integer variables:
Be careful to distinguish between pre- and postincrement!
26. Write compilation schemes for
<
and >
&&
and ||
with lazy semantics
while
loops and if-else
statements.
You don't need to think about optimizations.
27. Put everything together and show the code generated from the two programs below. The easiest way to produce the code is certainly by running your compiler!
Once you have a program, called ex5
, you can use it to
generate a class
file by running the script
jass
on it. Then you can execute this file with
java Main
You need all files from the jvm
directory to run jass
(a header, a footer, and a runtime support class).
Here are the two test files. If you haven't written a compiler, you can answer this question by hand-written code.
// fibonacci.cc int main () { int low,hi,mx ; low = 1 ; hi = low ; mx = 500000 ; printInt(low) ; while (hi < mx) { printInt(hi) ; hi = low + hi ; low = hi - low ; } }
// lazy.cc int main () { int i = 1 ; bool b = (++i > 2) && (++i > 0) ; printInt(i) ; // 2 printInt(++i) ; // 3 printInt(i++) ; // 3 printInt(i) ; // 4 b = (i > 5) || (++i > 0) ; printInt(i) ; // 5 }