Tuesday, March 30, 2010

The DeadEnds Interpreter, Part Two

Structure of the DeadEnds Interpreter
The DeadEnds interpreter has two phases. The first parses DeadEnds program files and converts their contents into an internal format that can be interpreted by the second phase. A DeadEnds program file is a sequence of program modules (functions and procedures) and global variables written in the DeadEnds programming language. DeadEnds programs can have include statements that indicate other DeadEnds program files that must be processed. The included files are located and processed also. This phase is similar to a language compiler, except that a compiler converts programs into machine instructions that can run directly on a processor.

The second phase runs (executes, interprets) the programs that were prepared by the first phase. The internal form of programs consists of a global symbol table, and a list of program modules in internal form. The internal modules are composed of statement and expression objects. This phase runs DeadEnds programs by interpreting program modules by evaluating the expressions and interpreting the statements within a context provided by a chain of symbol tables that track the state of the running programs.

The DeadEnds language parser is generated by Bison (a public domain version of Yacc), a program that reads a file that defines both the grammar of the DeadEnds language and the semantic actions that convert DeadEnds programs into trees of internal objects that are interpreted later by the second phase. Bison generates the code that parses the DeadEnds language and performs the actions. Normally Yacc and Bison generates C code, but the version of Bison on Mac OS X systems can generate Objective-C code. DeadEnds uses this feature because the actions are written in Objective-C, and they create trees of Objective-C objects.

DeadEnds Interpreter Grammar
Here is the context free grammar of the DeadEnds programming language taken from the DeadEnds Yacc file. Words in lower case are nonterminals, and those in upper case are the terminals or tokens. Characters in single quotes are literal terminals.
  • definitions : definition | definitions definition
  • definition : declaration | module | INCLUDE SCONS
  • declaration : TYPE identifiers ';'
  • module : TYPE IDEN '(' parameterso ')' block
  • identifiers : IDEN initializero | identifiers ',' IDEN initializero
  • initializero: initializer | /* empty */
  • initializer : ASGNOP initialvalue
  • initialvalue: expression | '{' expressionso '}'
  • parameterso : parameters | /* empty */
  • parameters : parameter | parameters ',' parameter
  • parameter : TYPE IDEN
  • statementso : statements | /* empty */
  • statements : statement | statements statement
  • statement : expression ';' | declaration | IF '(' expression ')' statement | IF '(' expression ')' statement ELSE statement | WHILE '(' expression ')' statement | DO statement WHILE '(' expression ')' ';' | block | RETURN expressiono ';' | CONTINUE ';' | BREAK ';'
  • block : '{' statementso '}'
  • expression : IDEN | ICONS | FCONS | SCONS | boolconstant | '(' expression ')' | IDEN '(' expressionso ')' | expression ADDOP expression | expression MULOP expression | expression RELOP expression | expression EQOP expression | expression ANDOP expression | expression OROP expression | expression '=' expression | expression '.' member | ADDOP expression | NOTOP expression
  • member : IDEN | IDEN '(' expressionso ')'
  • expressiono : expression | /* empty */
  • expressionso: expressions | /* empty */
  • expressions : expression | expressions ',' expression
  • boolconstant : TRUECONS | FALSECONS
Those programmers who have used the LifeLines programming language will see that I haven't included any of the specialized for-statements. These will be included in an update soon.

No comments:

Post a Comment