Variables

Introduction

Over the last few chapters, we have created a compiler which is capable of parsing (and generating executable code for) various kinds of expressions. In the last chapter, we made our compiler capable of understanding statements, which are instructions that expect and act on such expressions. However, one important element is missing from our concept of expressions: the variable.

Variables

Simply put, a variable is a name that represents a value. Wherever a value can be used, a variable can be substituted. Here are some examples using our own expression syntax:

2 + x
"Hello " & Myname

The assumption in the above examples is that a variable called x represents a numeric value, and a variable called Myname represents a string.

How are variables associated with values? The most common way is to perform an operation called assignment, although there may be other methods. Most programming languages support a construct similar to the following:

  <variable> = <value>

Once a value has been assigned to a variable, the variable can be used in any expression where the value could have been used. The advantage is that the same expression can be re-used with varying values. Hence the name: variable.

Types

In practice, variables get a lot more complex. To begin with, variables have types. In our example, the variable x is of the numeric type, and Myname is of string type.

When is the type of a variable determined? There are two common approaches. In the first approach, we can determine the type of a variable at run time, when it is actually used. This is called dynamic typing, and is commonly used in interpreters and so-called "scripting" languages. Examples of languages that use dynamic types are Perl, Python, Ruby and Visual Basic.

In the second approach, the type of a variable is declared as part of writing a program, and is checked by the compiler as part of the compiling process. This is called +static typing_, and is the more commonly used approach in compiled languages. Examples of languages that use static typing are C, Pascal, ML, and Visual Basic.

A variable of a given type can be used in any expression of that type. But what if we try to mix types in expressions, for instance by using a string variable in a numeric expression? Consider the following example, which uses our own Print command and expression syntax:

 Myname = "Raj"
 Myage = 32
 Print "My name is " & Myname & " and my age is " & MyAge

Will this work?

Some languages do not allow expressions which combine different types. Such languages require an explicit conversion from one type to another if mixing of types is required. This kind of language is said to be strongly typed. Examples of strongly typed languages are Haskell, ML, Pascal, C++ (this is disputed by some), and Visual Basic.

Other languages are not so strict. They convert variables of different types implicitly at run time. Such languages are said to be weakly typed. Examples of weakly typed languages are JavaScript ~~(I guess one should say ECMAScript instead)~~, Perl and Visual Basic.

Finally (and this will be the topic of a future chapter), apart from the kind of data stored by a variable, its type also refers to how such data is stored. For example, a variable can store a single value (scalar), multiple values (array or vector), or just a reference to a value stored elsewhere (pointer or reference).

SIC, the language that we are designing, will be statically and strongly typed.

It is a common feature (though not a strict requirement) of statically typed languages that variables are declared before they are used. SIC will have this feature.

Scope

Variables also have scope, which means an enclosing context. Think of scope as being a part of a program, such as a function or a block within a function. The whole program is also a valid scope.

The easiest way to understand the use of scope is this: within the same scope, no two variables can have the same name. But the same name can be used in different scopes, such as in different functions.

Scopes have important connotations for variables, and as such will be the topic of a future chapter. For now, we will assume that all variables that we compile will have a single scope: the whole program. This means that we cannot repeat or re-declare variables.

Goal

Our goal in this chapter is threefold.

Recognize and translate a statement which declares variables, that is, specifies a variable's name and data type. This statement will be in one of the following two forms: Dim <variable> As <type> or <type> <variable>. Dim is a command. Possible types are Integer, String and Boolean.
Recognize and translate an assignment statement, which will associate values with variables. This statement will be in one of the following two forms: <variable> = <value> or <variable> := <value>.
Allow the use of variables in any expressions, in lieu of constants.

The Approach

In terms of scanning and parsing, there doesn't seem to be anything new in these goals. In fact, we already have most of the necessary scanners and recognizers in place. Types, variables, and the Dim command are names, for which we already have a scanner.

We do need some new parsers, though. And our ever-changing line parser needs to change again, because now a line can be any one of the following:

a declaration statement
an assignment statement
a command

But before we get to that, there's that all-important question: how do variables translate down to CIL?

First Step

Once again, the fact that we are compiling to the CLR helps us. The CLR natively understands the concept of variables.

The CLR actually understands three kinds of variables: locals, fields, and arguments. As of right now, we need to look only at local variables, which are variables that are declared and used within a single scope.

To work with variables, we will need to use a class called LocalBuilder in the Reflection Emit namespace, and two new Opcodes: ldloc and stloc.

Modifying CodeGen

Add the following code to CodeGen.vb.

Private m_LocalVariables As _
      New Dictionary(Of Integer, LocalBuilder)

Public Function DeclareVariable( _
                    Name As String, _
                    VariableType As System.Type _
                ) As Integer 

  Dim lb As LocalBuilder 

  lb = m_ILGen.DeclareLocal(VariableType)
    m_LocalVariables.Add(lb.LocalIndex, lb)

  Return lb.LocalIndex
End Function

So far, all our code generation has consisted of calling the Emit method of the field m_ILGen, which is an object of type System.Reflection.Emit.ILGenerator. This time, we are doing something different - we're declaring a variable. This does not generate any code immediately. Instead, the DeclareLocal method of m_ILGen returns a LocalBuilder object. The LocalBuilder keeps track of a local variable's type, and its index. The index of a variable is its position in the scope where it is declared. The first variable declared in a scope will have an index or 0, the second, 1 and so on.

So, we have created a function called DeclareVariable, which takes a name and a type. It uses the type to create a LocalBuilder, and stores it in a dictionary using the index as the key. It then returns the index. Notice that we are not doing anything with the name, yet.

The CLR, in fact, does not care about names of variables at all. It just needs index numbers and types, which the LocalBuilder objects keep track of for us.

Now, we need to actually emit code that uses these local variables. As mentioned earlier, this means using two new OpCodes.

OpCode	What it does
StLoc	Takes a variable index as a parameter. Pops the last value off the stack, and stores it in that variable. If the types do not match, there is a run-time error.
LdLoc	Takes a variable index as a parameter. Retrieves the value stored in that variable, and pushes it on to the stack.

Add the following to CodeGen.vb.

Public Sub EmitStoreInLocal( _
             Index As Integer _
           )

    Dim lb As LocalBuilder 

    lb = m_LocalVariables(Index)
    m_ILGen.Emit(OpCodes.Stloc, lb)

End Sub

Public Sub EmitLoadLocal( _
             Index As Integer _
           )

    Dim lb As LocalBuilder 

    lb = m_LocalVariables(Index)
    m_ILGen.Emit(OpCodes.Ldloc, lb)

End Sub

Pretty straightforward. Either method receives an Index, retrieves the corresponding LocalBuilder from the dictionary, and uses that with the relevant Opcode.

That's all CodeGen requires to deal with variables. Now, we move on to the parsing part, which is significantly more difficult.

The Symbol Table

To begin with, while variable names are not important to code generation, they are certainly important during parsing. We will encounter variable names at various places during the parsing stage. It is up to us to remember what the name represents, and generate appropriate code.

This is different from anything else we have parsed so far. Up until now, our approach has been to generate code for whatever we parse immediately. And this has not been a problem, because the meaning of anything we parse has been static. For instance, as long as we are parsing a numeric expression, if we meet the number 11 anywhere, all we need to do is call EmitNumber in CodeGen.

We can't do that with variable names. If we find the name in a declaration statement, we have to call DeclareVariable. If it appears in an assignment statement, we have to call EmitStoreInLocal. If we find it in an expression, EmitLoadLocal is to be called. The concept is not called variable for nothing.

Here's another twist. Variable names can appear within expressions, right? The question is: what type of expression? Or more correctly, what type of variable can appear in what type of expression? The answer, of course, is that the types of the variable and the expression should be the same. Who ensures this? Not CodeGen. Look at the implementation of EmitLoadLocal. Does it do any kind of type checking? It's up to the parser to ensure that the correct variable gets used in the correct type of expression.

If we do generate code that uses the wrong type of variable in a given expression, the results can be unpredictable. By and large, the fact that the CLR can verify code generated by us will protect us from crashing the whole machine. But, recall what we said in Chapter 1. The preferred way of dealing with errors is not producing them. We have to ensure type correctness while compiling. Fortunately, we made the decision to make SIC a strongly and statically typed language. This will make ensuring type correctness a bit easier.

As per our first goal, when we declare a variable, we also declare what type it is. We will need to keep track of the variable's name and type (among other things). We will do this by maintaining a table of variables. When we parse a declaration statement, we will add the variable's name and type to this table. When a variable appears in an expression, we will look it up in the table, and check whether its type matches the type of the expression. If not, we will produce a compile-time error. We will also produce an error if a variable name used in an expression is not found in the table, thus enforcing the rule "A variable has to be declared before it can be used".

Variables are one kind of names that can be used in lieu of constants; examples of other kinds are arrays and functions. Names that represent values are collectively referred to as symbols in compiler construction arcana. So, we need to build a symbol table into our parser.

Add the following to Utilities.vb.

Public Class Symbol
    Private m_Name As String = ""
    Private m_Type As Type = Nothing
    Private m_CodeGenHandle As Integer = -1 

    Public Sub New(newname as String, newtype As Type)
        m_Name = newname
        m_Type = newtype
    End Sub

    Public ReadOnly Property Name() As String
        Get
            Return m_Name
        End Get        
    End Property 

    Public ReadOnly Property Type() As Type
        Get
            Return m_Type
        End Get
    End Property 

    Public Property Handle() As Integer
        Get
            Return m_CodeGenHandle
        End Get

        Set(ByVal Value As Integer)
            m_CodeGenHandle = Value
        End Set
    End Property
End Class

Public Class SymbolTable
    Private m_symbolTable As New Dictionary(Of String, Symbol)

    Private Sub Add(symbol As Symbol)
        m_SymbolTable.Add( _
                        symbol.Name.ToLowerInvariant(), _
                        symbol
        )
    End Sub

    Public Function Fetch(name as String) As Symbol
        Return m_SymbolTable(name.ToLowerInvariant())
    End Function

    Public Function Exists(name as string) As Boolean
        Return m_SymbolTable.ContainsKey(name.ToLowerInvariant())
    End Function
End Class

The class called Symbol represents a single symbol, storing its name, type and handle. The handle is the CodeGen representation of a symbol, which is the index of the local variable.

The class called SymbolTable simply maintains a dictionary of symbols.

Next, add the following to Parser.vb, in the Fields section:

Private m_SymbolTable As New SymbolTable

Types

Symbols are the second kind of names that we have to work with (Commands were the first). In our goals above, we have defined a third kind: types.

Type names will appear in source code, specifically in declaration statements. We will need to validate them, and convert them to the compiler-internal representation, which is System.Type. Let's write some boilerplate code for doing this.

The approach we will take is similar to commands: have a table of valid type names and their System.Type equivalents.

Add the following to Commands.vb.

Private m_typeTable As Dictionary(Of String, Type)

Private Sub AddType(typeName As String, type As Type)
    m_typeTable.Add(typeName.ToLowerInvariant(), type)
End Sub

Private Function IsValidType(typeName As String) As Boolean
    Return m_typeTable.ContainsKey(typeName.ToLowerInvariant())
End Function

Private Function GetTypeForName(typeName as String) As Type
    Return m_typeTable(typeName.ToLowerInvariant())
End Function

Private Sub InitTypes()
    m_typeTable = new Dictionary(Of String, Type)

    ' Add types here
    AddType("integer", GetType(System.Int32))
    AddType("string", GetType(System.String))
    AddType("boolean", GetType(System.Boolean))
End Sub

Put the fields in the Fields region, and the methods in the Helper Functions region.

We need to call the InitTypes method that we just created before doing any parsing. Like in the case of commands, we will call it from the constructor of the Parser class. Make the change in Parser.vb.

Public Sub New( _
    ByVal newStream As TextReader, _
    ByVal newGen As CodeGen _
    )

    m_InputStream = newStream
    m_Gen = newGen

    InitTypes()
    InitCommands()
End Sub

With all that infrastructure in place, we are finally ready to do some parsing.

Parsing Declaration

Goal 1 above gives us two syntaxes for a declaration statement. Here's the BNF:

<declarationstatement>     ::= <dimcommand>|<typefirstdeclaration>
<dimcommand>               ::= "Dim" <name> ["As"] <type>
<type>                     ::= "integer"|"string"|"boolean"
<typefirstdeclaration>     ::= <type> <name>

I have made one small enhancement; the word "As" in optional in the <dimcommand> production.

Dim is a command, like others we have already parsed. So let's write a parser for it.

Add the following to Commands.vb, in the Commands region.

Public Function ParseDimCommand() As ParseStatus
    Dim result As ParseStatus

    ' Read a variable name
    SkipWhiteSpace()
    ScanName()

    If TokenLength = 0 Then
        result = CreateError(1, "a variable name.")
    Else
        Dim varname As String
        varname = CurrentToken

        If m_SymbolTable.Exists(varname) Then
            ' Variable name already declared
            result = CreateError(3, CurrentToken)
        Else
            ' Read either "as" or type
            SkipWhiteSpace()
            ScanName()

            ' Check and ignore "As"
            If CurrentToken.ToLowerInvariant() = "as" Then
                ' Read type
                SkipWhiteSpace()
                ScanName()
            End If

            Dim typename As String
            typename = CurrentToken

            If TokenLength = 0 OrElse _
                    Not IsValidType(typename) Then
                result = CreateError(1, "a valid type.")
            Else
                Dim symbol As New Symbol( _
                                    varname,
                                    GetTypeForName(typename)
                )

                symbol.Handle = m_Gen.DeclareVariable( _
                                            symbol.Name, _
                                            symbol.Type
                                )

                m_SymbolTable.Add(symbol)

                SkipWhiteSpace()

                If Not EndOfLine Then
                    result = CreateError(1, "end of statement.")
                Else
                    result = CreateError(0, "Ok")
                End If
            End If
        End If
    End If

    Return result
End Function

It reads pretty much like the BNF. Scan a variable name, then "As", and then a type name, with appropriate validations. If all is well, create a symbol, call DeclareLocal in CodeGen, and add the symbol to the symbol table.

Note the new error value of 3. If a variable name has already been declared, we do not allow re-declaration. We will have to modify CreateError to accommodate the new error. We will do that in a bit. For now, let us add the Dim command to the list of valid commands. In fact, let me make an enhancement here: we can use Var as a synonym for Dim.

Add the following to the InitCommands method in Commands.vb.

AddCommand("dim", AddressOf ParseDimCommand)
AddCommand("var", AddressOf ParseDimCommand)

And that's how easy it is to add a synonym to one of our commands; just use the same command parser.

Now, what about the second syntax, shown as <typefirstdeclaration> in the BNF above?

A parser is easy: just follow the BNF. Assume that a type name has been read, and proceed from there. Add the following to Parser.vb.

Private Function ParseTypeFirstDeclaration() _
                    As ParseStatus
    Dim result As ParseStatus

    Dim typename As String
    typename = CurrentToken

    If IsValidType(typename) Then
        ' Read a variable name
        SkipWhiteSpace()
        ScanName()

        If TokenLength = 0 Then
            result = CreateError(1, "a variable name.")
        Else
            Dim varname As String
            varname = CurrentToken

            If m_SymbolTable.Exists(varname) Then
                result = CreateError(3, varname)
            Else
                Dim symbol As New Symbol( _
                                    varname, _
                                    GetTypeOfName(typename)
                )

                symbol.Handle = m_Gen.DeclareVariable(
                                    symbol.Name,
                                    symbol.Type
                                )

                m_SymbolTable.Add(symbol)

                SkipWhiteSpace()

                If Not EndOfLine Then
                    result = CreateError(1, "end of statement.")
                Else
                    result = CreateError(0, "Ok")
                End If
            End If
        End If
    Else
        result = CreateError(1, "a valid type.")
    End If

    Return result
End Function

As things stand, this parser is not called from anywhere. We will deal with that, and the error value 3, shortly. For now, let us work on Goal 2.

Parsing Assignment

Goal 2 gives us the syntax for an assignment statement. Here's the BNF:

<assignmentstatement>         ::= <name> <assignmentoperator> <expression>
<assignmentoperator>          ::= "="|":="

We have to validate that the variable <name> exists, and that the type of <expression> matches the type of the variable. We already have most of the plumbing required, except a scanner for the assignment operator. Let's write that first.

Add the following to the appropriate regions of Parser.vb:

Private Function IsAssignmentCharacter(Byval c As Char) As Boolean
    Return ":=".IndexOf(c) > -1
End Function

Private Sub ScanAssignmentOperator()
    m_CurrentTokenBldr = New StringBuilder
    Do While IsAssignmentCharacter(LookAhead)
        m_CurrentTokenBldr.Append(LookAhead)
        m_CharPos += 1
    Loop

    Select Case CurrentToken
        Case "=", ":="
            ' Valid Assignment operator
        Case Else
            m_CurrentTokenBldr = New StringBuilder
    End Select
End Sub

With that in hand, let's write the parsers.

Add the following to Parser.vb:

Private Function ParseAssignment(variable As Symbol) _
                    As ParseStatus

    Dim result As ParseStatus

    SkipWhiteSpace()

    Select Case variable.Type.ToString()
        Case "System.Int32"
            result = ParseNumericExpression()
        Case "System.String"
            result = ParseStringExpression()
        Case "System.Boolean"
            result = ParseBooleanExpression()
        Case Else
            result = CreateError(1, " a valid type.")
    End Select 

    If result.Code = 0 Then
        ' Generate assignment code
        m_Gen.EmitStoreInLocal(variable.Handle)
    End If

    Return result
End Function

Private Function ParseAssignmentStatement() _
                    As ParseStatus

    Dim result As ParseStatus

    Dim varname As String
    varname = CurrentToken

    If m_SymbolTable.Exists(varname) Then
        ' Read assignment operator
        SkipWhiteSpace()
        ScanAssignmentOperator()

        If TokenLength = 0 Then
            result = CreateError(1, "= or :=")
        Else
            Dim variable As Symbol
            variable = m_SymbolTable.Get(varname)

            result = ParseAssignment(variable)
        End If
    Else
        result = CreateError(4, varname)
    End If

    Return result
End Function

We have split the job of assignment into two. ParseAssignmentStatement checks the validity of the statement itself, starting from the first token read, which should be a name that exists in the symbol table. This should be followed by an assignment operator. If this much is valid, ParseAssignment is invoked, passing it the symbol. Note the new error code, 4, which signals that a variable was not declared.

ParseAssignment checks the type of the symbol, and invokes the matching expression parser. Any type errors would be caught by code we have already written. If there are no errors, the last bit of code generated would have pushed the result of the expression on the stack. So all we need to do is emit the StLoc OpCode (via EmitStoreInLocal of CodeGen), which will pop the value and store it in the relevant variable.

A New Line (Again)

Now that we have parsers for declaration and assignment, we need someone to call them. That someone, inevitably, is ParseLine.

In the last chapter, we had defined a line to be one of our commands. Now, a line can either be a command (that includes the Dim command, which declares a variable), a type-first declaration, or an assignment statement. Here's the BNF:

<line>               ::= <command>|<typefirstdeclaration>|<assignmentstatement>

Lastly, a line can also be inside a comment block, in which case it can be anything.

So here's what ParseLine will do. We will scan a name. If the name is a valid command, we will invoke ParseCommand. If we are in a comment block, we will silently ignore the rest of the line. If it is a valid type name, we will invoke ParseTypeFirstDeclaration. In all other cases, we will invoke ParseAssignmentStatement.

Make the change in Parser.vb.

Private Function ParseLine() As ParseStatus
    Dim result As ParseStatus

    SkipWhiteSpace()

    m_LastTypeProcessed = Nothing

    If EndOfLine() Then
        ' An empty line is valid
        result = CreateError(0, "Ok")
    Else
        ScanName()

        If TokenLength = 0 Then
            result = CreateError(1, "a statement.")
        Else
            Dim name As String
            name = CurrentToken

            If IsValidCommand(name) Then
                result = ParseCommand()
            ElseIf m_inCommentBlock Then
                result = Ok()
            ElseIf IsValidType(name) Then
                result = ParseTypeFirstDeclaration()
            Else
                result = ParseAssignmentStatement()
            End If
        End If
    End If

    Return result
End Function

Error, Error On The Wall

We have introduced two new error values: 3 for variable already declared and 4 for variable not declared. In fact, we will need one more shortly, for variable type mismatch. We have to modify CreateError accordingly. Make the change in Parser.vb.

Private Function CreateError( _
    ByVal errorcode As Integer, _
    ByVal errorDescription As String _
    ) As ParseStatus

    Dim result As ParseStatus
    Dim message As String

    Dim errorpos As Integer
    ' Most errors happen
    ' at the scan position
    errorpos = m_CharPos + 1

    Select Case errorcode
        Case -1 ' Block finished
            message = ""
        Case 0  ' All good
            message = "Ok."
        Case 1  ' Expected something
            message = String.Format( _
                        "Expected {0}", _
                        errorDescription _
            )

        Case 2  ' Not in block
            message = errorDescription

            ' Not in block error happens
            ' after End command has been
            ' scanned
            errorpos = errorpos - TokenLength
        Case 3  ' Variable already declared
            message = String.Format( _
                        "Cannot redeclare variable '{0}'.", _
                        errorDescription
            )

            ' Error happens after scanning
            ' variable name
            errorpos = errorpos - TokenLength
        Case 4  ' Variable not declared
            message = String.Format( _
                        "Variable '{0}' not declared.", _
                        errorDescription
            )
            ' Error happens after scanning
            ' variable name
            errorpos = errorpos - TokenLength
        Case 5  ' Variable type mismatch
            message = String.Format( _
                        "Type mismatch for Variable '{0}'.", _
                        errorDescription
            )
            ' Error happens after scanning
            ' variable name
            errorpos = errorpos - TokenLength
        Case Else
            message = "Unknown error."
    End Select

    result = New ParseStatus(errorcode, _
                message, _
                errorpos, _
                m_linePos)


    Return result
End Function

I took this opportunity to add an enhancement to CreateError. Note the use of the variable errorpos above. When we return an error via ParseStatus, we also return the column and line of source code where the error occurs. Most of our errors are detected before we scan a token. In these cases, the error column position is the current scan position (m_CharPos), plus 1 because m_CharPos counts from 0. But some, like "Not in block" and the variable related errors, can only be detected after a token has been read. In these cases, the error column position has to be moved back by the length of the token. That's what gets done in the code above.

Test It

Let's test what we have so far. You may want to review the instructions in the Development Environment chapter.

Compile with:

vbc /out:sicc.exe Compiler.vb Parser.vb Commands.vb CodeGen.vb Utilities.vb

Run using:

sicc.exe

Test it with the following code:

Dim x As Integer
x = 2*(3+56) /4
String y
y := "Hello world"
Var z As Boolean
Z := 1=2

Also try re-declaring a variable, or not declaring a variable before using it, or assigning a wrong type of expression. Our compiler will report the error accurately.

Using variables

Now that we are done with variable declaration and assignment, it's time for Goal 3: using variables in lieu of constants.

So far, SIC understands two types of constants: number and string. We parse them at the lowest levels of the relevant expression parsers. That is where we have to add the ability to use variables as well.

But first, we need a method to parse a variable, of any type. Add the following to Parser.vb:

Private Function ParseVariable(type As Type) _
                    As ParseStatus

    Dim result As ParseStatus

    ' Try to read variable name
    ScanName()

    If TokenLength = 0 Then
        result = CreateError(1, "a variable.")
    Else
        Dim varname As String
        varname = CurrentToken

        If Not m_SymbolTable.Exists(varname) Then
            result = CreateError(4, varname)
        Else
            Dim variable As Symbol
            variable = m_SymbolTable.Fetch(varname)

            If Not variable.Type.Equals(type) Then
                result = CreateError(5, variable.Name)
            Else
                ' Emit the variable
                m_Gen.EmitLoadLocal(variable.Handle)
                SkipWhiteSpace()

                result = CreateError(0, "Ok.")
            End If
        End If
    End If 

    Return result
End Function

This parser should be invoked at a point where we definitely expect a variable. One such point can be found at the 'lowest' level of numeric expression parsing: ParseFactor.

Using variables in numeric expressions

In ParseFactor, we currently check if the lookahead character is a "(", in which case we call ParseNumericExpression. Otherwise, we call ParseNumber. We need to add a third possibility: if the lookahead character is a name start character (a letter or an underscore), we need to process the next token as a number-type variable. Let's do this now.

Make the change in Parser.vb, and add the new recognizer in the appropriate region.

Private Function IsNameStartCharacter(Byval c As Char) As Boolean
    Return c.Equals("_"c) OrElse Char.IsLetter(c)
End Function

Private Function ParseFactor() As ParseStatus
    Dim result As ParseStatus

    If LookAhead.Equals("("c) Then
        SkipCharacter()

        result = ParseNumericExpression()

        If result.Code = 0 Then
            If Not LookAhead.Equals(")"c) Then
                result = CreateError(1, ")")
            Else
                SkipCharacter()
            End If
        End If
    ElseIf IsNameStartCharacter(LookAhead) Then
        result = ParseVariable(GetType(System.Int32))
    Else
        result = ParseNumber()
    End If

    SkipWhiteSpace()

    Return result
End Function

Test Numeric Variables

Let's test this. Compile and run. Test with the following:

Dim i As Integer
Integer j
i = 39
j := i + 2
Print 3+j

You can try arbitrarily complex numeric expressions, in the assignment as well as the print statements. The assignment statement will work in all cases. The Print command will work in all cases, except when the expression starts with a variable. More on that later.

Using variables in string expressions

We could deal with strings in a similar fashion, if we had a string "factor". Unfortunately, our string expression parser directly calls ParseString, with no possibility of a string being anything other than a literal, quoted string. Let's introduce a more flexible parsing structure, similar to numbers. This will also allow us to introduce brackets in a string expression. Here is the modified BNF for string expressions.

<stringexpression>        ::= <stringfactor><concatoperation>*
<concatoperation>         ::= <concatoperator><stringfactor>
<concatoperator>          ::= "+"|"&"
<stringfactor>            ::= <string>|<name>|<bracketstringexpression>
<bracketstringexpression> ::= "(" <stringexpression> ")"
<string>                  ::= <quote-symbol><stringcharacter>*<quote-symbol>
<quote-symbol>            ::= '"'
<stringcharacter>         ::= <character>|<quote-in-string>
<quote-in-string>         ::= <quote-symbol><quote-symbol>
<character>               ::= ? every character other than double-quote or newline ?

Make the changes in Parser.vb.

Private Function ParseString() As ParseStatus
    Dim result As ParseStatus

    ScanString()

    If TokenLength = 0 And Not m_EmptyStringFlag Then
        result = CreateError(1, "a valid string.")
    Else
        m_Gen.EmitString(CurrentToken)
        result = CreateError(0,"Ok")
    End If

    SkipWhiteSpace()

    Return result
End Function

Private Function ParseStringFactor() As ParseStatus
    Dim result As ParseStatus

    If LookAhead.Equals(""""c) Then
        result = ParseString()
    ElseIf IsNameStartCharacter(LookAhead) Then
        result = ParseVariable(GetType(System.String))
    ElseIf LookAhead.Equals("("c) Then
        SkipCharacter()

        result = ParseStringExpression()

        If result.Code = 0 Then
            If Not LookAhead.Equals(")"c) Then
                result = CreateError(1, ")")
            Else
                SkipCharacter()
            End If
        End If
    Else
        result = CreateError(1, "a string.")
    End If

    Return result
End Function

Private Function ParseConcatOperator() As ParseStatus
    Dim result As ParseStatus
    Dim currentoperator As String = CurrentToken

    SkipWhiteSpace()

    result = ParseStringFactor()

    If result.code = 0 Then
        m_Gen.EmitConcat()
    End If

    Return result
End Function    

Private Function ParseStringExpression() As ParseStatus

    Dim result As ParseStatus

    result = ParseStringFactor()

    Do While result.code=0 _
        AndAlso _
        IsConcatOperator(LookAhead) 

        ScanConcatOperator()

        If TokenLength=0 Then
            result = CreateError(1, "& or +")
        Else
            result = ParseConcatOperator()
            SkipWhiteSpace()
        End If
    Loop

    m_LastTypeProcessed = Type.GetType("System.String")

    Return result    
End Function

Reads just like the BNF. Note the call to ParseVariable in the appropriate place, passing the string type.

Test String Variables

Compile and run. Test with the following:

Dim myname As String 
myname = "Paula"
String greet
greet := myname & ", you brillant person."
print "Hello, " + greet

Again, you can try arbitrarily complex expressions. The assignment will work in all cases. The Print command will work in all cases, except when the expression starts with a variable.

If you try to mix types, such as using a string variable in a numeric expression, or trying to assign a number to a string variable, you will get a very accurate error message.

Using variables in Boolean expressions - not yet

Variables in boolean expressions are quite a challenge. Unlike numbers and strings, we cannot easily deal with them at the "factor" level. The Boolean "factor" is a condition, which in itself starts with an expression, which may start with a variable. We cannot, with any certainty, assume that the variable at that point must be a boolean.

So, boolean variables, like boolean expressions, require special treatment in ParseExpression itself. The good news is that this treatment may solve the pesky error that we encountered whenever an expression starts with a variable.

In the beginning, there was a name

Our ParseExpression method detects the type of the expression by the lookahead character. A digit or sign indicates a numeric expression, and a quote(") indicates a string expression. If the lookahead character is an underscore(_) or a letter, we can assume that the next token is a name. But is it the name of a variable?

There is a valid token which can appear at the beginning of an expression and looks like a name: the not operator. If it is, we are in a boolean expression. Fortunately, we know what happens at this point of a boolean. We can simply call ParseNotOperator.

If the expression genuinely begins with a variable, we have a problem. The type of variable tells us which expression parser to call. We can't figure out the type until we scan the full token. However, our string and numeric parsers insist on scanning the first token of the expression themselves. There's no convenient way to let them know that we have already scanned the first token, which happens to be a variable.

How do our assignment statements work in every case, then? Well, in the case of the assignment statement, by the time we get to the expression, we already know what type it is likely to be, from the variable on the left of the assignment operator. So, we call the appropriate expression parser, and everything works.

Backtracking

We have hit a roadblock. So far, we could reliably predict what was coming by looking at a single Lookahead character. We can't do that any longer. How do we solve this problem?

The correct way would be to revisit our expression parsers, and add a way for them to allow for a pre-parsed first token. It would involve passing the pre-parsed token, or some kind of flag, to the top of the expression parser hierarchy, and passing it down all the way to the "factor" level. Our boolean parser already has something like this.

There is a less correct, but shorter way. Once we have scanned a variable which starts an expression and determined its type, we can move the lookahead character back to the start of the variable, and call the relevant expression parser. This does cause the variable to be scanned twice, but ensures correct output without massive changes in our parsers. Moving the position of the lookahead back in this manner is known as backtracking.

Backtracking is frowned upon, and rightly so. Let's try and understand why, in simple terms, using our work so far as an example.

Why is backtracking necessary at all? Because of ambiguity. Consider our implementation of ParseExpression. An expression can be numeric, string or boolean, and we can usually tell which kind a particular one is by looking at the first character. Things are unambiguous as long as variables are not involved. The moment an expression starts with a name character, we can't tell what kind it is until we have read the variable and looked it up in the symbol table. Thus, we don't know how the rest of the expression has to be processed. The expression type is ambiguous. Only after resolving the ambiguity can we proceed, but by then it is too late. So we backtrack.

Luckily, in our compiler, there are very few situations where there can be ambiguity regarding what comes next, and therefore where backtracking must occur. The most obvious one is when we call ParseExpression, and the Lookahead character is a name character. We call ParseExpression in two places so far: in the Print command, and in ParseCondition. Consider, each time we use the print statement and start the expression with a variable, the variable will be scanned twice. Furthermore, each time we use a boolean condition which starts with a variable (as most boolean expressions will), the variable will be scanned twice. As the programs that our compiler compiles get larger, this can cause performance problems for the compiler (Note: for the compiler only. Compiled code will run just fine.)!

Also, in our situation, we are lucky. No code has been generated at the point where we need to backtrack. There might be situations where code was generated, and then backtracking became necessary. We would have to somehow un-generate the code!

Backtracking should be avoided, and can be avoided with a little extra work. However, for the moment, we will backtrack.

One little aside: one of the most common places where backtracking is required is for reporting compile-time errors. We have been sort-of backtracking, in our CreateError method. Check out the way we handle all error cases other than case 1. Because we "sort-of" backtrack, and more importantly because we only ever report one compile-time error,this is not likely to affect the performance of our compiler.

Implementing the backtrack

Add the following to Parser.vb, in the Scanner region:

Private Sub Backtrack()
    If TokenLength > 0 Then
        m_CharPos -= TokenLength
        m_CurrentTokenBldr = New StringBuilder()
    End If
End Sub

This should be self-explanatory.

ParseInitialName

Next, we will create a new parser to handle the case of a name appearing at the beginning of an expression. Add the following to Parser.vb:

Private Function ParseInitialName() As ParseStatus
    Dim result As ParseStatus

    ' Scan the name
    ScanName()

    Dim varname As String
    varname = CurrentToken 

    If CurrentToken.ToLowerInvariant() = "not" Then
        ' This is a boolean. Backtrack and call
        ' boolean parser
        Backtrack()
        result = ParseBooleanExpression()
    Else
        If Not m_SymbolTable.Exists(varname) Then
            result = CreateError(4, varname)
        Else
            Dim variable As Symbol
            variable = m_SymbolTable.Fetch(varname)

            Select Case variable.Type.ToString()
                Case "System.Int32"
                    ' Backtrack, and call numeric parser
                    Backtrack()
                    result = ParseNumericExpression()
                Case "System.String"
                    ' Backtrack, and call string parser
                    Backtrack()
                    result = ParseStringExpression()
                Case Else
                    result = CreateError(1, " an Integer or String variable.")
            End Select

        End If
    End If

    Return result
End Function

This scans a name, and checks to see if the token is not. If so, it could have called NotOperator as discussed. But since we have decided to use backtracking anyway, it simply backtracks and calls the boolean parser. If the scanned name is not not (Sorry), it checks to see if is a variable, and backtracks and calls the appropriate parser. Note that we have not yet handled boolean variables.

One more way to express yourself

Finally, we need to change ParseExpression to call ParseInitialName where appropriate. Modify it in Parser.vb as follows:

Private Function ParseExpression( _
                    Optional ByVal expressiontype _
                        As Type = Nothing) _
        As ParseStatus

    Dim result As ParseStatus

    ' Since we are doing the work of the scanner by using the
    ' lookahead character, we need to initialize the token
    ' builder
    m_CurrentTokenBldr = New StringBuilder

    If LookAhead.Equals(""""c) Then
        result = ParseStringExpression()
    ElseIf IsNumeric(LookAhead) Then
        result = ParseNumericExpression()
    ElseIf IsNameStartCharacter(LookAhead) Then
        result = ParseInitialName()
    ElseIf IsNotOperator(LookAhead) Then
        result = ParseBooleanExpression()
    ElseIf LookAhead.Equals("("c) Then
        ' For now, assuming only numeric expressions can use ()
        result = ParseNumericExpression()
    Else
        result = CreateError(1, "a boolean, numeric or string expression")
    End If

    If result.Code = 0 Then
        If expressiontype Is Nothing Then
            ' If a relational operator follows the expression just parsed
            ' we are in the middle of a Boolean expression. Our work is
            ' not yet done.
            If IsRelOperator(LookAhead) Then
                ' Parse a boolean expression, letting it know that the
                ' first part has already been parsed.
                result = ParseBooleanExpression(m_LastTypeProcessed)
            End If
        End If
    End If

    Return result
End Function

Notice how the IsNameStartCharacter check happens before IsNotOperator. This has to be so. Think about it for a minute.

Test Variable-first Expressions

Compile and run. Test with the following:

integer x
x = 12
print x+2
integer y
y = x * 2
print y
print y<x
print y>x
string z
z := "Raj"
print z
print z & " Chaudhuri"
print z=="Raj"
print z>="Raj"
Dim myname As String 
myname = "Paula"
String greet
greet := myname & ", you brillant person."
print greet + " Hello."

All of these should work, including variables in boolean expressions. Which leaves us with one last problem: boolean variables.

Using variables in Boolean expressions - finally

For integers and strings, we parse variable names in the "factor" level of the relevant expression parser. As discussed earlier, the boolean "factor" is basically the condition. The BNF goes like this:

<booleanfactor>              ::= <condition>|<bracketedbooleanexpression>
<bracketedbooleanexpression> ::= "("<booleanexpression>")"

<condition>                  ::= <expression><reloperator><expression>
<reloperator>                ::= "="|"=="|"==="|"<>"|"!="|"!=="|
                                 "<"|"<="|"=<"|">"|">=","=>"

As you can see, the condition itself is made up of expressions, which may well start with a variable. This variable could be integer or string, or boolean. This is the only place where we can detect a boolean variable with certainty.

So, the only place where we can parse a boolean variable (and decide what to do next) is in ParseInitialName itself. Modify ParseInitialName as follows, in Parser.vb:

Private Function ParseInitialName() As ParseStatus
    Dim result As ParseStatus

    ScanName()

    Dim varname As String
    varname = CurrentToken 

    If CurrentToken.ToLowerInvariant() = "not" Then
        ' This is a boolean. Backtrack and call
        ' boolean parser
        Backtrack()
        result = ParseBooleanExpression()
    Else
        If Not m_SymbolTable.Exists(varname) Then
            result = CreateError(4, varname)
        Else
            Dim variable As Symbol
            variable = m_SymbolTable.Fetch(varname)

            Select Case variable.Type.ToString()
                Case "System.Int32"
                    ' Backtrack, and call numeric parser
                    Backtrack()
                    result = ParseNumericExpression()
                Case "System.String"
                    ' Backtrack, and call string parser
                    Backtrack()
                    result = ParseStringExpression()
                Case "System.Boolean"
                    ' Backtrack, parse variable because boolean factor
                    ' can't, then continue as a boolean expression
                    Backtrack()
                    result = ParseVariable(GetType(Boolean))
                    If result.Code = 0 Then
                        result = ParseBooleanExpression(GetType(Boolean))
                    End If
                Case Else
                    result = CreateError(1, " an Integer or String variable.")
            End Select

        End If
    End If

    Return result
End Function

Note the added case for a boolean type variable. If the name just scanned is a boolean, we backtrack and call ParseVariable with the boolean type parameter, which will take care of emitting it on the stack. And then we call ParseBooleanExpression with a parameter, which lets it know that the first token in that expression has already been parsed.

Now, we need to add some more checks in the boolean parse tree. The first is that if a condition starts with a boolean variable, the relational operator is optional. A boolean variable by itself is a valid condition.

Change ParseCondition as follows, in Parser.vb:

Private Function ParseCondition( _
                    Optional lastexpressiontype As Type = Nothing _
    ) As ParseStatus

    Dim result As ParseStatus

    If lastexpressiontype Is Nothing Then
        result = ParseExpression(Type.GetType("System.Boolean"))
    Else
        m_LastTypeProcessed = lastexpressiontype
        result = CreateError(0,"")
    End If

    If result.Code = 0 Then

        ScanRelOperator()

        If TokenLength = 0 Then
            ' If the last type processed was a Boolean,
            ' a relational operator is not required
            If Not m_LastTypeProcessed.Equals( _
                                            GetType(Boolean) _ 
                                        ) Then
                result = CreateError(1, "a relational operator.")
            End If
        Else
            result = ParseRelOperator()
            SkipWhiteSpace()
        End If
    End If

    Return result
End Function

Next, the case where a relational operator does follow a boolean variable. The only relational operators allowed for booleans are equality and inequality. We can check for this in ParseRelOperator. Modify it in Parser.vb as follows:

Private Function ParseRelOperator() _
        As ParseStatus

    Dim result As ParseStatus
    Dim reloperator As String = CurrentToken
    Dim conditiontype As Type = m_LastTypeProcessed

    SkipWhiteSpace()

    ' The expression after a relational operator
    ' should match the type of the expression
    ' before it
    If conditiontype.Equals( _
            Type.GetType("System.Int32") _
        ) Then

        result = ParseNumericExpression()
    ElseIf conditiontype.Equals( _
            Type.GetType("System.String") _
        ) Then

        result = ParseStringExpression()
    ElseIf conditiontype.Equals( _
            Type.GetType("System.Boolean") _
        ) Then
        ' Boolean conditions support only equality or
        ' inequality comparisons
        Select Case reloperator
            Case "=","==","===", "<>", "!=", "!=="
                result = ParseBooleanExpression()
            Case Else
                result = CreateError(1, "and = or <> operator")
        End Select

    Else
        result = CreateError(1, "an expression of type " & _
                        conditiontype.ToString())
    End If

    If result.Code = 0 Then
        GenerateRelOperation(reloperator, conditiontype)
    End If

    Return result
End Function

Finally, if a relation operator has been used with a boolean variable, we need to emit the relevant CIL. Luckily, the opcodes are the same as Integers. Change GenerateRelOperation in Parser.vb as follows:

Private Sub GenerateRelOperation( _
        reloperator As String, _
        conditionType As Type)

    If conditionType.Equals( _
            Type.GetType("System.Int32")
        ) Then

        Select Case reloperator
            Case "=", "==", "==="
                m_Gen.EmitEqualityComparison()
            Case ">"
                m_Gen.EmitGreaterThanComparison()
            Case "<"
                m_Gen.EmitLessThanComparison()
            Case ">=", "=>"
                m_Gen.EmitGreaterThanOrEqualToComparison()
            Case "<=", "=<"
                m_Gen.EmitLessThanOrEqualToComparison()
            Case "<>", "!=", "!=="
                m_Gen.EmitInEqualityComparison()
        End Select

    ElseIf conditionType.Equals( _
            Type.GetType("System.String")
        ) Then

        Select Case reloperator
            Case "=", "==", "==="
                m_Gen.EmitStringEquality()
            Case ">"
                m_Gen.EmitStringGreaterThan()
            Case "<"
                m_Gen.EmitStringLessThan()
            Case ">=", "=>"
                m_Gen.EmitStringGreaterThanOrEqualTo()
            Case "<=", "=<"
                m_Gen.EmitStringLessThanOrEqualTo()
            Case "<>", "!=", "!=="
                m_Gen.EmitStringInequality()
        End Select
    ElseIf conditionType.Equals( _
            Type.GetType("System.Boolean")
        ) Then

        Select Case reloperator
            Case "=", "==", "==="
                m_Gen.EmitEqualityComparison()
            Case "<>", "!=", "!=="
                m_Gen.EmitInEqualityComparison()
        End Select
    End If
End Sub

And now boolean variables are ready to be tested.

Test Boolean Variables

Compile and run. Test with the following:

var xx as boolean
var yy as boolean
xx = 1=1
yy = 1=2
print "xx"
print xx
print "yy"
print yy
print "xx = yy"
print xx = yy
print "xx = xx"
print xx = xx
print "xx and yy"
print xx and yy
print "xx or yy"
print xx or yy
print "not xx"
print not xx

As usual, you can try arbitrarily complex expressions.

More ambiguity

As if things were not complicated enough already. There is one more place where there can be ambiguity based on the fact that we are parsing variables now: the NotOperation.

Recall the definition from Chapter 5. Here's the BNF for reference:

<notoperation>               ::= <booleanfactor>|<notoperator>
<notoperator>                ::= <notsymbol><booleanfactor>
<notsymbol>                  ::= "NOT"|"not"|"!"

As you can see, a not-operation is either a boolean "factor", or a NOT operator followed by a boolean factor. Now, a NOT operator is either the symbol ! (no problems here) or the word NOT, uppercase or lowercase(oops!).

See the problem? At the start of any boolean factor, any variable starting with N, O, or T can cause the parser to try to scan a NOT operator.

Try it. Try to compile the following with our compiler:

Integer n
n:=42
Boolean m
m = n=42

You should get an "Expected NOT" error.

The problem comes from the fact that we use a single lookahead character to predict what comes next. This can cause ambiguity in situations such as this one. The exact cause of our current problem is this:

A boolean expression may begin with a variable or a NOT operator
Both of which may begin with the letter n

This problem could have been avoided if we did predictive parsing using a lookahead token instead of a lookahead character. But we are too far in to bring in such a big change. So, we will cheat a bit. For the first time, we will look more than one character ahead in a recognizer method; specifically IsNotOperator.

Rename the existing ReadNotOperator method to ReadNotOperatorSymbol, and enter the new one shown below. Make the change in the Recognizers region in Parser.vb.

Private Function IsNotOperatorSymbol(ByVal c As Char) As Boolean
    Return "nNoOtT!".IndexOf(c) > -1
End Function

Private Function IsNotOperator(Byval c As Char) As Boolean
    Dim result As Boolean = False

    If c = "!"c Then
        result = True
    ElseIf Char.ToLowerInvariant(c) = "n"c AndAlso _
        m_LineLength >= m_CharPos+2 Then

        Dim peektoken As String
        peektoken = m_ThisLine.Substring(m_CharPos,3).ToLowerInvariant()
        If peektoken = "not" Then
            result = True
        End If
    End If

    Return result
End Function

Finally, modify ScanNotOperator to use ReadNotOperatorSymbol instead of ReadNotOperator. Make the changes in the Scanner region in Parser.vb.

Private Sub ScanNotOperator()
    m_CurrentTokenBldr = New StringBuilder

    Do While IsNotOperatorSymbol(LookAhead)
        m_CurrentTokenBldr.Append(LookAhead)
        m_CharPos += 1
    Loop

    Select Case CurrentToken.ToLowerInvariant()
        Case "not", "!"
            ' Valid NOT operator
        Case Else
            m_CurrentTokenBldr = New StringBuilder
    End Select
End Sub

Compile and run. This time, the code which failed earlier should compile properly.

Declare and assign

At this point, we have successfully implemented variables in our compiler. We can stop now, but I want to add a few extras.

Most modern languages allow us to declare a variable and assign it an initial value in the same statement. Thus, our language should be able to correctly process lines like the following:

Dim x As Integer = 42
String str := "Superstring"

Let's add this capability.

Options, options

We have a situation where one of our constructs can be parsed in two ways. A declaration statement can be just a declaration, or an assignment as well.

This is not as difficult as it seems. Once we finish our regular processing of declaration statements, we just need to check if another token exists, and is an assignment operator. In our case, this simply means checking to see if the lookahead character is an assignment operator. If it is, we need to process the assignment. Let's first process the assignment by adding the following new method, ParseDeclarationAssignment, to the Parser class. Add it in Parser.vb.

Private Function ParseDeclarationAssignment(ByVal variable As Symbol) _
                                                        As ParseStatus
    Dim result As ParseStatus

    If IsAssignmentCharacter(LookAhead) Then
        ScanAssignmentOperator()

        If TokenLength = 0 Then
            result = CreateError(1, "= or :=")
        Else
            result = ParseAssignment(variable)
        End If
    Else
        result = CreateError(0, "Ok")
    End If

    Return result
End Function

This scans an assignment operator if available. On a successful scan, it calls the ParseAssignment method to parse the right side of the operator.

Note that if an assignment operator is not found, ParseDeclarationAssignment returns a successful result. This is how we deal with the fact that the assignment part is optional.

Now all that remains is to add a call to this in our two declaration parsers: ParseTypeFirstDeclaration and ParseDimCommand. Make the changes in Parser.vb and Commands.vb respectively, as shown below:

Private Function ParseTypeFirstDeclaration() _
                    As ParseStatus
    Dim result As ParseStatus

    Dim typename As String
    typename = CurrentToken

    If IsValidType(typename) Then
        ' Read a variable name
        SkipWhiteSpace()
        ScanName()

        If TokenLength = 0 Then
            result = CreateError(1, "a variable name.")
        Else
            Dim varname As String
            varname = CurrentToken

            If m_SymbolTable.Exists(varname) Then
                result = CreateError(3, varname)
            Else
                Dim symbol As New Symbol( _
                                    varname, _
                                    GetTypeForName(typename)
                )

                symbol.Handle = m_Gen.DeclareVariable(
                                    symbol.Name,
                                    symbol.Type
                                )

                m_SymbolTable.Add(symbol)

                SkipWhiteSpace()

                result = ParseDeclarationAssignment(symbol)
                If result.Code = 0 Then
                    If Not EndOfLine Then
                        result = CreateError(1, "end of statement.")
                    End If
                End If
            End If
        End If
    Else
        result = CreateError(1, "a valid type.")
    End If

    Return result
End Function

Public Function ParseDimCommand() As ParseStatus
    Dim result As ParseStatus

    ' Read a variable name
    SkipWhiteSpace()
    ScanName()

    If TokenLength = 0 Then
        result = CreateError(1, "a variable name.")
    Else
        Dim varname As String
        varname = CurrentToken

        If m_SymbolTable.Exists(varname) Then
            ' Variable name already declared
            result = CreateError(3, CurrentToken)
        Else
            ' Read either "as" or type
            SkipWhiteSpace()
            ScanName()

            ' Check and ignore "As"
            If CurrentToken.ToLowerInvariant() = "as" Then
                ' Read type
                SkipWhiteSpace()
                ScanName()
            End If

            Dim typename As String
            typename = CurrentToken

            If TokenLength = 0 OrElse _
                    Not IsValidType(typename) Then
                result = CreateError(1, "a valid type.")
            Else
                Dim symbol As New Symbol( _
                                    varname,
                                    GetTypeForName(typename)
                )

                symbol.Handle = m_Gen.DeclareVariable( _
                                            symbol.Name, _
                                            symbol.Type
                                )

                m_SymbolTable.Add(symbol)

                SkipWhiteSpace()

                result = ParseDeclarationAssignment(symbol)

                If result.Code = 0 Then
                    If Not EndOfLine Then
                        result = CreateError(1, "end of statement.")
                    End If
                End If
            End If
        End If
    End If

    Return result
End Function

Compile and run. Test with the following:

integer y:=2
dim x as string = "Hello"
dim a as integer
a = y+2
String b = "World"
Dim greetfirst As String := x & " "
String fullgreeting = greetfirst + b
Print fullgreeting

All of this should work. Any errors will be pointed out by our compiler.

Finish the chapter already

We are done with variables. Just one more thing, and we can close the chapter on them.

There's nothing to learn here - just a bit of concession for people who have experience with C-derived languages. We will allow the names int and bool to be used interchangeably with Integer and Boolean. Which, thanks to the work we have already done, is very simple: they need to be added in the InitTypes method. Change the method as follows, in Commands.vb:

Private Sub InitTypes()
    m_typeTable = new Dictionary(Of String, Type)

    ' Add types here
    AddType("integer", GetType(System.Int32))
    AddType("string", GetType(System.String))
    AddType("boolean", GetType(System.Boolean))
    AddType("int", GetType(System.Int32))
    AddType("bool", GetType(System.Boolean))
End Sub

Compile and run. Test with the following:

int i = 40
Dim j As Integer := i + 1
bool doWeKnowWhatIsLife = j<42
Print i
Print j
Print doWeKnowWhatIsLife
Dim areWeHappy As Boolean = 2=2
Print doWeKnowWhatIsLife AND areWeHappy

I can think of a lot of little cosmetic enhancements like that. But we will keep those for later. Right now, important additions need to be made to the language.

Conclusion

In this chapter, we have added the capability of handling variables to our compiler. This immediately makes boolean conditions more interesting, and opens the way for some constructs that are absolutely necessary for any imperative language: branches and loops. We will cover these in the next chapter.

Chapter 7 - Variables

Variables

Introduction

Variables

Types

Scope

Goal

The Approach

First Step

Modifying CodeGen

The Symbol Table

Types

Parsing Declaration

Parsing Assignment

A New Line (Again)

Error, Error On The Wall

Test It

Using variables

Using variables in numeric expressions

Test Numeric Variables

Using variables in string expressions

Test String Variables

Using variables in Boolean expressions - not yet

In the beginning, there was a name

Backtracking

Implementing the backtrack

ParseInitialName

One more way to express yourself

Test Variable-first Expressions

Using variables in Boolean expressions - finally

Test Boolean Variables

More ambiguity

Declare and assign

Options, options

Finish the chapter already

Conclusion

results matching ""

No results matching ""