Branches and Loops Part I

Introduction

In the last chapter, we made our compiler (and language) capable of declaring and initializing variables, and using variables in expressions. It's time now to tackle the next important programming concept: control flow, specifically branches and loops.

Control Flow

Like most other programming languages, our SIC language expects instructions to be executed in sequence. Our compiler reads the source, and emits CIL in the same sequence as the source statements. In real life, there will be many situations where the resulting CIL will need to execute in a different sequence. The three main examples of such situations are:

  • Execution of an arbitrary set of statements under certain conditions. This is called selection. Every language allows some sort of selection construct, typically the ubiquitous if statement.
  • Repeated execution of an arbitrary set of statements. This is called iteration or looping. Again, every language has some form of a loop instruction. The most common example is the while statement.
  • Execution of a known sequence or group of instructions as a unit, in lieu of a single instruction. Such groups are known as sub-programs, subroutines or functions.

Functions and subroutines are a topic unto themselves, and as such will be covered a few chapters down the line. In this chapter, we concern ourselves with selection and loops.

Branches

The essential concept of selection can be expressed via this statement: "Carry out the following instructions if this condition is true".

How do we ensure that the instructions following the condition are not executed if the condition is false? In most machine/assembly languages including CIL, there are instructions which cause execution to jump or branch to another statement. Such an instruction can be used to jump over instructions that need to be carried out if the condition is true.

What about high-level languages? In the early days, most languages had some form of the infamous goto statement, which permitted so-called unconditional branching. This statement usually took the form

Goto <label>

where <label> was some way of marking a location in the program; perhaps a line number or a name. The goto statement has been discouraged for years, because it supposedly encourages sloppy, unreadable programming. It can be quite difficult (for humans) to read and understand a program littered with goto statements. Nevertheless, in many early programming languages, the goto statement was the only way to both conditionally execute (or not execute) statements, and execute statements repeatedly.

As programming languages developed, several control flow constructs were added to improve the clarity and quality of programs. Such constructs could be used instead of goto statements. The use of such constructs collectively came to known as structured programming.

The most common use of the goto statement was to handle conditions. Not coincidentally, the most common constructs in structured programming were those which dealt with conditional branching, the most common of these being the aforementioned if statement.

Back when I first learned about structured programming, the term "branch" was used as a loose synonym for conditional branching constructs. I will use that meaning in this and the next chapter, as we arm our compiler with a few conditional branching constructs, starting with the if statement.

Loops

There are many situations in programming where a sequence or block of statements have to be repeated, either a known number of times, or depending on some condition. A language construct that supports this functionality is called a loop. Loops can be very useful for writing most kinds of computer programs.

In this and the next chapter, we will add some loop constructs to our language and compiler, beginning with the while construct.

Goal

In this chapter, we will enhance our compiler and our language with one branch and one loop construct.

The branch construct we will implement is the If statement, as described below:

If <boolean expression> [Then]
    <block>
[ElseIf < boolean expression> [Then]
    <block>]*
[Else
    <block>]
End If

The If statement is followed by a boolean expression, which in turn may optionally (that's what the square brackets in the pseudocode mean) be followed by the keyword Then. The lines that follow will be a block of instructions, which will end, in the simplest case, with the End If statement. These instructions will be carried out only if the boolean expression evaluates to true.

There may optionally be one and only one Else statement, followed by another block. The instructions in the Else block are to be carried out only if the boolean expression following the If statement evaluates to false.

Finally, there may be any number (that's what the star in the pseudocode means) of ElseIf statements, each followed by a boolean expression and block. All of them must occur after the initial If statement, and before an Else statement if one exists. ElseIf statements allow for checking multiple conditions. If the boolean expression following the first If evaluates to false, the boolean expression following the first ElseIf is checked. If that one evaluates to false, the next one is checked, and so on.

We will also introduce a common loop construct in this chapter: the While statement. The syntax is simple:

While <boolean expression>
    <block>
    [Continue While]
    [Exit While]
End While

The block of statements between the While and End While statements will be repeated, as long as the boolean expression following the While statement evaluates to true. The optional Exit While statement will cause an immediate exit; i.e., execution will continue after the End While statement. The optional Continue While statement will cause execution to continue at the start of the While statement, which means that the condition will be re-evaluated.

The Approach

In this chapter, we return to the approach we took in chapter 6. The elements of the If and While statements are essentially commands, so we need to write parser methods as appropriate, and add them to the list of commands.

First Step

To begin with, let us parse the If statement. This is the point where we ask the usual question: how do we do branching in CIL?

Thus far, we have been emitting CIL instructions as soon as we parse anything. In the resulting executable, these instructions are executed in the order in which they were emitted. How do we cause instructions to be executed out of order?

Like any other "machine" language, CIL provides jump or branch instructions, which cause execution to, well, jump to another instruction. In CIL, the destination of such jump instructions is a number, which is a signed offset from the current instruction. This means that if we emit a branch instruction with the offset 4, the next instruction executed will be the one four places after the current instruction; if we use the offset -3, it will be the one three places before1.

I'm sure you can see a problem immediately. In a parser like ours, at the point when a jump instruction must be emitted, we don't yet know the exact offset to jump to. To understand this, let us examine how our If statement might translate down to CIL.

The boolean expression following the If statement can be parsed using our usual ParseBooleanExpression method. Now, if this condition evaluates to false, then we need to jump past the block of instructions that follow the If statement, because that block will execute if the boolean expression is True. Where do we need to jump to? In the simplest case, to the statement after the matching End If (we'll discuss the other cases later). Now, we cannot know the offset of that location. We have not parsed the block yet, and so don't know how many instructions we have to jump over!

Luckily, we are using the Reflection Emit library to generate CIL. The library provides a nice feature called a Label, which allows us to both mark a point to jump to in the CIL instruction stream, and emit a jump instruction that targets that point. We can actually emit a jump instruction to a label before using it to mark the destination point. The Label object keeps track of all CIL between the jump and the mark, and updates the correct offset in the jump instruction. This solves our problem.

So, at this point, we will enhance our CodeGen class with the ability to define and emit Labels, as well as emit some jump instructions. There are many jump instructions in CIL. We will add only the following instructions for now:

OpCode What it does
Brfalse target Expects a boolean or integer value (It can work with some other values, but let's stick to those two for now) on the stack. If the value is False or zero, jumps to the target location specified.
Brtrue target Expects a boolean or integer value on the stack. If the value is True or non-zero, jumps to the target location specified.
Br target Unconditionally jumps to the target location specified.

The target is a Label in all cases.

Modifying CodeGen

Add the following to the CodeGen class in CodeGen.vb:

Private m_Labels As New List(Of Label)

Public Function DeclareLabel() As Integer
    Dim lbl As Label = m_ILGen.DefineLabel()
    m_Labels.Add(lbl)
    Return m_Labels.Count - 1
End Function

Public Sub EmitLabel(ByVal labelNumber As Integer)
    Dim lbl As Label = m_Labels(labelNumber)
    m_ILGen.MarkLabel(lbl)
End Sub

Public Sub EmitBranchIfFalse(ByVal labelNumber As Integer)
    Dim lbl As Label = m_Labels(labelNumber)
    m_ILGen.Emit(OpCodes.Brfalse, lbl)
End Sub

Public Sub EmitBranchIfTrue(ByVal labelNumber As Integer)
    Dim lbl As Label = m_Labels(labelNumber)
    m_ILGen.Emit(OpCodes.Brtrue, lbl)
End Sub

Public Sub EmitBranch(ByVal labelNumber As Integer)
    Dim lbl As Label = m_Labels(labelNumber)
    m_ILGen.Emit(OpCodes.Br, lbl)
End Sub

The CodeGen class now maintains a list of labels. The DeclareLabel method uses the DefineLabel method of the ILGenerator object (that we use to emit all our code) to actually define a label, stores it in the list, and returns its position in the list as a handle. The handle can be used by the parser while calling EmitLabel, which uses the MarkLabel method of the ILGenerator object to actually mark an instruction point. It’s important to note that our EmitLabel method should be called before emitting the instructions that will be executed after the jump. The label must come before the jump target.

The EmitBranchIfTrue, EmitBranchIfFalse and EmitBranch methods emit the respective CIL branch instructions. Note that EmitLabel need not be called before calling any of these. As long as the label that these instructions jump to has been declared (using DeclareLabel), these will function properly. Of course, at some point, the EmitLabel method would have to be called; otherwise we will end up with jumps to labels that do not occur in the instruction stream, resulting in an invalid application! However, we will not check for this. The nature of our parser will prevent that particular error.

Parsing the If Command

Let's start by parsing the simplest form of the If statement, as follows:

If <boolean expression> [Then]
    <block>
End If

In BNF, that would be:

<ifcommand>  ::= "if" <booleanexpression> ["then"]
                    <block>
                 <endcommand>

We already know how to parse boolean expressions, blocks of statements and the End command, from chapters 4 through 6. Thus, parsing the If command itself is straightforward.

Add the following to the "Commands" region in Commands.vb:

Public Function ParseIfCommand() As ParseStatus
    Dim result As ParseStatus

    SkipWhiteSpace()
    result = ParseBooleanExpression()

    If result.Code=0 Then
        SkipWhiteSpace()

        If Not EndOfLine Then
            ' Try to read "then"
            ScanName()
            If CurrentToken.ToLowerInvariant<>"then" Then
                result = CreateError(1, "then")
            Else
                ' There shouldn't be anything after "then"
                SkipWhiteSpace()
                If Not EndOfLine Then
                    result = CreateError(1, "end of statement")
                End If
            End If
        End If

        If result.Code=0 Then
            Dim endpoint As Integer = m_Gen.DeclareLabel()

            ' If the condition just emitted is false, emit jump
            ' to endpoint
            m_Gen.EmitBranchIfFalse(endpoint)

            ' Parse the "If" block
            Dim ifblock As Block = New Block( _
                                "if", _
                                endpoint, _
                                endpoint
            )

            result = ParseBlock(ifblock)

            ' If the block was successfully parsed, emit
            ' the endpoint label
            If result.Code = 0  Then
                m_Gen.EmitLabel(ifblock.EndPoint)
            End If
        End If
    End If

    Return result
End Function

Like any other command, the ParseIfCommand method assumes that the calling ParseCommand method has already scanned the command itself. Thus, we parse a boolean expression immediately. If this succeeds, we check to see if the next token is a Then, and just "eat" it. We then check if there is anything else on the line (there shouldn't be). If not, we declare a label (which will mark the end of the If statement) and emit a jump to that label if the boolean expression evaluated to false (if the condition is false, execution continues after the End If). Then, we call ParseBlock. If a block is parsed successfully, we emit the label that we had generated earlier.

Note how we declare a label, and then use it to construct a Block object. Let's take a short aside to discuss how we handle blocks.

When we introduced blocks in chapter six, we had very briefly mentioned their start and end points. We had not used these in the Comment block, covered in that chapter. We will use them now.

As can be seen from the code above, these properties of a Block object store "handles" of labels. In most cases, the start and end points of a block will store the locations where the code for the block starts and ends respectively.

In this particular case, the start and end point properties are both set to the label that marks the end of the If statement. This is because there is never a need to jump to the start of an If statement.

Note that when calling EmitLabel, we do not use the variable called endpoint. Instead, we use the EndPoint property of the Block object. They should be the same value, right? For now, they are. This will change shortly.

Finally, to make ParseCommand recognize and process If, we have to add it to the list of commands. Modify the InitCommands method in Commands.vb as follows:

Private Sub InitCommands()
    m_commandTable = New Dictionary(Of String, CommandParser)

    ' Add commands here
    AddCommand("print", AddressOf ParsePrintCommand)
    AddCommand("rem", AddressOf ParseRemCommand)
    AddCommand("comment", AddressOf ParseCommentCommand)
    AddCommand("end", AddressOf ParseEndCommand)
    AddCommand("dim", AddressOf ParseDimCommand)
    AddCommand("var", AddressOf ParseDimCommand)
    AddCommand("if", AddressOf ParseIfCommand)
End Sub

Adding a command simply means a call to AddCommand from the InitCommands method. From now on, this particular change will not be shown in the text again. We will mention that the command has been added.

Testing the If Block

Compile and run. Test with the following:

Dim myName As String = "Raj"

If myName = "Raj" Then
    print "Hello, " & myName
    print "Nice name."
End If

If myName<>"Raj"
    print "Not so nice."
End If

REM Nested if statements also work
int i:=30
if i>15
    print "More than 15"
    if i<50 then
        print "but less than 50"
    end if
end if

The block infrastructure that we have already written can take care of arbitrary levels of nesting. Also, note that we did not have to write any code to process the End If command. The ParseEndCommand method takes care of it.

Parsing the Else command

The Else command is slightly tricky. Let's examine how it changes things.

If the If expression evaluates to false, execution should continue at the first instruction after the Else statement. Which simply means that the jump after the boolean expression should now target the instruction after the Else statement.

What about the code that was executed if the If expression evaluated to true? For that code, execution should continue after the End If as soon as an Else statement is parsed. This means another jump, to a new end point label.

So to summarize, when we meet an Else statement, we need to generate a new label, emit a jump to that, then emit the old end point label, and then continue as usual.

Finally, an Else statement can appear only once in an If block. We will add a field, the Else flag, to keep track of this.

Add the following in Commands.vb, in the appropriate regions:

Private Function ParseElseCommand() As ParseStatus
    Dim result As ParseStatus

    SkipWhiteSpace()

    ' There should be nothing after Else on a line
    If Not EndOfLine Then
        result = CreateError(1, "end of statement")
    Else
        Dim currBlock As Block = m_BlockStack.CurrentBlock

        ' We should be in an If block, and the Else flag should
        ' not be set
        If currBlock Is Nothing _
                OrElse _
            currBlock.BlockType <> "if" _
                OrElse _
            m_ElseFlag Then

            result = CreateError(6, "Else")
        Else
            ' The end point is the same as the start point
            ' This means that only an If statement has been
            ' parsed before. Generate a new end point
            If currBlock.EndPoint = currBlock.StartPoint Then
                currBlock.EndPoint = m_Gen.DeclareLabel()
            End If

            ' Emit a jump to the end point
            ' Because a True If condition should never cause
            ' code in the Else block to be executed
            m_Gen.EmitBranch(currBlock.EndPoint)

            ' Emit the "start" point (of the Else block now)
            ' because a False If condition should cause
            ' execution to continue after the Else command.
            m_Gen.EmitLabel(currBlock.StartPoint)

            ' Set the Else flag
            m_ElseFlag = True

            result = CreateError(0, "Ok")
        End If
    End If

    Return result
End Function

Exactly how we described it earlier. Just one tricky bit here. Notice where we check whether the block EndPoint is the same as the StartPoint? The way we have handled the If command, they will always be the same, right? So why this check? We'll answer this question in the section immediately following this one.

Also, note the new error value of 6.

Now, we need to modify ParseIfCommand such that the Else flag is properly updated. Each time an If block begins, we need to save the old value of the Else flag, and set it to False. Notice that a successful ParseElseCommand sets it to True. When the If block ends, we need to restore the old value. This approach is required because If statements can be nested.

Modify ParseIfCommand in Commands.vb as follows:

Public Function ParseIfCommand() As ParseStatus
    Dim result As ParseStatus

    SkipWhiteSpace()
    result = ParseBooleanExpression()

    If result.Code=0 Then
        SkipWhiteSpace()

        If Not EndOfLine Then
            ' Try to read "then"
            ScanName()
            If CurrentToken.ToLowerInvariant<>"then" Then
                result = CreateError(1, "then")
            Else
                ' There shouldn't be anything after "then"
                SkipWhiteSpace()
                If Not EndOfLine Then
                    result = CreateError(1, "end of statement")
                End If
            End If
        End If

        If result.Code=0 Then
            ' Store old value of Else flag for nesting
            Dim oldelseflag As Boolean = m_ElseFlag
            ' ElseFlag should be false 
            ' at start of a new If block
            m_ElseFlag = False


            Dim endpoint As Integer = m_Gen.DeclareLabel()

            ' If the condition just emitted is false, emit jump
            ' to endpoint
            m_Gen.EmitBranchIfFalse(endpoint)

            ' Parse the "If" block
            Dim ifblock As Block = New Block( _
                                "if", _
                                endpoint, _
                                endpoint
            )

            result = ParseBlock(ifblock)

            ' If the block was successfully parsed, emit
            ' the endpoint label, and restore saved else
            ' flag for nesting
            If result.Code = 0  Then
                m_Gen.EmitLabel(ifblock.EndPoint)
                m_ElseFlag = oldelseflag
            End If
        End If
    End If

    Return result
End Function

Next, CreateError needs to be modified to handle the new error code, 6. Make the change im Parser.vb, as follows:

Private Function CreateError( _
    ByVal errorcode As Integer, _
    ByVal errorDescription As String _
    ) As ParseStatus

    Dim result As ParseStatus
    Dim message As String

    Dim errorpos As Integer
    ' Most errors happen
    ' at the scan position
    errorpos = m_CharPos + 1

    Select Case errorcode
        Case -1 ' Block finished
            message = ""
        Case 0  ' All good
            message = "Ok."
        Case 1  ' Expected something
            message = String.Format( _
                        "Expected {0}", _
                        errorDescription _
            )

        Case 2  ' Not in block
            message = errorDescription

            ' Not in block error happens
            ' after End command has been
            ' scanned
            errorpos = errorpos - TokenLength
        Case 3  ' Variable already declared
            message = String.Format( _
                        "Cannot redeclare variable '{0}'.", _
                        errorDescription
            )

            ' Error happens after scanning
            ' variable name
            errorpos = errorpos - TokenLength
        Case 4  ' Variable not declared
            message = String.Format( _
                        "Variable '{0}' not declared.", _
                        errorDescription
            )
            ' Error happens after scanning
            ' variable name
            errorpos = errorpos - TokenLength
        Case 5  ' Variable type mismatch
            message = String.Format( _
                        "Type mismatch for Variable '{0}'.", _
                        errorDescription
            )
            ' Error happens after scanning
            ' variable name
            errorpos = errorpos - TokenLength
        Case 6  ' Unexpected token
            message = String.Format( _
                        "'{0}' was unexpected at this time.", _
                        errorDescription
            )
            ' Error happens after scanning
            ' unexpected token
            errorpos = errorpos - TokenLength
        Case Else
            message = "Unknown error."
    End Select

    result = New ParseStatus(errorcode, _
                message, _
                errorpos, _
                m_linePos)


    Return result
End Function

CreateError is getting unwieldy. We should refactor it first chance we get. We will. For now, leave it as it is and add Else to the list of commands. You know how.

Compile and run. Test with the following:

Int i:=30

If i<30 Then
    Print "Less than 30"
Else
    If i>30
        Print "More than thirty"
    Else
        Print "Thirty!"
    End If
End If

Parsing the ElseIf Command

Take a look at the test code above. If the first condition is false, we perform another test. This is a common enough situation in programming to merit its own statement. The ElseIf statement allows us to check multiple conditions in a single logical If block.

How does ElseIf work? If the boolean expression after an If statement evaluates to false, a jump is emitted. In this case, that jump must target the boolean expression after the ElseIf. Also, the moment we hit an ElseIf, we must emit a jump which skips the entire ElseIf block. Very similar to what we did in the case of the Else statement.

The difference is this: there may be any number of ElseIf statements in a single If block. This will affect our jump logic as follows:

When an If block starts, the block's end point targets the end of the statements following the If - same as the start point. If we hit an End If without encountering an ElseIf or an Else, the targeting remains accurate. The first time we hit an ElseIf, we have to change the block's end point, because the end of the logical If block has just moved. For subsequent ElseIf statements, we cannot change the end point again. In fact, if there is as Else statement following the ElseIf statements, that cannot change the ending point either.

Finally, an ElseIf statement cannot follow an Else statement.

Fortunately, we can handle all of these cases with the code we have already written. Add the following to Commands.vb, in the appropriate region:

Private Function ParseElseIfCommand As ParseStatus
    Dim result As ParseStatus

    Dim currBlock As Block = m_BlockStack.CurrentBlock

    ' The ElseIf command can only be in an If block, and the
    ' Else flag should not be set
    If currBlock Is Nothing _
            OrElse _
        currBlock.BlockType <> "if" _
            OrElse _
        m_ElseFlag Then

        result = CreateError(6, "ElseIf")
    Else

        ' If the endpoint is the same as the startpoint, this
        ' is the first ElseIf. Generate new endpoint. This 
        ' will mark the end of the If block.
        If currBlock.EndPoint = currBlock.StartPoint Then
            currBlock.EndPoint = m_Gen.DeclareLabel()
        End If

        ' Emit jump to the endpoint, because the ElseIf condition 
        ' and block should not be processed if the If condition 
        ' was true.
        m_Gen.EmitBranch(currBlock.EndPoint)

        ' Emit the "start" point. This marks the start of the
        ' ElseIf block
        m_Gen.EmitLabel(currBlock.StartPoint)

        SkipWhiteSpace()
        ' Parse the ElseIf condition
        result = ParseBooleanExpression()

        ' If successful
        If result.Code = 0 Then

            ' "Eat" the optional "Then"
            SkipWhiteSpace()

            If Not EndOfLine Then
                ' Try to read "then"
                ScanName()
                If CurrentToken.ToLowerInvariant<>"then" Then
                    result = CreateError(1, "then")
                Else
                    ' There shouldn't be anything after "then"
                    SkipWhiteSpace()
                    If Not EndOfLine Then
                        result = CreateError(1, "end of statement")
                    End If
                End If
            End If

            If result.Code = 0 Then        
                ' Generate new "start" point. This will mark
                ' the next elseif statement, or an else statement
                currBlock.StartPoint = m_Gen.DeclareLabel()

                ' If the condition is FALSE, jump to the 
                ' "start" point
                m_Gen.EmitBranchIfFalse(currBlock.StartPoint)
            End If
        End If
    End If

    Return result
End Function

When we meet an ElseIf command for the first time, the StartPoint and EndPoint of the current block are the same (we set them that way in ParseIfCommand). In this case, we set a new EndPoint and emit a jump to it. This new EndPoint will, from now on, represent the point after the End If statement. We then emit the old StartPoint, which is the target of the BrFalse opcode generated after the If condition. Then, we proceed to parse and emit code for the ElseIf condition. The effect is that if the If condition evaluates to false, the ElseIf condition is checked immediately.

Then, we generate a new label, store it in the block's StartPoint property, and emit a BrFalse which targets this new StartPoint. This will be the location that will be targeted if the ElseIf condition evaluates to false. This may point to the next ElseIf condition, an Else block, or the instruction after the End If. We don't know yet.

What happens if we meet another ElseIf? At this point, the block's StartPoint and EndPoint are not the same, so we do not generate a new EndPoint. Instead, we emit a jump to the EndPoint that has already been set. Then, we emit the old StartPoint, and so on. This means that if the condition after the first ElseIf evaluates to false, the second ElseIf condition will be checked immediately. This may repeat any number of times.

What about when we finally meet an Else? Our ParseElseCommand method already takes care of the fact that if the block's StartPoint and EndPoint are different, a new EndPoint is not generated. It instead emits a jump to the pre-set EndPoint, then emits the StartPoint, then proceeds to parse and emit code for the Else block. As a result, if the last ElseIf condition evaluates to false, the statement immediately after the Else statement will be processed, thus completing our picture.

The code that "eats" the optional Then has been repeated from ParseIfCommand. We really need to do some refactoring, soon.

There is one remaining problem, though. Suppose we have an If, followed by an ElseIf, and then another ElseIf, and then an End If, but no Else? What will happen?

The first If will generate a StartPoint which will be the same as the EndPoint. The first ElseIf will generate a new EndPoint, use the old StartPoint, and generate a new StartPoint. The second ElseIf will use the old EndPoint, use the old StartPoint, and generate a new StartPoint. Then the If block will end, and while doing so, use the EndPoint. What happens to the StartPoint generated by the last ElseIf? If there was an Else, it would have been taken care of. Since the Else is optional, we need to take care of it.

To summarize, when an If block ends, if an ElseIf statement has been processed (StartPoint is different from EndPoint) and an Else statement has not been processed (m_ElseFlag is false), we need to emit the dangling StartPoint, before emitting the EndPoint. Let's do that now. Modify ParseIfCommand in Commands.vb, as follows:

Public Function ParseIfCommand() As ParseStatus
    Dim result As ParseStatus

    SkipWhiteSpace()
    result = ParseBooleanExpression()

    If result.Code=0 Then
        SkipWhiteSpace()

        If Not EndOfLine Then
            ' Try to read "then"
            ScanName()
            If CurrentToken.ToLowerInvariant<>"then" Then
                result = CreateError(1, "then")
            Else
                ' There shouldn't be anything after "then"
                SkipWhiteSpace()
                If Not EndOfLine Then
                    result = CreateError(1, "end of statement")
                End If
            End If
        End If

        If result.Code=0 Then
            ' Store old value of Else flag for nesting
            Dim oldelseflag As Boolean = m_ElseFlag
            ' ElseFlag should be false 
            ' at start of a new If block
            m_ElseFlag = False


            Dim endpoint As Integer = m_Gen.DeclareLabel()

            ' If the condition just emitted is false, emit jump
            ' to endpoint
            m_Gen.EmitBranchIfFalse(endpoint)

            ' Parse the "If" block
            Dim ifblock As Block = New Block( _
                                "if", _
                                endpoint, _
                                endpoint
            )

            result = ParseBlock(ifblock)

            ' If the block was successfully parsed, emit
            ' the endpoint label, and restore saved else
            ' flag for nesting
            If result.Code = 0  Then

                ' If there is a dangling StartPoint, emit
                ' it first
                If ifblock.StartPoint<>ifblock.EndPoint _
                        AndAlso _
                    Not m_ElseFlag Then

                    m_Gen.EmitLabel(ifblock.StartPoint)
                End If

                m_Gen.EmitLabel(ifblock.EndPoint)
                m_ElseFlag = oldelseflag
            End If
        End If
    End If

    Return result
End Function

Why do this at all? Why not just ignore the dangling label that we generated? Well, the Reflection Emit library, in specific, the TypeBuilder class does not permit this. Every label defined using ILGenerator.DefineLabel must have a corresponding ILGenerator.MarkLabel. If there isn't, the TypeBuilder.CreateType method throws an ArgumentException. I did not find this fact in the official documentation of the Reflection.Emit namespace.2

Finally, add the ElseIf command to the list of valid commands. You know how.

Compile and run. Test with the following:

int i:=22

if i>30
    print "more than 30"
elseif i>15 then
    print "more than 15 but less than 31"
elseif i>10
    print "more than 10 but less than 16"
elseif i<5
    print "less than 5"
else
    print "between 6 and 10 inclusive"
end if

print "Done"

Try with any combination of If, ElseIf and Else. Valid ones will work as expected. Invalid combinations should produce an accurate error message.

Parsing the While Command

All right, let's proceed to the While loop. Here's the simplest form:

While <boolean expression>
    <block>
End While

Or in BNF:

<whilecommand>  ::= "while" <booleanexpression>
                        <block>
                    <endcommand>

All we have to do is generate and emit a label (the startpoint) before parsing the boolean expression, and emit a jump to that when the block ends. We also need to generate a label for the end of the loop (the endpoint). We will emit a jump to that if the boolean expression evaluates to false. We will emit the endpoint label itself when the block ends, after the jump to the startpoint.

Add the following to Commands.vb:

Private Function ParseWhileCommand() As ParseStatus
    Dim result As ParseStatus

    SkipWhiteSpace()
    If EndOfLine Then
        result = CreateError(1, "a boolean expression")
    Else
        ' Generate and emit the startpoint
        Dim startpoint As Integer = m_Gen.DeclareLabel()
        m_Gen.EmitLabel(startpoint)

        ' Parse the Boolean expression
        result = ParseBooleanExpression()

        If result.Code = 0 Then
            ' There should be nothing else on the line
            If Not EndOfLine Then
                result = CreateError(1, "end of statement")
            Else
                ' Generate endpoint, and emit a conditional
                ' jump to it
                Dim endpoint As Integer = m_Gen.DeclareLabel()
                m_Gen.EmitBranchIfFalse(endpoint)

                ' Parse the block
                Dim whileblock As New Block( _
                        "while", _
                        startpoint, _
                        endpoint _
                )
                result = ParseBlock(whileblock)

                If result.Code = 0 Then
                    ' Emit jump back to startpoint
                    m_Gen.EmitBranch(whileblock.StartPoint)

                    ' Emit endpoint
                    m_Gen.EmitLabel(whileblock.EndPoint)
                End If
            End If
        End If
    End If

    Return result
End Function

Easy, wasn't it? To test this, add While to the list of commands, compile and run. Test it with this famous program:

Int i:=99

While i>0
    Print i
    Print "bottles of beer on the wall,"
    Print i
    Print "bottles of beer."
    Print "Take one down and pass it around,"

    If i>1 Then
        Print i-1
        Print "bottles of beer on the wall."
    Else
        Print "No more bottles of beer on the wall."
    End If

    i = i - 1

    Print ""
End While

Print "No more bottles of beer on the wall."
Print "No more bottles of beer..."
Print "Go to the store and buy some more..."
Print "99 bottles of beer."

The SIC language could do with a way of combining strings and numbers in a single Print statement, but other than that, things should work fine.

Next, try this:

var i int

while i<5
    if i<2 then
        i = i + 2
    else
        i = i + 1
    end while
end if

You should get an accurate error message. Our block infrastructure takes care of the fact that blocks need to be nested properly.

Parsing the Exit While and Continue While Commands

As described in the Goal section of this chapter, the Exit While statement will cause an immediate exit from the While block, and the Continue While will cause a jump to the start of the While block, which means that the condition will be re-evaluated.

At first glance, they seem simple to implement. For Exit While, just emit a jump to the endpoint of the current block. For Continue While, emit a jump to the startpoint. Assuming that these statements are only valid inside a While block, that's quite simple to implement, right?

Well, there is one complication. In practical use, an Exit While or a Continue While will always be conditional (think about this for a minute). And conditional, in the SIC language thus far, means an If block. See the problem? When we meet an Exit While or a Continue While, the current block will be an If block, and thus, we do not have access to the While block's startpoint or endpoint.

This is why our BlockStack class has a method called GetClosestOuterBlock. Given a block type, it will walk the stack searching for a matching block, starting from the current block and going down. If one is found, it will be returned, and then we can use that block's start and end points.

Let's think ahead a little bit. If we create more loop constructs, we will need Exit and Continue structures for them too. And the logic would be the same. So let's build these commands in a generic fashion.

First, a little plumbing. Add the following to Commands.vb, in the "Fields" and "Helper Functions" regions as appropriate:

Private m_loopTable As List(Of String)

Private Sub AddLoop(loopName As String)
    m_loopTable.Add(loopName)
End Sub

Private Function IsValidLoop(loopName As String) As Boolean
    Return m_loopTable.Contains(loopName)
End Function

Private Sub InitLoops()
    m_loopTable = New List(Of String)

    ' Add loops here
    AddLoop("while")
End Sub

This sets up a lookup table for loops, same as commands and types. We need to call the InitLoops method from the constructor of the Parser class. Modify it in Parser.vb as follows:

Public Sub New( _
    ByVal newStream As TextReader, _
    ByVal newGen As CodeGen _
    )

    m_InputStream = newStream
    m_Gen = newGen

    InitTypes()
    InitLoops()
    InitCommands()
End Sub

With that support system in place, we can now write the commands. Add the following to Commands.vb:

Private Function ParseLoopControlCommand() As ParseStatus
    Dim result As ParseStatus

    ' The current token is either Exit or Continue
    Dim cmdName As String = CurrentToken.ToLowerInvariant()

    ' Read the Exit loop type
    SkipWhiteSpace()
    ScanName()

    Dim loopkind As String = CurrentToken.ToLowerInvariant()

    If Not IsValidLoop(loopkind) Then
        result = CreateError(1, "a valid loop type")
    Else
        ' Try to get the block from the stack
        Dim loopBlock As Block = _
                m_BlockStack.GetClosestOuterBlock(loopkind)

        If loopBlock Is Nothing Then
            result = CreateError(6, cmdName & " " & loopkind)
        Else
            result = CreateError(0, "Ok")

            If cmdName = "exit" Then
                ' Emit jump to EndPoint
                m_Gen.EmitBranch(loopBlock.EndPoint)
            ElseIf cmdName = "continue" Then
                ' Emit jump to StartPoint
                m_Gen.EmitBranch(loopBlock.StartPoint)
            Else
                result = CreateError(6, cmdName)
            End if
        End If
    End If

    Return result
End Function

This time, we have written a single parser that will cause slightly different results depending on the command being parsed. We do this because the only difference between Exit and Continue is the point to emit the jump to. The logic, though, is pretty straightforward. The checks made here will also prevent us from compiling a statement like Exit If!

We will hook up this common parser to both Exit and Continue commands by modifying InitCommands. Make the change in Commands.vb as follows:

Private Sub InitCommands()
    m_commandTable = New Dictionary(Of String, CommandParser)

    ' Add commands here
    AddCommand("print",     AddressOf ParsePrintCommand)
    AddCommand("rem",       AddressOf ParseRemCommand)
    AddCommand("comment",   AddressOf ParseCommentCommand)
    AddCommand("end",       AddressOf ParseEndCommand)
    AddCommand("dim",       AddressOf ParseDimCommand)
    AddCommand("var",       AddressOf ParseDimCommand)
    AddCommand("if",        AddressOf ParseIfCommand)
    AddCommand("else",      AddressOf ParseElseCommand)
    AddCommand("elseif",    AddressOf ParseElseIfCommand)
    AddCommand("while",     AddressOf ParseWhileCommand)
    AddCommand("exit",      AddressOf ParseLoopControlCommand)
    AddCommand("continue",  AddressOf ParseLoopControlCommand)
End Sub

Testing the While Loop

Compile and run. Test with this:

int i:=1

while i<=10

    if [i==3] then
        i:=i+2
        continue while
    elseif [i=8]
        exit while
    end if

    print i
    i := i+1
end while

print "Done"

and then with this:

int i:= 30

while i>1
    int j:=10

    print i
    print "-------"

    while j>0
        if j == 5
            j = j - 5
            continue while
        end if

        print j
        j = j - 1
    end while

    print "======="

    i = i / 15
end while

Try introducing errors, such as putting an Exit While or a Continue While outside a While loop. As usual, an accurate error message will be produced by the compiler.

Conclusion

To quote Dr. Jack Crenshaw, "We could stop right here, and have a language that works." Our two constructs, If and While, are enough to take care of any iteration and selection cases that are required in a language like ours. However, most languages provide some more constructs, and so will we. In the next chapter, we will look at some more loop and branch constructs. See you then.


1. This is a simplification. The offset is actually calculated in bytes, starting from the start of the instruction following the jump instruction itself. To properly calculate the offset, we need to know how many bytes are taken up by each instruction and its parameters. Fortunately, because we use the Reflection Emit library to generate CIL, we don't have to calculate the offset ourselves.
2. I first discovered this omission in 2004. In October 2019, I forked the .NET documentation repository on GitHub, added the missing information, and generated a pull request. It's been accepted and merged. The results can be seen here.

results matching ""

    No results matching ""