verification.tex

\section{Simulating and formally verifying programs}
\label{sec:Verification}

We have developed the types representing the IAM machine and built the facilities
to construct IAM programs. Now we will assign the semantics to every instruction
of the IAM instruction set in terms of how they alter the state of the machine.
That will enable us to simulate the execution of programs and verify their properties
by means of symbolic computation using the Haskell SBV library~\cite{SBV}.
Note that SBV's design allows us to reuse the same Haskell implementation for
both simulation and verification parts of the framework. We achieve this unified
implementation by using symbolic execution with~\emph{literal} (e.g. non-symbolic)
values to simulate programs.

In this section, we will continue to use the example of array summation.

\begin{figure}[H]
\begin{minted}{haskell}
sumArray :: Script
sumArray = do
    load r0 0       -- sum := 0, sum accumulator
    load r2 254     -- i := n,   loop counter
    loop <- label
    store r2 255    -- store the value of i to use it
                    -- for accessing the array
    loadMI r1 255   -- load a[i]
    store r1 254    -- put a[i] to cell 254
    add r0 254      -- sum := sum + a[i]
    add r2 253      -- i := i - 1
    jumpZero 1      -- if i == 0 then halt
    goto loop       -- else continue the loop
    halt
\end{minted}
\caption{Sum an array of numbers}
\label{arraySum}
\end{figure}

Firstly, we will simulate the execution of the program considering the concrete
values to retrieve the programs final state. Then, we will verify that the program
indeed calculates the sum and does it in a ahead-known amount of clock ticks.

\subsection{Program simulation}

The program execution simulator currently provides only a limited functionality
of supplying an initial state of the machine, executing the program up to a
desired amount of state transitions, and observing the final state.

To construct an initial state, the user needs to supply an assembly program and
a memory layout. Then the~\mintinline{haskell}{runModel} needs to be called
with two parameters: a maximum number a steps and an initial state.

\begin{figure}[H]
\begin{minted}{haskell}
let mem = initialiseMemory
    [(1, 1), (2, 2), (3, 3), (253, -1), (254, 3)]
    finalState = runModel 1000 $ templateState sumArray mem
\end{minted}
\caption{Sum an array of numbers}
\end{figure}

Since the memory contains only concrete values, the~\mintinline{haskell}{runModel}
function performs the simulation~--- a sequence of state transitions. To retrieve
the values we are interested in, we need to access the part of the specific components
of the final state. In this case, the program runs for 93 clock cycles, yielding
the result --- a number 6 --- in the register~\mintinline{haskell}{r0}.

The program simulator needs further development. The next goal is to build an
interactive turn-based simulator displacing the current state of the machine
on every step.

\subsection{Program verification}

At this point, we have an assembly language to write programs for IAM machine
and a model of machine execution. One of the main objectives of the whole
developments is to be able to use a SAT/SMT solver to formally verify properties
of IAM programs. Using the SBV framework, we can formulate theorems about IAM programs,
which then get translated into SMT-LIB input/output language~\cite{smtlib} for
SMT solvers and are automatically checked for satisfiability.

As an example, we would like to prove the following theorem of the earlier considered
summation program (Fig.~\ref{arraySum}): given a fixed sequence of any numbers located in memory,
the program indeed calculates their sum. Note that we do not track integer overflow.

We define the theorem as a Haskell function parametrised by the length of a sequence.
Obviously, the proposition does not make sense for all integer numbers, but that is not our
consideration at this point, because currently we do not quantify on the length of the
sequence.

\begin{minted}{haskell}
theoremSumArray :: Int -> SBool
theoremSumArray n = do ...
\end{minted}

Now, inside the theorem environment, we universally quantify on a
sequence of free variables.

\begin{minted}{haskell}
   summands <- mkForallVars n
\end{minted}

Then we are putting the newly generated symbolic variables into memory. We also
initialise the loop counter (cell 254) and a constant to decrement the counter.

\clearpage

\begin{minted}{haskell}
    let memory = initialiseMemory $
            [(0, 0)]
         ++ zip [1..] summands
         ++ [ (253, -1)
            , (254, fromIntegral n)
            ]
\end{minted}

Finally, we can run the model and formulate the theorem statement.

\begin{minted}{haskell}
    let finalState = runModel 1000 $ templateState sumArray memory
        result = readArray (registers finalState) 0
    pure $ result .== sum summands &&& clock finalState .< 10000
\end{minted}

In order to ask SBV to communicate with an SMT solver to check the property,
we run the~\mintinline{haskell}{prove} function with a concrete length of a sequence.

\begin{minted}{haskell}
prove (theoremSumArray 252)
\end{minted}

As a result, we would have a dump of SMT-LIB communication with the solver and
a theorem result. On a regular workstation this theorem got translated into about
750 SMT-LIB clauses and proved in 250 milliseconds.

This simple example shows that with the developed verification framework we can
check properties of sophisticated-enough programs. However, a more advanced
property to check would be one that the program finds the correct sum for every
sequence of numbers of length from zero to two hundred twenty two. Universally
quantifying on the length of the sequence would involve dealing with
the~\emph{symbolic termination} problem. In our case, this problem should be solvable,
because the memory is finite, thus the program will terminate in a finite number of
steps and the termination would be determined by an external non-symbolic step counter.
But, unfortunately, the current state of implementation does not allow to check such a
property. Support for theorems of the described kind is the subject of ongoing work.