Foreword. This is my first blog post about protocol specification since I have left Informal Systems. Having a bit of free time on my hands, I want to show everyone how to think about code and protocols with specification languages such as TLA+ and Quint as well as with tools such as Apalache and TLC. There are no patented methods behind. What I am explaining is common knowledge in computer-aided verification and model checking. As is the case with programming languages, you just need to practice and get a bit of expertise in using these tools. Obviously, I am biased towards Quint and Apalache. I believe both projects are great, but it’s not my job to sell them to you :cocktail:.

1. The Story

In this blog post, I am going to tell you a story about how I realized that the Quint simulator had a subtle bug in its pseudorandom number generator, or just PRNG. If you have not heard about Quint yet, check its landing page. In short, Quint is a language and a set of tools for specifying and checking distributed protocols, which is very much inspired by the logic of $\tla{}$. Personally, I am using Quint as replacement of pen and paper, whenever I have to think about tricky code and protocols. Before Quint, I was using $\tla{}$, Apalache, and TLC for the same purpose. The PRNG is an example of tricky code that did not fit into my brain. In the best tradition of computer science, I am applying Quint to reason about a small part of Quint itself.

By the way, Quint license is Apache 2.0 :mega:. This is why I feel free talking about its code, and you can do that too :open_hands:.

Going back to the story, Quint has a randomized simulator. This simulator lets you quickly try thousands of specification runs without thinking too much. This is quite convenient for busy developers. If the simulator finds a bug, you know that the specification is broken. If it does not find a bug, you may want to think about running more advanced tools like a model checker. Many people I know stop after making the simulator happy, introduce randomized tests in their Github actions and hope that the simulator finds a bug in the CI one day, if there is a bug. If this sounds a lot like Property-based testing to you, you are completely right. Popularity of tools like Proptest among engineers led us to the idea of introducing the randomized simulator in Quint. Moreover, Vanlightly and Kuppe’22 presented the idea of randomized simulation at the TLA+ Conference 2022. (They did even more than that by evaluating statistical properties.)

1.1. Why is the Quint Simulator Randomized?

Why do we say that the Quint simulator is randomized? After all, many expressions are completely deterministic and do not require any randomization. For instance, have a look at the following Quint expressions:

2 + 3
2.to(10).forall(i => i > 0)
Set(3, 4, 6).map(i => i * 2)

We can easily evaluate these expressions with Quint REPL:

$ quint
Quint REPL 0.17.1
Type ".exit" to exit, or ".help" for more information
>>> 2 + 3
5
>>> 2.to(10).forall(i => i > 0)
true
>>> Set(3, 4, 6).map(i => i * 2)
Set(12, 6, 8)
>>>

We should get the same output in REPL no matter how many times we are evaluating the expressions. Things are getting more interesting when we look at Quint actions, which may read from and write to state variables. For instance:

$ quint
Quint REPL 0.17.1
Type ".exit" to exit, or ".help" for more information
>>> var temperature: int
>>> // declare actions to increase or decrease the temperature
>>> action inc(i) = { temperature' = temperature + i }

>>> action dec(i) = { temperature' = temperature - i }

>>> temperature' = 0
true
>>> temperature
0
>>> inc(2)
true
>>> temperature
2
>>> inc(3)
true
>>> temperature
5

If you are having hard time when you are trying to understand the above REPL session, Quint has a detailed REPL Tutorial.

Our example above is still perfectly deterministic. Now have a look at this continuation of the REPL session:

>>> any { inc(2), dec(3) }
true
>>> temperature
2
>>> any { inc(2), dec(3) }
true
>>> temperature
4
>>> any { inc(2), dec(3) }
true
>>> temperature
6
>>> any { inc(2), dec(3) }
true
>>> temperature
3

As you can see, in some cases inc(2) has been applied, whereas in other cases, dec(3) has been applied. This is due to the semantics of any: It applies one of the actions that may return true (such actions are called enabled). In our example, any is free to choose between inc(2) and dec(3), and we do not have control over how it chooses inc or dec. In other words, any exhibits control non-determinism. The Quint simulator is a relatively simple tool that resolves non-determinism via random choice between the enabled actions. If you know $\tla{}$, the Quint action any { inc(2), dec(3) } would look like this in $\tla{}$:

$inc(2) \lor dec(3)$

Now let’s try something different:

>>> nondet j = 1.to(2^256).oneOf(); inc(j)
true
>>> temperature
104805969134431997431573760118265476950429701228807521497918427101053555018189
>>> nondet j = 1.to(2^256).oneOf(); inc(j)
true
>>> temperature
116861016918330452735960921129257808479973548559971540977669281888148973858288

This time, we have used the nondet-binding, which exhibits data non-determinism. As you have probably guessed, the simulator has simply picked two random numbers in the range from 1 to $2^{256}$. If you know $\tla{}$, the above nondet-binding could be written as follows in $\tla{}$:

$\exists j \in 1..2^{256}: inc(j)$

1.2. How to Pick Random Numbers?

The title of this section looks silly. Every programming language has a library for generating random numbers, right? Well, we have a few specific requirements:

  1. We want our experiments to be reproducible. Hence, we need not so random “random” numbers, that is, we need pseudo-random numbers as in PRNG. PNRGs are deterministic: If you know the seed, you should be able to reproduce an experiment. Random number generators that read mouse movements or something similar are not good fit for this job. They are great for cryptography algorithms, but not for simulations :game_die:.

  2. As the simulator is exercising multiple runs, it would be great, if we could read the current state of the PRNG and use it as a seed later.

  3. We need a PRNG that is able to generate big integers, that is, we can give a PRNG a bound for every integer to be generated, but we cannot give it an ultimate bound on all integers to be generated.

Luckily for us, Widynski’22 have published their paper on “Squares: A Fast Counter-Based RNG”. This generator ticks off our requirements (1) and (2). Unfortunately, it only produces 32-bit integers. So we are left with (3) as a take-home exercise. I thought that it was an easy exercise, and this led to accidental introduction of a bug :beetle:

1.3. PRNG for Big Integers

We have implemented the generator square64 from the paper by Widynski’22. It produces up to $2^{64}$ pseudo-random 32-bit integers. You can check the square64 code. This code is a simple adaptation of the original C code from the paper. Since Quint is written in TypeScript, our implementation does not seem to be very efficient. We could have implemented it in AssemblyScript or Rust instead. However, square64 did not seem to be a performance bottleneck to us.

Finally, here is my code in TypeScript that implements a PRNG for big integers by calling square64:

export const newRng = (initialState?: bigint): Rng => {
  let state: bigint = initialState ?? BigInt(Math.floor(Math.random() * Number.MAX_SAFE_INTEGER))
  // ...
  return {
    // 2^32 as bigint
    const U32: bigint = 0x100000000n
    // 2^64 as bigint
    const U64: bigint = 0x10000000000000000n
    // Given a bound and the current state, produce a big integer below the bound.
    next: (bound: bigint): bigint => {
      assert(bound > 0n)
      let input: bigint = bound
      let output: bigint = 0n
      let base: bigint = 1n
      while (input >= U32) {
        // produce pseudo-random least significant 32 bits,
        // while shifting the previous output to the left
        output = output * U32 + squares64(state)
        // advance the RNG state, while staying within 64 bits
        state = (state + 1n) % U64
        // forget the least significant 32 bits of the input
        input /= U32
        // shift the base by 32 bits to the left
        base *= U32
      }
      // Now we have to be careful to make `output` in the range [0, bound).
      // Produce 32 bits. Since we are using the modulo operator here,
      // the result is not exactly the uniform distribution.
      // It may be biased towards some values, especially for
      // the small values of `bound`. If this becomes a problem in the future,
      // we should figure out, how to make the distribution uniform.
      output = (squares64(state) % input) * base + output
      // advance the RNG state, while staying within 64 bits
      state = (state + 1n) % U64
      return output
    },
  }
}

As you can see, the code is heavily documented. When I was writing it, it looked obvious to me. Recently, I had a fresh look at it, and something seemed off. What worried me is that the variable bound was not taken into account inside the loop at all. My first hypothesis was that this could produce output larger than bound. Obviously, I was concerned about this property before, and this is why rng.test.ts has a number of tests that check exactly this property. Moreover, we could test the main property of next as an assertion just before the return statement:

    { // ...
      assert(output < bound)
      return output
    }

Such an assertion at the end of a function is usually called a postcondition. If you noticed the statement assert(bound > 0n) in the beginning of the function, this is our precondition. Typically, verification tools introduce domain-specific languages for writing pre- and postconditions.

We already have randomized tests. They periodically run in the CI, but they do not catch any bugs. However, I still do not trust that my code is doing what it is supposed to do. What shall we do here? Twenty years ago, my standard answer would be: Get a piece of paper and a cup of coffee, think about the code. Can we do better, having all these computers around? Of course, it is the year 2024. Yes, I have tried ChatGPT on this code. It did not help me much. Actually, it lied in my face :smile:.

At that point, I did not want to go into the code verification tools. If I wanted to do so, Dafny would be my usual suspect.

Of course, my first thought was: Can we use Quint to check the generator logic? :innocent:

To be continued…

Continue reading in Part 2 of this blog post.


Do you want to receive notifications when I write something new? Subscribe to the newsletter. New blog posts are going to be announced once per week (maybe twice, if I really have something!).

Powered by Buttondown.