A tutorial quantum interpreter in 150 lines of Lisp

By Robert Smith

Simulating a universal, gate-based quantum computer on a classical
computer has many uses and benefits. The top benefit is the ability to
inspect the amplitudes of the system’s state directly. However, while
the mathematics is very well understood, implementing a
general-purpose simulator has largely been folk knowledge. In this
tutorial, we show how to build an interpreter for a general-purpose
quantum programming language called $mathscr{L}$, capable of
executing most kinds of quantum circuits found in literature. It is
presented economically, allowing its implementation to take fewer than
150 lines of self-contained Common Lisp code. The language
$mathscr{L}$ is very simple to extend, making the interpreter ripe
for testing different kinds of behavior, such as noise models.

Introduction

Simulating the workings of an ideal quantum computer has many
important applications, such as algorithms research and quantum
program debugging. A variety of quantum computer simulators exist,
both free and commercial. However, while the concept of the simulation
of quantum computers is generally well understood at a high level, the
devil is in the details when it comes to implementation.

Quantum computer simulators found in the wild often have many
limitations. The most prevalent limitation is the number of qubits an
operator can act on. Usually, one-qubit gates and controlled
one-qubit¹ gates are allowed, but nothing more. While these
together are sufficient for universal quantum computation, it leaves
much to be desired when studying quantum algorithms.

In this post, we present an implementation of a fully general quantum
programming language interpreter, allowing measurement as well as
arbitrary unitary operators on an arbitrary number of arbitrarily
indexed qubits. The implementation weighs in at under 150 lines²
of code in Common Lisp, though the ideas make implementation simple in
other languages as well. All of the code from this tutorial can be
found on
GitHub.

This tutorial is aimed at a quantum computing beginner who has some
familiarity with the fundamentals of linear algebra and computer
programming. Beyond those subjects, this tutorial is relatively
self-contained. We also aim this tutorial at practitioners of quantum
computing, who are interested in the brass tacks of simulation, with
all of the details filled out. To such practitioners, the bulk of this
document will be easy to skim, since we recapitulate topics such as
qubits and unitary operators.

A note about Common Lisp

We use Common Lisp, because it is an excellent platform for both
exploratory and high-performance computing. One of the fastest and
most flexible quantum simulators out there, the Quantum Virtual
Machine, is written entirely in
Common Lisp.

We wrote this article so that it would be easy to follow along with a
Common Lisp implementation. The code has no dependencies, and should
work in any ANSI-compliant implementation (I hope).

With that said, this article was also written with portability in
mind. Since no especially Lisp-like features are used, the code should
be easy to port to Python or even C. At minimum, your language should
support complex numbers and arrays.

A note to experienced quantum computing practitioners

This section is written for experienced practitioners of quantum
computing who happened upon this post, and can be skipped.

In this post, we opt to simulate a quantum circuit the “Schrodinger”
way, that is, by evolving a wavefunction explicitly. For a circuit of
width $n$, we walk through the mathematics of how to interpret a
$k$-qubit gate $g in mathsf{SU}(2^k)$ for $kle n$, specified to act
on a $k$-tuple of numbered qubits—corresponding to each qubit’s
position in the tensor product which forms the Hilbert space of the
system—as a full operator $g’inmathsf{SU}(2^n)$. We do this by
providing an explicit construction of the matrix in the computational
basis of the system.

An alternative approach would have been to describe the action of a
$g$ on an $n$-qubit wavefunction by way of careful manipulation of
indexes, i.e., to effectively permute and partition our wavefunction
into $2^{n-k}$ groups of $2^k$-dimensional vectors corresponding to
the subsystem of qubits being operated on. The major benefit of this
approach is efficiency.

As a first introduction to a computer science graduate, I find this
explanation lacking in two ways:

It under-emphasizes that a gate like $mathsf{CNOT}$, typically
written as a $4times 4$ matrix $mathsf{I}oplusmathsf{X}$, in a
quantum circuit truly is a linear operator on the Hilbert space of
the entire system. “It’s just linear algebra; here’s the matrix and
here’s the vector” is a point I want to drive home.
It requires significant labor to both explain and prove the
correctness of the method, without having significant experience with
tensor algebra, contractions, Einstein notation, and so on.

The approach of this post can be used as a basis to follow up with
more efficient techniques, without relinquishing a strong mathematical
foundation. We are very careful to not be hand-wavy, and to not
conflate the different vector spaces at play. We hope that you’ll find
this approach agreeable, even if it sacrifices some efficiency.

The Language $mathscr{L}$

We wish to construct an interpreter for a small quantum programming
language named $mathscr{L}$. This language supports
both of the fundamental operations of a quantum computer: gates and
measurements.

A gate is an operation that modifies a quantum state. (What a
quantum state is exactly we will delve into later.) Because quantum
states are large compared to the physical resources used to construct
them, gates represent the “powerful” operations of a quantum
computer.

A measurement is an observation and collapse of the quantum state,
producing one bit (i.e., $0$ or $1$) of classical information per
qubit. Measurements represent the only way in which one can extract
information from our simulated quantum computer, and indeed, in most
programming models for real quantum computers.

In some sense, one might think of the language $mathscr{L}$ as the
simplest non-trivial quantum programming language. A program in
$mathscr{L}$ is just a sequence of gates and measurements. The syntax
is as follows:

Non-Terminal		Defintion
program	:=	`(` instruction* `)`
instruction	:=	`(` `GATE` matrix qubit+ `)`
	\|	`(` `MEASURE` `)`
matrix	:=	a complex matrix `#2A(` … `)`
qubit	:=	a non-negative integer

Spaces and newlines are ignored, except to delimit the tokens of our
language.

We borrow Common Lisp’s two-dimensional array syntax for the syntax of
matrices. In Common Lisp, the matrix $left(begin{smallmatrix}1 &
2\3 & 4end{smallmatrix}right)$ is written #2A((1 2) (3 4)). We
also borrow the syntax for complex numbers: $1-2i$ is written #C(1 -2).

An example program might be one to construct and subsequently measure
two qubits labeled 2 and 5 in a Bell state configuration:

(
 (GATE #2A((0.70710677 0.70710677) (0.70710677 -0.70710677)) 2)
 (GATE #2A((1 0 0 0) (0 1 0 0) (0 0 0 1) (0 0 1 0))          2 5)
 (MEASURE)
)

We will model the semantics of $mathscr{L}$ operationally, by way of an abstract machine. The abstract machine for $mathscr{L}$ is called $M_n$, where $n$ is a positive but fixed³ number of qubits. The state of the machine $M_n$ is the pair $(v, b)$ where $v$ is a quantum state, and $b$ is an $n$-bit measurement register.

The quantum state is an element of the set

$${Vert vVert=1mid vinmathbb{C}^{2^n}}.$$

In other words, $v$ is a unit vector of dimension $2^n$ over the
complex numbers. We will discuss this from first principles in the
next section.

The measurement register is an element of the set ${0,1}^n$, i.e.,
a sequence of $n$ bits, which we realize as a non-negative
integer. The $k$th least-significant bit of this integer represents
the last observation of the qubit numbered as $k$. We will discuss
this in detail as well.

In Common Lisp, it suffices to create a structure machine which holds these two pieces of state.

(defstruct machine
  quantum-state
  measurement-register)

Typically, the machine is initialized with each classical bit in the
measurement register $0$, and each qubit starting in the
zero-state. (However, for the purposes of algorithm study or
debugging, the machine may be initialized with any valid state.)

The precise way in which the language $mathscr{L}$ is interpreted on
$M_n$ is what we describe in this tutorial. Before that, however, we
find it most important to describe what exactly a quantum state is,
and how to represent it on a computer.

The Quantum State

Where does one qubit live?

Quantum computers are usually just a collection of interacting computational elements called qubits. A single qubit has two distinguished states: $ket{0}$ and $ket{1}$. If the qubit has a name like $q$, then we label the states $ket{0}_q$ and $ket{1}_q$.

The funny notation is called Dirac notation or braket notation. It happens to be a convenient notation for doing calculations in quantum mechanics, and we just use it for consistency with other texts. The ket $ket{cdots}$, as a physicist would call it, doesn’t add any special significance, except to denote that the quantity is a vector. One can actually put anything inside the brackets. In usual linear algebra, one often writes $mathbf{e}_i$ to denote a basis vector, where in quantum mechanics, one just writes the subscript in a ket $ket{i}$, dropping the $mathbf{e}$ entirely. If the notation throws you off, and you’d like to think in more traditional written linear algebra notation, you can always replace $ket{x}$ with $vec x$, and you’ll be safe.

These distinguished states $ket{0}$ and $ket{1}$ are understood to be orthonormal basis vectors in a vector space whose scalars are complex numbers $mathbb{C}$. As such, a qubit can be $ket{0}$, $ket{1}$, or a superposition $alphaket 0 + betaket 1$, where $alpha$ and $beta$ are complex numbers. The numbers $alpha$ and $beta$ are called probability amplitudes, because $vertalphavert^2$ (resp. $vertbetavert^2$) represent the probability of the qubit being observed in the $ket 0$ (resp. $ket 1$) state. Since they represent probabilities, there’s an additional constraint, namely that the probabilities add to one: $vertalphavert^2 + vertbetavert^2=1$.

To those unfamiliar, it may not be obvious why we’ve opted to use the
language of linear algebra. Why do we consider a qubit as being a
linear combination? Why do we suppose that the observable states are
orthonormal vectors? Why can’t we simply say that a qubit is just a
pair of complex numbers and move on?

The reason for this is scientific, and not mathematical. It turns out that the best theory of quantum mechanics we have is one which describes transformations between states as being linear. In fact, the evolution of a quantum mechanical system is not only described by an operation that is just linear, but also reversible. These conditions—linear, reversible, and length-preserving—give rise to a special class of transformations called unitary operators, which naturally lead us to the discussion of vector spaces over complex numbers⁴.

We will discuss the nature of these operations in more depth when we consider how to implement gates later on. For now, however, it’s sufficient to think of a qubit named $q$ as something that lives in a complex, two-dimensional vector space, which we will call $$B_q := operatorname{span}_{mathbb{C}}{ket 0_q, ket 1_q}.$$ (We will use this $B_q$ notation a few times throughout this tutorial. Remember it!) We also understand that this space is equipped⁵ with a way to calculate lengths of vectors—the usual norm

$$
leftVertalphaket{0}+betaket{1}rightVert = sqrt{vertalphavert^2+vertbetavert^2}.
$$

Many qubits

Roughly speaking, a single qubit can be described by two
probabilities. How do we deal with more?

Suppose we have two qubits named $X$ and $Y$. As a pair, quantum
mechanics tells us that they can interact. Practically, what
that means is that their states can be correlated in some way. If
they’ve interacted, knowing information about $X$ might give us a clue
about what $Y$ might be. One well-known example of this is the
Bell state, which can be summarized as follows:

Qubit $X$	Qubit $Y$	Prob. Amp.	Probability
$ket 0_X$	$ket 0_Y$	$1/sqrt{2}$	$50%$
$ket 0_X$	$ket 1_Y$	$0$	$0%$
$ket 1_X$	$ket 0_Y$	$0$	$0%$
$ket 1_X$	$ket 1_Y$	$1/sqrt{2}$	$50%$

Here, we have an example of a non-factorizable state; qubits $X$ and $Y$ are correlated to each other dependently. If we know $X$ is in the $ket 0_X$ state, then we necessarily know that $Y$ is in the $ket 0_Y$ state. Such a correlation means it’s not possible to express the probabilities independently. It might be tempting to think that one can simply think of $X$ having a $50%$ probability of being in either basis state, and $Y$ having a $50%$ probability of being in either state—facts which are certainly true—but considering those independently would give us a different distribution of probabilities of the system:

Qubit $X$	Qubit $Y$	Probability
$ket 0_X$	$ket 0_Y$	$P(X=ket 0_X)P(Y=ket 0_Y)=25%$
$ket 0_X$	$ket 1_Y$	$P(X=ket 0_X)P(Y=ket 1_Y)=25%$
$ket 1_X$	$ket 0_Y$	$P(X=ket 1_X)P(Y=ket 0_Y)=25%$
$ket 1_X$	$ket 1_Y$	$P(X=ket 1_X)P(Y=ket 1_Y)=25%$

This state is called factorizable because we can express each
probability as a product of probabilities pertaining to the original
qubits, i.e., each probability has a form that looks like
$P(X)P(Y)$. Note that here, knowing something about $X$ gives us no
information about $Y$, since they’re completely independent. With that
said, it should be emphasized that factorizable states are perfectly
valid states, but they don’t represent the entirety of possible
states.

If qubits $X$ and $Y$ live in the linear spaces $B_X$ and $B_Y$ respectively, then the composite space is written $B_Xotimes B_Y$. This is called a tensor product, which is a way to combine spaces with the above structure. Formally, if we have an $m$-dimensional vector spaces $V:=operatorname{span}{v_1,ldots,v_m}$ and an $n$-dimensional vector space $W:=operatorname{span}{w_1,ldots,w_n}$, then their tensor product $T:=Votimes W$ will be an $mn$-dimensional vector space $operatorname{span}{t_1,ldots,t_{mn}}$, where each $t_i$ is a formal combination of basis vectors from $V$ and $W$. (There are of course $mn$ different combinations of $v$’s and $w$’s.) To give an example without all the abstraction, consider $V$ with a basis ${vec x, vec y, vec z}$ and $W$ with a basis ${vec p, vec q}$. Then $Votimes W$ will have a basis

$$
left{
begin{array}{lll}
vec xotimesvec p, & vec yotimesvec p, & vec zotimesvec p, \
vec xotimesvec q, & vec yotimesvec q, & vec zotimesvec qhphantom{,}
end{array}
right}.
$$

An example vector in the space $Votimes W$ might be

$$
-i(vec xotimesvec p) – 2(vec yotimesvec p) + 3 (vec zotimesvec p) +
frac{1}{4}(vec xotimesvec q) – sqrt{5}(vec yotimesvec q) + e^{6pi}(vec zotimesvec q),
$$

assuming these vector spaces are over $mathbb{C}$.

Intuitively, a tensor product “just” gives us a way to associate a number with each possible combination of basis vector. In our case, we need to associate a probability amplitude with each combination of distinguished qubit basis states. We need this ability since—as we’ve established—we need to consider every possible holistic outcome of a collection of qubits, as opposed to the outcomes of the qubits independently. (The former constitute both factorizable and non-factorizable states, while the latter only include factorizable states.)

Bit-String notation and a general quantum state

If we have qubits $X$, $Y$, and $Z$, then they’ll live in the space $B_Xotimes B_Yotimes B_Z$, which we’ll call $Q_3$. It will be massively inconvenient to write the basis vectors as, for example, $ket 0_Xotimes ket 1_Yotimesket 1_Z$, so we instead use the shorthand $ket{011}$ when the space has been defined. This is called bit-string notation. A general element $ketpsi$ of $Q_3$ can be written $$psi_0ket{000}+psi_1ket{001}+psi_2ket{010}+psi_3ket{011}+psi_4ket{100}+psi_5ket{101}+psi_6ket{110}+psi_7ket{111}.$$ There are two substantial benefits from using bit-string notation. These benefits are much more thoroughly explained in this paper—which was a precursor to this very blog post.

The first benefit is that the names of the qubits—$X$, $Y$, and $Z$—have been abstracted away. They’re now just positions in a bit-string, and we can canonically name the qubits according to their position. We record positions from the right starting from zero, so $X$ is in position $2$, $Y$ is in position $1$, and $Z$ is in position $0$.

The second benefit is one relevant to how we implement quantum states on a computer. As written, the probability amplitude $psi_i$ has an index $i$ whose binary expansion matches the bit-string of the basis vector whose scalar component is $psi_i$. This is no accident. The main outcome of this is that we can use a non-negative integer as a way of specifying a bit-string, which also acts as an index into an array of probability amplitudes. So for instance, the above state can be written further compactly as $$ketpsi=sum_{i=0}^7psi_iket i.$$ Here, $ket i$ refers to the $i$th bit-string in lexicographic (“dictionary”) order, or equivalently, the binary expansion of $i$ as a bit-string.

Since qubits live in a two-dimensional space, then $n$ qubits will live in a $2^n$-dimensional space. With a great deal of work, we’ve come to our most general⁶ representation of an $n$-qubit system: $$sum_{i=0}^{2^n-1}psi_iket i,$$ where $vertpsi_ivert^2$ gives us the probability of observing the bit-string $ket i$, implying $$sum_{i=0}^{2^n-1}vertpsi_ivert^2=1.$$

On a computer, representing a quantum state for an $n$-qubit system is simple: It’s just an array of $2^n$ complex numbers. An index $i$ into the array represents the probability amplitude $psi_i$, which is the scalar component of $ket{i}$. So, for instance, the state $ket{000}$ in a 3-qubit system is represented by an array whose first element is $1$ and the rest $0$. Here is a function to allocate a new quantum state of $n$ qubits, initialized to be in the $ket{ldots 000}$ state:

(defun make-quantum-state (n)
  (let ((s (make-array (expt 2 n) :initial-element 0.0d0)))
    (setf (aref s 0) 1.0d0)
    s))

Sometimes, given a quantum state, or even an operator on a quantum
state, we will want to recover how many qubits the state represents,
or the operator acts on. In both cases, the question reduces to
determining the number of qubits that a dimension represents. Since
our dimensions are always powers of two, we need to compute the
equivalent of a binary logarithm. In Common Lisp, we can compute this
by computing the number of bits an integer takes to represent using
integer-length. The number $2^n$ is always a 1 followed by $n$
0’s, so the length of $2^n$ in binary is $n+1$.

(defun dimension-qubits (d)
  (1- (integer-length d)))

Evolving the quantum state

Since the quantum state is a vector, the principal way we change it is
through linear operators represented as matrices. As our quantum
program executes, we say that the quantum state
evolves. Matrix–vector multiplication is accomplished with
apply-operator and matrix–matrix multiplication is accomplished
with compose-operators. There is nothing special about these
functions; they are the standard textbook algorithms.

(defun apply-operator (matrix column)
  (let* ((matrix-size (array-dimension matrix 0))
         (result (make-array matrix-size :initial-element 0.0d0)))
    (dotimes (i matrix-size)
      (let ((element 0))
        (dotimes (j matrix-size)
          (incf element (* (aref matrix i j) (aref column j))))
        (setf (aref result i) element)))
    (replace column result)))

(defun compose-operators (A B)
  (destructuring-bind (m n) (array-dimensions A)
    (let* ((l (array-dimension B 1))
           (result (make-array (list m l) :initial-element 0)))
      (dotimes (i m result)
        (dotimes (k l)
          (dotimes (j n)
            (incf (aref result i k)
                  (* (aref A i j)
                     (aref B j k)))))))))

These functions will sit at the heart of the interpreter, which will
be elaborated upon in the section about gates.

Measurement

Already, through the construction of our quantum state, we’ve
discussed the idea that the probability amplitudes imply a probability
of observing a state. Measurement then amounts to looking at a quantum
state as a discrete probability distribution and sampling from it.

Measurement in quantum mechanics is side-effectful; observation of a
quantum state also simultaneously collapses that state. This means
that when we measure a state to be a bit-string, then the state will
also become that bit-string, zeroing out every other component in the
process.

We thus implement the process of measurement in two steps: The
sampling of the state followed by its collapse.

(defun observe (machine)
  (let ((b (sample (machine-quantum-state machine))))
    (collapse (machine-quantum-state machine) b)
    (setf (machine-measurement-register machine) b)
    machine))

Note that we’ve recorded our observation into the measurement register. We now proceed to define what we mean by sample and collapse.

How shall we sample? This is a classic problem in computer science. If we have $N$ events ${0, 1,ldots,N-1}$, such that event $e$ has probability $P(e)$, then we can sample as follows. Consider the partial sums defined by the recurrence $S(0)=0$ and $S(k)=S(k-1) + P(k-1)$. If we draw a random number $r$ uniformly from $[0,1)$, then we wish to find the $k$ such that $S(k)leq r < S(k+1)$. Such a $k$ will be a sampling of our events according to the imposed probability distribution.

We can implement this simply by computing successive partial sums, until our condition is satisfied. In fact, we can be a little bit more resourceful. We can find when $r-S(k+1)<0$, which amounts to successive updates $rleftarrow r-P(k)$.

With a quantum system, we have $P(ket i) = vertpsi_ivert^2$, and the sampled $k$ is the bit-string $ket k$ we find.

Let’s do an example. Suppose we have a quantum state

$$
sqrt{0.2}ket{00} – sqrt{0.07}ket{01} + sqrt{0.6}ket{10} + sqrt{0.13}ket{11}.
$$

Then our discrete probability distribution is:

$$
P(ket{00}) = 0.2qquad P(ket{01}) = 0.07qquad P(ket{10}) = 0.6qquad P(ket{11}) = 0.13
$$

Next, suppose we draw a random number $r = 0.2436$. We first check if $r < 0.2$. It’s not, so $ket{00}$ is not our sample. Subtract it from $r$ to get $r = 0.0436$. Next check if $r < 0.07$. Yes, so our sample is $ket{01}$. Pictorially, this looks like the following:

A process of selecting a random sample.

The implementation is straightforward:

(defun sample (state)
  (let ((r (random 1.0d0)))
    (dotimes (i (length state))
      (decf r (expt (abs (aref state i)) 2))
      (when (minusp r) (return i)))))

Collapsing to $ket k$ is simply zeroing out the array and setting $psi_k$ to $1$.

(defun collapse (state basis-element)
  (fill state 0.0d0)
  (setf (aref state basis-element) 1.0d0))

Gates

Gates as matrices

Gates are the meat of most quantum algorithms. They represent the
“hard work” a quantum computer does. As previously described, a gate
$g$ is a transformation that is linear, invertible, and
length-preserving.

Linear: $g(aketpsi+bketphi)=ag(ketpsi)+bg(ketphi)$.
Invertible: There is always an operation $h$ that can cancel out the effect of $g$: $h(g(ketpsi))=g(h(ketpsi))=ketpsi$.
Length-Preserving: $Vert g(ketpsi)Vert = VertketpsiVert$.

These ideas are captured by an overarching idea called a linear isometry, which comes from the Greek word isometria, with isos meaning “equal” and metria meaning “measuring”. As with all linear transformations, we can write them out as a matrix with respect to a particular basis. Matrices representing linear isometries are called unitary matrices⁷.

The simplest gate must be identity, a gate which does nothing.

$$
mathsf{I} := begin{pmatrix}
1 & 0\
0 & 1
end{pmatrix}
$$

In Common Lisp, this would be defined as

(defparameter +I+ #2A((1 0)
                      (0 1)))

which we will make use of later. Just a notch higher in complexity
would be the quantum analog of a Boolean “NOT”. This is called the
$mathsf{X}$ gate:

$$
mathsf{X} := begin{pmatrix}
0 & 1\
1 & 0
end{pmatrix}.
$$

This has the effect of mapping $mathsf{X}ket 0=ket 1$, which means
directly that $mathsf{X}ket 1=ket 0$ and therefore it is its own
inverse: $mathsf{X}mathsf{X} = mathsf{I}$ so $mathsf{X}=mathsf{X}^{-1}$.

We suggest re-reviewing how one interprets a matrix as an explicit
mapping of each element of the basis, as it helps make sense of
gates. In this tutorial, gate matrices are always specified in terms
of the bit-string basis

$$
{ket{ldots000}, ket{ldots001}, ket{ldots010}, ket{ldots011}, ldots}.
$$

We again refer the reader to this
paper for an in-depth discussion
about this basis.

In the rest of this section, the whole goal is to be able to apply
gates to our quantum state. There are two cases of pedagogical and
operational interest: the one-qubit gate and the many-qubit gate. We
will write two functions to accomplish each of these, in order to
implement a general function called apply-gate for applying any kind
of gate on any collection of qubits for any quantum state.

(defun apply-gate (state U qubits)
  (assert (= (length qubits) (dimension-qubits (array-dimension U 0))))
  (if (= 1 (length qubits))
      (%apply-1Q-gate state U (first qubits))
      (%apply-nQ-gate state U qubits)))

Gates on multi-qubit machines

If we are working with the machine $M_n$, then our space is $2^n$-dimensional, and as such, our matrices would be written out as $2^ntimes 2^n$ arrays of numbers. If we can write out such a matrix, then applying it is as simple as a matrix–vector multiplication. For instance, for a $4$-qubit machine, an $mathsf{X}$ on qubit $0$ would be written

$$
begin{pmatrix}
0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\
1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\
0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\
0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\
0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\
0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\
0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0 & 0 & 0\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0 & 0 & 0\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0 & 0 & 0\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1\
0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 0 & 1 & 0
end{pmatrix},
$$

which could be readily applied to a $16$-element quantum state vector. It is easy to verify that this will swap the components of $ket{ldots 0}$ with the corresponding components of $ket{ldots 1}$.

But as should be plainly obvious from the obnoxious amount of paper wasted by writing out this matrix, it would be better if we could simply generate this matrix with just three pieces of information: the gate matrix $g=left(begin{smallmatrix}0 & 1\1 & 0end{smallmatrix}right)$, the qubit index $i=0$, and the size of the machine $n=4$. This is a process we will call lifting.

Lifting requires a fundamental tool for constructing operators on spaces that were formed out of tensor products. If we have two finite-dimensional vector spaces $U$ and $V$, and operators $f$ and $g$ on the spaces respectively, then it seems reasonable to consider how $f$ and $g$ transform $Uotimes V$. In some sense, applying $f$ and $g$ “in parallel” on $Uotimes V$ correspond to a new linear operator $h$. If $f$ and $g$ are matrices, then $h$ is defined by a block matrix

$$
begin{equation}
h_{i,j} = f_{i,j} g.
label{eq:kron}
end{equation}
$$

More specifically, let $0 leq i,j < dim U$. The matrix $h$ will be an array of $dim U times dim U$ copies of $g$, where the entries of the $(i,j)$th blocks are multiplied by the single scalar $f_{i,j}$. This will lead to a matrix with $(dim U)(dim V)$ rows and columns, which is exactly the dimension of $Uotimes V$. Incidentally, we write $h$ as $fotimes g$, and this combination of operators is called the Kronecker product⁸. As code:

(defun kronecker-multiply (A B)
  (destructuring-bind (m n) (array-dimensions A)
    (destructuring-bind (p q) (array-dimensions B)
      (let ((result (make-array (list (* m p) (* n q)))))
        (dotimes (i m result)
          (dotimes (j n)
            (let ((Aij (aref A i j))
                  (y (* i p))
                  (x (* j q)))
              (dotimes (u p)
                (dotimes (v q)
                  (setf (aref result (+ y u) (+ x v))
                        (* Aij (aref B u v))))))))))))

As a matter of terminology, remember that tensor products combine
vector spaces, and Kronecker products combine operator matrices.

Single-qubit gates and gates on adjacent qubits

From here, we can very easily lift one-qubit gates to machines with
any number of qubits. A gate $g$ on qubit $i$ in an $n$-qubit machine
is just $g$ applied to qubit $i$ and the identity $mathsf{I}$ on all
other qubits. Writing this out as a Kronecker product, we have

$$
begin{equation}
operatorname{lift}(g, i, n) :=
underbrace{mathsf{I} otimes mathsf{I} otimes cdots}_{n-i-1text{ factors}}
otimes g otimes
underbrace{cdots otimes mathsf{I}}_{itext{ factors}},
label{eq:liftone}
end{equation}
$$

where there are a total of $n$ factors, and $g$ is at positioned $i$ factors from the right.

This concept generalizes to higher-dimensional operators which act on index-adjacent qubits. In other words, if $g$ is a $k$-qubit operator specifically acting on qubits

$$
(i+k-1, i+k-2, ldots, i+2, i+1, i),
$$

then the lifting operator from eqref{eq:liftone} is much the same:

$$
begin{equation}
operatorname{lift}(g, i, n) := underbrace{mathsf{I} otimes mathsf{I} otimes cdots}_{n-i-ktext{ factors}}
otimes g otimes
underbrace{cdots otimes mathsf{I}}_{itext{ factors}}.
label{eq:liftmany}
end{equation}
$$

It must be emphasized one last time: This only works for multi-qubit operators that act on qubits that are index-adjacent. We will get to how to work with non-adjacent qubits shortly, but first we will turn this into code.

For simplicity, we create a way to iterate a Kronecker product
multiple times, that is, compute

$$
underbrace{gotimes cdots otimes g}_{ntext{ factors}},
$$

which is usually simply written $g^{otimes n}$. We must use care when
handling the case when we are “Kronecker exponentiating” by a
non-positive number, so that $fotimes g^{otimes 0} = f$.

(defun kronecker-expt (U n)
  (cond
    ((< n 1) #2A((1)))
    ((= n 1) U)
    (t (kronecker-multiply (kronecker-expt U (1- n)) U))))

With kroncker-expt, we can write lift following eqref{eq:liftmany}:

(defun lift (U i n)
  (let ((left  (kronecker-expt +I+ (- n i (dimension-qubits
                                           (array-dimension U 0)))))
        (right (kronecker-expt +I+ i)))
    (kronecker-multiply left (kronecker-multiply U right))))

Multi-qubit gates on non-adjacent qubits

In this section, we assume we are working on a multi-qubit machine
$M_n$ with $nge 2$.

The general idea

So far, we’ve managed to get away with lifting operators that act on
either a single qubit, or a collection of index-adjacent qubits. This
has been more-or-less trivial, because we can tack on a series of
identity operators by way of Kronecker products to simulate “doing
nothing” to the other qubits. However, if we want to apply a
multi-qubit gate to a collection of qubits that aren’t index-adjacent,
we have to be a little more clever.

The way we accomplish this is by swapping qubits around so that we can
move in and out of index-adjacency. In fact, for a given gate acting
on a given collection of qubits, we aim to compute an operator $Pi$
which moves these qubits into index-adjacency, so that we can compute

$$
begin{equation}
Pi^{-1} operatorname{lift}(g, 0, n) Pi.
label{eq:upq}
end{equation}
$$

This recipe requires many ingredients, each of which we describe in
detail.

Swapping two qubits

To start, we need some way to swap the state of two qubits. We can do
this with the $mathsf{SWAP}$ operator:

$$
mathsf{SWAP} := begin{pmatrix}
1 & 0 & 0 & 0\
0 & 0 & 1 & 0\
0 & 1 & 0 & 0\
0 & 0 & 0 & 1
end{pmatrix}.
$$

In Common Lisp, we define this in the same way we defined +I+.

(defparameter +SWAP+ #2A((1 0 0 0)
                         (0 0 1 0)
                         (0 1 0 0)
                         (0 0 0 1)))

The $mathsf{SWAP}$ operator takes two qubits and swaps their
state. What does this mean in a system of correlations, where qubit
state isn’t strictly compartmentalized (i.e., factorized)? Swapping is
equivalent to swapping the component of $ket{01}$ with the component
of $ket{10}$, which are the only two distinguishable
correlations⁹. Still, in a multi-qubit system, we can’t
immediately arbitrarily swap two qubits with the tools we’ve
developed. What we can do is swap index-adjacent qubits. In
particular, we can define the transpositions

$$
tau_i := operatorname{lift}(mathsf{SWAP}, i, n),qquad text{with }0leq i < n - 1. $$

The transposition $tau_i$ swaps qubit $i$ with qubit $i+1$. This is
our first ingredient.

Re-arranging qubits to be index-adjacent

The second ingredient is a way to re-arrange our qubits so that they
are index-adjacent. Suppose we have a three-qubit operator $g$ which
acts on qubits $(2, 4, 3)$ in a machine of $n=5$ qubits. The space in
which the quantum state of $M_5$ lives is

$$
B_4 otimes B_3 otimes B_2 otimes B_1 otimes B_0,
$$

but we need to re-arrange our state vector as if we’ve moved $B_2to
B_0$, $B_4to B_1$, and $B_3to B_2$ so that our sub-state sits
index-adjacent. In combinatorics, this permutation is written in
two-line notation

$$
begin{pmatrix}
0 & 1 & 2 & 3 & 4\
3 & 4 & 0 & 2 & 1
end{pmatrix}.
$$

Here, we’ve made a few arbitrary decisions. First, we’ve decided to
re-map a $k$-qubit operator to the $B_{k-1}otimescdotsotimes
B_1otimes B_0$ subspace. Any other index-adjacent subspace would
work, but this simplifies the code. Second, we see that $0mapsto 3$
and $1mapsto 4$, but it doesn’t matter so much where they map to, as
long as $2$, $4$, and $3$ are mapped correctly.

There’s no sense in writing the first line in two-line notation, so we
just write the permutation compactly as $34021$. As a quantum
operator, we write this as $Pi_{34021}$.

The question is: How can we write $Pi_{34021}$ as familiar operators?
It is a well-known fact in combinatorics that any permutation can be
decomposed into a composition of swaps, and every swap can be
decomposed into a series of adjacent transpositions. We leave this as
an exercise¹⁰, but we will show the code to our implementation.

We start with a function which takes a permutation written as a list,
like (3 4 0 2 1), and converts it to a list of (possibly
non-adjacent) transpositions to be applied left-to-right, represented
as cons cells ((0 . 3) (1 . 4) (2 . 3)).

(defun permutation-to-transpositions (permutation)
  (let ((swaps nil))
    (dotimes (dest (length permutation) (nreverse swaps))
      (let ((src (elt permutation dest)))
        (loop :while (< src dest) :do
          (setf src (elt permutation src)))
        (cond
          ((< src dest) (push (cons src dest) swaps))
          ((> src dest) (push (cons dest src) swaps)))))))

Next, we convert these transpositions as cons cells to adjacent
transposition indexes. This is straightforward. If we are swapping
$(a,b)$ with $a(0 1 2 1 0 1 2 3 2 1 2).

(defun transpositions-to-adjacent-transpositions (transpositions)
  (flet ((expand-cons (c)
           (if (= 1 (- (cdr c) (car c)))
               (list (car c))
               (let ((trans (loop :for i :from (car c) :below (cdr c)
                                  :collect i)))
                 (append trans (reverse (butlast trans)))))))
    (mapcan #'expand-cons transpositions)))

These are indexes $i_1, i_2, ldots$ such that $Pi = cdots
tau_{i_2}tau_{i_1}$

The last ingredient we need is inverting $Pi$. If we have $Pi$
represented as a sequence of $tau$, then we simply reverse the list
of $tau$.

Using transpositions to implement multi-qubit gates

With all of these, we write what is perhaps the most important
function of our interpreter.

(defun %apply-nQ-gate (state U qubits)
  (let ((n (dimension-qubits (length state))))
    (labels ((swap (i)
               (lift +swap+ i n))
             (transpositions-to-operator (trans)
               (reduce #'compose-operators trans :key #'swap)))
      (let* ((U01 (lift U 0 n))
             (from-space (append (reverse qubits)
                                 (loop :for i :below n
                                       :when (not (member i qubits))
                                         :collect i)))
             (trans (transpositions-to-adjacent-transpositions
                     (permutation-to-transpositions
                      from-space)))
             (to->from (transpositions-to-operator trans))
             (from->to (transpositions-to-operator (reverse trans)))
             (Upq (compose-operators to->from
                                     (compose-operators U01
                                                        from->to))))
        (apply-operator Upq state)))))

A few quick notes for comprehension:

The value of (swap i) is $tau_i$ fully lifted.
The one-line zinger that defines transpositions-to-operator takes
a list of transposition indexes and converts it into a unitary
operator. It does so by doing what’s known in functional programming
as a map-reduce, by first mapping $imapstotau_i$ and reducing by
operator composition.
The variable from-space contains the permutation $p$ that encodes
the space in which we’d like to act. This permutation is calculated
based off of the qubits argument.
The variables from->to and to->from represent $Pi_p$ and
$Pi^{-1}_p$ respectively.
The variable Upq is our fully lifted operator, exactly by way of
eqref{eq:upq}.

The function %apply-nQ-gate is what allows our interpreter to be so
general. Making the interpreter more efficient ultimately is an
exercise in making this function more efficient.

The only thing left to do is integrate all of the topics discussed
hitherto into an interpreter!

An interpreter

The driver loop

The bulk of the interpreter has been written. We’ve described the
semantics of the two instructions of interest: MEASURE and
GATE. Now we create the interpreter itself, which is just a driver
loop to read and execute these instructions, causing state transitions
of our abstract machine. If we see a GATE, we call apply-gate. If
we see a MEASURE, we call observe.

(defun run-quantum-program (qprog machine)
  (loop :for (instruction . payload) :in qprog
        :do (ecase instruction
              ((GATE)
               (destructuring-bind (gate &rest qubits) payload
                 (apply-gate (machine-quantum-state machine) gate qubits)))
              ((MEASURE)
               (observe machine)))
        :finally (return machine)))

Efficiency

Performance-focused individuals will have noticed that this
interpreter is pretty costly, in many ways. The biggest cost is also
unavoidable: The fact that our state grows exponentially with the
number of qubits. Real, physical quantum computers avoid this cost,
which makes them alluring machines to both study and construct.

However, even with this unavoidable cost, this interpreter has been
implemented for ease of understanding and not machine
efficiency. Writing a faster interpreter amounts to avoiding the
construction of the lifted operator matrices. This can be done with
very careful index wrangling and sensitivity to data types and
allocation. This is how the high-performance Quantum Virtual
Machine is implemented.

Examples

What good is writing an interpreter if we don’t write any programs
worth interpreting? Here are a few examples of programs.

Bell state

The Bell state is one which we’ve explored earlier. It is a
two-qubit state $$frac{1}{sqrt{2}}(ket {00} + ket {11}).$$ Here’s
a program to generate one, using two new gates, the controlled-not
gate $mathsf{CNOT}$ and the Hadamard gate $mathsf{H}$.

(defparameter +H+ (make-array '(2 2) :initial-contents (let ((s (/ (sqrt 2))))
                                                         (list (list s s)
                                                               (list s (- s))))))

(defparameter +CNOT+ #2A((1 0 0 0)
                         (0 1 0 0)
                         (0 0 0 1)
                         (0 0 1 0))))

(defun bell (p q)
  `((GATE ,+H+ ,p)
    (GATE ,+CNOT+ ,p ,q)))

Greenberger–Horne–Zeilinger state

The Greenberger–Horne–Zeilinger state, or GHZ state, is a
generalization of the Bell state on more than two qubits, namely
$$frac{1}{sqrt{2}}(ket{0ldots 000} + ket{1ldots 111}).$$ This is
accomplished by executing a chain of controlled-not gates:

(defun ghz (n)
  (cons `(GATE ,+H+ 0)
        (loop :for q :below (1- n)
              :collect `(GATE ,+CNOT+ ,q ,(1+ q)))))

The quantum Fourier transform

The ordinary discrete Fourier transform of a complex vector is a
unitary operator, and as such, it can be encoded as a quantum
program. We will write a program which computes the Fourier transform
of the probability amplitudes of an input quantum state (a time-domain
signal), producing a new quantum state whose amplitudes represent
components in the frequency domain. This is the central subroutine to
Shor’s algorithm, which is a quantum algorithm which factors integers
faster than any known classical method.

First, we will need a gate called the controlled-phase gate $mathsf{CPHASE}(theta)$:

(defun cphase (angle)
  (make-array '(4 4) :initial-contents `((1 0 0 0)
                                         (0 1 0 0)
                                         (0 0 1 0)
                                         (0 0 0 ,(cis angle)))))

Now, we can generate the quantum Fourier transform recursively.

(defun qft (qubits)
  (labels ((bit-reversal (qubits)
             (let ((n (length qubits)))
               (if (< n 2)
                   nil
                   (loop :repeat (floor n 2)
                         :for qs :in qubits
                         :for qe :in (reverse qubits)
                         :collect `(GATE ,+swap+ ,qs ,qe)))))
           (%qft (qubits)
             (destructuring-bind (q . qs) qubits
               (if (null qs)
                   (list `(GATE ,+H+ ,q))
                   (let ((cR (loop :with n := (1+ (length qs))
                                   :for i :from 1
                                   :for qi :in qs
                                   :for angle := (/ pi (expt 2 (- n i)))
                                   :collect `(GATE ,(cphase angle) ,q ,qi))))
                     (append
                      (qft qs)
                      cR
                      (list `(GATE ,+H+ ,q))))))))
    (append (%qft qubits) (bit-reversal qubits))))

The program for a three-qubit quantum Fourier transform (qft '(0 1 2)) looks like this:

(
  (GATE #2A((0.7071067811865475d0 0.7071067811865475d0) (0.7071067811865475d0 -0.7071067811865475d0)) 2) 
  (GATE #2A((1 0 0 0) (0 1 0 0) (0 0 1 0) (0 0 0 #C(0.0d0 1.0d0))) 1 2) 
  (GATE #2A((0.7071067811865475d0 0.7071067811865475d0) (0.7071067811865475d0 -0.7071067811865475d0)) 1) 
  (GATE #2A((1 0 0 0) (0 0 1 0) (0 1 0 0) (0 0 0 1)) 1 2) 
  (GATE #2A((1 0 0 0) (0 1 0 0) (0 0 1 0) (0 0 0 #C(0.7071067811865476d0 0.7071067811865475d0))) 0 1) 
  (GATE #2A((1 0 0 0) (0 1 0 0) (0 0 1 0) (0 0 0 #C(0.0d0 1.0d0))) 0 2) 
  (GATE #2A((0.7071067811865475d0 0.7071067811865475d0) (0.7071067811865475d0 -0.7071067811865475d0)) 0) 
  (GATE #2A((1 0 0 0) (0 0 1 0) (0 1 0 0) (0 0 0 1)) 0 2) 
)

(Recall that #C(0 1) represents the complex number $i$.)

We can see the quantum Fourier transform in action by computing the
Fourier transform of $ket{000}$. Here is a transcript of this
calculation:

CL-USER> (run-quantum-program
          (qft '(0 1 2))
          (make-machine :quantum-state (make-quantum-state 3)
                        :measurement-register 0))
#S(MACHINE
   :QUANTUM-STATE #(#C(0.3535533724408484d0 0.0d0)
                    #C(0.3535533724408484d0 0.0d0)
                    #C(0.3535533724408484d0 0.0d0)
                    #C(0.3535533724408484d0 0.0d0)
                    #C(0.3535533724408484d0 0.0d0)
                    #C(0.3535533724408484d0 0.0d0)
                    #C(0.3535533724408484d0 0.0d0)
                    #C(0.3535533724408484d0 0.0d0))
   :MEASUREMENT-REGISTER 0)

Indeed, one can verify that the classical Fourier transform of the
vector $[1,0,0,0,0,0,0,0]$ is a vector with eight components equal to
about $0.35355$.

$ python
Python 2.7.16 (default, May 23 2023, 14:13:27) 
[GCC 8.3.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import numpy as np
>>> np.fft.fft([1,0,0,0,0,0,0,0], norm="ortho")
array([0.35355339+0.j, 0.35355339+0.j, 0.35355339+0.j, 0.35355339+0.j,
       0.35355339+0.j, 0.35355339+0.j, 0.35355339+0.j, 0.35355339+0.j])

Example transcript

Here is an example transcript downloading and using this software,
using Steel Bank Common Lisp.

$ git clone https://github.com/stylewarning/quantum-interpreter.git
Cloning into 'quantum-interpreter'...
remote: Counting objects: 10, done.
remote: Compressing objects: 100% (10/10), done.
Unpacking objects: 100% (10/10), done.
remote: Total 10 (delta 2), reused 5 (delta 0), pack-reused 0

$ cd quantum-interpreter/

$ sbcl --noinform
* (load "qsim.lisp")
T

* (load "examples.lisp")
T

* (run-quantum-program (bell 0 1)
                       (make-machine :quantum-state (make-quantum-state 2)
                                     :measurement-register 0))
#S(MACHINE
   :QUANTUM-STATE #(0.7071067690849304d0 0.0d0 0.0d0 0.7071067690849304d0)
   :MEASUREMENT-REGISTER 0)

* (run-quantum-program (qft '(0 1 2))
                       (make-machine :quantum-state (make-quantum-state 3)
                                     :measurement-register 0))
#S(MACHINE
   :QUANTUM-STATE #(#C(0.3535533724408484d0 0.0d0) #C(0.3535533724408484d0 0.0d0)
                    #C(0.3535533724408484d0 0.0d0) #C(0.3535533724408484d0 0.0d0)
                    #C(0.3535533724408484d0 0.0d0) #C(0.3535533724408484d0 0.0d0)
                    #C(0.3535533724408484d0 0.0d0) #C(0.3535533724408484d0 0.0d0))
   :MEASUREMENT-REGISTER 0)

* (defun flip-coin ()
    (machine-measurement-register
     (run-quantum-program
      `((GATE ,+H+ 0) (MEASURE))
      (make-machine :quantum-state (make-quantum-state 1)
                    :measurement-register 0))))
FLIP-COIN

* (loop :repeat 10 :collect (flip-coin))
(1 1 0 1 1 0 0 1 0 1)

* (quit)

Source code

The source code in this tutorial are published under the BSD 3-clause
license. The complete listing and most up-to-date source code can be
found on
GitHub.