Matrix-vector multiplication with microwave signals

An RF signal at fixed frequency has amplitude and phase, so it can be represented or it can represent a complex number. Furthermore, microwave networks can be thought of as linear operations applied to those signals. When those networks are designed so that some of their parameters can be dynamically configured we can end up with a unitary matrix-complex vector multiplication. In this post I describe my implementation of 2×2 special unitary matrix by complex vector multiplication done (mostly) with microstrip and phase shifters.


Scattering matrix is commonly used to characterise microwave networks. Each element describes the relation between the incident wave of voltage Vk+{V^+_k} at port kk and the reflected wave Vj{V^-_j} measured at port jj.

Sjk=VjVk+S_{jk} = \frac{V^-_j}{V^+_k}

We can think of a 4 port microwave network as a linear operator that is applied to the input signals at ports 1 and 4, with the outputs appearing on ports 2 and 3.

[Y0Y1]=[S21S24S31S34][X0X1]\begin{bmatrix} Y_0 \\ Y_1 \end{bmatrix} = \begin{bmatrix} S_{21} & S_{24} \\ S_{31} & S_{34} \end{bmatrix} \begin{bmatrix} X_0 \\ X_1 \end{bmatrix}

Our goal is to design a microwave network that implements SU(2)SU(2). This is a very interesting group of unitary matrices. It plays important role in quantum computing1, is isomorphic to quaternions of unit norm (sounds like rotations), and describes rotations of the Bloch sphere (definitely rotations).

SU(2)=[αββα]  α,βC  α2+β2=1SU(2) = \begin{bmatrix} \alpha & -\overline{\beta} \\ \beta & \overline{\alpha} \end{bmatrix} \; \alpha, \beta \in \mathbb{C} \; \left|\alpha\right|^2+\left|\beta\right|^2=1

What’s going to be the most interesting for us is the fact that any 2×2 special unitary matrix can be decomposed into rotations around the Z and Y axis:

SU(2)=Rz(θ)Ry(ϕ)Rz(λ)SU(2) = R_z(\theta)R_y(\phi)R_z(\lambda)

Phase shift

Any transmission line is going to introduce some time delay that depends on its type, geometry, the materials involved, and the signal frequency. This delay causes negative phase shift. Moreover, the final circuit is inevitably going to introduce some global phase shift, common for all signals, that will need to be corrected. What is much more interesting though, is the relative phase difference between signals. First, let’s have a look at a SU(2) matrix representing a rotation around Z axis:

Rz(θ)=[eiθ200eiθ2]R_z(\theta) = \begin{bmatrix} e^{-i\frac{\theta}{2}} & 0 \\ 0 & e^{i\frac{\theta}{2}} \end{bmatrix}

That’s a negative phase shift on one signal and a positive phase shift on the other one. If we continue to ignore the global phase shift then this can be also done by changing the phase of only one signal or two negative phase shifts. We can implement this rotation by just having two microstrips of a different length, though this alone is not going to be good enough as, obviously, it needs to be dynamically configurable. The simplest way to achieve that is to use a switched line phase shifter: a series of pairs of SPDT switches that select between two transmission lines of different length. First stage implements optional -180° shift, next one -90°, etc. The number of stages determines the minimal step size. One of the problems with this design is that since the effective length of the transmission line is dependent on the selected phase shift and so is the insertion loss. There’s also a trade off between the resolution and the insertion loss as adding more stages will further attenuate the signal.

Branch-line coupler

There’s only so much that can be done with independent operations on RF signals. What we need now is a circuit which each output depends on both inputs. That sounds like a microwave coupler. In particular, let’s have a look at branch-line coupler.

Branch-line coupler

Incoming signal on port 1 is going to appear at half power on ports 2 and 3 with 90° phase difference between them, while port 4 is isolated. It is easy to see that this is a symmetrical network and we get analogous behaviour when the input signal is on port 4. The S-parameters of an ideal branch-line coupler are:

12[0i10i001100i01i0]\frac{-1}{\sqrt{2}} \begin{bmatrix} 0 & i & 1 & 0 \\ i & 0 & 0 & 1 \\ 1 & 0 & 0 & i \\ 0 & 1 & i & 0 \end{bmatrix}

This can be shown by doing the even-odd mode analysis. The general idea is to exploit the symmetry of the coupler and horizontally split it into two two-port networks. Then the even mode, when signals at ports 1 and 4 are in phase, and odd mode when signals at those ports are out of phase are considered. The full analysis is, however, a bit too much for this post.

From scattering parameters we can get the linear operation that branch-line coupler implements:

C=[S21S24S31S34]=12[i11i]\mathcal{C} = \begin{bmatrix} S_{21} & S_{24} \\ S_{31} & S_{34} \end{bmatrix} = \frac{-1}{\sqrt{2}} \begin{bmatrix} i & 1 \\ 1 & i \end{bmatrix}

This is a unitary matrix, which means it is a combination of a global phase shift and a rotation. As it turns out it is a -90° rotation around X axis.

Rx(θ)=[cosθ2isinθ2isinθ2cosθ2]iRx(π2)=12[i11i]\begin{aligned} R_x\left(\theta\right)&= \begin{bmatrix} \cos\frac{\theta}{2} & -i\sin\frac{\theta}{2} \\ -i\sin\frac{\theta}{2} & \cos\frac{\theta}{2} \end{bmatrix} \\ -iR_x\left(\frac{-\pi}{2}\right)&=\frac{-1}{\sqrt{2}} \begin{bmatrix} i & 1 \\ 1 & i \end{bmatrix} \end{aligned}

Mach–Zehnder interferometer

It’s easy to see that a rotation around X axis by 90°, followed by an arbitrary rotation around Z, followed by a -90° rotation around X is actually an arbitrary rotation around Y. This is not exactly what we have, but the expectation is that a branch-line coupler, rotation around Z, and a branch-line coupler will result with something not too different:

Ry(θ)=[cosθ2sinθ2sinθ2cosθ2]CRz(θ)C=i[sinθ2cosθ2cosθ2sinθ2]=i[cosπθ2sinπθ2sinπθ2cosπθ2][1001]=Ry(πθ)Rz(π)\begin{aligned} R_y(\theta)&= \begin{bmatrix} \cos\frac{\theta}{2} & -\sin\frac{\theta}{2} \\ \sin\frac{\theta}{2} & \cos\frac{\theta}{2} \end{bmatrix} \\ \mathcal{C} R_z(\theta) \mathcal{C} &= -i\begin{bmatrix} \sin \frac{\theta}{2} & \cos \frac{\theta}{2} \\ \cos \frac{\theta}{2} & -\sin \frac{\theta}{2} \end{bmatrix} \\ &= -i\begin{bmatrix} \cos \frac{\pi - \theta}{2} & -\sin \frac{\pi - \theta}{2} \\ \sin \frac{\pi - \theta}{2} & \cos \frac{\pi - \theta}{2} \end{bmatrix} \begin{bmatrix} 1 & 0 \\ 0 & -1 \end{bmatrix} \\ &= R_y(\pi-\theta)R_z(\pi) \end{aligned}

A coupler, a phase shift, and a coupler is basically what Mach-Zehnder interferometer is, a device used to determine relative phase shift of light beams and, apparently, also to compute matrix-vector multiplications.

Let’s run a few simulations of a MZI with a phase shift applied only to one signal. The expectation are as follows:

M(θ)=C[eiθ001]CM(0)=[0ii0]M(π)=[1001]M(π2)=[1+i21+i21+i21i2]\begin{aligned} \mathcal{M}\left(\theta\right)&=\mathcal{C} \begin{bmatrix} e^{i\theta} & 0 \\ 0 & 1 \end{bmatrix} \mathcal{C} \\ \mathcal{M}\left(0\right)&= \begin{bmatrix} 0 & i \\ i & 0 \end{bmatrix} \\ \mathcal{M}\left(-\pi\right)&= \begin{bmatrix} 1 & 0 \\ 0 & -1 \end{bmatrix} \\ \mathcal{M}\left(\frac{-\pi}{2}\right)&= \begin{bmatrix} \frac{1+i}{2} & \frac{1+i}{2} \\ \frac{1+i}{2} & \frac{-1-i}{2} \end{bmatrix} \end{aligned}

I used openEMS, an open source electromagnetic field solver. Below are visualisations of the electric field magnitude as the relative phase shift is changed (admittedly, not that useful) and computed S-parameters (much more useful).

MZI s-parameters

When strong signal is expected on a given port we get quite good results, even though there are some unavoidable transmission line losses. The strength of the signal when we expect to receive nothing is not great and definitely could be improved by adjusting the geometry of the couplers, but then again, this is not how the final MZI is going to look like, and not how the phase shift is going to be implemented.


The main part of the board is, obviously, going to be the MZI. I decided to cheat a little and not to add the phase shifters that should be there before and after the MZI to implement the Z rotations. The main reason is their large insertion loss and also the fact that it wasn’t really the most interesting thing.

We need to start with a way to create RF signals that represent complex numbers of our choosing and the ability to read them. This is problem that RF communication people has, generously, already solved for us in the form of IQ mixers. They are in fact two mixers, one multiplies the I (in-phase) input with the fixed frequency signal from the local oscillator, the other multiplies Q (quadrature) input with local oscillator output shifted by 90°. That sounds very much like real and imaginary part of a complex number.

Block diagram

An important decision to make at this point is the RF frequency which the circuit should use. I have chosen 5.8GHz as it makes the sizes of branch-line couplers and Wilkinson splitters quite convenient and since it is in the range of the frequencies used by WiFi there should be a larger selection of components.

Another early design decisions was that I am not going to try to make the computations fast. This let me keep the digital section simple. As a DAC I chose AD5684R – 12-bit, 4-channel, with SPI interface. Its outputs are single-ended and were configured to 0-2.5V range. That cannot be connected directly to the IQ modulator (ADL5375) as it requires 1Vpp differential with 1.5V bias and I needed to add fully differential opamps to do the voltage level conversion.

At 5.8GHz the output power of the mixer is around 0.16dBm. That signal goes into a low-pass filter (-0.6dB) and then into a 7-bit variable attenuator with 0.25dB resolution. The insertion loss of the attenuator is no more than 2.8dB. It is followed by a fixed +11dB amplifier. In total the attenuator+amplifier pair gives -23.55dB to +8.2dB range, which should be enough to get the signal power at the ADC at the appropriate level without having to worry too much about the exact losses along the way.

Next is the first branch-line coupler. Its insertion loss is 3dB plus microstrip transmission line losses. Let’s be conservative and estimate that the total loss is going to be no more than 4dB. It is followed by the phase shifters. While initially I was considering implementing them with multiple switches and different lengths of microstrip, the losses, imprecisions, and total area would be too high. Instead, I decided to use MAPS-010145, a 4-bit (22.5° step) phase shifter IC. It’s still far from ideal, though. The insertion loss is <6.5dB, which is significantly more than any other individual component in the circuit, and the attenuation variation is ±1dB. Just to make things more annoying, it also requires both positive and negative (-5V) power supply.

From the phase shifters the signal goes into another branch-line coupler, which means another at least -3dB. After that, there are baluns (-1dB) since the demodulator expects differential input. At this point our very conservative estimate is that the signal reaching the mixer is going to be around -7.75dBm (with the attenuator set to 0). The mixer is ADL5380 and its conversion gain at 5.8GHz is around +5.8dB. The output is differential and at -2dBm it is going to be ~500mVpp.

Differential signals from the demodulators go into opamps that convert them into single ended 0-2.5V. Then, there is a simple RC low-pass filter and the analog to digital converter: 12-bit, 4 channel AD7923.

Both modulators and demodulators need a 5.8GHz signal from the local oscillator. The relative phase between the LO signal that modulators and demodulators receive is not very important, it is only going to contribute to the global phase shift observed on the outputs, something that we have already accepted to deal with in the calibration. It is, however, going to be much easier if modulators for both channels get LO signal with no phase difference.

I used MAX2871 frequency synthesiser as the local oscillator. It has two differential outputs each capable of providing up to 5dBm. Each output has a balun to convert to a single-ended signal, a Wilkinson splitter, and a pair of baluns to convert back to a differential pair. A rough estimate of the RF power at the LO inputs of IQ mixers is around 0dBm, well within the 6dBm to -6dBm required range.

The last piece of the puzzle is the microcontroller which is going to facilitate communication between computer and the SPI-connected devices. I have chosen STM32F401RC: a single 84MHz Cortex-M4 core in a 64-pin LGFP package. It is definitely much more powerful than what’s needed, but there wasn’t really much point in trying to find the smallest MCU good enough for the job.

The required power rails are 5V, 3.3V (multiple), and -5V (low power). The input is 12V DC, which goes into 5V, 3.6V, and -5V switching mode power supplies. Linear regulators then convert 3.6V into multiple 3.3V rails.

PCB layout

The PCB was fabricated using OSH Park four layer service. The substrate is FR408HR which relative permittivity at 5GHz is 3.64. The stackup is rather usual one (starting from the top layer):

The layout of the PCB roughly follows the diagram shown earlier. Most of the signals are routed on the top layer. While the bottom layer is used for the long-range SPI bus shared between attenuators, phase shifters, and the frequency synthesiser as well as the low frequency IQ modulator inputs and the demodulator outputs.

I started placement and routing by, first, dealing separately with each IC and its auxillary components, just to make sure that related things are kept close together. At this point datasheet recommendations and layouts of the evaluation board were a great source of inspiration. I was also trying to avoid using the bottom layer and this point.

The next step, was to arrange these blocks, starting with the RF path, in a way that make sense and also reduces the amount of wasted space. The bottom layer was mostly unused until now so it was quite easy to route the long-range (but rather slow) signals. The final stage was to create power planes, which also was rather straightforward because already at the component selection I tried to ensure that devices close to each other use the same power supply voltage.


The microcontroller firmware is quite simple. First, it initialises all the components connected over SPI: DAC, ADC, attenuators, phase shifters, and the frequency synthesiser. Then it listens on the UART for the requests sent from the computer and translates them to the appropriate SPI commands. All UART communication is fixed-length synchronous request-reply just to keep things as simple as possible.

Most of the work is done by Python scripts on the computer. In order to get good results there several corrections that need to be applied. Moreover, since the insertion loss of the phase shifters depends on the configured phase shift, those correction are specific to the multiplier matrix.

Firstly, the attenuators need to be set so that there’s no clipping at the ADC. It is desirable to choose the minimal attenuation in order to use as much as possible of the ADC dynamic range and maximise signal to noise ratio. This is done by first setting the phase shifters to -90° and -270° so that relative phase difference is -180° and the expected amplitude at the outputs is the same at the amplitude at the corresponding inputs. Then, for each channel, the scripts tries multiple values, all with amplitude 1 but with different phases, and adjusts the attenuation so that there’s no clipping. The reason for trying multiple values is that as we don’t know yet the global phase shift of the circuit, so cannot easily tell which input values are going to produce the maximum output. After this procedure we have initial attenuation settings that allow us to proceed with the calibration.

The next step is to correct the bias at the outputs. That’s mostly a consequence of the imprecisions of the ADC driver opamp circuit. The procedure is to set the phase shifts to the values required for the desired matrix multiplication, set both inputs to 0, and measure the outputs.

With the bias corrected, we can now figure out the phase shift of the global circuit. Each input is set to a range of values with different amplitudes and phases and the phase difference between the actual output and the expected one is calculated. The average of those differences is assumed to be the global phase shift that needs to be corrected. It is also worth noting that I generally got better results when the script was rejecting outputs with too small amplitude, as well as restricting the amplitudes of inputs to avoid extreme values.

At this point we get outputs with the correct phase and no bias. However, because of the low precision of the attenuators and the fact that their values were chosen at different circuit configuration additional gain needs to be applied to the outputs. The wat to do this is to, at fixed phase, measure the outputs as the input amplitude changes. We expect a linear relationship between both the input amplitude and the real part of the output, as well as between the input amplitude and the imaginary part of the output. However, there are always going to be some nonlinearities in the circuit, so linear regression is used to get a model that best fits the actual outputs (still done separately for the real and imaginary parts). Now, the ratio between the expected slope and the slope of the model based on the actual measurements tells us how much gain needs to be applied. Moreover, the intercept is the additional bias correction needed to further minimise the overall error. For best results, this is repeated for different phases and the average of the obtained correction values is applied.


It is time to show some results. The whole system has a few dimensions too many to fit everything on one plot that would still be useful to humans and there isn’t really much point in doing any proper characterisation. Fortunately, the circuit is sufficiently symmetric that looking just at few values should gives a good idea how it performs. I picked a non-trivial matrix, fixed one of the inputs and varied the other X1X_1. The measurements are of only one of the outputs: Y1Y_1.

[Y0Y1]=[1+i21+i21+i21i2][0.3+0.6iX1]\begin{bmatrix} Y_0 \\ Y_1 \end{bmatrix} = \begin{bmatrix} \frac{1+i}{2} & \frac{1+i}{2} \\ \frac{1+i}{2} & \frac{-1-i}{2} \end{bmatrix} \begin{bmatrix} 0.3+0.6i \\ X_1 \end{bmatrix}


The top two plots show the differences between expected and actual amplitudes and phases for inputs at two phases and varying amplitudes. There’s no averaging those are all single measurements, and we can see that noise isn’t a big problem. What may be less obvious from these plots, because of the chosen matrix, is that the nonlinearities also seem to be acceptable, and straight lines are straight. This gets a bit worse at higher signal amplitudes and one way of improving that would be to sacrifice some of the resolution and use slightly smaller range at the DAC.

The bottom left plot shows the relative error for given inputs values. We can clearly see that there’s a problem at X1=0.3+0.6iX_1 = 0.3+0.6i. The expected output is Y1=0Y_1=0, but clearly, that’s not happening. After comparing that results with measurements for different matrices it becomes obvious that some input power leaks into the outputs. This should also explain why phase error plot looks better than the amplitude one. It is caused by the imperfect coupler geometry, something that came up earlier when I was showing openEMS simulations, but I haven’t really addressed in the design. This seems to be the biggest problem with that board that cannot be easily corrected in the calibration.

The bottom right plot shows actual and expected values for several different X0X_0. It doesn’t really tell us anything that the previous ones didn’t. The output phase is mostly fine, amplitude less so.

  1. This should explain the quantum computing book in the bibliography: it contains a lot of mathematical background regarding unitary matrices that I use in this post. ↩︎