treatise (PDF)

Document Sample
treatise (PDF) Powered By Docstoc
					Software Radio for Digital Satellite

                   E DMUND T SE

               Supervisor: Prof. Branka Vucetic
        Associate Supervisor: Dr. Zhendong Kyle Zhou
        This thesis is submitted in partial fulfillment of
               the requirements for the degree of
    Bachelor of Electrical Engineering (Telecommunications)

       School of Electrical and Information Engineering
                  The University of Sydney

                      17 November 2010

   Satellite television broadcasting is used around the world to deliver both free to air and sub-
scription based media services direct to consumer homes. Currently, in order to receive these
services, a dedicated piece of hardware device known as the set-top box must be purchased and
installed. One drawback of using these devices is that if whenever a new broadcasting standard
is released, it is very difficult to migrate existing customers to use the new standard due to a
need to replace the dedicated hardware devices.
   We investigate the use of a software radio approach to satellite television transmission and
reception. With this approach, standards and specific radio characteristics are completely de-
fined in software, while the hardware radio components remain entirely generic. All of the
functionality performed by a software radio device can be completely reconfigured through a
software download.
   We implement a DVB-S compatible satellite television receiver using GNU Radio – a free
software radio framework, a low-cost universal software radio peripheral and a general purpose
computer. We exploit the increasing computational capacity of today’s computers to provide
the basic needs for a purely software based decoder. We conducted extensive experiments to
show the validity of our approach as well as to provide an indication of how close we are for
realtime processing using general purpose hardware.


   I would like to thank my supervisor, Kyle Zhou, for his patience and whose continual
support and academic guidance was indispendable for this honours project throughout my final
year of undergraduate studies.

                                                             C ONTENTS

Abstract                                                                                                                                             ii

Acknowledgements                                                                                                                                    iii

List of Figures                                                                                                                                     vi

List of Tables                                                                                                                                      vii

Chapter 1       Introduction                                                                                                                         1
  1.1   Software Radio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .       1
  1.2   Digital Television . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .       2
  1.3   Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .    3

Chapter 2       Background                                                                                                                           4
  2.1   The DVB-S Standard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .             4
  2.2   Software Radio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .       8
  2.3   Challenges of the software radio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
  2.4   The Universal Software Radio Peripheral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
  2.5   GNU Radio as a Software Radio Platform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
  2.6   Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

Chapter 3       System Design                                                                                                                       21
  3.1   Transmit path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
  3.2   Receive path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
  3.3   Testing and quality assurance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
  3.4   Performance tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

Chapter 4       Experimental Results                                                                                                                37
  4.1   Experimental Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
  4.2      Validation using a real satellite data stream . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
  4.3      Performance measurements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
  4.4      Utilisation of Parallel Computation Cores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

Chapter 5          Conclusion                                                                                                                       49
  5.1      Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
  5.2      Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
  5.3      Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

Appendix A             USRP Daughterboards                                                                                                          53

Appendix B            List of implemented GNU Radio blocks                                                                                          54

Appendix C             List of unit tests                                                                                                           55
  C++ unit tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
  Python unit tests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

Appendix D             GNU Radio Installation                                                                                                       58

Appendix E            Test computer specifications                                                                                                   60
  Hardware Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
  Software Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

Appendix F            Complete receive path block structure and interfaces                                                                          62

Bibliography                                                                                                                                        63

                                         List of Figures

2.1     Functional block diagram of the DVB-S system                                            4

2.2     Randomiser linear shift feedback register configuration                                  5
2.3     Randomised MPEG-2 transport multiplex packets                                           6
2.4     Forney convolutional interleaver with interleaving depth I = 12                         6

2.5     QPSK Constellation                                                                      8
2.6     The ideal software radio                                                                9
2.7     Photograph of a Universal Software Radio Peripheral                                    13

2.8     Simplified USRP Block Diagram                                                           14
2.9     USRP Internals                                                                         15
2.10 The DBSRX daughterboard                                                                   15

3.1     DVB-S software radio receive chain                                                     27
3.2     DVB-S Demodulator                                                                      27
3.3     Depuncture-Viterbi hierarchical block structure                                        29

3.4     Illustration of complex symbol ambiguity                                               30
3.5     Sync search algorithm - state transitions                                              33

4.1     Computer to satellite dish connection                                                  38
4.2     Screen capture of MPEG-2 TS playback using MPlayer                                     41

4.3     Throughput of individual receiver blocks, normalised against requirement for
      realtime signal processing on a Core 2 Quad computer                                     46

4.4     Relative processing duration for individual signal processing blocks as a proportion
      of all processing required for realtime signal reception                                 47

                                      List of Tables

2.1   Punctured convolutional code definition                7

3.1   Reed-Solomon parameters                               23
3.2   Automatic gain control parameters                     28

4.1   Satellite equipment and transponder characteristics   39
4.2   Minimum throughput for receiver blocks                44

4.3   Receiver blocks throughput                            45

E.1   Test computer specifications                           60
E.2   Computer operating parameters                         60

                                          C HAPTER 1


1.1 Software Radio

Software radio promises to liberate radio transceivers from the constraint of adhering to a single
radio communication standard by enabling reconfiguration simply by reprogramming the signal
processing software. Its goal is to shift the boundary between hardware and software as close
as possible towards the antenna, thereby transforming traditionally hardware problems into
software problems that are potentially easier to solve. This allows the design of flexible and
cost efficient implementations of multi-standard terminals.

The term “software radio” was introduced by Joseph Mitola III in 1992 when he first published
the idea [1], but it have since also been known as software defined radio [2], reconfigurable
radio, and cognitive radio [3]. While such terms harbour a common definition that the radio
interface is redefinable via software, they simply attempt to qualify the level of flexibility that
the radio system offers. As defined by the Wireless Innovation Forum [4], software defined
radio is “radio in which some or all of the physical layer functions are software defined.” Only
a software download is required for existing hardware to adopt a new radio communication

Despite expectations for widespread adoption by the year 2006 [5], a number of challenges [6]
prevented this from realisation. For example, there needs to be an open software architecture
that supports portability across domains, transport mechanisms and programming languages;
as well as a demand for vast computational capacity at a low power consumption to accom-
modate the signal processing requirements. Nonetheless, active development of the software
architecture from the military domain [7] and increasingly faster processors are bringing us
ever closer to realising the generic software radio.

A free software radio toolkit project called “GNU Radio” [8] was founded by Eric Blossom
to provide some signal processing blocks to implement software radios using low cost external
radio frequency (RF) hardware and general purpose commodity computers. It also consists
of a software framework on which custom signal processing blocks can be easily developed
and integrated with existing ones. As it is free software, and requires nothing both a general
purpose computer to run, GNU Radio is much more accessible compared with other specialised
software radio platforms.

Even though it is called “software radio”, there is still some hardware involved whose role is
to provide an interface between analog radio signals and digital samples usable by the com-
puter. One example of such device device called the “Universal Software Radio Peripheral”
was designed by Matt Ettus to be a bidirectional analog-digital converter that connects to the
computer via standard Universal Serial Bus (USB) revision 2.0. Once the radio spectrum has
been sampled for use by the computer, the data stream can be passed to GNU Radio for purely
software based signal processing.

1.2 Digital Television

It was widely recognised that with maturing digital video technology [9], the future of television
services will become digital [10]. There are at least three different digital High Definition
Television (HDTV) in use around the world [11]: the Japanese MUSE [12], European DVB
[13] and the North American ATSC [14; 15]. The abundance of different standards for digital
television makes it an excellent candidate for a software radio implementation due to the ability
to switch to another standard simply by loading corresponding software packages.

An even more compelling reason to explore the use of general purpose computers for software
radio based digital television is that there is already a consumer market for digital television
receiver peripherals for the personal computer. Some of the leading brands in this space today
are Hauppauge, AVerMedia, Dvico, Compro Technology and Leadtek, whom all take the tra-
ditional approach of using inflexible, application specific hardware that cannot be reconfigured
for use with other standards.

We place our focus on the Digital Video Broadcasting (DVB) family of international standards
[13]. The technology areas covered by DVB include specialisations DVB-T [16] for terrestrial
broadcasting, DVB-H [17] for battery-powered hand-held devices, DVB-S [18] and DVB-S2
[19] for satellite systems and DVB-IP [20] for media over IP networks. In Australia, DVB-T is
used for terrestrial digital television services, while for satellite transmission DVB-S and DVB-
S2 are used. While all of the DVB specialisations have much in common, we shall concentrate
on the dominant Australian satellite television standard DVB-S, and implement this using a
software radio approach.

1.3 Contributions

The contributions of this treatise are as follows:

       • We present an extensive survey of software radio in research and report on its recent
         development progress
       • We provide an implementation of the DVB-S coding and modulation standard on the
         GNU Radio software radio platform.
       • Using this implementation, we analyse the computational demand for each component
         signal processing block to provide empirical evidence to support our evaluations.

The remainder of this treatise is organised as follows: In Chapter 2, we present an overview
of the DVB-S standard, software radio as a research field and explore our software radio plat-
form in detail. We also survey related work on implementing media broadcast standards using
software radio. Chapter 3 describes our approach to the engineering problem and provides
specifications and details about our implementation. In Chapter 4 we present our experimental
results and in particular discuss the performance measurements of our DVB-S implementation.
Finally, we conclude in Chapter 5 summarising our work.

                                         C HAPTER 2


2.1 The DVB-S Standard

DVB-S is a general system for transmission and broadcast of digital television. It can be used to
provide television services direct to home (DTH), conveying media streams between studios, or
even uplink satellite connections in digital satellite news gathering (DSNG) applications [18].

The DVB technical module (DVB-TM) with the cooperation of a large number of European
organisations agreed upon a system definition that led to its standardisation by the European
Telecommunications Standards Institute (ETSI) at the end of 1993.

At the heart of DVB-S, the system was designed to operate efficiently on non-linear, wideband,
and power limited satellite channels affected by noise, interference and distortion through the
use of a flexible error protection technique based on the concatenation of Reed-Solomon and
convolutional codes.

Conceptually, the DVB-S transmission system has several stages, as shown in Figure 2.1. It
is defined as the functional block of equipment that adapts the output of an MPEG-2 trans-
port multiplexer to the satellite channel characteristics [21]. The source video, audio and data

                F IGURE 2.1: Functional block diagram of the DVB-S system
            F IGURE 2.2: Randomiser linear shift feedback register configuration

streams are encoded according to the MPEG-2 [22] standard source codec, then combined into
a single transport stream using the MPEG-2 transport multiplex. This transport stream (MPEG-
2 TS) is structured such that each packet has a fixed length of 188 bytes, beginning with 1 sync
byte and 3 header bytes containing packet identifiers. MPEG-2 TS does not include any error
protection on the packet headers, so it is unsuitable direct transmission over the satellite chan-
nel. DVB-S adapts MPEG-2 TS to the satellite channel through powerful channel coding and
modulation techniques that minimises the effect of channel noise, interference, and distortion.

The first function block is the data randomiser, which scrambles the input MPEG-TS data in
order to comply with radio regulations for spectrum occupancy. Scrambling is done based on a
pseudo-random binary sequence (PRBS) generated by a linear feedback shift register (LFSR),
synchronised to a frame of 8 MPEG-2 packets. The framing structure is illustrated in Figure
2.3. The LFSR has a generator polynomial of 1 + X 14 + X 15 , and its registers are initialised
with the sequence “100101010000000”. This sequence is also to be loaded into the registers at
the start of every frame of eight packets. To enable the derandomiser to synchronise its PRBS,
the first sync byte in every frame is bitwise inverted from 0x47 to 0xB8.

The randomiser applies the PRBS starting at the first bit after the first inverted sync byte, i.e.
at the 9th bit of each frame of 8 packets. The PRBS is XORed with the subsequent 1503 input
bytes until the end of the frame. To aid synchronisation, the second to the eighth sync bytes
should remain unrandomised, but the PRBS should continue to clock during those bits.

The randomised packets are then encoded with Reed-Solomon code RS(204,188), shorted from
the standard RS(255,239) code. This protects the data by appending 16 redundant bytes, al-
lowing correction of up to any T = 8 erroneous bytes in each packet. The shortened RS
               F IGURE 2.3: Randomised MPEG-2 transport multiplex packets

        F IGURE 2.4: Forney convolutional interleaver with interleaving depth I = 12

code may be implemented by padding the 188-byte packet using 51 zero bytes into a block of
length 239 bytes. After encoding, the zero bytes are simply discarded, leaving only the original
packet and the 16 redundant bytes. The Reed-Solomon code has code generator polynomial of
g(x) = (x + λ0 )(x + λ1 )(x + λ2 ) . . . (x + λ15 ), where λ = 0x02; and field generator polynomial
of p(x) = x8 + x4 + x3 + x2 + 1.

After Reed-Solomon encoding, the packets are convolutionally interleaved based on the Forney
approach (see Figure 2.4), with interleaving depth of I = 12. Conceptually, it consists of 12
parallel first-in-first-out (FIFO) shift registers of memory lengths from 0 to 11 × 17 bytes. The
bytes are routed to each shift register in cyclic order, with the input branch index synchronised
to the output branch index. Sync bytes are always routed to branch “0”, which corresponds
to zero delay and thus remain in the same position every 204 bytes. The deinterleaver has
                 Code rate 1/2          2/3     3/4         5/6           7/8
                 df ree     10            6       5           4             3
                 X               1   1010      101 10101 1000101
                 Y               1   1111      110 11010 1111010
                 I            X1 X1 Y2 Y3 X1 Y2 X1 Y2 Y4        X1 Y2 Y4 Y6
                 Q             Y1 Y1 X3 Y4 Y1 X3 Y1 X3 X5      Y1 Y3 X5 X7
                      TABLE 2.1: Punctured convolutional code definition

a similar structure to the interleaver, with the difference being the branches have decreasing
memory lengths with branch index. That is, the shift register on branch “0” of the deinterleaver
has length 11 × 17, and null delay for branch “11”. Sync bytes remain routed via branch “0”.

DVB-S allows for a range of code rates, to accommodate varying amounts of error correction
for given service or source data rate requirements. The interleaved frame is convolutionally en-
coded with constraint length K = 7 to obtain a mother code of rate , which is then punctured
                             1 2 3 5         7
into a resultant code rate of , , , , or . The generator polynomials for the convolutional
                             2 3 4 6         8
code are G = (0171OCT , 0133OCT ). The puncturing matrix for each of the code rates are given
in Table 2.1.

The concatenation of Reed-Solomon and convolutional coding schemes is a feature of the DVB
transmission systems for strong error correction. Errors at the output of the Viterbi decoder
often occur as bursts, which are spread out over different packets by the interleaver to improve
the error correction capability of the Reed-Solomon code.

Finally, the encoded bits are absolutely mapped to a signal constellation shown in Figure 2.5,
after each I and Q signals represented as Dirac delta spaced by symbol duration TS =       are
square root raised cosine filtered to obtain the baseband signal shape. This root raised cosine
filter is defined by:
                              F IGURE 2.5: QPSK Constellation

              1
                                                  for |f | < fN (1 − α) ;
                                              1
                 1 1      π        fN − |f |
                                              2
   H(f ) =                                         for fN (1 − α) ≤ |f | ≤ fN (1 + α); and
              2 + 2 sin 2fN
                                     α
              0                                   for |f | > fN (1 + α).

              1    RS
Where fN =       =    is the Nyquist frequency, and α = 0.35 is the roll-off factor.
             2TS    2

2.2 Software Radio

The ideal software radio enables RF signal processing including filtering and conversion from
RF to lower frequencies to be done digitally by placing the analog-to-digital converter (ADC)
as close as possible to the antenna [23]. This gives full software programmability because all
but the initial signal amplification is done in software. In the ideal software radio as shown in
Figure 2.6, hardware is extremely simple and all of the radio functions are software defined.
However, this is a rather simplistic view of the necessary hardware technologies for software
radio because there are a number of the technical challenges which we shall review in a latter
                             F IGURE 2.6: The ideal software radio

The concept of the software radio brings many opportunities. With the idea being conceived
during the development phase of third-generation (3G) cellular telephony systems, the value
of having a universal handset compatible with the different next generation systems defined
from across regions was well recognised [24]. Mobile services subscribers would find interna-
tional roaming much easier and manufacturers would have greater scale economies and better
flexibility in updating any existing hardware devices for emerging standards [25].

A generic hardware device where the application specific complexities are defined in software
has the potential to reduce cost, circuit complexity, power consumption and the size of the phys-
ical radio [26]. Software radios can also reduce cost and risk and provide enhanced capabilities
for NASA deep space missions [27].

2.2.1 Evolution of the software radio

Early software radio concepts can be found since the 1970s, in a Defense Advanced Research
Projects Agency (DARPA) technology pathfinder project called SPEAKeasy [28], a multi-
phase research and development program for the next-generation military radio. Its motivation
was to extend the service lifetime of radio devices, by developing a multi-band multi-mode ra-
dio. With a programmable digital signal processing capacity on the order of one billion 16-bit
integer operations and 200 million 32-bit floating-point operations (FLOPS) and an operating
frequency range from 2 to 2000 MHz, it was designed to emulate any existing or future mil-
itary radios through the use of a suite of software modules with custom sets of parameters.
SPEAKeasy was a proof of concept for low rate data and voice communications, and was not
a widely deployed product.
Traditional radio systems are implemented in hardware as application specific integrated cir-
cuits (ASICs), and is designed to support one specific waveform by tailoring the hardware
components and the construction of the system to that waveform. Waveform in this context
refers to the entire set of specifications to convert a message into an RF signal, including fre-
quency, modulation, coding, message format and the transmission system. As a result, two
parties in possession of different radio systems are unable to communicate with each other due
to this incompatibility. To implement a radio that would have been able to reconfigure itself,
several different approaches can be used [5].

One approach is to use digital signal processors (DSPs), which are essentially relatively slow
general purpose microprocessors with architectural optimisations for extremely fast analog sig-
nal processing. DSPs have become significantly more powerful in recent years [29] with higher
degrees of parallelism, incorporating multiple data paths and execution units. While DSPs have
typically high power dissipation, innovations in processor fabrication technology continue to
drive DSP performance with increasing clock speeds and transistor counts while at the same
time improving its power efficiency. This makes DSPs prime candidates for implementing
software radio systems.

Another approach is to use field programmable gate arrays (FPGAs), which are integrated
circuits whose logic gates interconnection and logic functions can be redefined after manu-
facturing [30]. FPGAs are designed using hardware description languages such as VHDL or
Verilog, and are often used as a low-cost alternative to the design of ASICs especially for low-
volume applications. It is possible to design the hardware layout for pipelined or even parallel
data flows. Some FPGAs even allow dynamic reconfiguration of a portion of the chip while
other parts of the chip is still operational. With the added benefit of compilers being created to
generate FPGA code directly from Matlab, Simulink or C, FPGAs offer complete hardware re-
configurability with its hardware logic reorganisable by software reprogramming, even though
they are expensive, lack power management features found in ASICs and are physically rather

A third approach for implementing software radio is the use of a completely general purpose
microprocessor or microcontroller. Even though the use of a small embedded microcontroller
does allow full software reconfigurability, it is generally not considered to be a feasible option
because of its slow performance. There are however, innovative platforms known as hetero-
geneous multiprocessor [31] solutions that takes uses the microcontroller to parameterise high
speed interconnected DSPs or dedicated hardware.

At the other end of the spectrum high powered general purpose processors could be used for sig-
nal processing. While this approach is unlikely to become the solution for the universal mobile
handset in the near future due to computational speed, power and heat dissipation constraints,
everyday desktop computers are becoming faster and faster and thus more capable for signal
processing in general. One of the key developments for the general purpose computing based
software radio is the software communications architecture (SCA) [32; 33], designed to allow
standardised development of hard and software components for seamless integration [34].

2.3 Challenges of the software radio

There are many technical challenges that prevent such a device from being widely adopted [35].
We begin by examining the hardware limitations starting from the most essential part of any
radio system: the antenna.

Ideally, we would have a wideband antenna capable of operating from around 100 megahertz
up to several gigahertz or higher. However, physics dictate that the operating frequency and
bandwidth of an antenna is solely determined by its size and shape. One way to achieve a
wider operating range is through the use of smart materials like the micro electro-mechanical
system (MEMS), which could be used to change an antenna’s operating frequency [36].

Another issue we encounter is designing the radio’s power amplifiers to have a linear response
across a broad range of frequencies [23]. Although there are signal modulation formats that
are immune to non-linear distortions, many other communication schemes involving the ma-
nipulation of amplitude such as quadrature amplitude modulation (QAM) would be adversely
The next challenge concerns the analog to digital converters (ADCs) and digital to analog
converters (DACs), which must sample at a rate of at least twice that of the highest frequency
component of the signal (Nyquist rate) to avoid the effects of signal aliasing – giving an upper
bound on its bandwidth. As well as sampling speed, they also need to have sufficient resolution
and dynamic range to ensure acceptable signal-to-noise ratio (SNR) and adequate sensitivity to
weaker signals.

Once the signal is converted from analog to digital, all of the subsequent signal processing is
done according to some software definition - either as a hardware description for programmable
logic, or as a sequence of processor instructions for the DSP or the general purpose processor.
Thus it is a fundamental challenge to achieve sufficient computational capacity to accommodate
processing of the radio frequency signals at acceptable costs and power consumption. The ulti-
mate flexibility of such processors, combined with their remarkable performance increases year
after year show great promise in delivering the required computational capacity for software

With the advent of hardware based high performance computing accelerators such as general
purpose graphical processing units (GPGPUs) [37] and the Cell Broadband Engine [38], mul-
tiple core architectures have shown a significant improvement in performance. The GPGPU in
particular, may be attractive in the future for use as an inexpensive software radio solution for
general purpose computers due to its consumer market penetration as graphics hardware and
its relatively low price point. Computational capacity will always be in demand with modula-
tion and coding systems ever increasing in complexity due to higher baseband bandwidth and
spectral efficiency requirements [39].

There are studies of the software architectural challenges [40; 41; 6], although these are beyond
the scope of this treatise. We shall focus on the use of general purpose computers for software

             F IGURE 2.7: Photograph of a Universal Software Radio Peripheral

2.4 The Universal Software Radio Peripheral

The Universal Software Radio Peripheral (USRP) is a hardware device developed by Ettus
Research LLC [42] that enables general purpose computers to be used as a platform for software
radio. It serves as a generic radio frontend by providing a digital baseband and IF section for
the radio system [43]. The USRP product family consists of the motherboards that perform
IF digitising, digital up and down conversion, decimation and interpolation; as well as various
daughterboards to cover different frequency ranges. The USRP motherboard hosts several key
components: ADCs and DACs for signal conversion, an FPGA and a universal serial bus (USB)
interface to connect to the computer.

The USRP features 4 high speed ADCs with a sampling rate of 64 million samples per second
(MS/s) at a resolution of 12 bits. This means it can theoretically sample an IF signal with
bandwidth of up to 32 MHz. For the transmit path, there are 4 high speed DACs with a clock
frequency of 128 MS/s at a resolution of 14 bits, giving a Nyquist frequency of 64 MHz.

The FPGA used in the USRP is the Altera Cyclone EP1C12, and is responsible for digital up
and down conversion between baseband and IF band, spectral shaping and out-of-band signal
                        F IGURE 2.8: Simplified USRP Block Diagram

rejection. It also handles decimation of the signal so that the data rates are within the limits of
the USB controller and the computer.

The high speed USB 2.0 controller chip used is the Cypress FX2, and can sustain a data rate
of 32 MB/s. This equates to 8 million 32-bit complex samples per second, and a Nyquist
frequency of 8 MHz since complex samples were used.

The USRP motherboard features four expansion slots (two transmit ports and two receive ports)
for connecting to a variety daughterboards that contain RF to IF conversion circuitry. A list of
daughterboards available for purchase via the Ettus Research website [42] are given in Appen-
dix A. For DVB-S reception, we used the DBSRX daughterboard revision 2.2 (2007) which
has been superseded since 20 Oct 2010, by the DBSRX2 daughterboard that uses the universal
hardware driver (UHD).
             (a) USRP Motherboard                      (b) With daughterboards attached

                               F IGURE 2.9: USRP Internals

                         F IGURE 2.10: The DBSRX daughterboard

In 2009, Ettus Research began to ship the second generation of the USRP family, the USRP2.
It features a faster FPGA (Xilinx Spartan 3), faster and higher resolution ADC and DACs, and
a higher bandwidth connection to the computer via Gigabit Ethernet. For our project, the first
generation USRP is adequate for testing purposes.
2.5 GNU Radio as a Software Radio Platform

Once the radio signal has been down-converted and digitised, the desktop computer is used for
signal waveform processing. There are several software radio solutions available, but we shall
focus on the free software radio framework GNU Radio [8] for which the USRP was designed,
licenced under the GNU General Public License (GPL) version 3.

GNU Radio was designed as a software radio platform both for experimenting with digital
communications and to perform signal processing using low cost RF hardware and commodity
computer hardware. As part of the toolkit, a number signal processing blocks form a part of
the shipped product, including signal sources and sinks, FIR and IIR filters, modulation and
demodulation, interpolation and decimation, gain control, Fourier and wavelet transforms, and
even an implementation of the North American digital television Advanced Television System
Committee (ATSC) standard.

The project started as early as 1998, with code based on the PSpectra, an MIT SpectrumWare
virtual radio project using the C++ programming language which has a low scheduling over-
head and a potential for near linear speedup with symmetric multiprocessing (SMP) – an com-
mon computer architecture found in most modern computers. It was also designed so that
adding a new signal processing block was very easy.

GNU Radio in conjunction with the USRP have proven to be a popular choice among hobbyists
and research groups as a testbed platform [44], mainly due to its accessibility and low cost
compared with commercial platforms. Its performance even exceeded that of an alternative
software platform (OSSIE) that utilises the SCA infrastructure [45].

2.5.1 Concepts and Architecture

GNU Radio is primarily developed using the GNU/Linux operating system, but BSD, Mac OS
X and Windows are also supported. In GNU Radio, a radio system is represented as a directed
signal flow graph where graph vertices are known as signal processing blocks and edges indicate
a connection between the two blocks. Data flows in one direction from a signal source to one
or more signal sinks. This construction of software radio is similar to development of hardware
radios, but with an additional restriction that the signal flow in a flow graph cannot form a
feedback cycle, so implementation of any feedback mechanisms must be contained within one
signal processing block.

In GNU Radio, the signal processing blocks are defined in C++ for performance, while the
connections between the blocks for a given application are declared in Python. Using a high
level language like Python allows users to quickly create different applications by construct-
ing a signal flow graph simply by making connections between smaller building blocks. This
approach meant that the agility of software development in a high level language can be max-
imised while at the same time sidestepping its drawback of slow performance by acting only
as ‘glue’ code and offloading the heavy lifting to C++ compiled code. Interoperation and data
marshalling between Python and C++ is done by employing the Simple Wrapper Interface
Generator (SWIG).

GNU Radio uses a number of data types to represent the signal at the interfaces of each of the
signal processing blocks. The data type used by a particular block can usually be identified
through the naming convention that each block should be suffixed with a code to represent its
interface. For example, the block gr_rms_cf has a suffix of _cf, which indicates that the
block takes input as 8 byte complex values, and produces an output in 4 byte floating point
values. Similarly, the block gr_multiply_const_vss would take a vector of 2 byte short
integers and produce an output in the same format. Other possible data types include b for 1
byte integers, and i for 4 byte integer values.

A new GNU Radio signal processing block is defined by deriving from the base class gr_block
or one of its subclasses gr_sync_decimator, gr_sync_interpolator, gr_sync_block,
or gr_hier_block2 in C++. Then, a SWIG interface is defined for this block, which en-
ables it to be constructed and connected from Python.

At the core of the signal processing block are two member functions: forecast() and
work(). forecast() returns an estimate of how many units of the input data is required for
this module to produce a given number of output units. work() is the function that does the

     actutal computation on the input data and produces an output. This signal processing blocks
     framework abstracts away the complexity of how one might schedule the work of multiple
     signal processing blocks on the computer.

     2.5.2 Example

     Consider the following example where we take signal input from a microphone connected to
     the computer’s audio card, amplify the signal and then write the raw sample data to disk, we
     can do so as given in Listing 2.1, after which we simply need to instantiate this class then call its
     run() method. Of course, the data that is written to disk here are simply raw samples without
     any header information, there is not compatible with common audio file formats recognisable
     by the computer.

1    from gnuradio import gr, audio
3    class Audio_Sampler(gr.top_block):
4         def __init__(self):
5              gr.top_block.__init__(self)
6              self.sample_rate = 48000              # Hz
8              self.src = audio.source(self.sample_rate)
9              self.amp = gr.multiply_const_ff(1000)
10             self.dst = file_sink(gr.sizeof_float, "audio_samples.dat")
12        self.connect(self.src, self.amp, self.dst)

                                L ISTING 2.1: Simple GNU Radio example

     For further extended examples on building GNU Radio applications and writing new signal
     processing blocks, there are a number of existing software radio implementations available on
     the Comprehensive GNU Radio Archive Network (CGRAN) [46]. There are 23 projects hosted
     on CGRAN at the time of writing, including an IEEE 802.11 receiver, ZigBee PHY, and RFID.
A detailed step-by-step tutorial on how to write a signal processing block is also available on
the GNU Radio website [47].

2.6 Related Work

There are a variety of implementation of media broadcasting standards using a software radio
approach. For example, the physical layer of the Digital Audio Broadcasting standard was
implemented by Elsner in 2007 [48], and later reimplemented by Müller in 2008 [49], both
using the GNU Radio framework.

For Digital Video Broadcasting, there is a partial implementation for DVB-T and DVB-H [50]
using a specialised software radio platform based on single-instruction multiple-data (SIMD)
parallel DSP platform from Infineon called MuSIC [51], and a fully digital transmitter was
implemented using FPGAs [52].

More recently, A DVB-T transmitter named Soft-DVB was implemented [53] by Pellegrini
et al. Initially, they managed a processing time of 6.8 times realtime on the computer, but
the observation that the transmitter didn’t consume very much memory prompted them to em-
ploy time/space trade-off techniques for optimisation. In particular, they improved the Reed-
Solomon encoder by pre-calculating all of the values in GF (28 ) into an array, and transformed
the required computation into simple lookup operations. They applied the same technique for
the Pseudo Random Binary Sequence (PRBS) generation, Transmissions Parameter Signalling
(TPS) data and redundancy calculation, constellation mapping, interleaving and unpacking
bytes into bits for bit-level operations. Another optimisation they used was converting bit-
level operations to byte-level operations. Their implementation results are promising, because
they have been able to achieve realtime transmission of a DVB-T signal using a computer that
has a 3.0 GHz Intel Pentium 4 single core processor after optimisations. Details of the work
could be found in Rose’s masters thesis [54].

Later, the group implemented a software radio based DVB-T receiver [55], detailed in Di Dio’s
masters thesis [56]. Even though they continued to use the USRP as the hardware platform, they
chose to use a custom software radio framework called “newRADIO” [57] written entirely in
C++ for efficiency gains and finer control over multi-threading. Similar to the transmitter, they
have also employed memory and time trade-off techniques to improve the performance. De-
spite the optimisations in place, the processing required for DVB-T reception was significantly
higher, because they achieved realtime performance on a considerably more modern computer
with an Intel Core 2 Quad Q9400 quad-core processor. This work provides a good bench-
mark with which we can compare performance, while an overview of their memory trade-off
optimisations could be found in a follow-up paper [58].

                                         C HAPTER 3

                                      System Design

To design our DVB-S software radio application, we first looked at the functional blocks (See
Figure 2.1) we needed to implement the complete system. Where possible, we made use of
existing signal processing blocks that are part of the GNU Radio framework to avoid code
duplication. However, even its most recent release version 3.3 did not contain all of the com-
ponents required to assemble a complete DVB-S transmitter and receiver, thus it was necessary
to develop and write those missing blocks in order to implement a complete DVB-S signal
processing chain.

We decided to extend GNU Radio in a way similar to how existing modules are structured
within GNU Radio. Therefore, it takes the form of a new Python package called dvb that con-
tains blocks specific to the DVB family of standards, which is to be installed system-wide. This
way, end-user GNU Radio Python applications can access the new blocks by simply importing
this module with the statement import dvb in Python. The module name dvb was chosen
because the DVB family of video broadcasting standards share many common signal process-
ing blocks, particularly the forward error correction coding blocks. What differs between say
DVB-S and DVB-T is the baseband shaping and modulation; is that DVB-T uses OFDM while
DVB-S uses QPSK, appropriate for the satellite channel characteristics.

3.1 Transmit path

Looking at the DVB-S functional blocks for the transmit path, we see that the blocks structure
are relatively simple. It was mostly a one-to-to mapping between the functional blocks specified
in the standard and the GNU Radio blocks we implemented.
3.1.1 Data randomiser

In DVB-S, the input media data stream is randomised for energy dispersal by taking the XOR
of the data stream against a PRBS generated by an LFSR. The nature of this block means that
it could be implemented so that the data is randomised bit by bit, or we can process them
multiple bits at a time. In designing the data interface for this block, we decided to perform
randomisation a byte at a time for computational efficiency, and the block should input and
output one MPEG-2 TS packet at a time. Processing inputs one packet at a time also allows us
to perform inspections on the packet structure and contents without keeping unnecessary state.

We introduce a new data type mpeg_ts_packet to represent a complete 188 byte MPEG-
2 TS packet. Its memory layout consists of 188 bytes of the packet data, plus 68 bytes of
memory padding to fill the data type up to 256 (a 2n boundary) bytes for optimal buffering by
GNU Radio.

The class dvb_randomizer_pp randomises its input sequence of MPEG-TS packets, and
has the following input and output signatures:

 Input:    mpeg_ts_packet
 Output: mpeg_ts_packet

The input and output signatures indicate that this block is designed to process a complete
MPEG-2 TS packet at a time. Internally, the block contains a complete software implementa-
tion of the LFSR that produces the PRBS with generator polynomial 1 + X 14 + X 15 , as well
as an internal counter to reset itself after each period of 8 packets. For every byte in the input
packet, the block first clocks the LFSR 8 times to generate the next 8 bits of the sequence, then
calculates the XOR result between these bits and the input byte.

3.1.2 Reed-Solomon Encoder

Once the packet data has been randomised, it is encoded using Reed-Solomon code RS(204,188),
which is shorted from the original RS(255,239) code. We designed this block so that it connects
directly to the randomiser, therefore taking an input of mepg_ts_packets.
                    Parameter                            Value
                    Symbol Size                          8 bits per symbol
                    Field Generator Polynomial           0x11d
                    First consecutive root in index form 0 (index form)
                    Primitive element                    0x01
                    Number of roots                      16
                            TABLE 3.1: Reed-Solomon parameters

We introduce a new data type dvb_packet_rs_encoded to represent the output error pro-
tected packet from the Reed-Solomon encoder. Its memory layout consists of 204 bytes of
the packet data (188 bytes of original packet data plus the 16 additional Reed-Solomon parity
bytes) as well as 52 bytes of memory padding to fill the structure up to 256 bytes. The block
dvb_rs_encoder_pp has the following input and output signatures:

 Input:    mpeg_ts_packet
 Output: dvb_packet_rs_encoded

This block was implmented using library functions that are available in the GNU Radio core
source code. Specifically, we make use of general purpose Reed-Solomon encoding and de-
coding functions written by Phil Karn in 2002, applying Reed-Solomon parameters, using pa-
rameters as given in Table 3.1.

It is worth noting that the standard [21] specifies that the primitive element for the code gener-
ator polynomial should be λ = 02HEX . From testing against captured satellite signals, it was
found that using primitive element of 1 yielded correct behaviour.

3.1.3 Convolutional Interleaver

We have chosen to implement the convolutional interleaver by constructing the conceptual
structure exactly as it is defined; composing of a cyclic input switch onto 12 shift registers, and
followed by a cyclic output switch.

Rather than combining this functionality into a monolithic block, we have decided to preserve
the conceptual structure of the Forney interleaver and implement this as a hierarchical block,
consisting of a gr_deinterleave for the cyclic input switch, gr_interleave for the
cyclic output switch, and dvb_fifo_shift_register_bb for the shift registers.

At the interleaver stage, it no longer made sense to continue processing the source data as a
packet, because the interleaver disperses the bytes of each incoming packet in time. Therefore,
dvb_interleaver_bb takes inputs and produces output byte by byte:

 Input:    unsigned char
 Output: unsigned char

It should be pointed out that that this block initially will output initialised values from its shift
registers, for the first 2244 bytes. It takes 2 × 17 × (0 + 1 + . . . + 11) = 2244 bytes of
input to complete fill the shift registers, and by then 1122 bytes of data as well as 1122 bytes
of interleaved uninitisalised values would also have been output. Because it is interleaved,
those values cannot be removed without losing data. In general, this behaviour is unimportant
because after the first 1122 bytes have passed, the uninitialised values would all have been
replaced with the actual data.

3.1.4 Convolutional encoder

Even though the GNU Radio core framework does not contain signal processing blocks for
convolutional coding, among its extension packages is one named trellis, which provided
the blocks we need for trellis based coding.

Using this existing package, we created a hierarchical block to encapsulate trellis.encoder_bb
from the GNU Radio trellis package. We applied the following parameters specific to

       • Number of input bits k = 1
       • Number of output bits n = 2
       • The k × n generator matrix G = 0171, 0133
       • Symbol dimensionality: 2
The dvb_convolutional_encoder_bb block produces the mother code with rate                       .
That is, it produces two bits of output for every input bit. Its input and output signatures are
slightly misleading, because the convolutional encoder inputs and outputs are in the form of
unpacked bits rather than a full byte. That is, each byte is either 0 or 1.

 Input:    unsigned char
 Output: unsigned char

3.1.5 Puncturing

After convolutional encoding, the bitstream is then punctured according to a puncturing matrix
depending on the desired punctured convolutional code rate. The block dvb_puncture_bb
reads in unpacked bits, and outputs unpacked bits according to an internal counter over the
puncturing matrix:

 Input:    unsigned char
 Output: unsigned char

The puncturing matrices are defined in the standard, and are referenced in table 2.1. The resul-
tant code rate is selected by specifying the puncturing matrix as the parameter to this block’s
constructor. The puncturing matrix is to be represented in the form of a 1-dimensional list,
interleaving the I and Q components of the matrix.
For example, the rate puncturing matrix for rate        code is:

                                          X: 1 0 1
                                           Y : 1 1 0

We represent this matrix for our puncturing block by interpreting how we apply this to a se-
quence of interleaved I and Q symbols. Thus, our puncturing sequence to be used in our
puncturing block for rate code is (1, 1, 0, 1, 1, 0).
3.1.6 Baseband shaping and modulation

The QPSK modulation block consists of two parts: bit to symbol mapping and a finite impulse
response (FIR) filter for baseband shaping. Because both of these two blocks already exists in
GNU Radio, we simply created a hierarchical block that is composed of the two sub-blocks
using parameters appropriate for DVB-S. The input and output signatures for our modulator
block dvb_s_modulator_bc are:

 Input:   unsigned char
 Output: gr_complex

The bit to symbol mapper adapts the standard QAM Gray coded mapping from the GNU Radio
library by rescaling the amplitude to 1 from the original 2 obtain the DVB-S constellation.
This constellation is then mapped to from the input bits.

After mapping the input bits into complex symbols, they are then passed through an interpo-
lating FIR filter to obtain the final pulse shape. The filter taps were designed to have a roll-off
factor of 0.35 as per the specifications using the filter design library that is part of the GNU
Radio framework.

This block takes in two parameters: the desired sample rate and the symbol rate. These two
parameters are used by the interpolating FIR filter to determine how many samples it should
output for every input symbol.

3.2 Receive path

Just like its hardware radio counterpart, the signal receive path is considerably more complex
than the transmit path due to the added need for synchronisation of received symbols. Figure
3.1 illustrates the receive chain starting from the USRP as a signal source through to an output
of MPEG-2 TS packet stream.
                       F IGURE 3.1: DVB-S software radio receive chain

                               F IGURE 3.2: DVB-S Demodulator

3.2.1 QPSK Demodulator

Being a standard modulation scheme, it is not surprising that the GNU Radio library contains
a block that is able to take the input samples of the spectrum and perform QPSK demodulation
to obtain the complex symbols.

In our implementation, we based our demodulator on an existing modular implementation for
differential phase shift keying modulation. We created a hierarchical block structured as per
Figure 3.2. The first block is an automatic gain control (AGC) for optimum signal amplitude.
Its parameters are listed in Table 3.2.

After adjusting the signal amplitude using the AGC, we then correct any frequency drifts in
the signal that may occur due to frequency drifts at the local osciallator of the satellite receiver
                        Parameter                             Value
                        Attack rate for fast changing signals 0.06
                        Decay rate for slow changing signals 0.001
                        Reference value                       1
                        Initial gain value                    1
                        Maximum allowable gain                100
                       TABLE 3.2: Automatic gain control parameters

LNB, using a frequency locked loop based on band-edge filters (gr_fll_band_edge_cc).
After frequency recovery, we the proceed to recovering the individual QPSK symbols from
samples, using the generic gr_mpsk_receiver_cc. Finally, because this block outputs its
symbol constellation in a ‘+’ orientation, we need to rotate this 45 degrees by multiplying by a
contant of 0.707 + j0.707.

We have also implemented the symbol recovery from samples using polyphase filterbanks
(gr_pfb_clock_sync_ccf) as an alternative approach. This performs signal timing syn-
chronisation with the input samples and then outputs one sample per symbol clock. Having
recovered the QPSK symbols from the samples, we then pass them through a Costas phase
locked loop for final phase correction.

For both of these demodulator implementations, our overall blocks dvb_s_demodulator_cc
and dvb_s_demodulator2_cc has the following input and output signatures:

 Input:   gr_complex
 Output: gr_complex

3.2.2 Depuncturing, Viterbi Decoder and Sync Decoder

After demodulation, we must then depuncture the data, decode the convolutional coded data
stream using Viterbi algorithm and finally synchronise to the MPEG-2 TS stream with the
sync decoder. Therefore the block dvb_depuncture_viterbi_cb must take the received
complex symbols as input, and produce properly decoded and synchronised bytes:
                 F IGURE 3.3: Depuncture-Viterbi hierarchical block structure

 Input:     gr_complex
 Output: unsigned char

This is the most complex block in our DVB-S system, due to the need for a feedback mechanism
in order to resolve several received symbol ambiguities. We name this block the depuncture-
Viterbi hierarchical block for short and its structure is outlined in Figure 3.3.

The astute reader may recall that GNU Radio forbids feedback cycles in the signal flow graph
and may be wondering how we have managed to implement this feedback loop across multiple
sub-blocks. The answer lies in our loosely coupled, modular design combined with the use of
a lightweight callback mechanism to avoid the need to create another signal flow for feedback

In our design of this signal processing block, each of the individual sub-blocks can be used on
its own in GNU Radio Python applications. However, it is not possible to connect the feedback
mechanism in Python. This is why we have implemented a hierarchical block from within C++,
rather than in Python as it is typically done. This way, we can define a custom delegate function
that simply iterates over the different combinations of corrections to be applied to the incom-
ing signal. While we only try combinations of phase rotation, conjugation and depuncturing
boundary, it can easily be extended to allow iteration over the different DVB-S code rates as
well, similiar to the information recovery scheme described in [59]. To better understand ex-
actly why we need this feedback, we begin by first explaining the incoming signal ambiguity
                   F IGURE 3.4: Illustration of complex symbol ambiguity Complex adjustment

For QPSK modulation, the received symbol constellation suffer from two kinds of ambiguity
problems. One of these is phase ambiguity, meaning that the symbols received may have
                                               π        3π
been subjected to a phase rotation of either 0, , π, or    ; caused by a lack of synchronisation
                                               2         2
between the transmitter constellation and the receiver constellation. The square constellation
pattern (Figure 2.5) would appears the same orientation even if it has been rotated by those

The other ambiguity is complex conjugate ambiguity, which causes the received symbols to
appear as the complex conjugate of the true symbols, in the process of down converting the
signal from RF to baseband. The extraction of the in-phase and quadrature components are
done by mixing the RF with a two local oscillators phased apart, for example using cos ωc t
and sin ωc t. However, it is undefined whether the in-phase component lead or lag the quadrature
component, for example cos ωc t and − sin ωc t could have been generated by the oscillators
instead. This can result the complex conjugate of the true symbols. Figure 3.4 illustrates this
concept: the true constellation points as sent by the QPSK modulator are numbered 1 to 4,
although all eight possible constellation orientations appear identical without this information
at the receiving end.
To correct these ambiguities, we must take the complex conjugate of the symbol and/or apply
a constant phase rotation to the output of the demodulator if necessary. We implemented the
block dvb_complex_adjust_cc that applies a phase rotation and optionally a complex
conjugation to its input complex symbols:

 Input:    gr_complex
 Output: gr_complex

In addition to the basic function of processing input and output streams, this block also has a
method to choose which corrections to apply, allowing us to control the adjustments. Unfor-
tunately, neither of these two ambiguities can be identified at the demodulator stage, and must
rely on information from further downstream to determine that correct set of adjustments. Symbol dropper

As well as the complex symbol ambiguity, there is another ambiguity in the signal due to
the nature of puncturing. The process of puncturing applies a puncturing matrix to a group
of complex symbols at a time. Therefore when depuncturing, it is vital to properly align the
depuncturing boundary to ensure that symbols are inserted at the correct positions.
                                                                       code. The dimensions
To illustrate this problem, let us look at the puncturing matrix for rate
of the matrix is 3 × 2 indicating that puncturing takes a complete cycle every 3 symbols. In
this matrix, we see 2 0’s, which means that a one I and one Q component is to be removed,
resulting in an output of 2 symbols for every 3 that are input. When depuncturing this stream
of symbols, we must correctly align the boundary of each depuncturing cycle to each group of
2. For rate code, we can see that there are two possible boundary positions of which we must
correctly choose one.

To correct the depuncturing boundary, we need a way to shift the depuncturing boundary by
one or more symbols. This is facilitated by our block dvb_drop_cc, which simply passes
the input to the output during normal operation. However, this block has a method that can be
called that raises a flag for the block to skip the output for its next input symbol. This allows
us to shift the boundary for the depuncturing block further downstream.
Similar to the complex number ambiguity, the depuncturing symbol boundary ambiguity cannot
be resolved at this stage, and must rely on information further downstream. Depuncturing

Our depuncturing block is implemented in a way similar to the puncturing block. When instan-
tiating this block, a puncturing sequence must be supplied. The block then iterated over this
sequence for every input to determine whether or not insertion is necessary that that point. The
dvb_depuncture_ff block operates on a stream of interleaved I an Q symbols rather than
a stream of complex symbols:

 Input:    float
 Output: float Viterbi Decoder

We mentioned previously that we used a signal processing block in the GNU Radio package
called trellis for convolutional encoding. It is no surprise that this package also includes
a block for Viterbi decoding, named trellis.metrics_f and trellis.viterbi_b.
This metrics block computes the trellis branch weights based on its input, while the Viterbi
block performs the add-compare-select and traceback operations. For metrics computation, we
used Euclidean distances for soft decoding, rather than hard decision decoding. Sync decoder

The sync decoder is responsible for monitoring the incoming bits and determining the correct
byte boundary. Our implementation is largely inspired by the sync-search scheme described
in [60] that uses a finite state machine to match SYNC bytes. We have extended this algorithm
by integrating additional synchronisation function for the inverted SYNC byte such that the
first output of this block is the start of the frame of 8 MPEG-2 TS packets, correctly aligned for
the derandomiser.
                    F IGURE 3.5: Sync search algorithm - state transitions

Our sync search algorithm is detailed in Figure 3.5: Initially, the sync decoder is in an unlocked
state with with a confidence level of 0 for synchronisation (lower left corner). In this state, we
are completely desynchronised, so one bit is read in at a time, each time checking for a SYNC
byte. Once a SYNC byte appears, we increase our confidence level by 1 then skip 204 bytes
in search for another SYNC byte. If this is a true SYNC byte, then we expect to find the next
SYNC after 204 bytes. The case for when we encounter an inverted SYNC byte is handled by
the confidence level only decreasing by 1, allowing for fast recovery upon the next non-inverted
SYNC bytes.

This initial sync search process continues until we reach our threshold confidence level, where
we then transition our operation mode to Seeking. In this operation mode, we continue to skip
204 bytes at a time to seek an inverted SYNC byte that marks the beginning of a frame of 8
MPEG-2 TS packets. If the byte is neither a SYNC nor an inverted SYNC, then we decrease
the confidence level until we return to our initial unlocked state. When we reach an inverted
SYNC byte, our operation mode becomes Synchronised, during which we pack all of the input
bits into bytes and begin our output from the first inverted SYNC byte. Once in this mode, the
sync decoder will continue to check for the presence of the periodic SYNC byte or its inverted
version. If the number of missed SYNC bytes reach a threshold, then we return to an unlocked

Our sync decoder has the following input and output signatures:

 Input:    unsigned char
 Output: unsigned char

An additional feature we have added to this block as part of the feedback mechanism is that
it has the ability to report its synchronisation status through the use of a delegate callback
function. After a fixed number of input bits, the sync decoder invokes its delegate callback
with whether or not the block is in synchronisation. If no callback function was assigned, then
no action is taken.

3.2.3 Deinterleaver

The Forney deinterleaver is implemented in the same way as the interleaver, using a hierarchical
block containing shift registers. Its structure is the same as its conceptual form as discussed in
Section 2.1. The dvb_deinterleaver_bb block has the same input and output signatures
as the interleaver:

 Input:    unsigned char
 Output: unsigned char

The only major difference of note is that this block does not output any of its uninitialised
shift register contents. Instead, it discards the first 2244 bytes from its output, because the first
1122 bytes of incoming data would not be usable due to having 1122 uninitialised values being
interleaved with it.

3.2.4 Reed-Solomon decoder

The Reed-Solomon decoder block dvb_rs_decoder_pp is implemented in a similar fash-
ion to the Reed-Solomon encoder, but with its input and output signatures reversed:
 Input:    dvb_packet_rs_encoded
 Output: mpeg_ts_packet

3.2.5 Derandomiser

The dvb_derandomizer_pp block performs the inverse operation of the randomiser, and
its input and output signatures are identical to that of the randomiser:

 Input:    mpeg_ts_packet
 Output: mpeg_ts_packet

Despite being an inverse operation, the implementation of this block is identical to the ran-
domiser, due to the property of the XOR operation that X ⊕ X = 0. In other words, a second
application of the XOR operation on the same sequence nullifies its effect.

Correct derandomisation relies on correct synchronisation. This is achieved by detecting the
start of the frame of 8 MPEG-2 packets delimited by the SYNC byte that was inverted by
the randomisation process. In our implementation, the derandomiser does not search for this
delimiter; we instead rely on the sync decoder to suppress its output until the first inverted
SYNC byte. As a precaution to loss of synchronisation, this block will reset itself whenever it
receives an input packet with an inverted SYNC byte.

3.3 Testing and quality assurance

We adopted a test driven development approach to developing each new signal processing block
which involves first writing a unit test that ensures correct functionality of the block before the
block itself is written. It is by design that at the time when the test is developed, it should fail
because the actual signal processing block has not been implemented. However, this practice
ensures that the behaviour of the block is well defined prior to implementation.

We used two frameworks for unit testing: CppUnit, a port of the popular Java-based unit test-
ing framework JUnit, was used to test blocks at the C++ level. For unit testing a the Python
application level, GNU Radio’s built-in unit test framework was used instead because it more
closely resembles building and running of a signal flow graph.

It becomes increasingly more difficult to perform unit testing for signal processing blocks from
the convolutional encoder onwards, due to a lack of synchronisation and a structure that is
asymmetric between the transmit and receive paths. A complete list of the unit tests we have
implemented for this system can be found in Appendix C.

3.4 Performance tuning

Several optimisations were applied to our system in improving its throughput. The first of
these optimisions was for the randomiser and derandomiser blocks. The standards specify that
a LFSR should be used in the block. Even though computing the PRBS at runtime using the
LFSR costs an insignificant amount of overhead relative to the more computationally intensive
blocks like the Viterbi decoder, we adoped the approach described in [54] by pre-computing
the values and then hard-coded the values in the source code. However, the inelegance of this
approach prompted to refine it by generating the required PRBS at run-time but only once dur-
ing the instantiation of the randomisation blocks. This costs approximately 1.5 KB of memory
but in exchange, only a read operation and an XOR operation is required to process each of
the input bytes. This represents an optimal balance between speed and code elegance because
there is no need for a hard-coded byte sequence.

As part of the receive path, it was necessary to convert complex symbols into interleaved in-
phase and quaduratrature components as floating point numbers. We initially implemented this
as a hierarchical block that pieces together existing block and achieves this function without
the need to create a new one. However, we have found that due to the construction of this hi-
erarchical block, this block had rather poor throughput given that this block is only responsible
to separing the real and imaginary part of its input complex number. To reduce the amount of
overhead in the number of memory copy operations, We re-implemented this as a single signal
processing block in C++, achieving a speedup of 5.7.

                                          C HAPTER 4

                                   Experimental Results

In the experimental evaluation we focus on the correctness of our implementation and the data
throughput it offers. In addition to the empirical results, we also investigated the scalability of
our approach in light of the increasing number of processing units found in typical computers.

4.1 Experimental Setup

For the experiments, we used a computer with performance characterise similar to a typical
computer that can be purchased today. It is configured with a modest 2.4 GHz quad-core
processor with 2 GB of memory. More details of the computer configuration can be found in
Appendix E.

We divide our experiments in two sections; the first is for validating correctness of our system,
while the other is for benchmarking the throughput of the system as a whole as well as its
individual components. For validation, we run our system through a known good RF signal,
sampled using the USRP into a data stream. If we can correctly view the MPEG-2 TS stream
at the output of our system, then we have validated the correctness of our implementation.
The next section involves measuring the throughput of our system as a whole as well as each
individual constituent signal processing blocks.
                      F IGURE 4.1: Computer to satellite dish connection

4.2 Validation using a real satellite data stream

To validate the correctness of our implementation, we needed to show that our receiver chain
of blocks can correctly decode a signal stream from a satellite dish. We used existing captured
data from a USRP connected to the LNB, shown in Figure 4.1.

The connections begin like a conventional set-up for satellite TV, with a connection from the
DVB-S set-top box to the satellite receiver LNB. Then, the RF output of the set-top box was
daisy-chain connected to the SMA input to the DBSRX daughterboard in the USRP, which is
in turn connected to the computer via USB. The connection of the set-top box between the
LNB and the USRP was for one of two reasons. First, the set-top box could be used for fine
positional adjustments to the LNB and the satellite dish, and at the same time it verifies that
the satellite receiver IF connection had a good signal that could be decoded into video stream.
Second, the LNB needed to be powered by a DC voltage along the coaxial cable. Even though
it is possible to apply a DC voltage bias via jumper J100 on the DBSRX daughterboard (See
Figure 2.10) it is far easier to simply let the set-top box supply the voltage required.
          Parameter                    Value
          Satellite                    GE-23
          Orbital location             172◦ East
          Transponder frequency        3915.75 MHz (C Band)
          Transponder polarisation     Horizontal
          Symbol rate                  3.33 Mbaud
          FEC code rate                3/4
          LNB model                    Microelectronics Technology Inc. AC15-2C
          LNB LO frequency             H: 5.15 GHz; V: 5.75 GHz
          LNB LO frequency stability ± 1 MHz Max. at room temperature
               TABLE 4.1: Satellite equipment and transponder characteristics

The satellite that was tuned to was General Electrics GE-23, launched 29 Dec 2005 and orbiting
at 172.0◦ E. The satellite channel that was tuned into was a low symbol rate and high forward
error correction channel called “Hope Channel International”. From a transponder frequency of
3915 MHz and the LNB LO frequency of 5150 MHz, the resultant carrier frequency at the RF
input of the USRP is calculated to be 1235 MHz which is within the range of signals receivable
by the DBSRX daughterboard. This channel was chosen for because the USRP is fast enough
to capture its slow symbol rate. Further details can be found in Table 4.1.

For this section, we used a pre-captured data stream saved onto a file instead of decoding a live
video stream. Not only is a data file as signal source much easier to work with, it also provides
us with a consistent, and repeatable data stream for testing.

The data stream that was captured was limited to 250 million complex samples, which equates
to       = 31.25 seconds of capture time given our sample rate from the USRP is 8 million
complex samples per second, output in the form of interleaved short integers.

We construct our flow graph in Python as a GNU Radio application, which first reads from the
pre-recorded signal file and converts it back into its complex form. Then, it feeds that samples
through our signal processing blocks to be finally output to a file in correct MPEG-2 TS format.
The MPEG-2 TS file can be played back using a supported media player, such as MPlayer. The
     Python application code, as shown in Listing 4.1, is quite short and simple due to its complexity
     encapsulated within the signal processing blocks.

1    #!/usr/bin/env python
2    from gnuradio import gr
3    import dvb
5    sample_rate = 8e6                  # 8 MS/s
6    symbol_rate = 3.33e6               # 3.33 Mbaud
7    puncturing = [1,1,0,1,1,0]         # For rate 3/4
9    tb = gr.top_block("DVBS_Receiver")
11 # Data file adaptation
12 src = gr.file_source(gr.sizeof_short, "hope.dat", False)
13 s2c = gr.interleaved_short_to_complex()
15 # Main flow
16 demod = dvb.s_demodulator_cc(sample_rate, symbol_rate)
17 decode = dvb.depuncture_viterbi_cb(puncturing)
18 deinterleaver = dvb.deinterleaver_bb()
19 pad = dvb.pad_dvb_packet_rs_encoded_bp()
20 rs_decoder = dvb.rs_decoder_pp()
21 derandomizer = dvb.derandomizer_pp()
22 depad = dvb.depad_mpeg_ts_packet_pb()
23 dst = gr.file_sink(gr.sizeof_char, "decoded_hope.ts")
25 tb.connect(src, s2c, demod, decode, deinterleaver, pad,
26             rs_decoder, derandomizer, depad, dst)

                  L ISTING 4.1: DVB-S receive path from samples to transport stream

     After running this flow graph, a new file called decoded_hope.ts was created. We can
     then open this transport stream in MPlayer by typing mplayer decoded_hope.ts. We
            F IGURE 4.2: Screen capture of MPEG-2 TS playback using MPlayer

have successfully played back the video and audio data in the MPEG-2 TS stream without
any problems, and a screen capture of MPlayer is shown in Figure 4.2. This validates the
correctness of our DVB-S implementation.

4.3 Performance measurements

Benchmarks were conducted on the receiver flow graphs to determine the throughput of the
system as a whole, and also its constituent blocks. The majority of our benchmarks were
synthetic benchmarks, which involved subjecting the block under test to either null input (input
stream consists only of zeros) or pseudo-random input generated using a Galois LFSR. Only a
few of our benchmarks used the raw captured stream (or partially processed stream).
    The reason why we chose to use either the null signal source or the GLFSR source was to
    eliminate the throughput bottleneck from reading source data from disk. Many of our smaller
    blocks that perform minimal processing and with low memory operations overhead such as
    the padding and de-padding blocks have such high throughput that the throughput of the entire
    flow graph was limited by the signal source rather than the block under test. Using a null signal
    source or the pseudo-random source allows us to overcome this problem of disk bottleneck.

    For some of the simpler blocks that performs the same set of computations regardless of its
    input content, the null signal source is an excellent signal source due to its extremely low
    overhead. However, it is possible for blocks that carry out different operations depending on
    input content such as the Reed-Solomon decoder to have a rather unrealistic throughput when
    only zeros were input. Hence we included the use of the pseudo-random signal source which
    is more appropriate in those circumstances. We elected to use a pseudo-random source over
    a true random source for two reasons. First, the pseudo-random source is much faster due to
    the use of a GLFSR to generate its output, compared to say reading from the operating system
    generated entropy from /dev/urandom. Second, a pseudo-random source is repeatable so
    that each of our time measurement runs process that same input data for consistency.

    4.3.1 Experiments harness

    To run both our synthetic benchmarks and those that use real data, we created a test harness
    that reads in a list of experiments described in a comma separated values (CSV) file using semi-
    colons as delimiters, and conducts the timing experiments accordingly. This list of experiments
    specifies the function for we we measure the running duration as well as the parameters for that
    function. The functions that we measure the running time for are defined in a separate Python
    file, where each builds a particular flow graph that connects the block under test to a signal
    source and a signal sink. An example of such a flow graph for timing purposes is shown in
    Listing 4.2.

1   def rs_decoder_rand(n, degree=16):
2        """
3        Runs a flow graph that performs RS decoding on n pseudo-random bytes
4         """
5         tb = gr.top_block(’rs_decoder_rand’)
6         src = gr.glfsr_source_b(degree)
7         head = gr.head(gr.sizeof_char, n)
8         pad = dvb.pad_dvb_packet_rs_encoded_bp()
9         decoder = dvb.rs_decoder_pp()
10        dst = gr.null_sink(dvb.sizeof_mpeg_ts_packet)
11        tb.connect(src, head, pad, decoder, dst)

           L ISTING 4.2: Example of a function used for timing a particular block under test

     The harness measures the running time for that function using a built-in Python module called
     timeit, which reports the running time of the function call in a floating point number of
     seconds. One added benefit of the timeit module is that it disables the Python garbage
     collector for the duration of the timing for consistency. The timing process is then run 10
     times, each recording the result to an output CSV file for later processing. When determining
     a running time, minimum time of the 10 measurements were used because it is the shortest
     duration that give a lower bound for how fast the computer is able to run the flow graph.
     Longer durations are typically caused not by the block under test, but rather by other concurrent
     processes interfering with the timing.

     4.3.2 Throughput analysis

     The obvious question when looking at the blocks’ throughput is, how much is considered ad-
     equate for real-time decoding. To answer this question, we must begin by tracing the signal
     flow from the USRP source in the receiver flow graph. The complete receiver flow graph can
     be found in Appendix F.

     The USRP is set to its maximum sample rate of 8e6 samples per second. This means for the
     blocks directly after the USRP must have a throughput of at least 8e6 complex samples per
     second. The throughput requirement changes after the M-PSK receiver, since its output rate is
     equal to the satellite transponder’s symbol rate which is less than our sampling rate. Using the
       Block                       Minimum throughput              Example rate
       Automatic gain control      sample rate (complex)           8e6
       Frequency locked loop       sample rate (complex)           8e6
       M-PSK receiver              sample rate (complex)           8e6
       Rotate 45 degrees           sample rate (complex)           3.33e6
       Complex adjustment          symbol rate (complex)           3.33e6
       Drop                        symbol rate (complex)           3.33e6
       Complex to interleaved float symbol rate (complex)           3.33e6
       Depuncture                  2× symbol rate (float)           6.66e6
       Trellis metrics             2× symbol rate (float)           6.66e6
       Viterbi                     2× symbol rate (float)           6.66e6
       Sync decoder                2× symbol rate (bit)            6.66e6
       Convolutional deinterleaver 8 × 2× symbol rate (byte)       8.33e5
       Pad RS encoded packet       8
                                     × 2× symbol rate (byte)       8.33e5
       Reed-Solomon decoder        8
                                     × 2× symbol rate (byte)       8.33e5
                                   188   1
       Derandomiser                204
                                       × 8 × 2× symbol rate (byte) 7.67e5
                                   188   1
       Depad MPEG-2 TS packet      204
                                       × 8 × 2× symbol rate (byte) 7.67e5
                    TABLE 4.2: Minimum throughput for receiver blocks

example Hope Channel International, we may be sampling at 8 M samples per second, but the
symbol rate remains at 3.33 Mbaud.

Following this logic we obtain the minimum throughput for each block, given in table 4.2. Start-
ing at the depuncture block, the rate of processing appears to have doubled from the previous
block, but it is in fact the same amount amount of data because a complex value is composed
of two floating point numbers. For simplicity, our calculations ignore date expansion due to
depuncturing, i.e. assume that the punctured convolutional code rate is . For other code rates,
this can increase the data rates requirement for all of the blocks after the depuncturing block by
       7          7
up to (for rate ).
       4          8
To calibrate our measured times, we conducted a series of timed runs where the flow graph
contains only the null or pseudo-random source and the null signal sink. Measuring this this
duration allows us to calibrate our measurements by quantifying the amount of time the flow
graph would have taken without processing by the block under test. We then subtracted the
            Block                             Null/Random Throughput
            Automatic gain control            Null            5.86e7 complex/s
            Frequency locked loop             Null            6.39e6 complex/s
            Frequency locked loop             Random          6.63e6 complex/s
            M-PSK receiver                    Null            8.37e6 complex/s
            M-PSK receiver                    Random          9.00e6 complex/s
            Rotate 45 degrees                 Null            9.36e7 complex/s
            Complex adjustment (no conj)      Null            1.00e8 complex/s
            Complex adjustment (conjugated) Null              8.56e7 complex/s
            Drop                              Null            1.30e8 complex/s
            Complex to interleaved float       Null            4.32e8 complex/s
            Depuncture (rate 1/2 code)        Null            8.79e7 float/s
            Trellis metrics                   Null            5.88e7 float/s
            Trellis metrics                   Random          1.30e8 float/s
            Viterbi                           Null            4.02e6 float/s
            Viterbi                           Random          4.35e6 float/s
            Sync decoder                      Null            1.33e8 bit/s
            Convolutional deinterleaver       Null            5.25e7 byte/s
            Pad RS encoded packet             Null            2.73e9 byte/s
            Reed-Solomon decoder              Null            4.22e7 byte/s
            Reed-Solomon decoder              Random          7.53e6 byte/s
            Derandomiser                      Null            2.50e8 byte/s
            Depad MPEG-2 TS packet            Null            2.73e9 byte/s
                            TABLE 4.3: Receiver blocks throughput

appropriate time measurement (null or pseudo-random, and data type) for this empty flow graph
from the measured time for our measurements.

To see whether or not each individual block by itself meets our throughput requirements, we
calculated its throughput based on the time it takes to process 10 million input values. Our
throughput measurements for the main receive path blocks are summarised in Table 4.3.

For the blocks that perform different operations depending on the contents of input data, we
used the throughput value for random input data. Blocks like the derandomiser that perform the
       F IGURE 4.3: Throughput of individual receiver blocks, normalised against re-
       quirement for realtime signal processing on a Core 2 Quad computer

same operation (XOR in this case), there was not the need to measure its performance against
a random source.

To put the throughput values into perspective, we normalised the measurements against the
what was necessary to achieve realtime decoding in Figure 4.3. A high normalised value indi-
cates that the block’s throughput exceeds the minimum requirement, while a score of less than
one means that the block is incapable of performing at realtime.

Figure 4.4 shows the relative amounts of time used by individual block to process a realtime
stream, assuming all of the block can run concurrently. From these results, what is most obvious
is that there are three blocks in particular that represent bottlenecks in the system: the frequency
locked loop, the M-PSK receiver and the Viterbi decoder. For the frequency locked loop and the
M-PSK receiver, a contributing factor to their low normalised scores is that they must process
the greatest amount of data. Those blocks perform processes that are typically implemented
using optimised hardware architectures or dedicated hardware.
       F IGURE 4.4: Relative processing duration for individual signal processing
       blocks as a proportion of all processing required for realtime signal reception

The Viterbi algorithm is by far the least performing component in our system. With a score
of 0.65, this block has throughput close to half of what is required for realtime processing.
We believe that the reason behind this under-performance is that the GNU Radio trellis
package might be designed to implement generic trellis coded systems, and therefore trades
off optimised performance with flexibility. While we aimed for a clean software development
approach where code duplication is avoided, this bottleneck calls for a compromise worth mak-

There is a free software digital signal processing library available from the same author as the
Reed-Solomon code library, Phil Karn, with specific optimisations for convolutional codes of
rate and memory length K = 7 He has also implemented specific optimisations for 64-
bit machines and more modern machines with support for streaming parallel data instruction
sets SSE2 and SSE4.1. An optimised Viterbi decoder based on this library should resolve the
While it is the block by block breakdown of relative throughput that is most important, it also
helps to consider the performance of the system today as a whole. Using the same experi-
mental procedure, we connected our full DVB-S receiver system in a flow graph and measured
its throughput. It performed underwhelmingly, at a rate of 1.19e6 complex symbols per sec-
ond. Considering that the system needs a throughput of 8e6 complex symbols per second, its
normalised throughput is at 0.15.

4.4 Utilisation of Parallel Computation Cores

The GNU Radio framework is capable of high performance concurrent multi-processing, al-
lowing for different signal processing blocks to run concurrently on different threads. This
capability is a key performance factor, because modern computers typically have two or more
processor cores available. During our experiments on a computer with 4 processor cores, we
typically observed utilisation of between one processor core with a relatively even load on the
other three cores; the total amount of CPU utilisation was in the range of 150% to 200%

From this observation, we can infer that the amount of parallelism are limited by the rate of
data output production by the earlier blocks in the receive path. Therefore, we should expect
higher a overall system performance if we had used a computer with a dual core processor at a
higher clock speed.

Alternatively, we should achieve greater system throughput if the processing required for each
of the slower blocks could be split into two or more blocks, allowing GNU Radio to schedule
concurrent processing on different processor cores.

                                         C HAPTER 5


5.1 Summary

In this project, we developed a new system for transmission and reception of DVB-S signals.
We provide an implementation of the system for the GNU Radio software radio framework.

Our system was designed with software quality in mind, using a test driven development ap-
proach. We have also re-used existing signal processing blocks where available in order to
avoid code and functionality duplication in GNU Radio. The overall structure of our system is
highly modular, with each individual signal processing block completely decoupled from one
another so that future work on improvement the system performance can be carried out on a
block-per-block basis. Even in the depuncture-Viterbi block where there is feedback mecha-
nism, sub-blocks can replaced with another without affecting its overall operation.

We performed extensive evaluations of our implementation of DVB-S receiver. We have suc-
cessfully validated the correctness of our system by correctly decoding and playing back a real
captured media stream using MPlayer. Further evaluations revealed that there are only three sig-
nal processing blocks preventing realtime reception of DVB-S. We expect that realtime DVB-S
reception is possible once these three block have been optimised for greater throughput. We
identified that the system is unable to obtain full utilisation of the available multiprocessing
capability of the CPU due to bottleneck constraints of the slower blocks, and that better overall
performance may be achieved if the functions of those blocks could be split into two.

While discussing the performance of our system, it must be stressed that the performance scores
are given with respect to the best case scenario of a low symbol rate and low code rate satellite
transponder. Typical symbol rates for Ku-band satellite transponders (received by smaller,
60cm satellite dishes) are in the order of 30 million symbols per second, a magnitude greater
than what we have used to determine realtime reception, and possibly higher code rates at the
same time. This is because much greater effective isotropic radiated power (EIRP) is available
at the receiver for Ku-band satellites. Not only is this symbol rate greater than what is receivable
by the USRP hardware, the amount of computation power required to receive this signal is an
order of magnitude greater.

5.2 Future Work

Our main focus for this project is on completeness of functionality and best design practices,
while achieving realtime performance was a secondary goal. To achieve likely realtime recep-
tion of DVB-S on commodity computers, it would first require further optimisation to reduce
the processing time required on several key signal processing blocks.

The first candidate for performance improvements is of course the Viterbi decoder. We have
seen in our experiments that it is the slowest block in our receive path, and therefore we expect
the greatest benefit from improving this block. This can be done by replacing our trellis metrics
computation and Viterbi algorithm blocks with a new block that acts as a wrapper around Phil
Karn’s digital signal processing library. Fortunately this extension is a simple task because
of our loosely coupled system design. We expect to see much greater throughput using his
optimised code.

The other main candidate for optimisation is the frequency correction and Q-PSK symbol re-
covery blocks. These blocks need the greatest absolute throughput because they handle pro-
cessing the raw samples from the USRP before any rate reduction takes place. We have begun
to investigate this improvement by implmenting the same demodulation functionality using a
different approach. We have provided both implementations as dvb_s_demodulator_cc
and dvb_s_demodulator2_cc.
As well as reducing the computation requirements, the aim of these optimisations is also to im-
prove multiprocessing efficiency by equalising the processing requirements for different blocks.
At this stage, we are unable to fully utilise the four processor cores available due to throughput
bottleneck limits.

As we described in Chapter 3, we used a brute-force approach for resolving the received signal
ambiguities. While this strategy is simple and effective, the synchronisation process is con-
sidered relatively slow because it takes quite some time for the sync decoder to determine the
correct set of adjustments to be applied. There are, in fact, a number of different strategies for
resolving the complex phase rotation ambiguity.

The phase ambiguity problem is also known as the “node synchronisation” problem, for which
there are several solutions [61]. The conventional approach is by observing the growth of the
path metric in the Viterbi decoder. If the growth exceeds a certain threshold, then there is high
probability that the carrier phase used for demodulation is incorrect. The drawback to this
approach is the need to inspect the internals of the Viterbi decoder, more tightly coupling the
functionality with the overall block with the specific behaviour of an implementation.

Another approach for node synchronisation is known as syndrome based node synchronisation
[62; 63]. This approach is faster compared with the previous, and perform synchronisation
before Viterbi decoding, allowing us to integrate this with the demodulator, further decoupling
the Viterbi decoder and the sync decoder. A specific application of syndrome based node
synchronisation technique to DVB-S systems is described in [64].

Using this approach, we can further improve the our overall system architecture because we
can reduce the size of the feedback loop. With syndrome based node synchronisation, we can
perform node synchronisation to resolve signal ambiguities using a feed-forward approach, cor-
recting the input symbol stream before it reaches the Viterbi decoder. There is, however, still the
need for the sync decoder to report its synchronisation status due to the need for depuncturing
symbol alignment.

Once this DVB-S implementation is complete, we can produce a working demonstration of the
benefits of software radio using commodity computers and low cost universal radio hardware.
We can gather different existing implementations of DVB-S, DVB-T, DAB, and ATSC to show
that the same hardware device can be used to receive media streams based on very different
standards by a simply download of the corresponding software.

5.3 Conclusion

In this project, we focused on creating an flexible and modular software radio implementation
of the DVB-S system. While its throughput performance is not yet ready for realtime media
stream reception, our implementation allows for easy optimisation or replacement of individ-
ual blocks for future work. We have demonstrated that it is possible to use a software radio
approach to implement a complete DVB-S coding and modulation system, although further
optimisation is required for realtime operation.

                                         A PPENDIX A

                                 USRP Daughterboards

The daughterboards available for purchase via the Ettus Research website [42] are, at the time
of writing:

       • BasicTX and BasicRX – transmitter and receiver for use with user supplied hardware;
         these daughterboards do not have any mixer, filter or amplifier
       • LFTX and LFRX – DC to 30 MHz transmitter and receiver
       • TVRX – 50 to 870 MHz receiver
       • DBSRX2 – 800 MHz to 2.4 GHz receiver
       • RFX900 – 800 to 1000 MHz transceiver with 200+ mW output power
       • RFX1200 – 1150 to 1450 MHz transceiver with 200+ mW output power
       • RFX1800 – 1.5 to 2.1 GHz transceiver with 100+ mW output power
       • RFX2200 – 2.0 to 2.4 GHz transceiver with 100+ mW output power
       • RFX2400 – 2.3 to 2.9 GHz transceiver with 20+ mW output power 1
       • XCVR2450 – 2.4 to 2.5 GHz and 4.9 to 5.85 GHz dual band transceiver
       • WBX – 50 MHz to 2.2 GHz wideband transceiver

    The RFX daughterboards are MIMO capable
                                     A PPENDIX B

                    List of implemented GNU Radio blocks

The following GNU radio signal processing blocks were implemented in our DVB-S system:

      • dvb_pad_mpeg_ts_packet_bp
      • dvb_depad_mpeg_ts_packet_pb
      • dvb_pad_dvb_packet_rs_encoded_bp
      • dvb_depad_dvb_packet_rs_encoded_pb
      • dvb_fifo_shift_register_bb
      • dvb_randomizer_pp
      • dvb_derandomizer_pp
      • dvb_rs_encoder_pp
      • dvb_rs_decoder_pp
      • dvb_interleaver_bb
      • dvb_deinterleaver_bb
      • dvb_puncture_bb
      • dvb_depuncture_cc
      • dvb_convolutional_encoder_bb
      • dvb_s_modulator_bc
      • dvb_s_demodulator_cc
      • dvb_depuncture_viterbi_cb
      • dvb_sync_decoder
      • dvb_complex_adjust_cc
      • dvb_drop_cc
      • dvb_complex_to_interleaved_float

                                           A PPENDIX C

                                     List of unit tests

The list of unit tests that were written for our software radio based implementation of DVB-S
are listed below:

C++ unit tests
      • Test class: qa_dvb_randomizer_pp
           – Test case: lfsr_start_values
              Ensures that the PRBS generated by the randomiser is correct by comparing the
              output of the randomiser with a known good sequence of pre-generated values.
           – Test case: test_randomize_derandomize
              Checks that the derandomiser correctly performs the inverse operation of the ran-
      • Test class: qa_dvb_sync_search_impl
           – Test case: test_sync_noise
              Tests to ensure that the sync search algorithm does not lock onto random noise.
           – Test case: test_sync_lock
              This test subjects the sync search algorithm to a specifically crafted sequence of
              SYNC bits in the correct layout, for which the SYNC search algorithm is expected
              to successfully lock onto.

Python unit tests
      • Test class: qa_dvb
– Test case: test_complex_to_interleaved_float
 Ensures correctness of the type conversion block from gr_complex to inter-
 leaved floats.
– Test case: test_loopback_pad_depad_mpeg_ts_packet
 Tests that padding and depadding does not damage a stream of data.
– Test case: test_loopback_pad_depad_dvb_packet_rs_encoded
 Tests that padding and depadding does not damage a stream of data.
– Test case: test_randomizer_pp_short_sequence
 Tests correct behaviour for randomisation and derandomisation when (0, 1, 7,
 8 and 9 MPEG-2 TS packets are input. 8 packets is the last packet before the
 randomiser should automatically reset, and 7 and 9 are for boundary testing.
– Test case: test_randomizer_pp_sync_bytes
 Checks the output of the randomiser for preservation fo SYNC bytes and correct
 inversion of every 8 SYNC bytes.
– Test case: test_derandomizer_pp_desync
 Tests the ability for the derandomizer to correctly synchronise to an unexpected
 desynchronisation of the 8 packet frame boundary
– Test case: test_rs_coder_pp_error_correction
 Tests the ability of the pair of Reed-Solomon coding blocks correct errors of up
 to its maximum theoretical of 8.
– Test case: test_shift_register_bb
 Tests the correctness of the FIFO shift register in normal operation.
– Test case: test_loopback_rand_rs
 Loopback test from data source through randomiser, RS encoder and its corre-
 sponding return path.
– Test case: test_interleaver_bb
 Tests that interleaving and deinterleaving does not damage a stream of data, and
 verifies that it agrees with the expected signal delay due to the shift registers.

– Test case: test_loopback_rand_rs_int
 Loopback test from data source through randomiser, RS encoder, convolutional
 interleaver and its corresponding return path.
– Test case: test_puncture_bb
 Tests correct puncturing behaviour when different puncturing sequences are ap-
– Test case: test_depuncture_puncture
 Tests that the depuncturing block is matched to the puncturing block in a loopback
– Test case: test_depuncture_viterbi_cb
 Subjects the depuncture-viterbi block to random noise - it is expected to produce
 no output.
– Test case: test_convolutional_encoder
 Tests correctness of the convolutional decoder in a loopback through a Viterbi

                                         A PPENDIX D

                               GNU Radio Installation

This set of instructions are designed for use with a fresh installation of Debian GNU/Linux
release 6.0 “Squeeze”.

    (1) Install the following pre-requisite packages through the package manager prior to in-
        stalling GNU Radio:
           • build-essential
           • sdcc swig guile-1.8
           • autoconf automake libtool
           • ccache
           • python-dev python2.6-dev
           • python-scipy
           • libfftw3-dev
           • libcppunit-dev
           • libboost1.42-dev libboost-thread1.42-dev libboost-program-options1.42-dev
           • libusb-dev
           • wx-common, python-wxgtk2.8
           • libasound2-dev
           • libsdl1.2-dev
           • libgsl0-dev
           • doxygen xmlto
           • python-cheetah python-lxml (required for grc)
    (2) Obtain a copy of the GNU Radio sources.

        git clone
(3) Build GNU Radio from sources, run the unit tests and install system-wide.
   make check
   sudo make install
(4) Apply workaround for libtool behaviour.
      • If it does not already exist, create the file /etc/,
        and write the line /usr/local/lib to that file.
      • Verify that the file /etc/ contains an entry that includes libc.conf.
        For example: include /etc/*.conf
(5) Add rules for the Udev daemon to handle USRP plug/unplug events.
      • Add a new user and group named ‘usrp’.
        sudo addgroup usrp
        sudo usermod -G usrp -a ‘whoami‘
      • Create a new Udev rule to handle the USB device, by creating the file
        /etc/udev/rules.d/10-usrp.rules which contains the following line:
        ACTION=="add", BUS=="usb", SYSFS{idVendor}=="fffe",
        SYSFS{idProduct}=="0002", GROUP:="usrp", MODE:="0660"
      • Finally, reload the Udev rules and plug in the USRP. If configuration was suc-
        cessful, then we should find the device listed under our USB devices.
        sudo udevadm control --reload-rules
        ls -lR /dev/bus/usb | grep usrp

                                      A PPENDIX E

                             Test computer specifications

Hardware Components

The computer we used for our experiments was a fairly inexpensive and common general pur-
pose computer. The system hardware components, specifications and operating parameters are
listed in Table E.1 and Table E.2.

      Component        Description
      Motherboard      Gigabyte GA-G31M-ES2L v1.0 (Intel G31 + ICH7)
      CPU              Intel Core 2 Quad Q6600 (8 MB L2, 2.40 GHz, 1066 MHz FSB)
      Memory           2 × 1 GB Hynix HYNP112U64CP8 PC2-5300 DDR2 667MHz
      Power Supply     Antec TruePower 2.0 480 W TPII-480
                           TABLE E.1: Test computer specifications

                   Parameter                 Setting
                   Front side bus (FSB)      1066 MHz
                   CPU core clock            2.4 GHz
                   CPU core voltage (Vcore ) 1.268 V (measured in BIOS)
                   Memory multiplier         4.0
                   Memory clock              800 MHz
                   Memory timings            4-4-4-12 T
                   Memory voltage            1.808 V (measured in BIOS)
                        TABLE E.2: Computer operating parameters

Software Configuration

Our computer was configured with a fresh installation of 64-bit Debian GNU/Linux version 6.0
“Squeeze” with kernel version 2.6.32-5-amd64.

Instead of using the GNU Radio in the Debian package repository, we used a development
version by accessing the Git repository. Our tests were conducted using GNU Radio revision
c81312cee781a6912eb87f430096f3757e056b28, dated 14 September 2010.

We used GNU Compiler Collection gcc version 4.4.5 (Debian 4.4.5-4) and SWIG version
1.3.40 to build our C++ signal processing blocks. For C++, we also used the -O2 compiler
optimisation flag to optimise runtime speed.

                    A PPENDIX F

Complete receive path block structure and interfaces


 [1] J. Mitola, “Software radios-survey, critical evaluation and future directions,” in [Proceed-
     ings] NTC-92: National Telesystems Conference, pp. 13/15–13/23, IEEE, 1992.

 [2] W. H. Tuttlebee, “Software-defined radio: facets of a developing technology,” IEEE Per-
     sonal Communications, vol. 6, pp. 38–44, Apr. 1999.

 [3] A. H. Tewfik and G. E. Sobelman, “Software defined cognitive radios,” in 2007 7th Inter-
     national Conference on ASIC, pp. 14–14, IEEE, Oct. 2007.

 [4] “Wireless Innovation Forum,” 2010.

 [5] W. H. Tuttlebee, “Advances in software defined radio,” Annals of Telecommunications,
     vol. 57, no. 5-6, pp. 314–337, 2002.

 [6] T. Ulversoy, “Software Defined Radio: Challenges and Opportunities,” IEEE Communi-
     cations Surveys & Tutorials, 2010.

 [7] D. Stephens, B. Salisbury, and K. Richardson, “JTRS Infrastructure Architecture and
     Standards,” in MILCOM 2006, pp. 1–5, IEEE, Oct. 2006.

 [8] I. Free Software Foundation, “GNU Radio - The GNU Software Radio,” 2010.

 [9] R. K. Jurgen, “Digital video,” IEEE Spectrum, vol. 29, pp. 24–30, Mar. 1992.

[10] D. Anastassiou, “Digital television,” Proceedings of the IEEE, vol. 82, pp. 510–519, Apr.

[11] N. Bourbakis, “Digital video and digital TV: a comparison and the future directions,”
     in Proceedings 1999 International Conference on Information Intelligence and Systems
     (Cat. No.PR00446), pp. 470–481, IEEE Comput. Soc, 1999.

[12] E. Kimura and Y. Ninomiya, “A high-definition satellite television broadcast sys-
     tem—’MUSE’,” Journal of the Institution of Electronic and Radio Engineers, vol. 55,
     no. 10, p. 353, 1985.

[13] U. Reimers, “DVB-the family of international standards for digital video broadcasting,”
     Proceedings of the IEEE, vol. 94, pp. 173–182, Jan. 2006.

[14] R. Hopkins, “Advanced Television Systems,” IEEE Transactions on Consumer Electron-
     ics, vol. CE-32, pp. xi–xvi, May 1986.

[15] G. Sgrignoli, History of ATSC Digital Television Transmission System. No. February
     1993, IEEE, Jan. 2007.

[16] U. Ladebusch and C. Liss, “Terrestrial DVB (DVB-T): a broadcast technology for sta-
     tionary portable and mobile use,” Proceedings of the IEEE, vol. 94, pp. 183–193, Jan.

[17] G. Faria, J. Henriksson, E. Stare, and P. Talmola, “DVB-H: digital broadcast services to
     handheld devices,” Proceedings of the IEEE, vol. 94, pp. 194–209, Jan. 2006.

[18] M. Cominetti and A. Morello, “Digital video broadcasting over satellite (DVB-S): a sys-
     tem for broadcasting and contribution applications,” International Journal of Satellite
     Communications, vol. 18, no. 6, pp. 393–410, 2000.

[19] A. Morello and V. Mignone, “DVB-S2: the second generation standard for satellite broad-
     band services,” Proceedings of the IEEE, vol. 94, pp. 210–227, Jan. 2006.

[20] A. Stienstra, “Technologies for DVB services on the Internet,” Proceedings of the IEEE,
     vol. 94, pp. 228–236, Jan. 2006.

[21] European Telecommunications Standards Institute, “EN 300 421 V1.1.2,” 1997.

[22] International Organization for Standardization, “ISO/IEC 13818-1 Information technol-
     ogy - Generic coding of moving pictures and associated audio information: Systems,”

[23] P. Koch and R. Prasad, “The universal handset,” IEEE Spectrum, vol. 46, pp. 36–41, Apr.

[24] W. Tuttlebee, “Software radio technology: a European perspective,” IEEE Communica-
     tions Magazine, vol. 37, no. 2, pp. 118–123, 1999.

[25] M. Sadiku and C. Akujuobi, “Software-defined radio: a brief overview,” IEEE Potentials,
     vol. 23, pp. 14–15, Oct. 2004.

[26] E. Buracchini, “The software radio concept,” IEEE Communications Magazine, vol. 38,
     no. 9, pp. 138–143, 2000.

[27] R. C. Reinhart, S. K. Johnson, T. J. Kacpura, C. S. Hall, C. R. Smith, and J. Liebetreu,
     “Open Architecture Standard for NASA’s Software-Defined Space Telecommunications
     Radio Systems,” Proceedings of the IEEE, vol. 95, pp. 1986–1993, Oct. 2007.

[28] R. Lackey and D. Upmal, “Speakeasy: the military software radio,” IEEE Communica-
     tions Magazine, vol. 33, pp. 56–61, May 1995.

[29] D. Efstathiou, L. Fridman, and Z. Zvonar, “Recent developments in enabling technologies
     for software defined radio,” IEEE Communications Magazine, vol. 37, no. 8, pp. 112–117,

[30] M. Cummings and S. Haruyama, “FPGA in the software radio,” IEEE Communications
     Magazine, vol. 37, no. 2, pp. 108–112, 1999.

[31] J. Mitola, “The software radio architecture,” IEEE Communications Magazine, vol. 33,
     pp. 26–38, May 1995.

[32] J. Belzile, S. Bernier, C. Auger, and D. Roberge, “Co-design for software defined radio
     using the software communications architecture,” in 2004 IEEE/Sarnoff Symposium on
     Advances in Wired and Wireless Communications, pp. 55–58, IEEE, 2004.

[33] E. Jones, “Software Defined Radios, Cognitive Radio and the Software Communications
     Architecture (SCA) in relation to COMMS, radar and ESM,” in Cognitive Radio and
     Software Defined Radios: Technologies and Techniques, 2008 IET Seminar on, (London),
     pp. 1–7, 2008.

[34] A. C. Tribble, “The software defined radio: Fact and fiction,” in 2008 IEEE Radio and
     Wireless Symposium, pp. 5–8, IEEE, Jan. 2008.

[35] J. Mitola, “Technical challenges in the globalization of software radio,” IEEE Communi-
     cations Magazine, vol. 37, no. 2, pp. 84–89, 1999.

[36] A. Haghighat, “A review on essentials and technical challenges of software defined radio,”
     in MILCOM 2002. Proceedings, pp. 377–382, IEEE, 2002.

[37] M. D. McCool, “Signal Processing and General-Purpose Computing and GPUs [Ex-
     ploratory DSP],” IEEE Signal Processing Magazine, vol. 24, pp. 109–114, May 2007.

[38] C. R. Johns and D. A. Brokenshire, “Introduction to the Cell Broadband Engine Architec-
     ture,” IBM Journal of Research and Development, vol. 51, pp. 503–519, Sept. 2007.

[39] M. Cummings and T. Cooklev, “Tutorial: Software-defined radio technology,” in 2007
     25th International Conference on Computer Design, pp. 103–104, IEEE, Oct. 2007.

[40] J. Mitola, D. Chester, S. Haruyama, T. Turletti, and W. Tuttlebee, “Globalization of Soft-
     ware radio [Guest Editorial],” Feb. 1999.

[41] J. Mitola, “Software Radio Architecture Evolution: Foundations , Technology Tradeoffs
     , and Architecture Implications,” IEICE Transactions on Communications, vol. E83-B,
     no. 6, pp. 1165–1173, 2000.

[42] Ettus Research, “Ettus Research Website,” 2010.

[43] F. A. Hamza, “The USRP under 1.5X Magnifying Lens!,” 2008.

[44] D. Valerio, “Open Source Software-Defined Radio: A survey on GNUradio and its appli-
     cations,” 2008.

[45] G. Abgrall, F. Le Roy, J.-P. Delahaye, J.-P. Diguet, and G. Gogniat, “A comparative study
     of two software defined radio platforms,” in SDR ’08 Technical Conference and Product
     Exposition, 2008.

[46] “The Comprehensive GNU Radio Archive Network,” 2010.

[47] E. Blossom, “How to Write a Signal Processing Block,” 2005.

[48] J. P. Elsner, “Implementation of the DAB physical layer in software using the GNU Radio
     framework,” 2007.

[49] A. Müller, “DAB Software Receiver Implementation,” Aug. 2008.

[50] Y. Jiang, W. Xu, and C. Grassmann, “Implementing a DVB-T/H Receiver on a Software-
     Defined Radio Platform,” International Journal of Digital Multimedia Broadcasting,
     vol. 2009, pp. 1–8, 2009.

[51] U. Ramacher, “Software-Defined Radio Prospects for Multistandard Mobile Phones,”
     Computer, vol. 40, pp. 62–69, Oct. 2007.

[52] Z. Ye, J. Grosspietsch, and G. Memik, “An FPGA Based All-Digital Transmitter with
     Radio Frequency Output for Software Defined Radio,” in 2007 Design, Automation &
     Test in Europe Conference & Exhibition, pp. 1–6, IEEE, Apr. 2007.

[53] V. Pellegrini, G. Bacci, and M. Luise, “Soft-DVB: A Fully-Software GNURadio-based
     ETSI DVB-T Modulator,” in 5th Karlsruhe Workshop on Software Radios, vol. 2, 2008.

[54] L. Rose, R-DVB: Software Defined Radio implementation of DVB-T signal detection func-
     tions for digital terrestrial television. PhD thesis, University of Pisa, 2009.

[55] V. Pellegrini, M. D. Dio, L. Rose, and M. Luise, “A real-time, fully-software receiver
     for DVB-T signals based on the USRP,” in 6th Karlsruhe Workshop on Software Radios,

[56] M. Di Dio, Signal synchronization and channel estimation/equalization functions for
     DVB-T software-defined receivers. PhD thesis, University of Pisa, 2009.

[57] V. Pellegrini and M. Luise, “Fully software OFDM modulation in vehicular, highly time-
     variant channels. An implemented technology and its results,” in 2009 6th International
     Symposium on Wireless Communication Systems, pp. 550–554, IEEE, Sept. 2009.

[58] V. Pellegrini, L. Rose, and M. Di Dio, “On Memory Accelerated Signal Processing within
     Software Defined Radios,” eprint arXiv:1004.0263, Apr. 2010.

[59] Z. Yi and Y. Li, “An Efficient Convolutional Code Information Recovery Scheme in DVB-
     S,” in 2008 4th International Conference on Wireless Communications, Networking and
     Mobile Computing, pp. 1–4, IEEE, Oct. 2008.

[60] Z. Yi and Y. Li, “The Research of Sync-search Scheme in DVB-S Receiver,” in 2010
     Second International Conference on Networks Security, Wireless Communications and
     Trusted Computing, pp. 377–380, IEEE, Apr. 2010.

[61] O. Joeressen and H. Meyer, “Node synchronization for punctured convolutional codes
     of rate (N-1)/N,” in 1994 IEEE GLOBECOM. Communications: The Global Bridge,
     pp. 1279–1283, IEEE, 1994.

[62] M. Moeneclaey and P. Sanders, “Syndrome-based Viterbi decoder node synchronization
     and out-of-lock detection,” in [Proceedings] GLOBECOM ’90: IEEE Global Telecommu-
     nications Conference and Exhibition, pp. 604–608, IEEE, 1990.

[63] M.-L. de Mateo, “Node synchronization technique for any 1/n rate convolutional code,”
     in ICC 91 International Conference on Communications Conference Record, no. D,
     pp. 1681–1687, IEEE, 1991.

[64] J. Fu and Y. Tang, A Parity Check Node Synchronization Based Method for Solving Phase
     Ambiguity Problem in DVB-S Systems. IEEE, Aug. 2009.


Shared By: