High Speed Modem Concepts and Demonstrator for Adaptive Coding and

Document Sample
High Speed Modem Concepts and Demonstrator for Adaptive Coding and Powered By Docstoc
					    High Speed Modem Concepts and Demonstrator for Adaptive Coding and Modulation with High
                                Order in Satellite Applications1

                      C. Berrou(6), R. De Gaudenzi(1), C. Douillard(6), G. Gallinaro(5), R. Garello(3),
                              D. Giancristofaro(2), A. Ginesi(1), M. Luise(4), G. Montorsi(3),
                                              R. Novello(2), A. Vernucci(5)
                                                (1)
                                                   European Space Agency
                                                        (2)
                                                            Alenia Spazio
                                                  (3)
                                                      Politecnico di Torino
                                                     (4)
                                                         Università di Pisa
                                                    (5)
                                                        Space Engineering
                                                      (6)
                                                          ENST Bretagne

1.         Introduction
Within the satellite communications innovation trend, some key issues, such as:
• usage of increasingly high frequency bands (e.g., from X, Ku, up to Ka and Q;).
• the higher frequency reuse achievable with multi-beam satellite antenna technology;
• the increasing satellite RF power becoming available thanks to the platform technology improvement;
• the exploitation of Adaptive Coding and Modulation (ACM) techniques in addition to power control to reduce
     the static link margin and matching the physical layer to the location and time dependent SNIR [1], [2]
are shifting the focus from classical satellite modulation schemes, such as QPSK, to higher order M-ary modulation
schemes. The latter can provide a higher spectral efficiency and thus the data rate required for either multi-media
applications or for applications such as point-to-point high data rate backbone connectivity, and future Earth
Observation missions requiring downlink data rates exceeding 1 Gb/s.

          Usage of the higher frequency bands, while providing improved bandwidth and clear sky link budget for a
given user station antenna dish size, implies the need to cope with an increase depth of fading events as well as a
higher level of local oscillators’ phase noise. The above mentioned problems can be mitigated by the exploitation
of Fade Mitigation Techniques (FMT) such as gateway site diversity, ACM for physical layer, power control etc…)
and robust demodulator synchronization techniques possibly inserting pilot symbols. Notwithstanding the above
technological improvements, satellite RF power still represents the key cost driver of satellites and calls for
efficient on-board high-power amplification.. To achieve the required power efficiency highly efficient coding shall
be associated with a wide range of (high-order) modulation schemes, optimised for the non-linear satellite channel
operation. On the demodulator side, the challenge is to cope with the high-speed link requirements and to extract
synchronisation and channel estimation accurate enough to avoid impairing the data recovery process even in
presence of challenging phase noise disturbances, at the low Es/N0 imposed by the state of the art coding solutions.

          This paper reports the preliminary outcomes of the Modem for Higher Modulation Schemes (MHOMS)
project funded under the Technology Research Program of the European Space Agency (ESA). The MHOMS
program is aimed at the research, design development and demonstration of an innovative high-rate, very high-
speed, on-the-fly re-configurable satellite digital modem prototype, with maximum bit rate of 1Gbps supporting a
wide range of spectral efficiencies (from 0.5 to 5.4 bps/Hz). The modem performance requirements for all
operational modes (spectral efficiencies) are set to a very small distance from the Shannon capacity bound (about 1
dB max). Furthermore, specified losses for modem operation over a typical Ka-band satellite nonlinear channel
shall be low calling for innovative modulation/demodulation schemes design. Subject of the demonstration activity
is the core HW section of the physical layer, expected to achieve unprecedented throughput/efficiency
maximisation in the satellite communication link, requiring highly innovative design and development approach.
This is particularly true for the demodulator/decoder sub-system that will require truly innovative high-speed
architectures. The hardware demonstrator will include together with the modem, a channel simulator including
non-linearity hardware simulator, programmable phase noise injection, adjacent and co-channel interference


1
    The present contribution describes the achievements obtained within the ESA funded MHOMS program (Modems for High-
      Order Modulation Schemes Contract No. 16593/02/NL/EC) led by Alenia Spazio as prime contractor.
injection. The current contribution focuses on the results of the study phase 1 (currently ongoing), which aims at
the specification of a flexible yet power and spectral efficient (de)coding, (de)modulation, synchronization
techniques able to satisfy the challenging requirements set forth. Phase 2 will lead to detailed design,
implementation and test of the demonstrator.

The envisaged MHOMS modem application scenarios will encompass as a minimum:
• High-speed Distributed Internet Access;
• Trunk Connectivity (Backbone/Backhaul);
• Earth Observation high-speed downlink;
• Point-to-multipoint applications (e.g. high-speed multicasting/broadcasting).

Among the above application scenarios, particular commercial interest is expected for the
broadcasting/multicasting and Internet access applications, for which the team is actively participating to the DVB-
S2 standardisation group [3]. With respect to the DVB-S2 new standard (limited to the forward link) the MHOMS
work is also encompassing more advanced functionalities such satellite beam hopping requiring a different frame
structure and enhanced reverse link compared to the current DVB-RCS.


2.   Modulation schemes and signal detection/synchronisation
techniques
Together with state-of–the art coding, soon addressed, new approaches to the issue of modulation and signal
detection/synchronization have been taken into account. Two main aspects were considered in this respect; namely,
non-linear distortions introduced by the High-power Amplifier (HPA) on-board the satellite on one hand and low-
SNIR robust synchronization on the other.


2.1      Modulation formats selection

Concerning non-linear distortions, two basic approaches were investigated. The first one is proper design of the
constellations to be used. Traditional square- or cross-QAM constellations in fact have a high peak-to average
power ratio, that cause non-negligible AM/AM and AM/PM distortions when the signal is amplified by the HPA.
This remark led to the “rediscovery” and optimisation of different multi-level Amplitude- and Phase-Shift Keying
(APSK) constellations based on concentric “rings” of equi-spaced points, such as the 4+12-APSK (a inner 4PSK
constellation surrounded by an outer 12PSK) shown in Figure 1 and so on for 32-APSK and 64-APSK [4]. In
4+12-APSK for instance, the modulation symbols bear two different amplitudes only, thus minimising envelope
fluctuations in the transmitted signal. This in turn results in lower distortion onto the HPA-amplified received
signal. By proper capacity-based APSK constellation parameters optimisation, APSK showed superior
performance over nonlinear satellite channels and almost identical performance compared to QAM over linear
AWGN channel. Minimum Euclidean distance-based constellation optimisation (see Figure 1-a) although simpler
provides in some cases sub-optimum results. The 16-APSK and 32-APSK constellations proposed by the team
have been retained by the DVB-S2 standard [3]. The second approach to minimize the effect of nonlinearity is
adaptive constellation (data) pre-distortion in the transmitter. Pre-distortion means intentionally modifying the
location of the data symbols on the complex plane with respect to their nominal position. The criterion is simple:
just try to obtain at the receiver matched filter output a constellation that, in the average, looks like the ideal one
you would get in the absence of distortions, by means of appropriately pre-distorting the constellation points at the
transmitter. Such a technique only calls for a modification of the transmitted constellation points, without the need
to resort to analogue devices. This is particularly straightforward and effective for circular constellations such as
APSK. The drawback of this technique is the need for the Earth station to have knowledge on how to pre-distort its
own transmitted signal; this depends on how the whole communication chain distorts the signal; the required
information can be easily obtained a priori or a posteriori if the uplink ground station receives its own signal in the
downlink. If required, the adaptation is carried out in a training phase on known (pilot) symbols, not necessarily
time-contiguous, that can be periodically repeated according to the (slow) dynamics of the HPA characteristics drift
[5].
Different pre-distortion schemes were investigated, based either on “instantaneous” evaluation of the distortion at
the receiver (adaptive static pre-distortion), or on the consideration of a certain amount of “memory” in the
combined phenomenon of non-linear distortion plus matched filtering at the receiver (so called adaptive dynamic
pre-distortion) [6]. In particular, the latter approach can compensate both for the non-linear distortion (mainly
caused by the chain composed by transmitter filter + HPA + receiver filter) and for the linear distortion as well
(caused by the satellite IMUX and OMUX filters at the input and the output of satellite transponder, respectively).
This makes the use of a decision feedback equalizer (DFE) at the receiver side unnecessary.

The set of tested modulation schemes for MHOMS includes “conventional” QPSK or 8-PSK modulations, two-ring
16-ary, three rings 32-ary and four ring 64-ary APSK constellations. Thanks to the coded APSK constellation and
pre-distortion techniques, in single-carrier mode the HPA can be operated remarkably close to saturation. This is
particularly true for 16-APSK which can basically optimally operate at Input Back-Off (IBO) very close to 0 dB. It
turns out that with data (constellation) predistortion, the HPA can be operated remarkably close to saturation (down
to 1.5 dB average Output Back-Off) even with 16-point variable-amplitude constellations and in the presence of
highly-efficient channel coding (turbo or LDPC). For instance Figs. 2 and 3 show curves for the total (power)
degradation due to nonlinear distortions and output power backing off with data modulation in a realistic
simulation of a satellite channel. The simulation was carried out with measured characteristics of HPA and
IMUX/OMUX filter, and with a special technique of “dynamic” constellation predistortion with memory that
optimizes system performance. The advantage of dynamic techniques with respect to “static” (memoryless)
predistortion and simple phase/amplitude recovery (DA-AGC) is apparent. The optimum TD for 4-12 APSK with
rate-3/4 LDPC coding is 2 dB of which 0.9 dB is due to the IMUX/OMUX losses and 0.9 dB corresponds to the
TWTA power modulation loss due to the non constant input signal envelope.




Figure 1: Multi-ring APSK constellations: a) Euclidean distances, b) 16-points 4+12-APSK; c), 16-points 6+10-
APSK; d) 64 points 4+12+20+28-APSK.
                                 8                                                                                           8


                                                                                                                                         DA-AGC
                                             DA-AGC                                                                                      Static Data Predistortion
                                 7                                                                                           7
                                             Static Data Predistortion                                                                   Dynamic Data Predistortion M=5
                                             Dynamic Data Predistortion M=5

                                 6                                                                                           6
      Total Degradation , [dB]




                                                                                                  Total Degradation , [dB]
                                 5                                                                                           5



                                 4                                                                                           4



                                 3                                                                                           3



                                 2                                                                                           2
                                                                         16-QAM                                                                            4-12 APSK
                                                                       LDPC Code                                                                           LDPC Code
                                                                    N = 64800 , R = 3/4                                                                 N = 64800 , R = 3/4
                                 1                                     60 Iterations                                         1
                                                                                                                                                           60 Iterations
                                                                     (Es/N0)lin=9.9 dB                                                                  (Es/N0)lin=10.1 dB
                                                                                   -5                                                                                  -5
                                                                        BER = 10                                                                            BER = 10
                                 0                                                                                           0
                                     0   1          2           3             4           5                                      0   1           2             3              4

                                                     OBO , [dB]                                                                            OBO , [dB]
 Figure 2: Total Degradation for r=3/4 LDPC-coded 16-                                         Figure 3: Total Degradation for r=3/4 LDPC-coded 4-12-
   QAM with DA-AGC, Static Data Predistortion and                                                APSK with DA-AGC, Static Data Predistortion and
         Memory-5 Dynamic Data Predistortion.                                                          Memory-5 Dynamic Data Predistortion.



2.2                              Synchronization techniques

As for signal synchronization, the receiver has to be able to work with a powerful channel coding (turbo or LDPC
with large block length)with an operating Es/N0 ratio considerably lowered with respect to more conventional
coding schemes with shorter code blocks and/or smaller coding gain. The best that a synchronisation unit can do in
this respect is given by the performance of data-aided algorithms, that can only be applied on pilot symbols known
to the receiver in advance. On the other hand, insertion of pilot symbols decreases the efficiency of the link,
reducing the net Eb/N0 ratio on information bits. A pilot “density” of only 20% with respect to the information
symbols means a decrease of roughly 1 dB in power efficiency, thus partially offsetting the benefits of the adoption
of more advanced coding schemes.
The approach that was taken in the design of the MHOMS modem was to investigate the trade-off between
(optimum) data-aided synchronisation with a loss due to pilot insertion, and (suboptimal) “blind” synchronisation
with no pilot insertion loss, and to identify the best solution for the different operating modes. Blind
synchronisation places some constraints on the adoption of particular modulation schemes. For instance, 4-12
APSK turns out to be more efficient than conventional 16-QAM on the non-linear satellite channel, but blind
carrier frequency synchronisation is extremely challenging (due the particular constellation symmetry), while on
the contrary it is relatively straightforward with 16-QAM.
Novel techniques for blind carrier phase recovery (which is needed to perform coherent detection and channel
decoding) were also investigated. In particular, by exploiting the Expectation Maximisation (EM) algorithm [7],
soft-decision-directed iterative phase estimation combined with iterative channel decoding (the so-called turbosync
approach) was shown to be applicable both to continuous- and to burst-mode operation of the modem, even in the
presence of large amounts of oscillator phase noise. With long codewords, the phase noise has relevant variations
over the channel code block. This means that correcting the carrier phase using a constant term over the entire
frame is useless. The best approach is subdividing the block in L sub-blocks, each one encompassing a pilot symbol
preamble and a “payload” section. The preamble is needed for “pre-compensating”, with a data aided estimate, the
overall phase error in the subsequent payload section, so that the EM algorithm can operate within its phase
acquisition range (about [-30,30] degrees). The resulting framing structure is shown in figure 4.
            Pilot     Data
          Symbols     Field

                                                                                   ...


            NP           ND


                                                             N
                                        Figure 4: Pilot distribution scheme I.

Obviously, the performance depends on the pilot symbol density η , defined as
                                                           NP
                                                 η=                (1)
                                                         NP + ND
where N D is the data field length and N P is the preamble length. For each constellation, we found an optimal
value of N D which represents the best trade off between two opposite constraints. First, it is necessary to minimise
the loss in power efficiency due to the insertion of pilot symbols (need to have a short preamble section); second,
we want to achieve an estimate of the carrier phase as accurate as possible (need to lengthen the preamble).
Figure 5 shows an alternative pilot distribution scheme that has been tested. For each sub-block the preamble
analysed for the previous system is evenly split between the head and the tail of the sub-block. In other words, half
preamble of the first sub-block is shifted at the end of the last sub-block of the frame. A linear interpolation
between two subsequent phase estimates is exploited to pre-compensate for phase noise and carrier frequency
offset. In particular, this pilot distribution allows to obtain an estimate of the “instantaneous” carrier phase at the
end of each frame, so that we can perform linear interpolation between the last two phase estimates θ L −1 and θ L to
                                                                                                         ˆ        ˆ
better compensate for frequency offset and phase noise.
    θ0                             θ1               θ2                                                          θL




                                                                                     ...


    NP                        NP
                 ND
    2                         2



                                                     N
                                        Figure 5 – Pilot distribution scheme II.

This algorithm requires an additional computational load to perform phase interpolation; however, it allows to
                                                 −4
counteract residual frequency errors up to 3 × 10 times the symbol rate. A performance sample of such techniques
is given in Figs. 6 and 7
                                                8PSK                                                                      8PSK
                                                LDPC Rate 2/3, N = 64800                                                  LDPC Rate 2/3, N = 64800
                -1
           10                                   Sub-block: 300                          -1                                Sub-block: 600
                                                                                       10                                 Pilot Density 2%
                                                Pilot Density 3%


                -2
           10
                                                                                        -2
                                                                                       10


                -3
           10
                                                                                        -3
                                                                                       10




                                                                                 BER
     BER




                -4
           10

                                                                                        -4
                                                                                       10
                -5
           10


                                                                                        -5
                                                                                       10
                -6     AWGN                                                                       AWGN
           10
                       νT = 1 x 10
                                   -4
                                                                                                  νT = 0
                                                                                                          -4
                       νT = 3 x 10
                                   -4
                                                                                                  νT = 10
                                                                                                             -4
                       νT = 5 x 10
                                   -4
                                                                                                  νT = 3 x 10
                -7                                                                      -6
           10                                                                          10
                 6.0       6.2          6.4        6.6       6.8           7.0              6.0     6.2           6.4        6.6       6.8           7.0

                                         ES / N0 [dB]                                                              ES / N0 [dB]


 Fig. 6 – EM algorithm performance using pilots and sub-   Fig. 7 – EM algorithm performance using pilots and sub-
block linear interpolation pre-compensation using a 64800 block linear interpolation pre-compensation using a 64800
bit rate 2/3 LDPC code with 8PSK, with 300 symbols sub- bit rate 2/3 LDPC code with 8PSK, 600 symbols sub-block
               block and pilot density of 3%.                               and pilot density of 2%.




3.          Code design and trade-off
Parallel Concatenated Convolutional Codes (PCCC), Serially Concatenated Convolutional Codes (SCCC), and
Low Density Parity Check codes (LDPC) have been investigated. All the pre-selected schemes allow to achieve
the desired flexibility and maintain excellent performance, very close to the sphere packing lower error bound.


3.1         Serial Concatenated Convolutional Codes

For SCCC, the innovative pragmatic scheme depicted in Figure 8 has been designed. It allows maintaining the
same interleaver with different code-rates, with evident advantages in term flexibility.
It consists of an outer and inner encoders stemming from the same 4-state, rate ½, recursive, systematic encoder
shown in the same figure. For all the SCCC rates but ¼ the outer encoder is punctured to a rate 2/3 through the
optimal puncturing reported in the figure. The interleaver size is kept fixed to the length N required by the
application at hand for all rates and modulation formats. The corresponding information block size is then
independent from the SCCC rates, and amounts to Ni=NRo, where Ro=2/3 is the outer code fixed rate. The only
exception is the SCCC rate ¼, for which both encoders need to be used without puncturing. In that case, we still
keep the information block size fixed, and use an interleaver with size N=2 Ni .
In order to obtain the desired SCCC rate, we perform puncturing at the output of the inner encoder, according to the
scheme shown in Figure 8. The upper register at the output of the inner encoder contains the N+2 inner systematic
bits, which coincide with the interleaved outer code word plus the 2 bits terminating the inner trellis. The lower
register, instead, contains the N+2 parity-check bits generated by the inner encoder. Two different puncturing
algorithms are used to puncture bits in the upper and lower registers. Puncturing in the upper register is performed
on the N inner systematic bits (excluding the 2 inner code terminating bits that are always transmitted) according to
a puncturing pattern periodic with period 200 trellis steps, which correspond in our case to 300 outer coded bits.
The systematic bits of the inner encoder correspond to the code word generated by the outer encoder, so that the
puncturing pattern on these bits has been designed to maximise the free distance of the outer encoder, and take into
account that puncturing occurs after interleaving. Aiming also at high code rates, it is computationally complex to
exhaustively optimise the puncturing patterns for the outer encoder. Thus, we used a slightly sub-optimal, yet
manageable, searching algorithm that works incrementally, in a rate-compatible fashion, so that the punctured
positions for a given outer rate are also punctured for all higher rates.
Optimisation of the upper register puncturing pattern involves both the number of bits to be punctured, and their
position. Denoting by Ssur the number of surviving bits in each 300-bit period of the upper register after puncturing,
the outer code “equivalent” rate, as seen from the SCCC output, is then
                                                            200 ⋅ n p − 2
                                                Ro =
                                                            Ssur ⋅ n p + 2
where np is the number of puncturing periods contained in the interleaver, i.e., np=N/300.
A design procedure for the outer code puncturing yielding outer code rates in the whole range
200 / 300 ≤ Ro ≤ 200 / 201 has been developed for this purpose.
                                                                                  S bits        RC
                                                                         Punc.                  Interleaver
                                                                                        S
                                                                          sys
                                 Fix
                   Rate ½       Punct.                        Rate ½                              Mapping
             u                               Interl.                               S
                   4-state       11                           4-state                            Modulation
                                 10                                                     P
                                                                         Punc.
                                                                                            P bits
                                                                         parity
                     u                                 c1


                                                                                       m bits

                                                    c2
                                   Figure 8: Block diagram of the SCCC scheme


Puncturing bits in the lower register, which contains the parity-check bits generated by the inner encoder, is
obtained by applying a rate matching algorithm, and consists in deleting regularly spaced bits according to the rate
matching parameter Psur/(N+2), representing the fraction of surviving parity bits Psur and the overall number of
parity bits before puncturing. Denoting by RSCCC the overall SCCC rate, we can easily derive Psur as
                                                    2N − 6
                                           Psur =          − n p S sur − 2
                                                    3RSCCC
The simulation results show that the newly proposed scheme offers good performance in a large range of code
rates, including very high ones. As an example, the FER performance for the designed SCCC codes with code-rate
5/6 and codeword lengths 16384 and 480 bits, respectively, are reported in Figure 9 for 4-PSK modulation, for 50
decoding iterations.
               1.00E+00




               1.00E-01




               1.00E-02




               1.00E-03




               1.00E-04




               1.00E-05




               1.00E-06
                          2.5        3        3.5        4              4.5           5   5.5   6    6.5

                                                               E   b   /N   0    dB
                                                                                [ ]


         Figure 9:Simulation results for the new proposed SCCC scheme. Rate 5/6, n=16384 and 480 bits.




3.2     Low Density Parity Check Codes

For LDPC, a new class of “modular” parity check matrixes has been investigated. These codes have a linear
encoder complexity and are particularly suited for parallel implementation of the decoder algorithm. The codes
have been optimized for high SNR performance, since FER as low as 10-7 are required by most considered
applications.
For limiting encoding complexity, we impose the following structure on the parity check matrix H of the designed
LDPC codes C(n=k+m,k):
                                           H=[Hp|Hd]

where Hd is an m by k matrix, and Hp is a dual diagonal m by m matrix that is shown below for m=5:
                                                          1   0 0 0 0
                                                          1   1 0 0 0
                                                                     
                                                    H p = 0   1 1 0 0
                                                                     
                                                          0   0 1 1 0
                                                          0
                                                              0 0 1 1
                                                                      

The codeword vector x can be spitted into a systematic part s (with dimension k) and a parity part p (with
dimension m), such that x=(s,p). The encoding is performed by searching the parity vector p which satisfy the
equation H x = 0, which is equivalent to Hp p = v.

As a consequence, the m parity check bits can be easily computed by back-substitution:
                            p1 = v1
                            pl = vl + pl −1                  l=2,…,m
and the overall complexity grows linearly with n.

The random part of the matrix, Hd have been designed by properly distributing smaller sub-matrices, a structure
which helps parallel decoder architectures. Each of them can be:
    • An all-zero matrix
    • A matrix obtained by shifting the m by m diagonal matrix
The position of the non-zero matrices and their respective cyclic shift are chosen using the Progressive Edge
Growth (PEG) algorithm [8]. This algorithm guarantees that every time a non-zero matrix is added, its position and
cyclic shift are chosen so that the girth (length of minimal cycles) of the graphs is as large as possible.
The performance of the designed LDPC are very good, showing limited penalty with respect to theoretical lower
bounds. As an example, the FER performance for the designed LDPC codes with code-rate 5/6 and codeword
lengths 16384 and 480 bits, respectively, are reported in Figure 10 for 4-PSK modulation, for 50 decoding
iterations.


                                          LDPC RATE=5/6 MOD=4PSK

       1.00E+00

       1.00E-01

                                                                                    USparse n=480
       1.00E-02                                                                     Mod(16384,13568)


       1.00E-03
 FER




       1.00E-04

       1.00E-05

       1.00E-06

       1.00E-07

       1.00E-08
                  2.5         3.5            4.5             5.5              6.5                      7.5     8.5
                                                            Eb/No
           Figure 10:Simulation results for the new proposed LDPC scheme. Rate 5/6, n=16384 and 480 bits



The decoding complexity of the designed coding schemes, which is essential for proper comparisons, has been
deeply analyzed and discussed [9]. This aspect is indeed fundamental for our applications, due to very high data-
rates and required flexibility.


3.3        Parallel Concatenated Convolutional Codes

A very flexible error-correcting turbo code has been devised for adaptive coding and modulation in high
throughput satellite applications. This code, nicknamed TurboΦ, is derived from the extension of the DVB-RCS
turbo code [10] to 16 states. TurboΦ is intended to offer near-Shannon performance on Gaussian channel, in most
situations of block size, coding rate from 1/4 up to 8/9 and associated modulation. It is expected that the FER
(frame error rate) versus Eb/N0 curve does not display any marked change in the slope (flattening) down to 10-7, in
most situations. Besides its high degree of versatility regarding the system requirements and thanks to the very
simple generic model of its internal permutation, the specification of TurboΦ gives the natural possibility of using
a high level of parallelism in the associated decoder. This would lead to the possibility of reaching throughputs as
large as 50 Mbit/s on FPGAs such as Virtex2-8000.

TurboΦ is depicted in Fig. 11.The encoder is a parallel concatenation of two 16-state duo-binary circular recursive
systematic convolutional (CRSC) encoders, fed by blocks of k bits (N = k/2 couples). Both component encoders
have identical features: polynomials 23 (recursivity, period 15), 35 (redundancy), first bit (A) on tap 1, second bit
(B) on taps 1, D and D3. The use of duo-binary component encoders, already justified in several papers (see for
instance [11-13]), offers better convergence of the iterative decoding process and larger minimum Hamming
distances (MHDs), for medium and high coding rates (1/2 ≤ R). The circularity principle, that is the way to encode
such that the final state of the register is equal to the initial one, allows each component convolutional code to
become a perfect block code, without any need for additional information (no tail bits) and without any side effect.
While the former point is interesting only for short blocks, the latter is crucial vis-à-vis MHDs for all sizes. Circular
termination makes it possible to protect all the bits (or couples in the case of duo-binary codes) at the same level of
protection, including the starting and finishing portions of the block. All bits (or couples) being uniformly encoded,
and all of them benefiting from the whole set of redundancy, there is no particular insidious effect that would
reduce drastically the MHD. Because no special care has to be taken, in the permutation design, about the starting
and finishing parts of the block, circular termination constitutes a sizeable simplification in the research for good
permutations, when the target error rate is low. One restricting condition to the use of the circularity principle is
that the block length (i.e. N couples) must not be a multiple of the period of the pseudo-random generator, from
which the CRSC encoder is derived. When adopting polynomial 23 for the encoder, N cannot be a multiple of 15,
unless using stuffing symbols. To the previous condition, we add another one on N, which has to be a multiple of 4,
because of the permutation law, which is detailed next section. All the blocks considered are then counted in bytes.


                                                                      A
                                Systematic part
                                                                                                                       c
                                                                      B                1                               o
A                                                                                                                      d
              s1          s2        s3        s4
                                                                      Permutation                                      e
B                                                                        (N)                                           w
                                                                                                                       o
                                                                                                          Y1 or 2      r
                                                                          Π             2
                                                                                                                       d
                                                           N=k/2                                    puncturing
                                Y Redundancy part          couples of data



Figure 12: The TURBOΦ encoder (R ≥ 1/2): a parallel concatenation of two duo-binary 16-state circular recursive
systematic convolutional (CRSC) encoders. Permutation Π is carried out at the couple level, and couples (A,B) are
permuted to become (B,A) before second encoding.


The rationale in the choice of the permutation, whose design has strong impact on the MHD value and therefore on
the asymptotic performance, rests on a trade-off between three concerns. It is well known that regular permutation
is quite appropriate, with respect to error patterns with weight 2 or 3, and more generally with respect to error
patterns that are not composite [13]. The permutation rule has then to be almost regular. Regarding composite error
patterns, that is, patterns which link several return-to-zero (RTZ)2 sequences on both dimensions, some disorder
has to be instilled. However, this disorder has to be carefully handled, in order to prevent non composite patterns
from dominating again. A short disorder cycle value (typically 4) seems to be sufficient to obtain good MHDs. This
low degree disorder is also a good thing, from the point of view of the permutation qualification, because the code
has a small period, with a limited number of cases to be checked. Furthermore, the simplicity of the permutation
law enables easy on-the-fly reconfigurability from both encoder and decoder sides.
The turboΦ permutation is defined as follows:

i (i = 0, …N-1) is the couple address in the natural order. j (j = 0, …N-1) is the couple address in the permuted
order.

-            if j = 0 mod. 4, then Q = 0;
-            if j = 1 mod. 4, then Q = 4Q1;
-            if j = 2 mod. 4, then Q = 4P + 4(Q2 + 1);                                      (1.1)
-            if j = 3 mod. 4, then Q = 4P + 4(Q3 + 1).


2
    An RTZ sequence is any finite input sequence which makes the encoder quit state 0 and makes it go back to this
     state. Using the D operator, and denoting G(D) the recursivity generator of the encoder (G(D) = 1 + D3 + D4 for
     polynomial 23), all RTZ sequences are multiple of G(D).
Finally,      i = Π(j) = Pj + Q +3 mod. N                                                           (1.2)

P is an integer, relatively prime with N. Q1 Q2 and Q3 are small integers (from 0 up to 8). The four parameters of
the permutation law have to be determined for each block size.

Fig. 13 compares the TurboΦ and DVB-RCS turbo code performance, for a particular case (k = 1504, R = 1/2, 2/3
and 3/4). The average improvement for these medium coding rates is about 1 dB, or more, for FER = 10-7. The gain
may be significantly larger for higher rates. Fig. 14 shows the performance obtained from the association of
TurboΦ and three different modulations, in order to achieve spectral efficiencies of 1, 2 and 3 (bit/s)/Hz,
respectively, for short blocks. The association is made under the so-called pragmatic [14] or BICM [15] conditions.
From the results shown, we can observe in particular the absence of significant change in the slopes (flattening),
down to FER = 10-7.
                                1.0e+00


                                1.0e-01


                                1.0e-02   R=1/2: 8-state
                                               16-state
                                          R=2/3: 8-state
                                1.0e-03        16-state
                                          R=3/4: 8-state
                                               16-state
                          FER




                                1.0e-04


                                1.0e-05


                                1.0e-06


                                1.0e-07


                                1.0e-08
                                       -1.5 -1 -0.5   0    0.5   1 1.5 2      2.5   3   3.5   4   4.5
                                                                 Eb/N0 (dB)
Figure 13: Performance comparison between TurboΦ (16-state) and DVB-RCS turbo code (8-state) for QPSK. k =
1504, R = 1/2, 2/3 and 3/4. Max-Log-MAP component algorithm, 8 iterations, 4-bit quantization.




Figure 14: Performance obtained from the association of TurboΦ and QPSK (R = 1/2, k = 408), 8-PSK (R = 2/3,
k = 816) and 16-QAM (R = 3/4, k = 1224). Pragmatic coded modulation. Max-Log-MAP component algorithm, 8
iterations, 6-bit quantization.
3.4            Preliminary coding schemes trade-off
   In previous sections we have described candidate coding schemes drawn from the class of PCCC, SCCC and
LDPC codes, and provided some simulation results. In this section, we will propose a few criteria used as
“goodness” measures to rank the proposed schemes and draw some preliminary conclusions. We have used the
following criteria to compare and rank the described codes:

                           1.   Performance. We will use as performance measure the Eb/N0 required to achieve a FER=10-4
                                for the information block size of 428 and code rates 1/3 and 5/6, and a FER=10-6 for the code
                                word block size of 16,384, all rates, and for the information block size of 428 and rate 9/10.
                           2.   Complexity. As complexity measure we will use the numbers of elementary operations
                                (arithmetic complexity) and memory requirements (memory complexity) per decoded bit per
                                iteration.
                           3.   Flexibility. We will comment on the flexibility of the proposed schemes as their ability to
                                cope with the stringent requirement of modifying the rate, block size and modulation on a per
                                frame basis.
                           4.   Number of iterations required. This number impacts on the complexity parameter,
                                essentially as a multiplicative parameter of the estimated arithmetic complexity. We prefer,
                                however, to keep it as a separate item in this preliminary stage, since its effect on the overall
                                complexity depends on the chosen decoder architecture.
                           5.   IPR issues.
                           6.   Maturity. We will comment on the maturity of the proposed schemes in terms of existing
                                implementations and general industrial acquaintance with them.

  The FER performance of the three coding schemes with QPSK modulations are compared in Figg. 15 and 16.
                                                            Simulation results, k=428
                                                               4PSK-modulation
            1.00E+00


                                                                                                   SCCC     LDPC       PCCC
                                                                                                   SCCC     LDPC       PCCC
                                                                                                   SCCC     LDPC       PCCC
            1.00E-01




                                                                                        9/10
            1.00E-02

                                              1/3


                                                          5/6
      FER




            1.00E-03




            1.00E-04




            1.00E-05




            1.00E-06
                       0        1         2         3           4            5            6    7      8            9          10
                                                                       E b/ N 0 [dB]

      Figure 15. FER performance of PCCC, SCCC and LDPC for the short block size and all three code rates.
                                                           Simulation results n=16384
                                                               4PSK-modulation
            1.00E+00

                                       SCCC         LDPC      PCCC
                                       SCCC         LDPC      PCCC
            1.00E-01                   SCCC         LDPC      PCCC



                                                                                            5/6
            1.00E-02                                                                                        9/10


                                 1/3
      FER




            1.00E-03




            1.00E-04




            1.00E-05




            1.00E-06
                       0   0.5    1           1.5              2            2.5         3         3.5   4          4.5   5
                                                                       E b/ N 0 [dB]

     Figure 16. FER performance of PCCC, SCCC and LDPC for the long block size and all three code rates.

In Table 1, we summarize the comparison criteria for the three classes of codes. The table reports for the short and
long block codes the required Eb/N0 for the target FER, the Eb/N0 loss with respect to the best code in dB, the
minimum distance and the number of nearest neighbours when available. In terms of complexity, we report for the
short and long block codes the RAM and ROM memory occupations, the number of sums and max/max* operators
per decoded bit, the number of iterations, and, finally, the evaluation of the other comparison criteria.

The performance comparison
For the short block size, the PCCC scheme offers the best performance. It yields a gain from 0.3 dB (rate 1/3) to
1.05 dB (date 9/10) over SCCC, and from 0.5 dB (rate 1/3) to 1 dB (rate 9/10) over the LDPC. For the long block
size, the LDPC scheme offers the best performance, close to those of the SCCC (gains from 0.15 to 0.25 dB). Both
schemes do not show the presence of error floors. On the other hand, the PCCC scheme shows the presence of a
floor for all three rates. For the rate 1/3, this leads to a loss of 0.45 with respect to LDPC and 0.3 to SCCC. For the
higher rates, the floor is so pronounced that it is difficult, and questionable in absence of the results at FER=10-6 to
measure the losses.

The complexity comparison
In terms of complexity, the SCCC scheme is the simplest in terms of RAM occupation for all block sizes and rates.
The LDPC scheme complexity are much larger. In terms of ROM memory (recalling that the area occupation of a
ROM is roughly 1/5 of that of a RAM of the same size), the PCCC scheme is the best, since it does not require
storing the permutation, but only a few parameters from which the permuted addresses are derived on the fly. In the
table we report the ROM occupation for one PCCC block size. This memory should be multiplied by the number of
permutations required according to the frame flexibility.
As to the arithmetic complexity, the SCCC scheme is by far the simplest. It is roughly 4-5 times simpler than the
PCCC scheme when both use the MAX* operation, and 2-3 times less complex when the PCCC uses the MAX
operation (notice that the added complexity of the LUT is almost irrelevant). The LDPC schemes is 5-6 times more
complex. Notice also that for high data rates requiring a highly parallel architecture, the differences in terms of area
due to the arithmetic operations becomes even more sensible.
                                   Code                                    Φ
                                                               PCCC - TurboΦ                           SCCC                             LDPC
          Criterion                                         1/3     5/6      9/10             1/3       5/6         9/10      1/3        5/6         9/10

                                   Eb/N0 @ Target FER       2           5.4          6.7      2.3        6         7.75       2.5         5.9        8?




                               t
                           or
                                   Loss to best


                          Sh
                                                            0            0           0?       0.3        0.6       1.05       0.5         0.5        1?
                   ce
                 an
                                   dmin, Nmin             36-107       5-32.4       4-337    30-1        5-1        4-3        X          X          X
                m
              or
            rf




                                   Eb/N0 @ Target FER
          Pe




                           ng                               1.2        3.9??        5.2?      0.9        3.4       4.35       0.75       3.25        4.1

                                   Loss to best
                          Lo


                                                           0.45        0.65??       1.1?     0.15       0.15       0.25        0          0          0

                                   dmin, Nmin             66-1366     11-1365       7-590    66-2        7-1        7-1        X          X          X

                                   RAM Memory             15,683       7,059        6,869   10,914      7,059      6,869     34,413     12,641     10,435

                                   ROM Memory              3,582       1,712        1,712    6,420      6,420      6,420     47,988     15,829     12,661
                               t
                           or




                                   SUM                     2,050       2,050        2,050    730        730        730        950        350        250
                          Sh




                                   MAX* (MAX)              1580        1580         1580     370        370        370        1600       900        700
                  ity




                                   Number of iterations     10           10          10       10         10         10         50         50         50
                ex
              pl
            om




                                   RAM Memory             196,608     225,276   236,742     139,260    225,276    236,742   552,313     614,137    630,713
           C




                                   ROM Memory             70,997       88,745   95,843      106,490    307,193    331,763   1,142,383 1,292,527 1,332,783
                             ng




                                   SUM                     2,050       2,050        2,050    730        730        730       1,250       550        550
                          Lo




                                   MAX* (MAX)              1580        1580         1580     370        370        370        2500       1600       1500

                                   Number of iterations     10           10          10       10         10         10         50         50         50
                                                            Sensitive to interleaver                  High (10)             Code must be designed for
                                                               optimization (8)                                                   all cases (5)
                                   Flexibility
                     es




                                                                       YES                              NO                               NO
                   su
                is
              er




                                   IPR issues
           th
          O




                                                                   Very High (10)                     High (8)                        Medium (6)
                                   Maturity
   Table 1: Summary of complexity and performance comparison for the three classes of PCCC, SCCC, and LDPC.

The flexibility comparison
   Flexibility refers here to the possibility (and the related complexity/performance consequences) of the co-decoder to
adapt to different code rates, modulation schemes, and block sizes on a per frame basis. All schemes can adapt easily to
variations in the modulation schemes, as they are inherently pragmatic schemes, working with binary co-decoders fed
by proper projections of the received soft symbols likelihood functions onto the binary log-likelihood ratios of the bits
that form a modulation symbol.
    The SCCC scheme adapts very easily to different code rates. It only requires the definition of the puncturing and
rate matching parameters (performed off line once for all), to be stored into a memory table. An almost continuous
variation of the code rate from rate ¼ to k/k+1 can be coped with. The same is true for the PCCC scheme.
   Adaptation to different code rates requires for the LDPC codes to work in general with different codes, which have to
be stored. Also, the decoding engine, although general in terms of elementary and repetitive operations, must be able to
work with possibly different number of edges entering into and exiting from variable and parity nodes.

The IPR issues comparison
In this case, the situation appears to be rather different for the various schemes. PCCC are protected by several France
Telecom patents. For the LDPC codes, we have examined two patents by Flarion. From the patents analysis, we feel
that our LDPC designs are not covered by Flarion patents. The SCCC structure and decoding algorithm are free from
patent protection. For these reasons, we inserted a YES or NO in the appropriate column of the previous tables.

The maturity comparison
PCCC have been already implemented in numerous cases, and accepted as standards in several system applications. The
only novelty concerns perhaps the parallel architectures needed to reach high data rates. SCCC have seen less cases of
implementations, but their technology should not pose any further problems. LDPC are relatively newer in terms of
applications, although their acceptance in the new DVB-S2 standard should accelerate the full comprehension and
solution of implementation problems. Also in this case, we have attempted to grade from 1 to 10 the technological
maturity of the three schemes in the previous tables.
4.       Conclusions and outlook
The paper presented preliminary results from the on-going Phase 1 of the ESA funded MHOMS project aiming at the
study design and development of a ultra high-speed, high-performance fully reconfigurable digital modem. The
MHOMS modem will be able to cover future needs of telecommunications and Earth Observation missions and will
feature state of-the-art coding, modulation, demodulation, synchronization and decoding algorithms. The MHOMS
technological development will also lay the foundation (building blocks) of a new class of modems able to exploit
adaptive coding and modulation. During the remaining part of phase 1 activity, the trade-off among the described
candidate coding, modulation, pre and demodulation/synchronisation techniques will be finalised and complemented by
architectural design considerations. Also possible physical layer improvements for the return link of interactive satellite
networks covered by the current DVB-RCS standard will be proposed.

During phase 2 of the activity the MHOMS prototype will be designed, assembled and tested in laboratory set-up
inclusive of a satellite channel simulator, aiming at demonstration of the most challenging modem operating modes.


REFERENCES

[1] R. De Gaudenzi, R. Rinaldo, “Adaptive Coding and Modulation for Next Generation Broadband Multimedia
Systems”, in the Proc. of the 20th AIAA Satellite Communication Systems Conference, Montreal, May 2002, AIAA-
paper 2002-1863.
[2] R. De Gaudenzi, R. Rinaldo, A. Perez Carro-Rios, “Adaptive Coding and Modulation for the Reverse Link of Next
Generation Ka-band Multimedia Systems”, in the Proc. of the Ka-band 8th Utilization Conference, Baveno, Italy, Sept.
25-27, 2002.
[3] Digital Video Broadcasting (DVB), “Second generation framing structure, channel coding and modulation systems
for Broadcasting, Interactive Services, News Gathering”, DVB-S2 draft document DVBS2-74, June 2003. Available on
the DVB organisation web site (http://www.dvb.org) [DVB-S2 ftp directory]
[4] R. De Gaudenzi, A. Guillen i Fabregas, A. M. Vicente, and B. Ponticelli, “High Power and Spectral Efficiency
Coded Digital Modulation Schemes for Nonlinear Satellite Channels”, in Proc. 7th International ESA Workshop on
Digital Signal Processing Techniques for Space Applications, Sesimbra, Portugal, October 2001.
[5] L. Giugno, V. Lottici, M. Luise, “Adaptive Compensation of Nonlinear Satellite Transponder for Uncoded and
Coded High-Level Data Modulations”, submitted to IEEE Transactions on Wireless Communications, November 2002.
[6] G. Karam and H. Sari, “A Data Predistortion Technique with Memory for QAM Radio Systems.” IEEE
Transactions on Communications, Vol. 39, No. 2, pp 336-344, February 1991.
[7] V. Lottici, M.Luise, “Embedding Carrier Phase Recovery into Iterative Decoding of Turbo-Coded Linear
Modulations” to appear on IEEE Trans. Comm. 2003.
[8] Xiao-Yu Hu; Eleftheriou, E.; Arnold, D.-M.; “Progressive edge-growth Tanner graphs,” Global
Telecommunications Conference, 2001. GLOBECOM '01. IEEE , Volume: 2 , 2001, pp. 995 -1001.
[9] Politecnico di Torino, ESA MHOMS project WP 1320 activity report (Modems for High-Order Modulation Schemes
Contract No. 16593/02/NL/EC).
[10] DVB, "Interaction channel for satellite distribution systems", ETSI EN 301 790, V1.2.2, pp. 21-24, Dec. 2000.
[11] C. Berrou and M. Jézéquel, "Non binary convolutional codes for turbo coding", Elect. Letters, Vol. 35, N° 1, pp.
39-40, Jan. 1999.
[12] M. Bingeman and A. K. Khandani, "Symbol-based turbo codes", IEEE Comm. Letters, Vol. 3, N° 10, pp. 285-287,
Oct. 1999.
[13] C. Berrou, "Turbo codes: some simple ideas for efficient communications", ESA DSP '2001, Lisbon, Oct. 2001 and
ESA TTC '2001, Noordwijk, The Netherlands, Oct. 2001.
[14] S. Le Goff, A. Glavieux and C. Berrou, "Turbo-codes and high spectral efficiency modulation", Proc. of IEEE
ICC'94, pp. 645-649, New Orleans, May 1994.
[15] G. Caire, G. Taricco, and E. Biglieri, “Bit-interleaved coded modulation”, IEEE Trans. Inform. Theory, Vol. 44,
No. 3, pp.927-946, May 1998.

				
DOCUMENT INFO
Shared By:
Tags: Modem
Stats:
views:94
posted:11/6/2010
language:English
pages:15
Description: Modem, a computer hardware, which can be translated into the computer's digital signal can be transmitted along the ordinary telephone line pulse signal, and these pulses can also be another line modem to receive the other end, and translated into computer intelligibility of language. This simple process is complete, the communication between two computers.