VIEWS: 94 PAGES: 15 CATEGORY: Internet / Online POSTED ON: 11/6/2010
Modem, a computer hardware, which can be translated into the computer's digital signal can be transmitted along the ordinary telephone line pulse signal, and these pulses can also be another line modem to receive the other end, and translated into computer intelligibility of language. This simple process is complete, the communication between two computers.
High Speed Modem Concepts and Demonstrator for Adaptive Coding and Modulation with High Order in Satellite Applications1 C. Berrou(6), R. De Gaudenzi(1), C. Douillard(6), G. Gallinaro(5), R. Garello(3), D. Giancristofaro(2), A. Ginesi(1), M. Luise(4), G. Montorsi(3), R. Novello(2), A. Vernucci(5) (1) European Space Agency (2) Alenia Spazio (3) Politecnico di Torino (4) Università di Pisa (5) Space Engineering (6) ENST Bretagne 1. Introduction Within the satellite communications innovation trend, some key issues, such as: • usage of increasingly high frequency bands (e.g., from X, Ku, up to Ka and Q;). • the higher frequency reuse achievable with multi-beam satellite antenna technology; • the increasing satellite RF power becoming available thanks to the platform technology improvement; • the exploitation of Adaptive Coding and Modulation (ACM) techniques in addition to power control to reduce the static link margin and matching the physical layer to the location and time dependent SNIR [1], [2] are shifting the focus from classical satellite modulation schemes, such as QPSK, to higher order M-ary modulation schemes. The latter can provide a higher spectral efficiency and thus the data rate required for either multi-media applications or for applications such as point-to-point high data rate backbone connectivity, and future Earth Observation missions requiring downlink data rates exceeding 1 Gb/s. Usage of the higher frequency bands, while providing improved bandwidth and clear sky link budget for a given user station antenna dish size, implies the need to cope with an increase depth of fading events as well as a higher level of local oscillators’ phase noise. The above mentioned problems can be mitigated by the exploitation of Fade Mitigation Techniques (FMT) such as gateway site diversity, ACM for physical layer, power control etc…) and robust demodulator synchronization techniques possibly inserting pilot symbols. Notwithstanding the above technological improvements, satellite RF power still represents the key cost driver of satellites and calls for efficient on-board high-power amplification.. To achieve the required power efficiency highly efficient coding shall be associated with a wide range of (high-order) modulation schemes, optimised for the non-linear satellite channel operation. On the demodulator side, the challenge is to cope with the high-speed link requirements and to extract synchronisation and channel estimation accurate enough to avoid impairing the data recovery process even in presence of challenging phase noise disturbances, at the low Es/N0 imposed by the state of the art coding solutions. This paper reports the preliminary outcomes of the Modem for Higher Modulation Schemes (MHOMS) project funded under the Technology Research Program of the European Space Agency (ESA). The MHOMS program is aimed at the research, design development and demonstration of an innovative high-rate, very high- speed, on-the-fly re-configurable satellite digital modem prototype, with maximum bit rate of 1Gbps supporting a wide range of spectral efficiencies (from 0.5 to 5.4 bps/Hz). The modem performance requirements for all operational modes (spectral efficiencies) are set to a very small distance from the Shannon capacity bound (about 1 dB max). Furthermore, specified losses for modem operation over a typical Ka-band satellite nonlinear channel shall be low calling for innovative modulation/demodulation schemes design. Subject of the demonstration activity is the core HW section of the physical layer, expected to achieve unprecedented throughput/efficiency maximisation in the satellite communication link, requiring highly innovative design and development approach. This is particularly true for the demodulator/decoder sub-system that will require truly innovative high-speed architectures. The hardware demonstrator will include together with the modem, a channel simulator including non-linearity hardware simulator, programmable phase noise injection, adjacent and co-channel interference 1 The present contribution describes the achievements obtained within the ESA funded MHOMS program (Modems for High- Order Modulation Schemes Contract No. 16593/02/NL/EC) led by Alenia Spazio as prime contractor. injection. The current contribution focuses on the results of the study phase 1 (currently ongoing), which aims at the specification of a flexible yet power and spectral efficient (de)coding, (de)modulation, synchronization techniques able to satisfy the challenging requirements set forth. Phase 2 will lead to detailed design, implementation and test of the demonstrator. The envisaged MHOMS modem application scenarios will encompass as a minimum: • High-speed Distributed Internet Access; • Trunk Connectivity (Backbone/Backhaul); • Earth Observation high-speed downlink; • Point-to-multipoint applications (e.g. high-speed multicasting/broadcasting). Among the above application scenarios, particular commercial interest is expected for the broadcasting/multicasting and Internet access applications, for which the team is actively participating to the DVB- S2 standardisation group [3]. With respect to the DVB-S2 new standard (limited to the forward link) the MHOMS work is also encompassing more advanced functionalities such satellite beam hopping requiring a different frame structure and enhanced reverse link compared to the current DVB-RCS. 2. Modulation schemes and signal detection/synchronisation techniques Together with state-of–the art coding, soon addressed, new approaches to the issue of modulation and signal detection/synchronization have been taken into account. Two main aspects were considered in this respect; namely, non-linear distortions introduced by the High-power Amplifier (HPA) on-board the satellite on one hand and low- SNIR robust synchronization on the other. 2.1 Modulation formats selection Concerning non-linear distortions, two basic approaches were investigated. The first one is proper design of the constellations to be used. Traditional square- or cross-QAM constellations in fact have a high peak-to average power ratio, that cause non-negligible AM/AM and AM/PM distortions when the signal is amplified by the HPA. This remark led to the “rediscovery” and optimisation of different multi-level Amplitude- and Phase-Shift Keying (APSK) constellations based on concentric “rings” of equi-spaced points, such as the 4+12-APSK (a inner 4PSK constellation surrounded by an outer 12PSK) shown in Figure 1 and so on for 32-APSK and 64-APSK [4]. In 4+12-APSK for instance, the modulation symbols bear two different amplitudes only, thus minimising envelope fluctuations in the transmitted signal. This in turn results in lower distortion onto the HPA-amplified received signal. By proper capacity-based APSK constellation parameters optimisation, APSK showed superior performance over nonlinear satellite channels and almost identical performance compared to QAM over linear AWGN channel. Minimum Euclidean distance-based constellation optimisation (see Figure 1-a) although simpler provides in some cases sub-optimum results. The 16-APSK and 32-APSK constellations proposed by the team have been retained by the DVB-S2 standard [3]. The second approach to minimize the effect of nonlinearity is adaptive constellation (data) pre-distortion in the transmitter. Pre-distortion means intentionally modifying the location of the data symbols on the complex plane with respect to their nominal position. The criterion is simple: just try to obtain at the receiver matched filter output a constellation that, in the average, looks like the ideal one you would get in the absence of distortions, by means of appropriately pre-distorting the constellation points at the transmitter. Such a technique only calls for a modification of the transmitted constellation points, without the need to resort to analogue devices. This is particularly straightforward and effective for circular constellations such as APSK. The drawback of this technique is the need for the Earth station to have knowledge on how to pre-distort its own transmitted signal; this depends on how the whole communication chain distorts the signal; the required information can be easily obtained a priori or a posteriori if the uplink ground station receives its own signal in the downlink. If required, the adaptation is carried out in a training phase on known (pilot) symbols, not necessarily time-contiguous, that can be periodically repeated according to the (slow) dynamics of the HPA characteristics drift [5]. Different pre-distortion schemes were investigated, based either on “instantaneous” evaluation of the distortion at the receiver (adaptive static pre-distortion), or on the consideration of a certain amount of “memory” in the combined phenomenon of non-linear distortion plus matched filtering at the receiver (so called adaptive dynamic pre-distortion) [6]. In particular, the latter approach can compensate both for the non-linear distortion (mainly caused by the chain composed by transmitter filter + HPA + receiver filter) and for the linear distortion as well (caused by the satellite IMUX and OMUX filters at the input and the output of satellite transponder, respectively). This makes the use of a decision feedback equalizer (DFE) at the receiver side unnecessary. The set of tested modulation schemes for MHOMS includes “conventional” QPSK or 8-PSK modulations, two-ring 16-ary, three rings 32-ary and four ring 64-ary APSK constellations. Thanks to the coded APSK constellation and pre-distortion techniques, in single-carrier mode the HPA can be operated remarkably close to saturation. This is particularly true for 16-APSK which can basically optimally operate at Input Back-Off (IBO) very close to 0 dB. It turns out that with data (constellation) predistortion, the HPA can be operated remarkably close to saturation (down to 1.5 dB average Output Back-Off) even with 16-point variable-amplitude constellations and in the presence of highly-efficient channel coding (turbo or LDPC). For instance Figs. 2 and 3 show curves for the total (power) degradation due to nonlinear distortions and output power backing off with data modulation in a realistic simulation of a satellite channel. The simulation was carried out with measured characteristics of HPA and IMUX/OMUX filter, and with a special technique of “dynamic” constellation predistortion with memory that optimizes system performance. The advantage of dynamic techniques with respect to “static” (memoryless) predistortion and simple phase/amplitude recovery (DA-AGC) is apparent. The optimum TD for 4-12 APSK with rate-3/4 LDPC coding is 2 dB of which 0.9 dB is due to the IMUX/OMUX losses and 0.9 dB corresponds to the TWTA power modulation loss due to the non constant input signal envelope. Figure 1: Multi-ring APSK constellations: a) Euclidean distances, b) 16-points 4+12-APSK; c), 16-points 6+10- APSK; d) 64 points 4+12+20+28-APSK. 8 8 DA-AGC DA-AGC Static Data Predistortion 7 7 Static Data Predistortion Dynamic Data Predistortion M=5 Dynamic Data Predistortion M=5 6 6 Total Degradation , [dB] Total Degradation , [dB] 5 5 4 4 3 3 2 2 16-QAM 4-12 APSK LDPC Code LDPC Code N = 64800 , R = 3/4 N = 64800 , R = 3/4 1 60 Iterations 1 60 Iterations (Es/N0)lin=9.9 dB (Es/N0)lin=10.1 dB -5 -5 BER = 10 BER = 10 0 0 0 1 2 3 4 5 0 1 2 3 4 OBO , [dB] OBO , [dB] Figure 2: Total Degradation for r=3/4 LDPC-coded 16- Figure 3: Total Degradation for r=3/4 LDPC-coded 4-12- QAM with DA-AGC, Static Data Predistortion and APSK with DA-AGC, Static Data Predistortion and Memory-5 Dynamic Data Predistortion. Memory-5 Dynamic Data Predistortion. 2.2 Synchronization techniques As for signal synchronization, the receiver has to be able to work with a powerful channel coding (turbo or LDPC with large block length)with an operating Es/N0 ratio considerably lowered with respect to more conventional coding schemes with shorter code blocks and/or smaller coding gain. The best that a synchronisation unit can do in this respect is given by the performance of data-aided algorithms, that can only be applied on pilot symbols known to the receiver in advance. On the other hand, insertion of pilot symbols decreases the efficiency of the link, reducing the net Eb/N0 ratio on information bits. A pilot “density” of only 20% with respect to the information symbols means a decrease of roughly 1 dB in power efficiency, thus partially offsetting the benefits of the adoption of more advanced coding schemes. The approach that was taken in the design of the MHOMS modem was to investigate the trade-off between (optimum) data-aided synchronisation with a loss due to pilot insertion, and (suboptimal) “blind” synchronisation with no pilot insertion loss, and to identify the best solution for the different operating modes. Blind synchronisation places some constraints on the adoption of particular modulation schemes. For instance, 4-12 APSK turns out to be more efficient than conventional 16-QAM on the non-linear satellite channel, but blind carrier frequency synchronisation is extremely challenging (due the particular constellation symmetry), while on the contrary it is relatively straightforward with 16-QAM. Novel techniques for blind carrier phase recovery (which is needed to perform coherent detection and channel decoding) were also investigated. In particular, by exploiting the Expectation Maximisation (EM) algorithm [7], soft-decision-directed iterative phase estimation combined with iterative channel decoding (the so-called turbosync approach) was shown to be applicable both to continuous- and to burst-mode operation of the modem, even in the presence of large amounts of oscillator phase noise. With long codewords, the phase noise has relevant variations over the channel code block. This means that correcting the carrier phase using a constant term over the entire frame is useless. The best approach is subdividing the block in L sub-blocks, each one encompassing a pilot symbol preamble and a “payload” section. The preamble is needed for “pre-compensating”, with a data aided estimate, the overall phase error in the subsequent payload section, so that the EM algorithm can operate within its phase acquisition range (about [-30,30] degrees). The resulting framing structure is shown in figure 4. Pilot Data Symbols Field ... NP ND N Figure 4: Pilot distribution scheme I. Obviously, the performance depends on the pilot symbol density η , defined as NP η= (1) NP + ND where N D is the data field length and N P is the preamble length. For each constellation, we found an optimal value of N D which represents the best trade off between two opposite constraints. First, it is necessary to minimise the loss in power efficiency due to the insertion of pilot symbols (need to have a short preamble section); second, we want to achieve an estimate of the carrier phase as accurate as possible (need to lengthen the preamble). Figure 5 shows an alternative pilot distribution scheme that has been tested. For each sub-block the preamble analysed for the previous system is evenly split between the head and the tail of the sub-block. In other words, half preamble of the first sub-block is shifted at the end of the last sub-block of the frame. A linear interpolation between two subsequent phase estimates is exploited to pre-compensate for phase noise and carrier frequency offset. In particular, this pilot distribution allows to obtain an estimate of the “instantaneous” carrier phase at the end of each frame, so that we can perform linear interpolation between the last two phase estimates θ L −1 and θ L to ˆ ˆ better compensate for frequency offset and phase noise. θ0 θ1 θ2 θL ... NP NP ND 2 2 N Figure 5 – Pilot distribution scheme II. This algorithm requires an additional computational load to perform phase interpolation; however, it allows to −4 counteract residual frequency errors up to 3 × 10 times the symbol rate. A performance sample of such techniques is given in Figs. 6 and 7 8PSK 8PSK LDPC Rate 2/3, N = 64800 LDPC Rate 2/3, N = 64800 -1 10 Sub-block: 300 -1 Sub-block: 600 10 Pilot Density 2% Pilot Density 3% -2 10 -2 10 -3 10 -3 10 BER BER -4 10 -4 10 -5 10 -5 10 -6 AWGN AWGN 10 νT = 1 x 10 -4 νT = 0 -4 νT = 3 x 10 -4 νT = 10 -4 νT = 5 x 10 -4 νT = 3 x 10 -7 -6 10 10 6.0 6.2 6.4 6.6 6.8 7.0 6.0 6.2 6.4 6.6 6.8 7.0 ES / N0 [dB] ES / N0 [dB] Fig. 6 – EM algorithm performance using pilots and sub- Fig. 7 – EM algorithm performance using pilots and sub- block linear interpolation pre-compensation using a 64800 block linear interpolation pre-compensation using a 64800 bit rate 2/3 LDPC code with 8PSK, with 300 symbols sub- bit rate 2/3 LDPC code with 8PSK, 600 symbols sub-block block and pilot density of 3%. and pilot density of 2%. 3. Code design and trade-off Parallel Concatenated Convolutional Codes (PCCC), Serially Concatenated Convolutional Codes (SCCC), and Low Density Parity Check codes (LDPC) have been investigated. All the pre-selected schemes allow to achieve the desired flexibility and maintain excellent performance, very close to the sphere packing lower error bound. 3.1 Serial Concatenated Convolutional Codes For SCCC, the innovative pragmatic scheme depicted in Figure 8 has been designed. It allows maintaining the same interleaver with different code-rates, with evident advantages in term flexibility. It consists of an outer and inner encoders stemming from the same 4-state, rate ½, recursive, systematic encoder shown in the same figure. For all the SCCC rates but ¼ the outer encoder is punctured to a rate 2/3 through the optimal puncturing reported in the figure. The interleaver size is kept fixed to the length N required by the application at hand for all rates and modulation formats. The corresponding information block size is then independent from the SCCC rates, and amounts to Ni=NRo, where Ro=2/3 is the outer code fixed rate. The only exception is the SCCC rate ¼, for which both encoders need to be used without puncturing. In that case, we still keep the information block size fixed, and use an interleaver with size N=2 Ni . In order to obtain the desired SCCC rate, we perform puncturing at the output of the inner encoder, according to the scheme shown in Figure 8. The upper register at the output of the inner encoder contains the N+2 inner systematic bits, which coincide with the interleaved outer code word plus the 2 bits terminating the inner trellis. The lower register, instead, contains the N+2 parity-check bits generated by the inner encoder. Two different puncturing algorithms are used to puncture bits in the upper and lower registers. Puncturing in the upper register is performed on the N inner systematic bits (excluding the 2 inner code terminating bits that are always transmitted) according to a puncturing pattern periodic with period 200 trellis steps, which correspond in our case to 300 outer coded bits. The systematic bits of the inner encoder correspond to the code word generated by the outer encoder, so that the puncturing pattern on these bits has been designed to maximise the free distance of the outer encoder, and take into account that puncturing occurs after interleaving. Aiming also at high code rates, it is computationally complex to exhaustively optimise the puncturing patterns for the outer encoder. Thus, we used a slightly sub-optimal, yet manageable, searching algorithm that works incrementally, in a rate-compatible fashion, so that the punctured positions for a given outer rate are also punctured for all higher rates. Optimisation of the upper register puncturing pattern involves both the number of bits to be punctured, and their position. Denoting by Ssur the number of surviving bits in each 300-bit period of the upper register after puncturing, the outer code “equivalent” rate, as seen from the SCCC output, is then 200 ⋅ n p − 2 Ro = Ssur ⋅ n p + 2 where np is the number of puncturing periods contained in the interleaver, i.e., np=N/300. A design procedure for the outer code puncturing yielding outer code rates in the whole range 200 / 300 ≤ Ro ≤ 200 / 201 has been developed for this purpose. S bits RC Punc. Interleaver S sys Fix Rate ½ Punct. Rate ½ Mapping u Interl. S 4-state 11 4-state Modulation 10 P Punc. P bits parity u c1 m bits c2 Figure 8: Block diagram of the SCCC scheme Puncturing bits in the lower register, which contains the parity-check bits generated by the inner encoder, is obtained by applying a rate matching algorithm, and consists in deleting regularly spaced bits according to the rate matching parameter Psur/(N+2), representing the fraction of surviving parity bits Psur and the overall number of parity bits before puncturing. Denoting by RSCCC the overall SCCC rate, we can easily derive Psur as 2N − 6 Psur = − n p S sur − 2 3RSCCC The simulation results show that the newly proposed scheme offers good performance in a large range of code rates, including very high ones. As an example, the FER performance for the designed SCCC codes with code-rate 5/6 and codeword lengths 16384 and 480 bits, respectively, are reported in Figure 9 for 4-PSK modulation, for 50 decoding iterations. 1.00E+00 1.00E-01 1.00E-02 1.00E-03 1.00E-04 1.00E-05 1.00E-06 2.5 3 3.5 4 4.5 5 5.5 6 6.5 E b /N 0 dB [ ] Figure 9:Simulation results for the new proposed SCCC scheme. Rate 5/6, n=16384 and 480 bits. 3.2 Low Density Parity Check Codes For LDPC, a new class of “modular” parity check matrixes has been investigated. These codes have a linear encoder complexity and are particularly suited for parallel implementation of the decoder algorithm. The codes have been optimized for high SNR performance, since FER as low as 10-7 are required by most considered applications. For limiting encoding complexity, we impose the following structure on the parity check matrix H of the designed LDPC codes C(n=k+m,k): H=[Hp|Hd] where Hd is an m by k matrix, and Hp is a dual diagonal m by m matrix that is shown below for m=5: 1 0 0 0 0 1 1 0 0 0 H p = 0 1 1 0 0 0 0 1 1 0 0 0 0 1 1 The codeword vector x can be spitted into a systematic part s (with dimension k) and a parity part p (with dimension m), such that x=(s,p). The encoding is performed by searching the parity vector p which satisfy the equation H x = 0, which is equivalent to Hp p = v. As a consequence, the m parity check bits can be easily computed by back-substitution: p1 = v1 pl = vl + pl −1 l=2,…,m and the overall complexity grows linearly with n. The random part of the matrix, Hd have been designed by properly distributing smaller sub-matrices, a structure which helps parallel decoder architectures. Each of them can be: • An all-zero matrix • A matrix obtained by shifting the m by m diagonal matrix The position of the non-zero matrices and their respective cyclic shift are chosen using the Progressive Edge Growth (PEG) algorithm [8]. This algorithm guarantees that every time a non-zero matrix is added, its position and cyclic shift are chosen so that the girth (length of minimal cycles) of the graphs is as large as possible. The performance of the designed LDPC are very good, showing limited penalty with respect to theoretical lower bounds. As an example, the FER performance for the designed LDPC codes with code-rate 5/6 and codeword lengths 16384 and 480 bits, respectively, are reported in Figure 10 for 4-PSK modulation, for 50 decoding iterations. LDPC RATE=5/6 MOD=4PSK 1.00E+00 1.00E-01 USparse n=480 1.00E-02 Mod(16384,13568) 1.00E-03 FER 1.00E-04 1.00E-05 1.00E-06 1.00E-07 1.00E-08 2.5 3.5 4.5 5.5 6.5 7.5 8.5 Eb/No Figure 10:Simulation results for the new proposed LDPC scheme. Rate 5/6, n=16384 and 480 bits The decoding complexity of the designed coding schemes, which is essential for proper comparisons, has been deeply analyzed and discussed [9]. This aspect is indeed fundamental for our applications, due to very high data- rates and required flexibility. 3.3 Parallel Concatenated Convolutional Codes A very flexible error-correcting turbo code has been devised for adaptive coding and modulation in high throughput satellite applications. This code, nicknamed TurboΦ, is derived from the extension of the DVB-RCS turbo code [10] to 16 states. TurboΦ is intended to offer near-Shannon performance on Gaussian channel, in most situations of block size, coding rate from 1/4 up to 8/9 and associated modulation. It is expected that the FER (frame error rate) versus Eb/N0 curve does not display any marked change in the slope (flattening) down to 10-7, in most situations. Besides its high degree of versatility regarding the system requirements and thanks to the very simple generic model of its internal permutation, the specification of TurboΦ gives the natural possibility of using a high level of parallelism in the associated decoder. This would lead to the possibility of reaching throughputs as large as 50 Mbit/s on FPGAs such as Virtex2-8000. TurboΦ is depicted in Fig. 11.The encoder is a parallel concatenation of two 16-state duo-binary circular recursive systematic convolutional (CRSC) encoders, fed by blocks of k bits (N = k/2 couples). Both component encoders have identical features: polynomials 23 (recursivity, period 15), 35 (redundancy), first bit (A) on tap 1, second bit (B) on taps 1, D and D3. The use of duo-binary component encoders, already justified in several papers (see for instance [11-13]), offers better convergence of the iterative decoding process and larger minimum Hamming distances (MHDs), for medium and high coding rates (1/2 ≤ R). The circularity principle, that is the way to encode such that the final state of the register is equal to the initial one, allows each component convolutional code to become a perfect block code, without any need for additional information (no tail bits) and without any side effect. While the former point is interesting only for short blocks, the latter is crucial vis-à-vis MHDs for all sizes. Circular termination makes it possible to protect all the bits (or couples in the case of duo-binary codes) at the same level of protection, including the starting and finishing portions of the block. All bits (or couples) being uniformly encoded, and all of them benefiting from the whole set of redundancy, there is no particular insidious effect that would reduce drastically the MHD. Because no special care has to be taken, in the permutation design, about the starting and finishing parts of the block, circular termination constitutes a sizeable simplification in the research for good permutations, when the target error rate is low. One restricting condition to the use of the circularity principle is that the block length (i.e. N couples) must not be a multiple of the period of the pseudo-random generator, from which the CRSC encoder is derived. When adopting polynomial 23 for the encoder, N cannot be a multiple of 15, unless using stuffing symbols. To the previous condition, we add another one on N, which has to be a multiple of 4, because of the permutation law, which is detailed next section. All the blocks considered are then counted in bytes. A Systematic part c B 1 o A d s1 s2 s3 s4 Permutation e B (N) w o Y1 or 2 r Π 2 d N=k/2 puncturing Y Redundancy part couples of data Figure 12: The TURBOΦ encoder (R ≥ 1/2): a parallel concatenation of two duo-binary 16-state circular recursive systematic convolutional (CRSC) encoders. Permutation Π is carried out at the couple level, and couples (A,B) are permuted to become (B,A) before second encoding. The rationale in the choice of the permutation, whose design has strong impact on the MHD value and therefore on the asymptotic performance, rests on a trade-off between three concerns. It is well known that regular permutation is quite appropriate, with respect to error patterns with weight 2 or 3, and more generally with respect to error patterns that are not composite [13]. The permutation rule has then to be almost regular. Regarding composite error patterns, that is, patterns which link several return-to-zero (RTZ)2 sequences on both dimensions, some disorder has to be instilled. However, this disorder has to be carefully handled, in order to prevent non composite patterns from dominating again. A short disorder cycle value (typically 4) seems to be sufficient to obtain good MHDs. This low degree disorder is also a good thing, from the point of view of the permutation qualification, because the code has a small period, with a limited number of cases to be checked. Furthermore, the simplicity of the permutation law enables easy on-the-fly reconfigurability from both encoder and decoder sides. The turboΦ permutation is defined as follows: i (i = 0, …N-1) is the couple address in the natural order. j (j = 0, …N-1) is the couple address in the permuted order. - if j = 0 mod. 4, then Q = 0; - if j = 1 mod. 4, then Q = 4Q1; - if j = 2 mod. 4, then Q = 4P + 4(Q2 + 1); (1.1) - if j = 3 mod. 4, then Q = 4P + 4(Q3 + 1). 2 An RTZ sequence is any finite input sequence which makes the encoder quit state 0 and makes it go back to this state. Using the D operator, and denoting G(D) the recursivity generator of the encoder (G(D) = 1 + D3 + D4 for polynomial 23), all RTZ sequences are multiple of G(D). Finally, i = Π(j) = Pj + Q +3 mod. N (1.2) P is an integer, relatively prime with N. Q1 Q2 and Q3 are small integers (from 0 up to 8). The four parameters of the permutation law have to be determined for each block size. Fig. 13 compares the TurboΦ and DVB-RCS turbo code performance, for a particular case (k = 1504, R = 1/2, 2/3 and 3/4). The average improvement for these medium coding rates is about 1 dB, or more, for FER = 10-7. The gain may be significantly larger for higher rates. Fig. 14 shows the performance obtained from the association of TurboΦ and three different modulations, in order to achieve spectral efficiencies of 1, 2 and 3 (bit/s)/Hz, respectively, for short blocks. The association is made under the so-called pragmatic [14] or BICM [15] conditions. From the results shown, we can observe in particular the absence of significant change in the slopes (flattening), down to FER = 10-7. 1.0e+00 1.0e-01 1.0e-02 R=1/2: 8-state 16-state R=2/3: 8-state 1.0e-03 16-state R=3/4: 8-state 16-state FER 1.0e-04 1.0e-05 1.0e-06 1.0e-07 1.0e-08 -1.5 -1 -0.5 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 Eb/N0 (dB) Figure 13: Performance comparison between TurboΦ (16-state) and DVB-RCS turbo code (8-state) for QPSK. k = 1504, R = 1/2, 2/3 and 3/4. Max-Log-MAP component algorithm, 8 iterations, 4-bit quantization. Figure 14: Performance obtained from the association of TurboΦ and QPSK (R = 1/2, k = 408), 8-PSK (R = 2/3, k = 816) and 16-QAM (R = 3/4, k = 1224). Pragmatic coded modulation. Max-Log-MAP component algorithm, 8 iterations, 6-bit quantization. 3.4 Preliminary coding schemes trade-off In previous sections we have described candidate coding schemes drawn from the class of PCCC, SCCC and LDPC codes, and provided some simulation results. In this section, we will propose a few criteria used as “goodness” measures to rank the proposed schemes and draw some preliminary conclusions. We have used the following criteria to compare and rank the described codes: 1. Performance. We will use as performance measure the Eb/N0 required to achieve a FER=10-4 for the information block size of 428 and code rates 1/3 and 5/6, and a FER=10-6 for the code word block size of 16,384, all rates, and for the information block size of 428 and rate 9/10. 2. Complexity. As complexity measure we will use the numbers of elementary operations (arithmetic complexity) and memory requirements (memory complexity) per decoded bit per iteration. 3. Flexibility. We will comment on the flexibility of the proposed schemes as their ability to cope with the stringent requirement of modifying the rate, block size and modulation on a per frame basis. 4. Number of iterations required. This number impacts on the complexity parameter, essentially as a multiplicative parameter of the estimated arithmetic complexity. We prefer, however, to keep it as a separate item in this preliminary stage, since its effect on the overall complexity depends on the chosen decoder architecture. 5. IPR issues. 6. Maturity. We will comment on the maturity of the proposed schemes in terms of existing implementations and general industrial acquaintance with them. The FER performance of the three coding schemes with QPSK modulations are compared in Figg. 15 and 16. Simulation results, k=428 4PSK-modulation 1.00E+00 SCCC LDPC PCCC SCCC LDPC PCCC SCCC LDPC PCCC 1.00E-01 9/10 1.00E-02 1/3 5/6 FER 1.00E-03 1.00E-04 1.00E-05 1.00E-06 0 1 2 3 4 5 6 7 8 9 10 E b/ N 0 [dB] Figure 15. FER performance of PCCC, SCCC and LDPC for the short block size and all three code rates. Simulation results n=16384 4PSK-modulation 1.00E+00 SCCC LDPC PCCC SCCC LDPC PCCC 1.00E-01 SCCC LDPC PCCC 5/6 1.00E-02 9/10 1/3 FER 1.00E-03 1.00E-04 1.00E-05 1.00E-06 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 E b/ N 0 [dB] Figure 16. FER performance of PCCC, SCCC and LDPC for the long block size and all three code rates. In Table 1, we summarize the comparison criteria for the three classes of codes. The table reports for the short and long block codes the required Eb/N0 for the target FER, the Eb/N0 loss with respect to the best code in dB, the minimum distance and the number of nearest neighbours when available. In terms of complexity, we report for the short and long block codes the RAM and ROM memory occupations, the number of sums and max/max* operators per decoded bit, the number of iterations, and, finally, the evaluation of the other comparison criteria. The performance comparison For the short block size, the PCCC scheme offers the best performance. It yields a gain from 0.3 dB (rate 1/3) to 1.05 dB (date 9/10) over SCCC, and from 0.5 dB (rate 1/3) to 1 dB (rate 9/10) over the LDPC. For the long block size, the LDPC scheme offers the best performance, close to those of the SCCC (gains from 0.15 to 0.25 dB). Both schemes do not show the presence of error floors. On the other hand, the PCCC scheme shows the presence of a floor for all three rates. For the rate 1/3, this leads to a loss of 0.45 with respect to LDPC and 0.3 to SCCC. For the higher rates, the floor is so pronounced that it is difficult, and questionable in absence of the results at FER=10-6 to measure the losses. The complexity comparison In terms of complexity, the SCCC scheme is the simplest in terms of RAM occupation for all block sizes and rates. The LDPC scheme complexity are much larger. In terms of ROM memory (recalling that the area occupation of a ROM is roughly 1/5 of that of a RAM of the same size), the PCCC scheme is the best, since it does not require storing the permutation, but only a few parameters from which the permuted addresses are derived on the fly. In the table we report the ROM occupation for one PCCC block size. This memory should be multiplied by the number of permutations required according to the frame flexibility. As to the arithmetic complexity, the SCCC scheme is by far the simplest. It is roughly 4-5 times simpler than the PCCC scheme when both use the MAX* operation, and 2-3 times less complex when the PCCC uses the MAX operation (notice that the added complexity of the LUT is almost irrelevant). The LDPC schemes is 5-6 times more complex. Notice also that for high data rates requiring a highly parallel architecture, the differences in terms of area due to the arithmetic operations becomes even more sensible. Code Φ PCCC - TurboΦ SCCC LDPC Criterion 1/3 5/6 9/10 1/3 5/6 9/10 1/3 5/6 9/10 Eb/N0 @ Target FER 2 5.4 6.7 2.3 6 7.75 2.5 5.9 8? t or Loss to best Sh 0 0 0? 0.3 0.6 1.05 0.5 0.5 1? ce an dmin, Nmin 36-107 5-32.4 4-337 30-1 5-1 4-3 X X X m or rf Eb/N0 @ Target FER Pe ng 1.2 3.9?? 5.2? 0.9 3.4 4.35 0.75 3.25 4.1 Loss to best Lo 0.45 0.65?? 1.1? 0.15 0.15 0.25 0 0 0 dmin, Nmin 66-1366 11-1365 7-590 66-2 7-1 7-1 X X X RAM Memory 15,683 7,059 6,869 10,914 7,059 6,869 34,413 12,641 10,435 ROM Memory 3,582 1,712 1,712 6,420 6,420 6,420 47,988 15,829 12,661 t or SUM 2,050 2,050 2,050 730 730 730 950 350 250 Sh MAX* (MAX) 1580 1580 1580 370 370 370 1600 900 700 ity Number of iterations 10 10 10 10 10 10 50 50 50 ex pl om RAM Memory 196,608 225,276 236,742 139,260 225,276 236,742 552,313 614,137 630,713 C ROM Memory 70,997 88,745 95,843 106,490 307,193 331,763 1,142,383 1,292,527 1,332,783 ng SUM 2,050 2,050 2,050 730 730 730 1,250 550 550 Lo MAX* (MAX) 1580 1580 1580 370 370 370 2500 1600 1500 Number of iterations 10 10 10 10 10 10 50 50 50 Sensitive to interleaver High (10) Code must be designed for optimization (8) all cases (5) Flexibility es YES NO NO su is er IPR issues th O Very High (10) High (8) Medium (6) Maturity Table 1: Summary of complexity and performance comparison for the three classes of PCCC, SCCC, and LDPC. The flexibility comparison Flexibility refers here to the possibility (and the related complexity/performance consequences) of the co-decoder to adapt to different code rates, modulation schemes, and block sizes on a per frame basis. All schemes can adapt easily to variations in the modulation schemes, as they are inherently pragmatic schemes, working with binary co-decoders fed by proper projections of the received soft symbols likelihood functions onto the binary log-likelihood ratios of the bits that form a modulation symbol. The SCCC scheme adapts very easily to different code rates. It only requires the definition of the puncturing and rate matching parameters (performed off line once for all), to be stored into a memory table. An almost continuous variation of the code rate from rate ¼ to k/k+1 can be coped with. The same is true for the PCCC scheme. Adaptation to different code rates requires for the LDPC codes to work in general with different codes, which have to be stored. Also, the decoding engine, although general in terms of elementary and repetitive operations, must be able to work with possibly different number of edges entering into and exiting from variable and parity nodes. The IPR issues comparison In this case, the situation appears to be rather different for the various schemes. PCCC are protected by several France Telecom patents. For the LDPC codes, we have examined two patents by Flarion. From the patents analysis, we feel that our LDPC designs are not covered by Flarion patents. The SCCC structure and decoding algorithm are free from patent protection. For these reasons, we inserted a YES or NO in the appropriate column of the previous tables. The maturity comparison PCCC have been already implemented in numerous cases, and accepted as standards in several system applications. The only novelty concerns perhaps the parallel architectures needed to reach high data rates. SCCC have seen less cases of implementations, but their technology should not pose any further problems. LDPC are relatively newer in terms of applications, although their acceptance in the new DVB-S2 standard should accelerate the full comprehension and solution of implementation problems. Also in this case, we have attempted to grade from 1 to 10 the technological maturity of the three schemes in the previous tables. 4. Conclusions and outlook The paper presented preliminary results from the on-going Phase 1 of the ESA funded MHOMS project aiming at the study design and development of a ultra high-speed, high-performance fully reconfigurable digital modem. The MHOMS modem will be able to cover future needs of telecommunications and Earth Observation missions and will feature state of-the-art coding, modulation, demodulation, synchronization and decoding algorithms. The MHOMS technological development will also lay the foundation (building blocks) of a new class of modems able to exploit adaptive coding and modulation. During the remaining part of phase 1 activity, the trade-off among the described candidate coding, modulation, pre and demodulation/synchronisation techniques will be finalised and complemented by architectural design considerations. Also possible physical layer improvements for the return link of interactive satellite networks covered by the current DVB-RCS standard will be proposed. During phase 2 of the activity the MHOMS prototype will be designed, assembled and tested in laboratory set-up inclusive of a satellite channel simulator, aiming at demonstration of the most challenging modem operating modes. REFERENCES [1] R. De Gaudenzi, R. Rinaldo, “Adaptive Coding and Modulation for Next Generation Broadband Multimedia Systems”, in the Proc. of the 20th AIAA Satellite Communication Systems Conference, Montreal, May 2002, AIAA- paper 2002-1863. [2] R. De Gaudenzi, R. Rinaldo, A. Perez Carro-Rios, “Adaptive Coding and Modulation for the Reverse Link of Next Generation Ka-band Multimedia Systems”, in the Proc. of the Ka-band 8th Utilization Conference, Baveno, Italy, Sept. 25-27, 2002. [3] Digital Video Broadcasting (DVB), “Second generation framing structure, channel coding and modulation systems for Broadcasting, Interactive Services, News Gathering”, DVB-S2 draft document DVBS2-74, June 2003. Available on the DVB organisation web site (http://www.dvb.org) [DVB-S2 ftp directory] [4] R. De Gaudenzi, A. Guillen i Fabregas, A. M. Vicente, and B. Ponticelli, “High Power and Spectral Efficiency Coded Digital Modulation Schemes for Nonlinear Satellite Channels”, in Proc. 7th International ESA Workshop on Digital Signal Processing Techniques for Space Applications, Sesimbra, Portugal, October 2001. [5] L. Giugno, V. Lottici, M. Luise, “Adaptive Compensation of Nonlinear Satellite Transponder for Uncoded and Coded High-Level Data Modulations”, submitted to IEEE Transactions on Wireless Communications, November 2002. [6] G. Karam and H. Sari, “A Data Predistortion Technique with Memory for QAM Radio Systems.” IEEE Transactions on Communications, Vol. 39, No. 2, pp 336-344, February 1991. [7] V. Lottici, M.Luise, “Embedding Carrier Phase Recovery into Iterative Decoding of Turbo-Coded Linear Modulations” to appear on IEEE Trans. Comm. 2003. [8] Xiao-Yu Hu; Eleftheriou, E.; Arnold, D.-M.; “Progressive edge-growth Tanner graphs,” Global Telecommunications Conference, 2001. GLOBECOM '01. IEEE , Volume: 2 , 2001, pp. 995 -1001. [9] Politecnico di Torino, ESA MHOMS project WP 1320 activity report (Modems for High-Order Modulation Schemes Contract No. 16593/02/NL/EC). [10] DVB, "Interaction channel for satellite distribution systems", ETSI EN 301 790, V1.2.2, pp. 21-24, Dec. 2000. [11] C. Berrou and M. Jézéquel, "Non binary convolutional codes for turbo coding", Elect. Letters, Vol. 35, N° 1, pp. 39-40, Jan. 1999. [12] M. Bingeman and A. K. Khandani, "Symbol-based turbo codes", IEEE Comm. Letters, Vol. 3, N° 10, pp. 285-287, Oct. 1999. [13] C. Berrou, "Turbo codes: some simple ideas for efficient communications", ESA DSP '2001, Lisbon, Oct. 2001 and ESA TTC '2001, Noordwijk, The Netherlands, Oct. 2001. [14] S. Le Goff, A. Glavieux and C. Berrou, "Turbo-codes and high spectral efficiency modulation", Proc. of IEEE ICC'94, pp. 645-649, New Orleans, May 1994. [15] G. Caire, G. Taricco, and E. Biglieri, “Bit-interleaved coded modulation”, IEEE Trans. Inform. Theory, Vol. 44, No. 3, pp.927-946, May 1998.