Prototyping Quadrature Amplitude Modulation for Two-way Communication on CATV Networks R. Lauwereins, M. Adé P. Vandaele, M. Moonen P. Schaumont KULeuven-ESAT/ACCA KULeuven-ESAT/SISTA Imec-VSDM/DISTA Kard. Mercierlaan 94 Kard. Mercierlaan 94 Kapeldreef 75 B-3001 Heverlee, Belgium B-3001 Heverlee, Belgium B-3001 Heverlee, Belgium Email: Rudy.Lauwereins@esat.kuleuven.ac.be 1. Abstract band for upstream communication is in the 5-25 MHz range, see also Figure 1. upstream downstream The recently discovered potential of two-way communication on CATV networks for analogue TV digital TV and interactive advanced telecommunications applications like 7 MHz spaced services, 8 MHz spaced video-on-demand, spawned research and 5 25 40 300 860 f (MHz) development in modem design for up-stream communication. This paper reports on the Figure 1. Frequency allocation prototyping of such a 16QAM modem and The basic configuration of the network is compares the achievable sample rates on 4 DSP hybrid fibre-coax. This means that the upstream processors to simulation speeds obtained on a signal will first travel through coax before powerful workstation. entering a fibre node and going through the fibre 2. Introduction trunk to the head-end station. The coaxial part will bring along some serious channel Recently, coaxial cable networks have impairments  which must be compensated for received much attention in the context of at the receiver. Without going into too much interactive application [1,2]. In countries with a detail we only mention the most important high penetration of CATV (e.g. Belgium impairments. The group delay distortion (i.e. >90%), the cable network forms a viable signals at different frequencies propagate with a alternative to classical telephone networks. different velocity through the network) causes Envisaged applications are telephony, severe inter-symbol interference at the receiver. interactive television, home-shopping, video-on- Micro-reflections are caused by discontinuities demand, high-speed Web browsing, etc. The in the transmission medium and cause part of interactive nature of these services requires the signal energy to be reflected. Ingress noise however two-way communication on a network models the interference caused by the antenna that initially was only intended for a one-way like properties of the cable. Burst noise typically broadcasting of television signals. What is originates from household appliances such as aimed at now, is a low bit-rate upstream link electrical motors. Besides this there are common (from the subscriber to the head-end station) path distortion products, thermal noise, impulse and a high bit-rate downstream link (from the noise, non-linearities, phase-noise frequency head-end station to the local subscribers). offset etc. If we add to this the variations in the Particularly the first problem is challenging networks stemming from the variability of the because very little is known about the upstream number of trunk, bridge and distribution channel, and communication standards are still amplifiers it becomes quite clear that it is very under development . The projected frequency hard to build a channel model which timing AGC phase IP equalizer slicer FFE IP FBE ED + EC ED + EC ED + EC ED : Error Detection EC : Error Correction Figure 2. Receiver architecture incorporates all these statistical and non- 3. The receiver architecture statistical phenomena observed in real networks. The receiver structure ought to be the best Studies  have shown that because of this ugly compromise between low bit-error rate, short environment, in many systems, less than half of run-in sequences (this is particularly important the spectrum will be available at any given time because a burst mode system becomes very instant. The absence of a good channel model inefficient for long run-in sequences), necessitates the real-time prototyping of the possibility for digital integration and complete set-up, including the cable, after off- implementation cost. The solution has to be line simulation and before a commitment to robust against the various channel impairments silicon is made. Real-time prototyping has and should be able to cope with high dynamic several benefits compared to off-line simulation: ranges (30 dB). The general structure of the • It enables algorithm verification on the real receiver is depicted in Figure 2. channel; Matched filtering is done by means of a fixed • It allows for more extensive testing under root raised cosine filter. Then the Automatic more varying conditions: 24 hours a day Gain Control brings the signal back to the right measurements are possible as well as tests level in order to avoid under- or overflow in with transmitters placed on different subsequent sections of the demodulator. A phase locations; loop compensates phase mismatches (which • It gives faster feedback when modifying result in a rotation of the constellation diagram) algorithmic settings: the prototype indeed as well as mismatches between the carrier allows to modify algorithmic parameters on frequency of transmitter and receiver. Timing the fly, without re-compilation and to view recovery is done by means of interpolation. its effects in real-time on the next received After the timing recovery the signal is down- data packet. sampled to symbol rate and finally the channel We aim at developing a 16QAM modem for distortion is countered using an equaliser. Since this upstream communication channel. The the compensation of the group delay distortion projected bit rate is 10 Mbit per second and this is the only task of the equaliser, it converges at a bit-error rate of 10-10. The transmission quite rapidly, even with a slowly converging payload consists of ATM cells and the multiple least mean squares (LMS) algorithm. access protocol is TDMA. This type of multi- Performance was improved using a decision access protocol naturally fits the cell-based feedback equaliser (DFE), where the decisions payload. of the slicer are fed into the feedback part of the The next section explains the receiver equaliser. If the decisions of the slicer are architecture. Section 4 describes the rapid correct, the input of the feedback part is error prototyping environment GRAPE and indicates free and hence improves performance. Since the how GRAPE was used to obtain first estimates output of the slicer is only a few bits wide, a of the achievable sample rate when the 16QAM very cost effective implementation is possible. receiver is implemented on 4 Digital Signal The equaliser is de-coupled from the rest of the Processors (the TMS320C40). receiver structure by means of the first slicer, this in order to avoid loop instability caused by interference. The error correcting mechanisms target architecture is specified as a connectivity work with different time constants during graph, with an indication of the amount and type training and tracking phase. During training a of resources each processing device possesses fast acquisition is desirable while in tracking . In the second phase, the amount of only slow variations of the channel resources required by each of the tasks when characteristics have to be compensated. executed on each of the processing devices, is 4. Prototyping with GRAPE estimated. Next, the application is mapped onto the target hardware. In this phase, each task is This section first describes the design flow of assigned to a specific processing device, a the prototyping environment GRAPE, developed communication path is established for each edge at the K.U.Leuven. In the next section, it in the application’s graph and a compile time explains how GRAPE is used to prototype the schedule order is determined per device that 16QAM receiver and what sample rates may be minimises the total makespan. In GRAPE’s expected on a target consisting of four fourth and last design phase, code in C or TMS320C40 DSP processors. VHDL is generated for each of the processing 4.1. GRAPE’s design flow devices, consisting of a main program and communication primitives. Note that a single GRAPE (Graphical RApid Prototyping design flow is used for software targets (DSPs) Environment) is an environment, developed at as well as for hardware targets (FPGAs) . our laboratory, which facilitates the real-time emulation and implementation of synchronous 4.2. Prototype of the 16QAM receiver DSP applications on heterogeneous target The target platform available for platforms consisting of DSPs and FPGAs . implementing the prototype of the receiver, Many aspects of GRAPE resemble the consists of 4 fully interconnected TMS320C40 environments Ptolemy of UC Berkeley  and processors, running at 40 MHz. It consists of COSSAP of RWTH Aachen , currently two PC long ISA-bus cards. Both cards further developed by Synopsys; the main communicate with the host PC via dual ported distinction is that GRAPE is targeted at real- RAM for program downloading and time execution whereas the other environments modification of algorithmic parameters. mainly target simulation. First, the 16QAM receiver application is GRAPE’s design flow consists of four specified using GRAPE’s graphical editor. By phases. In the specification phase, the carefully inspecting Figure 2, we can increase application is described using an extended data the granularity of the application wherever flow model, called cyclo-static data flow several sub-tasks are clearly sequential and (CSDF) , which is an extension of Lee’s cannot be pipelined. This reduces the amount of Synchronous Data Flow . In short, the inter-task communication overhead. We end up application is represented as a directed graph with Figure 3, which is a screen-dump of the G=(N,E), where the nodes N represent specification tool of GRAPE. The (dark grey) computation tasks, and the edges E the triangles represent algorithmic parameters that communication of the results (called tokens) may be modified at run-time. We clearly see the from a producing to a consuming task. The matched filters (FIR), the automatic gain control functionality of the nodes is specified in a (AGC), the phase loop (PHI), the symbol conventional high level language like C and alignment (MU) and the equaliser (LMS). VHDL. The number of tokens a task produces The tasks of the 16QAM receiver as specified respectively consumes during an execution in Figure 3, have been automatically assigned to phase of a task is known at compile time, the 4-processor target hardware, as shown by the allowing for a compile time analysis of the shade of grey of the task borders in the graph in the next phases of GRAPE’s design application window of Figure 4. flow and leading to highly efficient run-time code. Still in GRAPE’s specification phase, the Figure 3. High level specification of the QAM receiver. Figure 4. Schedule of the high level tasks. Then, GRAPE ordered the tasks on each throughput with simulation and ASIC device in time, such that processor idle time is implementation. minimised. The bottom of Figure 4 shows this Symbols/s Relative to schedule. White tasks are application tasks; grey real-time shaded tasks are inter-device communication ASIC 2.500.000 1 primitives, that were automatically generated by Prototyping 15.770 159 GRAPE. The obtained sample rate of 63 kHz Simulation (HP700) 500 5.000 corresponds to a symbol rate of 15.770 symbols per second. Table 1 compares the data Table 1 Estimated data rates. As can be seen, real-time prototyping cannot prototyping environments like GRAPE in be achieved. However, when the QAM protocol combination with programmable target is implemented in burst mode and when more hardware makes prototyping hardly more than 159 transmitters are in a time-multiplexed expensive in development time and equipment way present on the same CATV cable, real-time than simulation. processing for one user can be obtained by 6. Acknowledgements buffering a complete burst and processing it when the burst of the other users are on the R. Lauwereins and M. Moonen are Senior cable. Research Associates with the NFWO. P. Vandaele is a Research Assistant of the IWT. Research supported It is expected that a substantial speed by Siemens Atea and the Flemish Government via the improvement can be obtained compared to the Flemish Institute for the Advancement of Scientific- figure indicated above. These will be Technological Research in Industry (IWT). This investigated in the remainder of the project. A project has partly been made possible by NFWO, limited list of possibilities follows: ESA, Esprit (Retides, Dipsap-II) and Texas 1. A very cheap improvement is to switch to 50 Instruments. K.U.Leuven-ESAT and Imec are MHz processors. members of the DSP Valley network. 2. The current implementation requires the 7. References copying of the results of one block into a software buffer before communication 1 Comerford R. and Tekla S., “Wired for Interactivity”, IEEE Spectrum, April 1996, pp 21-28. primitives copy them onto the hardware 2 Goldberg L., “Cable Modems: The Journey From links. Careful scheduling can avoid these Hype To Hardware”, Electronic Design, Apr. 1996, copying steps and the associated pp 65-80. communication primitives. This would 3 Eng W., “IEEE Project 802.14: Standards for remove all shaded tasks in Figure 4. Digital Convergence”, IEEE Communications 3. The current C implementation is not Magazine, May 1995, pp 20-23. optimized for the TMS320C40 DSP 4 Currivan B., “CATV Upstream Channel Model, processor. The speed gain is unpredictable. Rev 1.0”, IEEE P 802.14 Working Group, June 1996. Previous experience showed speed gains 7 R. Lauwereins, M. Engels, M. Adé, J.A. between 20% and 300%. Peperstraete, “Grape-II: A System-Level Prototyping Environment for DSP Applications”, IEEE 4. Timing critical tasks, especially those Computer, Feb. 1995, pp 35-43. containing bit manipulations or extensive 8 J. Buck, S. Ha, E.A. Lee, D.G. Messerschmitt, conditional processing, may be migrated to “Ptolemy: a Framework for Simulating and FPGAs. Speed gain is unpredictable. This Prototyping Heterogeneous Systems”, Int. Journal of migration reduces the flexibility and Computer Simulation, Vol. 4, Apr. 1994, pp 155-182. observability of the application, and requires 9 Synopsys Inc., 700 E. Middlefield Rd., Mountain a substantial amount of effort, since the C View, CA 94043, USA, COSSAP User’s Manual. specification of the migrated sub-task needs 10 G. Bilsen, M. Engels, R. Lauwereins, J.A. to be re-written in register transfer level Peperstraete, “Cyclo-Static Dataflow”, IEEE Trans. VHDL. This migration will hence only be on Signal Processing, Feb. 1996. 11 E.A. Lee, D.G. Messerschmitt, “Static Scheduling done during the later stages of prototyping. of Synchronous Data Flow Programs for Digital Previous experience has shown speed gains Signal Processing”, IEEE Trans. on Computers, Vol. up to 800%. C-36, No. 1, Jan. 1987, pp 24-35. 5. Conclusion 12 G. Bilsen, M. Engels, R. Lauwereins, J.A. Peperstraete, “Compile-time Makespan-optimal Although not achieving real-time sampling Multi-resource Mapping for Hardware/Software Co- rates, prototyping is shown to be valuable to design”, KULeuven Tech. Report ESAT-ACCA 95-02. evaluate the interaction of new modem designs 13 M. Adé, R. Lauwereins, J.A. Peperstraete, with the real channel, by offering a speed-up of “Hardware-Software Co-design with GRAPE”, Proc. one to two orders of magnitude compared to 6th Int. Workshop on Rapid System Prototyping, workstation simulation. The use of advanced Chapel Hill, NC, USA, June 1995, pp 40-47.
Pages to are hidden for
"Prototyping Quadrature Amplitude Modulation for Two-way Communication"Please download to view full document