Embed
Email

DSPs for Future Wireless Base-Stations

Document Sample

Shared by: ewghwehws
Categories
Tags
Stats
views:
0
posted:
1/18/2012
language:
pages:
26
DSP Architectures for Future

Wireless Base-Stations



Sridhar Rajagopal and Joseph Cavallaro

ECE Department

Rice University

April 10, 2000







This work is supported by Texas Instruments, Nokia, Texas Advanced Technology Program and NSF

Overview



 Future Base-Stations



 Current DSP Implementation



 Our Approach

– Make Algorithms Computationally effective



– Task Partitioning for pipelining, parallelism





 DSP/ARM Extensions for Performance Acceleration





4/10/00 TI Meeting 2

Evolution of Wireless Comm



First Generation

Voice









Second/Current Generation

Voice + Low-rate Data (9.6Kbps)









Third Generation +

Voice + High-rate Data (2 Mbps) + Multimedia

W-CDMA





4/10/00 TI Meeting 3

Communication System Uplink





Noise +MAI

Base Station









Reflected Paths

Direct Path









User 1







User 2





4/10/00 TI Meeting 4

Main Processing Blocks









Channel Estimation Detection Decoding









Baseband Layer of Base-Station Receiver









4/10/00 TI Meeting 5

No Multiuser Detection

Proposed Base-Station









TI's Wireless Basestation (http://www.ti.com/sc/docs/psheets/diagrams/basestat.htm)





4/10/00 TI Meeting 6

Real -Time Requirements



 Multiple Data Rates by Varying Spreading Factors



 Detection needs to be done in real-time

– 1953 cycles available in a C6x DSP at 250MHz to detect 1 bit at 128

Kbps







Spreading Number of Data Rate

Factor Bits / Frame Requirement

4 10240 1024 Kbps

32 1280 128 Kbps

256 160 16 Kbps



4/10/00 TI Meeting 7

Current DSP Implementation

4

x 10 Data Rate Comparisons for Matched Filter and Multiuser Detector

18



16



14

Targeted Data Rate = 128Kbps

Data Rates Achieved









12



10

Projected (8x)

8

Matched Filter(C64)*

Multiuser Detector(C64)*

6 Matched Filter(C67)

Multiuser Detector(C67)

4 Targeted Data Rate



2

C67 at 166MHz

0

9 10 11 12 13 14 15

Number of Users







4/10/00 TI Meeting 8

Complexity



 Algorithm Choice Limited by Complexity

– Multistage reduces data rate by half.





 Main Features

– Matrix based operations

– High levels of parallelism

– Bit level computations





 32x32 problem size for the Detector shown



 Estimation, Decoding assumed pipelined.





4/10/00 TI Meeting 9

Reasons



 Sophisticated, Compute-Intensive Algorithms



 Need more MIPs/FLOPs performance



 Unable to fully exploit pipelining or parallelism



 Bit - level computations / Storage









4/10/00 TI Meeting 10

Our Approach



 Make algorithms computationally effective

– without sacrificing error rate performance





 Task Partitioning on Multiple Processing Elements

– DSPs : Core

– FPGAs : Application Specific / Bit-level Computations





 VLSI Implementation to find extensions for DSPs.









4/10/00 TI Meeting 11

Algorithms



 Channel Estimation

– Avoid inversion by iterative scheme





 Detection

– Avoid block-based detection by pipelining









4/10/00 TI Meeting 12

Computations Involved

delay

 Model time



bi bi+1

ri

i  ibiA  ir

R b

K2 Bits of K async. users aligned at times I and I-1

i

Received bits of spreading length N for K users

C r

N

i





1

i r ib 



H



L rb R

1

ib ib 



T



L bb R



4/10/00 TI Meeting 13

Multishot Detection

N K 2

Solve for the channel estimate, Ai C A i









rbR  iA bb

R A  A A 

i 0 1





Multishot Detection

 1,1b 

  0 0 A A DK DN

   1 0

 C A

 1, K b   0 A

1 0 A 0 

      

     r

  

 b      

 D ,1

 

   A 0 0 0 

  0 

 D , K b

4/10/00 TI Meeting 14

Differencing Multistage Detection



 Stage 0- Matched Filter

] r H A [eR  0 y S=diag(AHA)

) 0 y ( ngis  0

d

 Stage 1

y - soft decision

0

d] S  A A [eR  y  y

H 0 1





) 1y ( ngis  1 d

 Successive Stages d - detected bits

1 l (hard decision)

d d  x

l l



l

x] S  A H A [eR  l y  1 l y

) 1 l y ( ngis  1 l d

4/10/00 TI Meeting 15

Iterative Scheme



b b Rbb  Rbb  bL * bL  b0 * b0

T T



T

i iR bb



r b  R Rbr  Rbr  bL * rL  b0 * r0H

H H

i i rb





rb R  iA * bbR A  A   ( A * Rbb  Rbr )



 Tracking



 Method of Steepest Descent



 Stable convergence behavior



 Same Performance





4/10/00 TI Meeting 16

Simulations - AWGN Channel



-1 Comparison of Bit Error Rates (BER)

10



Detection Window = 12

SINR = 0

Paths =3

Preamble L =150

Spreading N = 31

BER









-2

10 Users K = 15

MF O(K2N) 10000 bits/user

ActMF

ML

ActML MF – Matched Filter



O(K3+K2N) ML- Maximum Likelihood



ACT – using inversion

-3

10

4 5 6 7 8 9 10 11 12

Signal to Noise Ratio (SNR)



4/10/00 TI Meeting 17

Block Based Detector



Matched Filter

1 12 Stage 1

1 12 Stage 2

1 12

Stage 3

1 12



Matched Filter Bits 2-11

11 22 Stage 1

11 22 Stage 2

11 22

Stage 3

11 22



Bits 12-21



4/10/00 TI Meeting 18

Pipelined Detector



1 2 3 4 5 6 7 8 9 10 11 12

Matched Filter





1 2 3 4 5 6 7 8 9 10 11 12

Stage 1







1 2 3 4 5 6 7 8 9 10 11 12

Stage 2







Stage 3 1 2 3 4 5 6 7 8 9 10 11 12









4/10/00 TI Meeting 19

Task Decomposition [Asilomar99]



Block I Block II Block III

Multistage Detector

Correlation Inverse Matrix

Matrices (Per Products Block IV

Bit)

d M A0HA1 Multistage

U Rbr[R] RbbAH O(K2N) Detection

X O(KN) = Rbr[R] (Per Window)

b O(K2N)



Rbr[I] A0HA0

O(KN) O(K2N)

Data’ RbbAH

M O(DK2Me) d

= Rbr[I]

U

O(K2N)

X Rbb A1HA1

Pilot O(K2) O(K2N)





AHr

Data O(KND)



Channel Estimation

Matched Filter



4/10/00 TI Meeting 20

Achieved Data Rates

5

x 10 Data Rates for Different Levels of Pipelining and Parallelism

3





2.5 (Parallel A) (Parallel+Pipe B)

(Parallel A) (Pipe B)

(Parallel A) B

2 AB

Data Rates









Sequential A + B



1.5

Data Rate Requirement = 128 Kbps



1





0.5





0

9 10 11 12 13 14 15

Number of Users

4/10/00 TI Meeting 21

VLSI Implementation



 Channel Estimation as a Case Study



 Area - Time Efficient Architecture



 Real - Time Implementation



 Bit- Level Computations - FPGAs



 Core Operations - DSPs







4/10/00 TI Meeting 22

DSP Extensions for Performance



 Bit-level storage / processing support

– Registers / Memory / ALU





 Efficient Matrix -Based operations

– Matrix- Vector Multiply





 Support for Complex-valued data



 Efficient memory accesses



 Pre-fetching Data - C64



4/10/00 TI Meeting 23

Use of ARM Core

 Work on Higher Base Station Layers



User Interface

OSI Translation

Layers Synchronization

3-7 Transport

ARM Network





OSI Data Link Layer

Layer (Converts Frames

2 to Bits)





Physical Layer

DSP OSI

Layer (hardware;

1 raw bit stream)



4/10/00 TI Meeting 24

Software Suggestions



 Limited OS Support



 Compiler Efficiency

– No more Assembly!





 Performance Analysis Tools





 Code Composer Studio 1.2









4/10/00 TI Meeting 25

Conclusions



 DSPs to play major role in Future Base-Station



Implementations.



 Search for Computationally Efficient Algorithms and Better



Processor Designs to meet Real-Time









4/10/00 TI Meeting 26



Related docs
Other docs by ewghwehws
By registering with docstoc.com you agree to our
privacy policy

You are almost ready to download!

You are almost ready to download!