VIEWS: 148 PAGES: 4 CATEGORY: Technology POSTED ON: 12/23/2009 Public Domain
A New Method To implement CORDIC Algorithm KHARRAT M. W., LOULOU M., MASMOUDI N., KAMOUN L. Laboratoire d’Electronique et des Technologies de l'Information LETI. Ecole Nationale d'Ing nieurs de Sfax B.P. W 3038 SFAX TUNISIE T l : 216 4 27 40 88 Fax : 216 4 27 55 95 Email: Nouri.Masmoudi@enis.rnu.tn ---------------------------------------------------------------------------------------------------------------- Abstract : Where i = 1. In this paper we present an op timisation ( e After n-1) iterations the basic quations of of CORDIC algorithm im plementation, CORDIC system become: which mainly offers a silicon area xn=K1 {x0 cos( )+y0 sin( )} occupation reduction and gives good yn=K1 {y0 cos( )–x0 sin( )} precision in calculating trigonometric functions such as sine and cosine function. c The onventional CORDIC algorithm s i To alid the new v method, an f presented by the bloc diagram of igure 1 implementation f angle o decomposition and the ngle adecomposition equation is equation sing FPGA u technology is presented by bloc diagram of figure 2. presented. This approach shows a considerable surface reduction and good X0 Y0 precision for calculation resolution less than 20 bits. I. Introduction 0 A/S 0 A/S The CORDIC algorithm was proposed in 1959 by J. E. Volder [1] to facilitate X1 Y12-1 X12-1 Y1 trigonometric numerical calculations uch s as vector otations.r CORDIC algorithm 1 A/S 1 A/S o permit resolution f arithmetic function such Kalman Filter and Discrete Fourrier a Transform [3,4,5]. The lgorithm describes p how a rotation resented by equation [1] Xn-1 Yn-12-n+1 Xn-12-n+1 Yn-1 b e could e xecuted by finite number of elementary rotations called micro-rotations. n-1 A/S n-1 A/S The CORDIC algorithm described by Volder using Walther hypothesis [3,4] is presented by the following set of equations: Xn Yn Fig.1. Bloc diagram of CORDIC algorithm x i +1 = x i i yi 2 i CORDIC algorithm continues to interest y i +1 = y i + i x i 2 i (1) researchers dealing with applications demanding rapidity, minimum area nd a i +1 = i i .a tan 2 i precision. 1 Concerning the truncated algorithm, the 0 idea is based on it that first erations in Atan2-0 conventional algorithm have the most 0 considerably contribution the t calculations. In fact he micro-rotation angle i ecreases d progressively when A/S 0 n iterations increases, then we ca replace the Atan2-1 1 it last erations by a full rotation around a 1 reduced angle obtained by executing a fixed number of pure rotations. This transaction reduces execution time nd surface a A/S 1 occupation to the detriment of precision. a For those lgorithm he problemt of performances still waits to be optimised. To Atan2-n+1 remedy this inconvenience we propose a n-2 n-2 new method to erforming angle p decomposition. A/S n-2 III. New Method n-1 n-1 c We onsider the number which is n-1 presented in b inary by the vector 0, 1, …, n-1 = 0, 1, …, n-1 (see figure 2) where Fig.2. Bloc diagram of angle decomposition 0 is the LSB of and, i represents the sign of ith iteration angle. Let's note that CORDIC algorithm converge In the section f 2 this opaper, we present in ]- /2, /2[ interval and that trigonometric t already used methods to opimise CORDIC functions are symmetric, so ur study o algorithm, which are mainly the merged becomes limited to [0, /2[ interval. and truncated algorithms. The new method By plotting the number versus for is described in section 3. different bit number n, we note that he t corresponding curve is linear by interval as II. Optimised method to implementing shown in figure 3. CORDIC algorithm Furthermore by plotting -C0=f( ) on the In previous works it was demonstrated same plot of =f( ) we notice that the curve that CORDIC algorithm could b optimised e =f( ) is formed by four segments rights in in area nd in execution time. We refer to a the same way slope than - C0=f( ) two methods known as merged and /4 /2 truncated algorithm [2,3,4]. The merged 0 algorithm s based i on combining two iterations into on ly one, which make a reduction of iteration number. a The dditions number used in this - /4 algorithm s identicali o conventional t C0 t a algorithm[ ] buhalf of those dditions are performed on(n-i) bits where i corresponds to ith iteration, then we gain in surface occupation but we loose in execution time. - /2 Fig. 3. = f( ) resolution 32 bits 2 G1 = 9+ 8+ 7+ 6( 5+ 4( 3+ 2+ 1 0)) -C0 G2 = 9+ 8+ 7 6( 5+ 4+ 3( 2+ 1)) G3 = 9+ 8( 7+ 6) 0 A1 A2= /4 A3 /2 P1 = 9 8 7( 6+ 5( 4+ 3 2 1)) =f( ) P2 = 9 8( 7+ 6+ 5 4( 3+ 2 1 0)) P3= 9 8 7 6 C3 S 0 = M2 - /4 S 1 = M3 + M1 M2 C2 S2 = S3 = S5 = M3 C0 S 4 = M2 + M3 -C0 = f( ) S 6 = M3 S7 = S8 = 1 C1 S9 = 0 - /2 For resolutions higher than a 10, o generalisation f equations must be done and for resolution lesser than an 10 Fig. 4. ( -C0)= f( ), et = f( ), adaptation of those equations is necessary. Once the onstants C0,C1,C2 and C3 are c n-1..0 determined, the variation domains of are n-1 known and value will be deducted from the following equations: < -A3 < -A2 < -A1 0< <A1 = - C0 > A3 > A2 > A1 A1< <A2 = - (C0-C1) A2< <A3 = - (C0-C1-C2) P3 P2 P1 G3 G2 G1 A3< < /2 = - (C0-C1-C2-C3) -A1< <0 = + C0 - A 2< <-A1 = + (C0-C1) MUX -A3< <-A2 = + (C0-C1-C2) - /2< <-A3 = + (C0-C1-C2-C3) M3 M2 M1 The numerical values of A1, A2, A3, C0 C0,C1,C2 and C3 are: Or C0 - C1 A1 = 0,3217468 rd Or C0 - C1 - C2 A2 = 0,7853965 rd Or C0 - C1 - C2 - C3 A3 = 1,2490463 rd S n-1..0 C0 = 1,8273468 rd C1 = 0,01025 rd C2 = 0,063961 rd C3 = 0,01025 rd A/S The new architecture is presented by diagram flow of figure 5 in which equations are performed for 10 bits number ( = 9…0). 0..n-1 Figure 5: New architecture 3 IV. Implementations and Performances reduction and p recision amelioration compared to conventional algorithm. Figure 6 shows that he tproposed method CLB number make better precision compared to 500 conventional m e ethod and for r solution less than Of its. 20 course, bthe proposed CORDIC Method 400 solution could ebimproved by formulating for resolutions higher than 20 bits. 300 c The hoice is guided by reprinting with more segments than reviously, which p New method allows best precision to the detriment of 200 area. a By implementing the ngle decomposition 100 equation in FPGA technology, we poof that area measured in CLB number is lowered 0 c compared to the onventional algorithm 8 16 24 32 (figure7) and the latency time or execution Resolution in bits f time (delay) or the new method is lowered compared to the onventional Figure 7: CLB c Number versus bit number algorithm(figure 8). delay time(ns) Precision in % 10 Conventional of CORDIC Conventional CORDIC 1 0,1 0,01 New method 0,001 0,0001 New method 0,0000 1 0,0000 01 8 16 24 32 Bit Number Figure 8: delay time Figure :6. Precision versus of bit number for =45 References [1] VOLDER, « The CORDIC trigonometric V. Conclusion computing technique. » IRE Trans.on Electronic Computers. Vol.EC-8.no.3,pp.330-334,Sep. 1959. In this paper, we proposed an op timisation [2] GHARIANI Moez. « Int gration lectronique de of the CORDIC algorithm im plementation. la Transformation de Park en technologie FPGA The optimisation mainly reside in the Application : V hicule Electrique ». DEA implementation f angle o decomposition d’Electronique de l’ENIS SFAX D embre 1997. h i equation, which as a great nfluence on [3 ] J. S. WALTER. The unified algorithm for performances (area, precision and rapidity). elementary functions. In. Proc. AFIAPS Spring joint An implementation f the so-called o Computing Conf. Volume 38, pp 379, 385 1971. equation in FPGA technology is performed to alid the method and shows urface s v [ 4] Merged CORDIC Algorithm IEEE Transaction on Computer Vol C 29 pp 946 – 950 1990. 4