Part Number Hot Search : 
2N323 TA143E MJ10013 KIA8000S 100CM ALC663 MHW8207A MC33072
Product Description
Full Text Search
 

To Download AN1442 Datasheet File

  If you can't view the Datasheet, Please click here to try to view without PDF Reader .  
 
 


  Datasheet File OCR Text:
 AN1442 APPLICATION NOTE
Signal Processing with ST10-DSP
By Andre ROGER and Charles AUBENAS
By providing ST10 micro-controllers with an enhanced 16-bit DSP co-processor STMicroelectronics proposes a powerful solution to a wide range of applications. The ST10 micro-controllers including a DSP unit are identified with a "2" in the second digit of the variant code like ST10F269. Most of the algorithms encountered in signal processing such as data acquisition (average, MAX/MIN), signal processing control (PID, PD) and filtering (FIR, IIR) can be easily run with the ST10-DSP micro-controllers. So the ST10-DSP, combined to the well know real-time performances of the ST10-MCU core, efficiently supports demanding applications like electronic steering, suspension, engine control, airbag among a varied list. The goal of this application note is to precise ST10-DSP features and tools. It describes how to use and program the ST10-DSP co-processor for common signal processing algorithms with a set of dedicated examples. Additional documentation : - ST10F2xx product data sheets - ST10 family programming manual - ST10F269 user's manual - Various application notes Figure 1 : ST10-DSP Architecture
Dual-port Internal RAM
Data buses
ST10 - DSP CPU
new addressing features
IDX0 IDX1 QX0 QR0 QX1 QR1
External Memory
Program Memory Program code
Operands Peripheral interface
Control MAC co-processor
16 x16 multiplier 40-bit ALU shifter
MCW MAL MRW MAH MSW
repeat unit
40-bit accumulator
November 2001
1/48
AN1442 - APPLICATION NOTE
TABLE OF CONTENTS 1 2 2.1 2.2 2.3 2.4 2.5 2.6 2.6.1 2.6.2 2.7 2.8 3 3.1 3.1.1 3.1.2 3.1.2.1 3.1.3 3.1.4 3.2 3.2.1 3.2.2 3.2.3 3.2.4 3.2.5 4 4.1 5 5.1 5.2 5.2.1 5.2.2 5.2.3 5.2.4 5.3 5.3.1 5.4 5.4.1
2/48
PAGE 4 4 4 4 5 5 5 5 5 6 6 6 7 7 7 7 7 7 7 8 8 8 8 8 8 9 9 10 10 10 10 11 13 15 17 17 20 20
INTRODUCTION ........................................................................................................ ST10F2XX DSP FEATURES ..................................................................................... HARVARD ARCHITECTURE ..................................................................................... MULTIPLY AND ACCUMULATE ................................................................................ MINIMUM AND MAXIMUM ........................................................................................ INSTRUCTION REPEAT UNIT .................................................................................. BARREL SHIFTER ..................................................................................................... DSP ADDRESSING MODES ..................................................................................... Index Registers ........................................................................................................... Offset Registers .......................................................................................................... REAL TIME ASPECTS ............................................................................................... ST10 INTRINSIC BENCHMARKS .............................................................................. DEVELOPPING DSP FUNCTIONS ........................................................................... DEVELOPPING YOUR OWN DSP FUNCTIONS ...................................................... DSP Functions in Full Assembly ................................................................................ DSP Functions in Mixed "C" and Assembly ............................................................... Example of Mixed "C" and Assembly for a Data Acquisition Routine ......................... User Defined Intrinsics ............................................................................................... Tasking Support of ST10-DSP ................................................................................... ST10-DSP HINTS ....................................................................................................... Instruction Scheduling ................................................................................................ DSP Loops ................................................................................................................. Memory Mapping ........................................................................................................ Enhanced 32-bit Arithmetic with ST10-DSP Unit ....................................................... Development Tools ..................................................................................................... ARCHITECTURAL ADVANTAGES OF ST10F2XX AS A DSP/MCU ....................... SUMMARY ................................................................................................................. ST10-DSP PROGRAMMING EXAMPLES ................................................................ INITIALIZATION ......................................................................................................... MATHEMATICS ......................................................................................................... Double Precision Multiplication ................................................................................... Nth Order Power Series ............................................................................................. [NxN][Nx1] Matrix Multiply ......................................................................................... N-real Multiply (windowing) ........................................................................................ FIR FILTER-REAL CORRELATION-CONVOLUTION ............................................... Multiple Precision FIR Filter ........................................................................................ IIR FILTERS ............................................................................................................... Nth Order IIR Filter: Direct Form 1 .............................................................................
AN1442 - APPLICATION NOTE
5.4.2 5.4.3 5.4.4 5.5 5.5.1 5.5.2 5.6 5.6.1 5.6.2 5.6.3 5.6.4 5.6.5 5.6.6 5.7 6 Nth Order IIR Filter: Direct Form 2 .............................................................................. N-Cascaded Real Biquads (Direct Form 2) ................................................................ N-cascaded Real Biquads: Transpose Form .............................................................. LMS ADAPTIVE FILTER ............................................................................................ Single-Precision LMS Adaptive Filter ......................................................................... Extended-Precision LMS Adaptive Filter .................................................................... OPERATIONS ON TABLES ....................................................................................... Detection of the Minimum or Maximum in a Collection of Samples ........................... Computing the Sum of a Collection of Samples ......................................................... Search for an Element Within a Collection of Samples .............................................. Table Move ................................................................................................................. Find the index of a Maximum Value in a Table .......................................................... Compare for Search ................................................................................................... SUMMARY OF ROUTINES ........................................................................................ APPLICATION NOTE VERSION INFORMATION ..................................................... 23 27 31 35 35 39 44 44 44 44 44 45 46 47 47
3/48
AN1442 - APPLICATION NOTE
1 - INTRODUCTION To better take advantage of the ST10F2XX DSP capabilities, information and software examples are gathered in this application note. The second chapter reviews the features and the key points of the different parts of the ST10-DSP. The third chapter details the ways to develop DSP functions using the ST10 programming tools. This is completed with some hints related to the tools and to the ST10-DSP. Chapter four highlights the ST10F2XX DSP advantages and strengths. To complete the application note, various programming examples like Matrix multiply, FIR & LMS routines and operations on tables are provided in the chapter five. 2 - ST10F2XX DSP FEATURES ST10F2xx is a combined CPU and DSP : - As a CPU, it is a powerfull real time oriented 16-bit CPU, - As a DSP, it is a single MAC 16 by 16-bit multiplier with a 40-bit accumulator. The key features of the ST10-DSP are explained in the following sections. For details on the ST10 instruction set and related information refer to the "ST10 FAMILY PROGRAMMING MANUAL", especially in the chapter 3 - "MAC INSTRUCTION SET". 2.1 - Harvard Architecture ST10-DSP is an Harvard architecture implementation. On every CPU-cycle it allows : - 1 opcode fecth, - 2 operand reads, - 1 optional operand write. On the other hand the ST10 core is based on unified memory organization, code and data share the same linear addressing space. 2.2 - Multiply and Accumulate ST10-DSP supports different multiply and accumulate instructions with several addressing modes. With the CoMAC [IDXy+], [Rz+], in a single cycle, ST10 is : - Fetching 2 operands addressed by IDXy and Rz, - Updating 2 pointers (increment, decrement, add an index). A Multiply and Substract instruction ("CoMAC-" ) is also available with the same addressing modes as the Multiply and Accumulate. CoMacM is similar to CoMac except that ST10 also move 1 operand to the bottom of the table (usefull for circular buffers in data acquisition routines) : The following figure illustrates the behavior of the CoMACM [IDX1+], [R9+] instruction.
4/48
AN1442 - APPLICATION NOTE
Figure 2 : Operand parallel move with CoMACM instruction
Before n+4 n+2 n n-2 oper 4 oper 3 oper 2 oper 1 IDX1 oper 4 oper 3 oper 2 oper 2 IDX1 After
Before n+4 n+2 n n-2 Oper X oper coef 4 coef 3 coef 2 coef 1 R9 coef 4 coef 3 coef 2 coef 1
After
R9
+
40-bit Accumulator
2.3 - Minimum and Maximum ST10-DSP has 2 instructions for detection of minimum and maximum (CoMin and CoMax). These instructions are generally used for saturating arithmetic. Hint : Combined with auto-incrementation, these instructions allow the ST10 to scan a table of samples and detect either the minimum or the maximum in one CPU-cycle per sample. 2.4 - Instruction Repeat Unit Each instruction can be repeated either an fixed number of times (immediate value) or a variable number of times. Register MRW is used to repeat a variable number of times. Repeat sequences can be interrupted. This allows ST10-DSP to compute a FIR (16 by 16; real) at a rate of 1 TAP per CPU-cycle. 2.5 - Barrel Shifter ST10-DSP has a barrel shifter connected to the accumulator : any result or value loaded into the accumulator can be shifted. The shift value must be between 0 and 8 (included). 2.6 - DSP Addressing Modes New addressing modes and their associated registers have been added : 2 index pointers and 4 offset registers. 2.6.1 - Index Registers The new index registers (IDX0 and IDX1) are giving more flexibilities in operand addressing : they allow double indirect addressing. When ST10-DSP is addressing 2 operands, 1 data-pointer shall be from the register bank (R0 through R15) and the other pointer must be from the index registers. All pointer post modifications available with GPR addressing are available with Index-addressing (see below). Restriction : The use of IDX index registers is limited to internal DPRAM addressing except for the CoMOV instruction.
5/48
AN1442 - APPLICATION NOTE
2.6.2 - Offset Registers 4 Offset registers (QX0, QX1, QR0, QR1) have been added to enhance ST10 capability in handling matrix. They are usefull when doing computation on matrix lines and/or columns. As a result, 5 different pointer modifications are available with ST10-DSP : - No pointer modification : ex : [IDX0], [R1] - Pointer post incrementation : ex : [IDX1+], [R2+] - Pointer post decrementation : ex : [IDX0-], [R3-] - Pointer post incrementation by offset : ex : [IDX0 + QX0], [R1 + QR0] - Pointer post decrementation by offset : ex : [IDX1 - QX0], [IDX1 - QX1], [R2 - QR1] As explained above, pointer post-modification are done in parallel to the execution of the instruction. 2.7 - Real time Aspects The ST10-DSP is both a real time CPU and a DSP. Any DSP code developped for ST10-DSP can be interrupted at any time (including during repeat sequences) and execution resumed after the interrupt routine. During the interrupt, bit MR remains set to indicate that a repeated instruction has been interrupted. - Latency : there is no added latency on interrupts when DSP functions are used. Interrupt routine requirements : The MAC registers must be saved at the entry point and restored at the exit of an interrupt routine using the DSP function and interrupting a DSP function. This is the single constraint in using ST10-DSP with interrupt. This control can be automatically done by Tasking tool chain by using "#pragma savemac" on each task using DSP functions (for details, refer to Tasking user's manual). 2.8 - ST10 intrinsic Benchmarks The following table is showing ST10 instrinsic benchmarks in DSP algorithms. Table 1 : ST10 Intrinsic Benchmarks in DSP Algorithms
Instruction Cycles Mathematics 32 by 32 signed multiplication Nth Order Power Series [NxN][Nx1] Matrix Multiply N-Real Multiply (Windowing) DSP Routines 2) 32x16 L-tap FIR DF1 3) Nth Order IIR filter DF2
4)
Program Words 24 22 24
12 9N/2+3 N2+5N+5 2N+3N/URF 2L+3 2N 2N+1 11N-1 14N-15 4L+2(L-2)/URF +1 6L+2(L-2)/URF +20 L+3 3L/2+10
6) 1)
5+4.URF 18 10 12 19 24 51+2(URF-1) 71+2(URF-1) 8 21 13
Nth Order IIR filter
DF2 N-cascaded Biquads TF 5) N-cascaded Biquads 16x16 L-tap LMS 32x16 L-tap LMS Operations on Tables Table Move (L items) Find the Index of a Maximum Value in a table (L items) "Compare For Search" 7) (L items)
Notes: 1. "URF" stands for "UnRolling Factor". 2. Representative part of the routine only. 3. Direct Form 1. 4. Direct Form 2. 5. Transpose Form. 6. On average. 7. First data in a table that matches a specified condition. 8. On average.
L/2+7 8)
6/48
AN1442 - APPLICATION NOTE
3 - DEVELOPPING DSP FUNCTIONS This chapter is explaining how to develop DSP functions with ST10F2xx. ST10-DSP programming aspects and also tool chain aspects are covered by this chapter. 3.1 - Developping Your Own DSP Functions With Tasking tool chain, there are 3 different ways of developping an DSP function. 3.1.1 - DSP Functions in Full Assembly New functions can be developped in full assembly. They need to be compatible with the EABI defined by Tasking (see Tasking documentation). The ST10-DSP library has been developped in this way. Then, it can be decided to make a library of the new DSP functions or to include the assembly sources in the Tasking project. 3.1.2 - DSP Functions in Mixed "C" and Assembly Tasking pseudo registers can be used to develop user's dedicated DSP functions. This is the recommended way of developping new DSP functions for ST10. This way of programming is : - Ideal for short functions (typically a loop of 1 or 2 instructions). - More portable than full assembly functions : register allocation is done by the compiler depending on available registers, the code is independant of the memory model. Such a function can also be advantageously inlined by using the keyword "_inline". 3.1.2.1 - Example of Mixed "C" and Assembly for a Data Acquisition Routine The following example detect the maximum value within a collection of samples (16-bit unsigned format) within a table.
_inline int max_table( *int a, int nb_samp) { int retval; #pragma asm( @w1=*a, @w2=nb_samp ) EXTERN XVAL:WORD, BVAL:BYTE, YVAL:WORD MOV @1, @w1 MOV @2, @w2 MOV MAL,#0 ; initialise register to point to the beginning of the table ; temporary register ; initialise accumulator to the minimum possible value ; (table of unsigned int assumed) MOV MAH,#0 MOV MAE ,#0 MOV MRW, @2 ; load repeat counter with nb_samples MOV @2, #0 LOOP1 : Repeat MRW times CoMAX @2, [@1+] ; perform comparison MRW times. MOV @w1, MAH #pragma endasm( retval=@w1 ) return retval; }
3.1.3 - User Defined Intrinsics Tasking tool chain allows to create user's defined intrinsics. For details please refer to Tasking documentation (m_c166.pdf). 3.1.4 - Tasking Support of ST10-DSP Tasking is developping new intrinsics made specifically to take advantage of ST10-DSP. These intrinsics are implemented starting from tool chain V7.0 r5.
7/48
AN1442 - APPLICATION NOTE
3.2 - ST10-DSP hints 3.2.1 - Instruction Scheduling All ST10-DSP instructions are executed in 1 Instruction cycle (2 CPU clock cycles) without any latency. There is no automatic interlock between the DSP units and the CPU part. As a consequence, 1 dummy instruction shall be inserted between : - 1 instruction doing pointer initialisation (IDX0/1, QX0/1 and QR0/1) and a DSP instruction using it, - 1 instruction in the DSP (CoXXX) and a CPU instruction that reads from the DSP : ex : between CoMAC and a compare instruction on MCW. These are the only constraints in ST10-DSP programming. 3.2.2 - DSP Loops ST10 repeat unit is limited to 1 instruction. Bigger loops are done using loop counter mapped in an ST10 GPR and using the instruction "CMPD1 reg, #0 together with JMPR cc_NE, Loop-label". Hint : usually, it is better to unroll loops to get better performance as long only 1 accumulator is needed. 3.2.3 - Memory Mapping ST10-DSP can read 2 operands from the memory provided they are properly mapped. The "X and Y" memories of ST10 is the DualPort RAM (DPRAM) : ST10 internal DPRAM has 2 independant read/write ports which allow parallel read and write operation without delay. => all operands and coefficients shall be mapped in ST10 's DPRAM. Hint : when ST10-DSP functionality is used, it is recommended to limit DPRAM usage to Stack, register banks for context switches, variables in bit format (bit addressable area is in the DPRAM), and variables for the DSP unit. 3.2.4 - Enhanced 32-bit Arithmetic with ST10-DSP Unit The concatenation unit enables the MAC unit to perform 32-bit operation in 1 CPU-cycle. It concatenates two 16-bit operands to a 32-bit operand before the 32-bit operation is executed in the 40-bit adder/ subtracter. The second operand is always the current accumulator content. This feature can be used for specific 32-bit arithmetic : - 32-bit arithmetic on tables (sum, comparison) - custom arithmetic : multiplication followed immediatly by few extented precision corrections (ex : addition, subtraction, shift) before storing the result. 3.2.5 - Development Tools Code Debugging The Algorithms developped for ST10-DSP can be debugged : - without any hardware : Tasking Crossview functional simulator, - with an evaluation board : ex : FORTH ST10F269 evaluation board, - with an emulator. Code Optimisation The best tool for code optimisation is the emulator, it allows to : - trace each instruction, the addresses and the values of the operands, - time stamp each instruction, to verify the performance of the routine.
8/48
AN1442 - APPLICATION NOTE
4 - ARCHITECTURAL ADVANTAGES OF ST10F2XX AS A DSP/MCU The efficiency of a DSP micro-controller depends several key features, all fully supported by the ST10-DSPs : - A hardware multiply and accumulate unit with a 40-bit accumulator (8 guard bits). - A DSP's Harvard architecture allowing simultaneous program instruction, 2 operand fetches and 1 operand write per cycle. - An address generation unit providing with zero-overhead address update for the 2 indirect pointers to be used for the next processor cycle. This unit supports double indirect addressing with 2 different offset registers for efficient matrix addressing. - On-chip memory architecture of ST10F2xx provide zero penalty accesses to program and data, resulting in high throughput even in the automotive temperature range (see relevant ST10F2xx product datasheets). - A choice of proven automotive software components like CAN-drivers, OSEK operating systems. - On-chip DMA (8 channels) to handle automated and sophisticated acquisitions (time based, position based) without software overhead. - A powerfull interrupt controller handling up to 64 interrupt sources with 16 different interrupt levels. - A proven debugging environment based on real time emulation, featuring real time trace of DSP/MCU registers. 4.1 - Summary ST10F2xx provides a solid platform for motor control and other control applications in real time environment. This good balancing of DSP and MCU features allows ST10F2xx to have similar performance in real applications (interrupts, scaling of results, more variables to handle like calibration variables) than in benchmarks. ST10-DSP preferred algorithms are data acquisition (average, MAX/Min), signal processing oriented control (PID, PD) and filtering (FIR, IIR). For automotive applications, ST10F2xx has superior advantages compared to other DSP/MCUs : availability in the entire automotive temperature range, wide choice of OSEK kernels + CAN certified drivers and communication stack (TP, NM) are ST10-DSP differentiating factors.
9/48
AN1442 - APPLICATION NOTE
5 - ST10-DSP PROGRAMMING EXAMPLES 5.1 - Initialization This routine shows an example of initialisation of DSP registers :
; Control Registers Initialization. MOV MOV ; ; Accumulator Initialization. ; MOV MAH, #data16 ; (MAH) #data16, ; (MAE) 8 times (MAH.15 ), ; (MAL) 0000h . ; ; Core SFRs Initialization. ; MOV MOV EXTR MOV MOV MOV MOV IDX0, IDX1, #4 QX0, QX1, QR0, QR1, #qx0 #qx1 #qr0 #qr1 #idx0 #idx1 ; (IDX0) idx0. ; (IDX1) idx1. ; Next 4 instructions will utilize the ESFR space. ; (QX0) qx0. ; (QX1) qx1. ; (QR0) qr0. ; (QR1) qr1. Program Words 19 MCW, MRW, #mcw #mrw ; (MCW) mcw. ; (MRW) mrw.
Instruction Cycles Total 10
ST10-DSP registers are handled like SFRs : all addressing modes available for SFRs can be used to initialise ST10-DSP : immediate (see above) , register, memory. Reminder : after initialisation of DSP registers, a dummy instruction shall be insterted to take into account pipeline effect. 5.2 - Mathematics 5.2.1 - Double Precision Multiplication This routine assumes that: - XL (LSW) and XH (MSW) are stored in R0 and R1, respectively. - YL (LSW) and YH (MSW) are stored in R2 and R3, respectively. - MP and MS are cleared. - t performs P=X*Y. After computation, the 64-bit product P is stored in R4-R7, where R7 contains the most significant word and R4 the least significant word.
10/48
AN1442 - APPLICATION NOTE
; ; XL *YL multiplication (unsigned) ; CoMULu CoSTORE ; ; XL *YH multiplication (unsigned/signed) and X H *Y L multiplication (signed/unsigned). ; CoSHR CoSHR CoMACus CoMACsu CoSTORE ; ; XHL *YH multiplication (signed/signed) ; CoASHR CoASHR CoMAC CoSTORE CoSTORE #8 #8 R1, R6, R7, R3 MAL MAH Instruction Cycles Total 12 ; (ACC) (ACC) >>a 8. ; (ACC) (ACC) >>a 8. ; (ACC) (ACC) + XH *YH . ; (R6) (ACC)L . ; (R7) (ACC)H . Program Words 24 #8 #8 R0, R1, R5, R3 R2 MAL ; (ACC) (ACC) >> 8. ; (ACC) (ACC) >> 8. ; (ACC) (ACC) + XL *YH . ; (ACC) (ACC) + XH *YL . ; (R5) (ACC)L . R0, R4, R2 MAL ; (ACC) X L *YL . ; (R4) (ACC)L .
5.2.2 - Nth Order Power Series The formula is:
n
i=0
The associated pseudo code is:
; x = input. ; y = output. ; a(i) for i=0,1,...,n, are the coefficients. y = a(n); for (i=1 to n) { y = y*x+a(n-i); }
Assuming that: - x is a fractional and is located in R1. - a(i) are fractional. - y can be represented by a 16-bit data.
11/48
y=
a ( i ) x i = [ [ [ [ a ( n ) x + a ( n - 1 ) ] x + a ( n - 2 ) ] x + a ( n - 3 )] + ... ]
AN1442 - APPLICATION NOTE
The final result is contained in the accumulator (ACC).To minimize the loop overhead, the program uses "loop unrolling" and assumes that n is even. Figure 3 : Memory Map of a Nth Order Power Serie Processing
Memory Low Addr. a(0) a(1) ... a(n-3) a(n-2) a(n-1) High Addr. a(n) R9
; Initialization. ; MOV MOV MOV ; ; Initialize the Loop Count ; MOV ; ; Unrolled Loop ; SERIE_LOOP: CoMUL CoADD CoSTORE CoMUL CoADD CoSTORE ; ; End_of_loop Checking. ; CMPD1 JMPR ; R3 cc_NZ #0h SERIE_LOOP ; (R3) (R3)-1. ; End-of-Loop test & branch. R1, R0, R2, R1, R0, R2, [R9-] [R9-] MAS R2 [R9-] MAS ; (ACC) a(n)*x; ; (R9) (R9)-2 ; (ACC) (ACC) + a(i-1); ; (R9) (R9)-2. ; (R2) limited (ACC) ; (ACC) (R2)*x ; (R9) (R9)-2 ; (ACC) (ACC)+ a(i-2); ; (R9) (R9)-2. ; (R2) limited (ACC) R3 #n/2 ; (R3) n/2. MCW, R0, R9, #mcw #0 a(n)_address ; (MCW) mcw, MS and MP are cleared. ; (R0) 0 ; (R9) address of a(n).
Instruction Cycles Total 9N/2+3
Program Words 22
12/48
AN1442 - APPLICATION NOTE
5.2.3 - [NxN][Nx1] Matrix Multiply Figure 4 : [NxN][Nx1] Matrix Multiply
C1 C2 = ....... ....... ....... ....... ....... A11 A21 A12 A22 A13 A23 ... ... A1N A2N X ....... BN DPRAM IDX0 A11 A12 ... A13 ... A1N ... ANN B1 B2 ... BN R9 xx xx xx ... High Addr. xx Before R10 C1 C2 C3 ... CN After R10 IDX0 B1 B2
CN
AN1
AN2
AN3
...
ANN
The [NxN][Nx1] matrix multiply memory map is shown below: Figure 5 : Memory Map of the Matrix Multiply
DPRAM Low Addr. A11 A12 ... A13 ... A1N ... ANN B1 B2 ... BN
High Addr. Low Addr.
R9
N is assumed to be less than 31.
13/48
AN1442 - APPLICATION NOTE
; ; MAC dedicated registers' initialization: ; MOV EXTR MOV ; ; GPRs initialization: ; ; ; ; MOV MOV MOV R7, R9, R10, #N @B1 @C1 ; (R7) N ; (R9) B1_addr ; (R10)) C1_addr - R7 is used as loop counter. - R9 contains B1 Address. - R10 contains C1 Address. IDX0, #1 QR0, #2*(N-1) @A11 ; (IDX0) A11_addr. ; next instruction uses ESFR space ; (QR0) 2(N-1): N-1 words.
; ; Product prolog ; MATRIX_LOOP CoMUL [IDX0+], [R9+] ; (ACC) Ai1.B1 ; (IDX0) (IDX0)+2 ; (R9) (R9)+2. ; ; Product loop. ; REPEAT N-2 TIMES CoMAC [IDX0+], [R9+] ; (ACC) (ACC) + Aij*Bj ; (IDX0) (IDX0)+2 ; (R9) (R9)+2. ; ; Product epilog (provide Ci in an appropriate format). ; CoMAC [IDX0+], [R9-QR0] ; (ACC) (ACC) + AiN*Bn ; (IDX0) (IDX0)+2 ; (R9) (R9)-(N-1). ; ; Shift & Rounding: shift to put the result in the right fractionnal format if needed ; CoASHR ; #data3, rnd ; (ACC)=(ACC)>>#data3+rnd
14/48
AN1442 - APPLICATION NOTE
; Write Ci into memory. ; CoSTORE [R10+] MAS ; ((R10)) Ci. ; (R10) (R10)+2. ; ; End_of_loop Checking. ; CMPD1 JMPR R7 cc_NZ #0h MATRIX_LOOP ; (R7) (R7) -1. ; End-of-Loop test & branch
Instruction Cycles Total N2+5N+5
Program Words 24
5.2.4 - N-real Multiply (windowing) The formula is: y ( i ) = x ( i ) w ( i ) for i =0,1,...,N-1 The memory mapping is shown in Figure 6. To minimize the loop overhead, this program uses "loop unrolling". The associated pseudo code is:
; x(n) = input signal at time n. ; w(n) = window coefficient at time n. ; for (i=0 to N-1) { y(i)= x(i)*w(i); }
Figure 6 : Memory Map of the N-Real Multiply
DPRAM Low Addr. x(n-N+1) x(n-N+2) ... x(n-3) x(n-2) x(n-1) High Addr. x(n) IDX0 DPRAM y(n-N+1) y(n-N+2) ... y(n-3) y(n-2) y(n-1) y(n) IDX0 Low Addr. w(N-1) w(N-2) ... w(3) w(2) w(1) High Addr. w(0) Before R9 w(N-1) w(N-2) ... w(3) w(2) w(1) w(0) R9 After 15/48
AN1442 - APPLICATION NOTE
This routines assumes that the following general purpose and co-processor registers (SFRs) have been initialized once for ever and L is a multiple of 4: - R9 contains the w(N-1) address. - IDX0 contains the x(n-N+1) address. - QX0 and QR0 with N-1.
; ; Initialize the Loop Count ; MOV ; ; Unrolled Loop ; WINDOW_LOOP CoMUL [IDX0], [R9-] rnd ; (ACC) w(i)*x(i) + rnd; ; (R9) (R9)+2 CoSTORE [IDX0+], MAH ; (IDX0) (ACC) ; (IDX0) (IDX0)+2 CoMUL [IDX0], [R9-] rnd ; (ACC) w(i+1)*x(i+1)+rnd; ; (R9) (R9)+2 CoSTORE [IDX0+], MAH ; (IDX0) (ACC) ; (IDX0) (IDX0)+2 CoMUL [IDX0], [R9-] rnd ; (ACC) w(i+2)*x(i+2)+rnd; ; (R9) (R9)+2 CoSTORE [IDX0+], MAH ; (IDX0) (ACC) ; (IDX0) (IDX0)+2 CoMUL [IDX0], [R9-] rnd ; (ACC) w(i+3)*x(i+3)+rnd; ; (R9) (R9)+2 CoSTORE [IDX0+], MAH ; (IDX0) (ACC) ; (IDX0) (IDX0)+2 ; ; End_of_loop Checking. ; CMPD1 JMPR R3 cc_NZ #0h WINDOW_LOOP ; (R3) (R3)-1. ; End-of-Loop test & branch. R3 #N/4 ; (R3) N/4.
Instruction Cycles Total 2N+3N/4
Program Words 5+ (2*2)*4
Note:
The number of Instruction Cycles and Program Words required for this application depends on the "unrolling factor". "2N" corresponds to the number of cycles per coefficient, "2N/4" corresponds to the branch penalty when the "unrolling factor" is 4. Similarly, "(2*2)*4-4" corresponds to the increase in program words when the "unrolling factor" is 4. Typically, if URF defines the factor, the execution time and number of program words becomes: 2N+3N/URF instruction cycles, and 5+ 4*URF program words.
16/48
AN1442 - APPLICATION NOTE
5.3 - FIR Filter-Real Correlation-Convolution 5.3.1 - Multiple Precision FIR Filter The pseudo code is: ; x(n) = input signal at time n. 16 bit value. ; y(n) = output signal at time n. 16 bit value. ; h(k) = k'th coefficient. 32-bit value. ; L = Number of coefficient taps in the filter. ; y(n)=0; for (k=0 to L-1) { y(n)= y(n) + h(k)*x(n-k); } This program illustrates the use of multiply/multiply-accumulate instructions, "CoMIN & CoMAX" (performing a programmable saturation), and a shift instruction. The corresponding memory map is shown below. It is assumed that the coefficients and samples have been initialized by another routine. Figure 7 : Memory Map for FIR
DPRAM Low Addr. x(n-L+1) x(n-L+2) ... x(n-3) x(n-2) x(n-1) High Addr. x(n-1) R9 R10 h(L-1) LSW h(L-1) MSW IDX0 DPRAM x(n-L+2) x(n-L+3) ... x(n-2) x(n-1) x(n) x(n) IDX0
Low Addr.
hL(L-1) hH(L-1) ... hL(1) hH(1) hL(0) hH(0) Before
hL(L-1) hH(L-1) ... hL(1) hH(1) hL(0) hH(0) After
R9 R10
High Addr.
This routine assumes that the following general purpose registers and co-processor registers (SFRs) have been initialized: - R0 with 0000h - R1 with the 16-bit MAXimum tolerated value - R2 contains the 16-bit MINimum tolerated value - R9 contains the hL(L-1) address (low word of the coefficient) - R10 contains hH(L-1) address (high word of the coefficient) - IDX0 contains the x(n-L+1) address - QX0 with 2*(L-1) - QR0 with 4 - and QR1 with 4*(L-1)
17/48
AN1442 - APPLICATION NOTE
; ; Repeat Count Initialization (repeat count > 31) ; MOV ; ; Read the new filter input from a (E)SFR and move it into the DPRAM ; at x(n-1) address therefore overwriting x(n-1). ; MOV ; ; FIR prolog (LSWs of Impulse response): first multiplication ; CoMULsu [IDX0+], [R9+QR0] ; (ACC) hL(L-1)*x(n-L+1) ; (IDX0) (IDX0)+2, ; (R9) (R9)+4. ; ; FIR loop (LSWs of Impulse response) Repeat the same MAC instruction L-2 times ; REPEAT MRW TIMES CoMACsu [IDX0+], [R9+QR0] ; (ACC) (ACC) + hL(i)*x(n-i) ; (IDX0) (IDX0)+2, ; (R9) (R9)+4. ; ; FIR epilog (LSWs of Impulse response): last MAC instruction and provide y(n) ; in an appropriate format ; CoMACsu [IDX0-QX0], [R9-QR1] ; (ACC) (ACC) + hL(0)*x(n) ; & x(n-L+1) x(n-L+2), ; (IDX0) (IDX0)-2*(L-1), ; (R9) (R9)-4*(L-1). ; ; Rounding & Shift ; MOV CoRND CoASHR CoASHR 8, 8, ; ; FIR prolog (MSWs of Impulse response): first multiplication ; MRW, #L-4 ; (MRW)=L-4. ; (ACC)=(ACC)+rnd ; (ACC)=(ACC)>>8 ; (ACC)=(ACC)>>8 @x(n), ADC_sfr ; move the new input x(n) MRW, #L-4 ; (MRW) L-4.
18/48
AN1442 - APPLICATION NOTE
; (ACC) hH(L-1)*x(n-L+1) ; (IDX0) (IDX0)+2, ; (R10) (R10)+4. ; ; FIR loop (MSWs of Impulse response)Repeat the same MAC instruction L-2 times ; REPEAT MRW TIMES CoMACM [IDX0+], [R10+QR0] ; (ACC) (ACC) + hH(i)*x(n-i) ; & x(n-i-1) x(n-i), ; (IDX0) (IDX0)+2, ; (R10) (R10)+4. ; ; FIR epilog (MSWs of Impulse response): last MAC instruction and provide ; y(n) in an appropriate format ; CoMACM [IDX0-QX0], [R10-QR1] ; (ACC) (ACC) + hH(0)*x(n) ; & x(n-l+1) x(n-L+2), ; (IDX0) (IDX0)-2*(L-1), ; (R10) (R10)-4*(2L-1). ; ; Shift & Rounding ; CoASHR ; ; Limiting ; CoMIN CoMAX ; ;Write the new filter output y(n) into a (E)SFR. ; NOP MOV DAC_sfr, MAH ; Pipeline Effect. ; move the new output y(n). R0, R0, R1 R2 ; (ACC) Min((ACC), MAX). ; (ACC) Max((ACC),MIN). #data3, rnd ; (ACC) (ACC)>>a #data3 +rnd
CoMAC
`[IDX0+],
[R10+QR0]
Instruction Cycles Read Input sample Initialization FIR Loop Post -Processing Write Output sample Total 1 2 2L+3 4 1 2L+11
Program Words 2 4 18 7 2 33
19/48
AN1442 - APPLICATION NOTE
5.4 - IIR Filters 5.4.1 - Nth Order IIR Filter: Direct Form 1 The rules for the implementation of FIR filters can be extended to IIR filters. The Nth-order difference equation is: N M y(n) = a (k ) y ( n - k ) + b ( k ) x ( n - k)
k=1
k=0
This can be called "Direct Form 1". The associated pseudo code is:
; ; ; ; x(n) = input signal at time n y(n) = output signal at time n a(k), b(k)= IIR coefficients N, M refer to the above equation
y(n)=0; for (k=0 to M) { y(n)= y(n) +b(k)*x(n-k) } for (k=1 to N) { y(n)= y(n) +a(k)*y(n-k); }
Figure 8 shows the memory map. It has been assumed that the coefficients and samples have been initialized by another routine. Figure 8 : Memory Map for IIR Direct Form 1
DPRAM Low Addr. x(n-M) x(n-M+1) ... x(n-3) x(n-2) x(n-1) x(n-1) y(n-N) y(n-N+1) ... High Addr. y(n-2) y(n-1) b(M) b(M-1) ... b(1) b(0) a(N) a(N-1) ... a(2) High Addr. a(1) Before R9 IDX0 DPRAM x(n-M+1) x(n-M+2) ... x(n-2) x(n-1) x(n) x(n) y(n-N+1) y(n-N+2) ... y(n-1) y(n) b(M) b(M-1) ... b(1) b(0) a(N) a(N-1) ... a(2) a(1) After R9 IDX0
Low Addr.
20/48
AN1442 - APPLICATION NOTE
This routines assumes that the following general purpose and co-processor registers (SFRs) have been initialized: - R0 with 0000h - R1 with the 16-bit MAXimum tolerated value - R2 contains the 16-bit MINimum tolerated value - R9 contains the b(M) address - R10 contains the y(n) address - IDX0 contains the x(n-M) address - QX0 with (N+M)*2 - QR0 with (N+M)*2
; ; Repeat Count Initialization (repeat count > 31) for the first IIR Loop ; MOV ; ; Read the new filter input from a (E)SFR & move it into the DPRAM ; at x(n-1) address, overwriting x(n-1). ; MOV ; ; Prolog of the First IIR loop. ; CoMUL [IDX0+], [R9+] ; (ACC) b(M)*x(n-M) ; (IDX0) (IDX0)+2, ; (R9) (R9)+2. ; ; First IIR loop. ; REPEAT MRW TIMES CoMACM [IDX0+], [R9+] ; (ACC) (ACC)+b(i)*x(n-i) ; & x(n-i-1) x(n-i), ; (IDX0) (IDX0)+2, ; (R9) (R9)+2. ; ; Repeat Count Initialization (repeat count > 31) for the second. ; MOV ; ; prolog of the Second IIR loop. ; CoMAC [IDX0+], [R9+] ; (ACC) a(N)*y(n-N) ; (IDX0) (IDX0)+2, ; (R9) (R9)+2. 21/48 MRW, #N-3 ; (MRW) N-3. @x(n), ADC_sfr ; move the new input x(n) MRW, #M-1 ; (MRW) M-1.
AN1442 - APPLICATION NOTE
; ; ; Second IIR loop. ; REPEAT MRW TIMES CoMACM [IDX0+], [R9+] ; (ACC) (ACC) + a(i)*y(n-i) ; & y(n-i-1) y(n-i), ; (IDX0) (IDX0)+2, ; (R9) (R9)+2. ; ; Epilog of the second IIR loop. ; CoMACM [IDX0-QX0], [R9-QR0] ; (ACC) (ACC)+h(0)*x(n) ; & y(n-2) y(n-1), ; (IDX0) (IDX0)-2*(N+M), ; (R9) (R9)-2*(N+M). ; ; Rounding ; CoRND ; ; Limiting ; CoMIN CoMAX ; ; Write the new filter output, y(n), into memory. ; CoSTORE ; [R10], MAH ; ((R10)) y(n). R0, R0, R1 R2 ; (ACC) Min((ACC), MAX). ; (ACC) Max((ACC),MIN). ; (ACC) (ACC) + rnd
22/48
AN1442 - APPLICATION NOTE
Instruction Cycles Read Input sample Initialization DF1 IIR Loop Post -Processing Write Output sample Total 1 2 N+M 3 1 N+M+7
Program Words 2 4 10 6 2 24
th 5.4.2 - N Order IIR Filter: Direct Form 2 The following equations equally represent the Nth Order IIR filter: N
k=1 N
k=0 These equations use the intermediate state variable vector U={u(n), u(n-1), u(n-2),..., u(n-N)}. This representation is called "Direct Form 2" and is illustrated by Figure 9. Direct Form 2 has an advantage over Direct Form 1 as it requires less data memory. The associated pseudo code is:
; x(n) = input signal at time n. ; u(n) = state variable at time n. ; y(n) = output signal at time n. ; a(k), b(k)= IIR coefficients. ; It is assumed N = M. ; u(n)=x(n); for (k=1 to N) { u(n)= u(n) +a(k)*u(n-k); } y(n)=b(0)*u(n); for (k=1 to N) { y(n)= y(n) +b(k)*u(n-k); }
y(n) =
u(n) = x(n) +
a ( k) u ( n - k )
b ( k) u ( n - k )
23/48
AN1442 - APPLICATION NOTE
Figure 9 : IIR Direct Form 2
u(n) x(n) Z-1 a(1) u(n-1) b(1) b(0)
Z-1 a(2) u(n-2) b(2)
a(N-1)
u(n-N+1)
b(N-1)
Z-1 a(N) u(n-N) b(N)
24/48
AN1442 - APPLICATION NOTE
Figure 10 shows the corresponding memory map. It has been assumed that the coefficients and samples have been initialized by another routine. Figure 10 : Memory Map for Nth Order IIR Filter Direct Form 2
DPRAM Low Addr. u(n-N) u(n-N+1) ... u(n-3) u(n-2) u(n-1) u(n-1) a(N) a(N-1) ... a(3) a(2) a(1) b(N) b(N-1) ... b(3) b(2) b(1) b(0) Before IDX0 DPRAM u(n-N+1) u(n-N+2) ... u(n-2) u(n-1) u(n) u(n) a(N) a(N-1) ... a(3) a(2) a(1) b(N) b(N-1) ... b(3) b(2) b(1) b(0) After IDX0
High Addr. Low Addr.
R10 R9
R10 R9
High Addr.
This routines assumes that the following general purpose and co-processor registers (SFRs) have been initialized: - R0 with 0000h - R1 with the 16-bit MAXimum tolerated value - R2 contains the 16-bit MINimum tolerated value - R9 contains the a(N) address - R10 contains the u(n) address - IDX0 contains the u(n-N) address - QX0 with 2*(N-1) and QX1 with 2N - QR0 with 4N-2, [ 2N+2*(N-1) ]
25/48
AN1442 - APPLICATION NOTE
; ; Repeat Count Initialization (repeat count > 31) for the first IIR Loop ; MOV ; ; Read the new filter input from a (E)SFR and move it into the Accumulator. ; MOV MAH, ADC_sfr ; (MAH) x(n), ; (MAE) 8 times (MAH.15), ; (MAL) 0000h. ; ; First IIR loop: u(n) computation. ; REPEAT MRW TIMES CoMAC [IDX0+], [R9+] ; (ACC) (ACC) + a(i)*u(n-i) ; (IDX0) (IDX0) + 2, ; (R9) (R9)+2. ; ; Epilog of the first IIR loop. ; CoMAC [IDX0-QX0], [R9+], rnd ; (ACC) (ACC)+a(1)*u(n-1) ; +rnd ; (IDX0) (IDX0)-2*(N-1), ; (R9) (R9)+2. ; ; Repeat Count Initialization (repeat count > 31) for the second. ; MOV ; ; Move u(n) into memory. ; CoSTORE ; ; Prolog of the Second IIR loop. ; CoMAC [IDX0+], [R9+] ; (ACC) b(N)*u(n-N) ; (IDX0) (IDX0)+2, ; (R9) (R9)+2. ; [R10], MAS ; ((R10)) u(n) MRW, #N-2 ; (MRW) N-2. MRW, #N-2 ; (MRW) N-2.
26/48
AN1442 - APPLICATION NOTE
; Second IIR loop. ; REPEAT MRW TIMES CoMACM [IDX0+], [R9+] ; (ACC) (ACC)+b(i)*u(n-i) ; & u(n-i-1) u(n-i), ; (IDX0) (IDX0)+2, ; (R9) (R9)+2. ; ; Epilog of the Second IIR loop. ; CoMACM [IDX0-QX1], [R9-QR0], rnd ; (ACC) b(0)*u(n)+rnd ; & u(n-1) u(n), ; (IDX0) (IDX0)-2N, ; (R9) (R9)-4N-2. ; ; Limiting ; CoMIN CoMAX ; ;Write the new filter output y(n) into a (E)SFR. ; NOP MOV DAC_sfr, MAH ; Pipeline Effect. ; move the new output y(n). R0, R0, R1 R2 ; (ACC) Min((ACC), MAX). ; (ACC) Max((ACC),MIN).
Instruction Cycles Read Input sample Initialization DF2 IIR Loop Post -Processing Write Output sample Total 1 2 2N+1 2 2 2N+8
Program Words 2 4 12 4 3 25
5.4.3 - N-Cascaded Real Biquads (Direct Form 2) A high-order filter can be implemented, either as a single section, or as a combination of first and second order sections. The single section form is quicker and easier to implement, but generates a larger numerical error. This increased error occurs for two reasons: - The long filter computation process accumulates errors from multiplication with quantized coefficients. - The roots of high-order polynomials are increasingly sensitive to changes in their quantized coefficients. Therefore, the single section form is not recommended except for a very low order controller. (see "Nth Order IIR Filter: Direct Form 1" on page 20).
27/48
AN1442 - APPLICATION NOTE
To implement a high-order transfer function, first decompose it into first order and second order blocks (biquads), and then connect these blocks in a cascade. The following paragraphs illustrate this technique for an even numbers of cascaded biquads. Unlike conventional digital signal processors, the MAC co-processor is able to repeat a single instruction at high speed but does not offer flexible and fast hardware looping. Consequently, to perform a loop containing more than one instruction the programmer must use the regular instruction set incurring a several cycle penalty for the end-of-loop detection. "Loop Unrolling" minimizes this penalty but increases the number of instructions. In the following section the loop unrolling technique will not be employed. Equations of a Direct Form 2 Nth Order IIR filter applied to a second order filter (N=2) yield: ui (n ) = xi ( n ) - ai (1 ) ui ( n - 1) - ai ( 2) ui ( n - 2 ) ( yi ( n) = bi ( 0 ) ui ( n ) + bi ( 1) ui ( n - 1 ) + bi (2 ) ui ( n - 2 )) x Where "i" specifies the biquad number. Note that yi(n)=xi+1(n). For simplicity, it has been assumed that no overflow occurs on ui(n) and yi(n). The naming convention is: ; xi(n) = input signal at time n of biquad number i. ; ui(n) = state variable at time n of biquad number i. ; y(n) = output signal at time n of biquad number i. ; ai(k), bi(k)= Coefficients of biquad number i. Figure 11 shows the corresponding memory map and assumes that both coefficients and samples have been initialized by another routine. Figure 11 : Memory Map for N-Cascaded Real Biquad IIR
DPRAM Low Addr. u1(n-2) u1(n-1) u2(n-2) u2(n-1) ... uN(n-2) uN(n-1) a1(2) a1(1) b1(2) b1(1) b1(0) ... aN(2) aN(1) bN(2) bN(1) bN(0) Before IDX0 IDX1 DPRAM u1(n-1) u1(n) u2(n-1) u2(n) ... uN(n-1) uN(n) a1(2) a1(1) b1(2) b1(1) b1(0) ... aN(2) aN(1) bN(2) bN(1) bN(0) After R9 IDX0, IDX1
High Addr.
Low Addr.
R9
High Addr.
28/48
AN1442 - APPLICATION NOTE
This routines assumes that the following general purpose and co-processor registers have been initialized: - R0 with 0000h - R1 with the 16-bit MAXimum tolerated value - R2 contains the 16-bit MINimum tolerated value - R9 contains the a1(2) address - R10 contains the R5 physical address - IDX0 contains the u1(n-2) address - IDX1 contains the u1(n-1) address - QX0 with 2*(N-1) - QX1 with 4
; ; Initialize the Loop Count: N ; MOV ; ; Read the new filter input from a (E)SFR and move it into the Accumulator. ; MOV MAH, ADC_sfr ; (MAH) x(n), ; (MAE) 8 times (MAH.15), ; (MAL) 0000h. DF2_BIQUAD_LOOP ; First Biquad iteration ; CoMAC[IDX0+], [R9+] ; (ACC) (ACC)-ai(2)*ui(n-2) ; (IDX0) (IDX0)+2, ; (R9) (R9)+2. CoMAC[IDX0-], [R9+] ; (ACC) (ACC)-ai(1)*ui(n-1) ; (IDX0) (IDX0)-2, ; (R9) (R9)+2. CoRND ; ; Write ui(n), into a GPR (R5) ; CoSTORE ; ; Second Biquad iteration. ; CoMUL [IDX0+], [R9+] ; (ACC) (ACC) + bi(2)*ui(n-2) ; (IDX0) (IDX0)+2, ; (R9) (R9)+2. 29/48 R5, MAS ; (R5) ui(n). ; (ACC) (ACC)+rnd R3 #N ; (R3) N.
AN1442 - APPLICATION NOTE
; (ACC) (ACC)+bi(1)*ui(n-1) ; & ui(n-2) ui(n-1) ; (IDX0) (IDX0)+2, ; (R9) (R9)+2. CoMAC R5, [R9+] ; (ACC) (ACC) + bi(0)*ui(n) ; (R9) (R9)+2. ; ; Write ui(n) to memory. ; CoMOV [IDX1+QX1] [R10] ; ui(n-1) ui(n). ; (IDX1) (IDX1)+4, ; ; End_of_loop Checking. ; CMPD1 JMPR ; ; Limiting ; CoMIN CoMAX ; ;Write the new filter output y(n) into a (E)SFR. ; NOP MOV DAC_sfr, MAH ; Pipeline Effect. ; move the new output y(n). R0, R0, R1 R2 ; (ACC) Min((ACC), MAX). ; (ACC) Max((ACC),MIN). R3 cc_NZ #0h DF2_BIQUA D_LOOP ; (R3) (R3)-1. ; End-of-Loop test & branch.
CoMACM
[IDX0+],
[R9+]
Instruction Cycles Read Input sample Initialization DF2 Biquad Loop Post -Processing Write Output sample Total 1 1 11N-1 3 1 11N+5
Program Words 2 2 19 5 2 31
30/48
AN1442 - APPLICATION NOTE
5.4.4 - N-cascaded Real Biquads: Transpose Form The equations of a Direct Form 2 Nth Order IIR filter applied to a second order filter (N=2) can yield: yi (n ) = bi ( 0 ) xi ( n ) + ui (n - 1 ) ( u i ( n ) = b i ( 1 ) x i ( n - 1 ) - a i ( 1 ) y i ( n ) + w i ( n - 1 ) )x wi (n ) = bi ( 2 ) xi ( n ) - ai ( 2) yi ( n ) Where "i" is the biquad number. Note that yi(n)=xi+1(n). This form is also called the "Transpose Form". For simplicity, it has been assumed that no overflow occurs on ui(n) or yi(n). This form is suitable when the input-to-output delay must be minimized. The naming convention is: ; xi(n) = input signal at time n of biquad number i. ; ui(n), wi(n) = state variables at time n of biquad number i. ; yi(n) = output signal at time n of biquad number i. ; ai(k), bi(k)= Coefficients of biquad number i. Figure 12 shows the corresponding memory map. It is assumed that both coefficients and samples have been initialized by another routine. Figure 12 : Memory Map of N-Cascaded Real Biquads (Transpose Form)
DPRAM Low Addr. u1(n-1) w1(n-1) u2(n-1) w2(n-1) ... uN(n-1) wN(n-1) R4 DPRAM u1(n) w1(n) u2(n) w2(n) ... uN(n) wN(n) R4
High Addr.
Low Addr.
b1(0) a1(1) b1(1) a1(2) b1(2) ... bN(0) aN(1) bN(1) aN(2) bN(2) Before
R9
b1(0) a1(1) b1(1) a1(2) b1(2) ... bN(0) aN(1) bN(1) bN(2) bN(2) After
R9
High Addr.
31/48
AN1442 - APPLICATION NOTE
This routines assumes that the following general purpose and co-processor registers (SFRs) have been initialized: - R0 with 0000h - R1 with the 16-bit MAXimum tolerated value - R2 contains the 16-bit MINimum tolerated value - R4 contains the u1(n-1) address - R9 contains the b1(0) address - R10 contains the R5 physical address - QR0 with 2*(N-1) - QR1 with 10N
; ; Initialize the Loop Count: N ; MOV ; ; Read the new filter input from a (E)SFR and move it into a GPR (R5). ; MOV ; TF_BIQUAD_LOOP: ; ; Compute yi(n) ; CoLOAD [R4+], R0 ; (ACC) ui(n-1) ; (R4) (R4)+2. CoMAC [R9+], R5 rnd ; (ACC) (ACC)+bi(0)*xi(n) ; +rnd ; (R9) (R9)+2. ; ; Write yi(n) into R8. ; CoSTORE ; ; Compute ui(n) ; CoLOAD CoMAC[R4], [R9+], R0 R8 ; (ACC) wi(n-1) ; (ACC) (ACC)-ai(1)*yi(n) ; (R9) (R9)+2. CoMAC [R9+], R5 rnd ; (ACC) (ACC)+bi(1)*xi(n)+rnd ; (R9) (R9)+2. ; R8, MAS ; (R8) limited(yi(n)). R5, ADC_sfr ; (R5) x(n) R3 #N-1 ; (R3) N-1.
32/48
AN1442 - APPLICATION NOTE
; Write ui(n) into memory. ; CoSTORE [R4+], MAS ; ui(n-1) ui(n). ; (R4) (R4)+2. ; ; Compute wi(n) ; CoMUL[R9+], R8 ; (ACC) -ai(2)*yi(n) ; (R9) (R9)+2. CoMAC [R9+], R5 rnd ; (ACC) (ACC)+bi(2)*xi(n)+rnd ; (R9) (R9)+2. ; ; Write wi(n) into memory. ; CoSTORE [R4+], MAS ; wi(n-1) wi(n). ; (R4) (R4)+2. ; ; Write yi(n) into R5. ; MOV ; ; End_of_loop Checking. ; CMPD1 JMPR ; ; Compute yN(n) ; CoLOAD [R4+], R0 ; (ACC) uN(n-1) ; (R4) (R4)+2. CoMAC [R9+], R5 rnd ; (ACC) (ACC)+bN(0)*xN(n) ; +rnd ; (R9) (R9)+2. ; ; Write yN(n) into R8. ; CoSTORE ; R8, MAS ; (R8) limited(y(n)). R3 cc_NZ #0h TF_BIQUAD_LOOP ; (R3) (R3) -1. ; End-of-Loop test & branch. R5 R8 ; xj(n) yi(n). ; j = i+1
33/48
AN1442 - APPLICATION NOTE
; Limiting ; CoMIN CoMAX ; ;Write the new filter output y(n) into a (E)SFR. ; NOP MOV ; ; Compute ui(n) ; CoLOAD CoMAC[R4], [R9+], R0 R8 ; (ACC) wN(n-1) ; (ACC) (ACC)-aN(1)*yN(n) ; (R9) (R9)+2. CoMAC [R9+], R5 rnd ; (ACC) (ACC)+bN(1)*xN(n) ; +rnd ; (R9) (R9)+2. ; ; Write ui(n) into memory. ; CoSTORE [R4+], MAS ; uN(n-1) uN(n). ; (R4) (R4)+2. ; ; Compute wi(n) ; CoMUL[R9+], R8 ; (ACC) -aN(2)*yN(n) ; (R9) (R9)+2. CoMAC [R9-QR1], R5 rnd ; (ACC) (ACC)+bi(2)*xi(n)+rnd ; (R9) (R9)-2*(5N). ; ; Write wi(n) into memory. ; CoSTORE [R4-QR0], MAS ; wN(n-1) wN(n). ; (R4) (R4)-2*[2*(N-1)]. Instruction Cycles Read Input sample Initialization TF Biquad Loop Output Post -Processing & Write Filter Post-Processing & Update Total 34/48 1 1 14N-15 7 7 14N+1 Program Words 2 2 24 13 14 55 DAC_sfr, MAH ; Pipeline Effect ; move the new output y(n). R0, R0, R1 R2 ; (ACC) Min((ACC), MAX). ; (ACC) Max((ACC),MIN).
AN1442 - APPLICATION NOTE
5.5 - LMS Adaptive Filter 5.5.1 - Single-Precision LMS Adaptive Filter An adaptive filter contains coefficients that are updated by an adaptive algorithm to optimize the filter's response to a desired performance criterion. Generally, adaptive filters have two distinct parts: a filter with a structure designed to perform a processing function, and an adaptive algorithm for adjusting the filter coefficients to improve its performance. The incoming signal x(n) is weighted in a digital filter to produce an output y(n). The adaptive algorithm adjusts the filter weights to minimize the error e(n) between the filter output y(n) and the desired response of the filter d(n). The Single-Precision LMS Adaptive Filter is a FIR filter whose coefficients are updated at each iteration according to an error signal e(n) equal to d(n)-y(n), where d(n) is the desired signal at time n and y(n) is the FIR output. Figure 13 illustrates this filter. Figure 13 : LMS Adaptive Filter
x(n) Unit Delay x(n-1) Unit Delay x(n-2) Unit Delay x(n-L+1)
h(0,n)
h(1,n)
h(2,n)
h(L-1,n)
+ y(n)
d(n)
The corresponding pseudo code is:
; x(n) = input signal at time n. ; d(n) = desired signal at time n. ; y(n) = output signal at time n. ; h(k, n) = k'th coefficient at time n. ; Mu= adaptive gain. ; L = Number of coefficient taps in the filter. ; y(n)=0; for (k=0 to L-1) { y(n)= y(n) + h(k,n)*x(n-k); }
e(n)=d(n)-y(n); for (k=0 to L-1) { h(k,n+1)= h(k,n) - Mu*x(n-k)*e(n); } 35/48
00 0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
+
e(n)
AN1442 - APPLICATION NOTE
Figure 14 shows the corresponding memory map. It has been assumed that both the coefficients and samples have been initialized by another routine. Unlike pure DSP filters, this filter is implemented in two steps, FIR output computation is followed by an update of the coefficients. Figure 14 : Memory Map for LMS Adaptative Filter
DPRAM Low Addr. x(n-L+1) x(n-L+2) ... x(n-3) x(n-2) x(n-1) High Addr. x(n-1) IDX0, R10 DPRAM x(n-L+2) x(n-L+3) ... x(n-2) x(n-1) x(n) x(n) IDX0, R10
Low Addr.
h(L-1, n) h(L-2,n) ... h(3, n) h(2, n) h(1, n) h(0, n)
R9
h(L-1,n+1) h(L-2,n+1) ... h(3, n+1) h(2, n+1) h(1, n+1) h(0, n+1)
R9
High Addr.
This routines assumes that the following general purpose and co-processor registers (SFRs) have been initialized once for ever and that L is less than 31: - R0 with 0000h - R1 with the 16-bit MAXimum tolerated value - R2 contains the 16-bit MINimum tolerated value - R8 contains the adaptive gain (Mu) value - R9 contains the h(L-1,n) address - R10 contains the x(n-L+1) address - IDX0 contains the x(n-L+1) address - QX0 and QR0 with 2*(L-1)
; ; Read the new filter input from a (E)SFR and move it into the DPRAM ; at x(n-1) address overwriting therefore x(n-1). ; MOV ; ; FIR prolog: first multiplication ; CoMUL [IDX0+], [R9+] ; (ACC) h(L-1,n)*x(n-L+1) ; (IDX0) (IDX0)+2, 36/48 @x(n), ADC_sfr ; move the new input x(n)
AN1442 - APPLICATION NOTE
; (R9) (R9)+2. ; ; FIR loop: Repeat L-2 times the same MAC instruction. ; REPEAT L-3 TIMES CoMAC [IDX0+], [R9+] ; (ACC) (ACC)+h(i)*x(n-i) ; (IDX0) (IDX0)+2, ; (R9) (R9)+2. ; ; FIR epilog: last MAC instruction and provide y(n) in an appropriate format ; CoMAC [IDX0-QX0], [R9-QR0] ; (ACC) (ACC)+h(0)*x(n) ; (IDX0) (IDX0)-2*(L-1), ; (R9) (R9)-2*(L-1). ; ; Shift & Rounding ; CoASHR #data3, rnd ; (ACC) (ACC)>>a #data3 ;+rnd ; ; Limiting ; CoMIN CoMAX ; ;Write the new filter output y(n) into an (E)SFR. ; CoSTORE MOV ; ;Read d(n) and move it into a GPR. ; MOV ; ; Error, e(n), Calculation ; SUB MOV CoMUL CoNEG CoSTORE R11, MAS R5, @e(n), R5, R6 R5 R8 rnd ; (R5) d(n)-y(n)=e(n) ; e(n-1) e(n) ; (ACC) Mu*e(n) ; (ACC) -(ACC)+rnd ; (R11) -Mu*e(n). 37/48 R5, @d(n) ; (R5) d(n) R6 DAC_sfr, MAH R6 ; (R6) y(n). ; move the new output y(n). R0, R0, R1 R2 ; (ACC) Min((ACC), MAX). ; (ACC) Max((ACC),MIN).
AN1442 - APPLICATION NOTE
; ; Coefficients' Updating ; MOV ; ; Coefficient Update Prolog. ; CoLOAD CoMAC [R9], R11 R0 [R10+], rnd ; (ACC) h(L-1,n) ; (ACC) h(L-1,n) ; Mu.e(n)*x(n-L+1)+rnd ; (R10) (R10)+2. CoSTORE [R9+], MAS ; h(L-1,n) h(L-1,n+1). ; (R9) (R9)+2. ; ; Coefficient Update Loop. ; LMS_LOOP: ; CoLOAD CoMACM [R9], R11 R0 [R10+], rnd ; (ACC) h(k,n) ; (ACC) h(k,n) -Mu.e(n)*x(n-k) ;+rnd ; x(n-k-1) x(n-k). ; (R10) (R10)+2. CoSTORE [R9+], MAS ; h(k,n) h(k,n+1). ; (R9) (R9)+2. ; ; End_of_loop Checking. ; CMPD1 JMPR ; ; Coefficient Update epilog. ; CoLOAD CoMACM [R9], [R10-QR0], R0 R11 rnd ; (ACC) h(0,n) ; (ACC) h(0,n) - Mu.e(n)*x(n)+ ; rnd ; x(n-1) x(n). ; (R10) (R10)-2*(L-1). CoSTORE [R9-QR0], MAS ; h(0,n) h(0,n+1). ; (R9) (R9)-2*(L-1). 38/48 R3 cc_NZ #0h LMS_LOOP ; (R3) (R3) -1. ; End-of-Loop test & branch. R3, #L-2 ; (R3) L
AN1442 - APPLICATION NOTE
Instruction Cycles Read Input samples Initialization LMS Loop Post/Pre -Processing Write Output sample Total 2 1 4L+2(L-2)+1 9 2 4L+2(L-2)+15
Program Words 4 2 25 16 4 51
Note:
The branch penalty in the LMS loop is roughly one third of the execution time of the LMS loop. Nevertheless, as shown in Section 5.2.4 - N-real Multiply (windowing), it is possible to minimize the branch penalty by "unrolling" instructions. Therefore, if URF is the UnRolling Factor, the execution times and program words count become respectively: 4L+2(L-2)/URF+1 instruction cycles, and 51 +(URF-1)*2 Program words.
5.5.2 - Extended-Precision LMS Adaptive Filter 16-bit coefficients can be insufficient for LMS filtering. The following routine describes a LMS filter with 32-bit coefficients and 16-bit samples. In most applications 24-bit coefficients provide good results. The Extended-precision LMS Adaptive filter uses the same naming convention as the single-precision LMS Adaptive filter: hL(k,n) and hH(k,n) represent the LS word and MS word (respectively) of the k'th coefficient at time n. For simplicity, you are advised to clear MP of MCW. Figure 15 shows the corresponding memory map. It is assumed that both coefficients and samples have been initialized by another routine. Note that like the "Single-Precision LMS Adaptive Filter" on page 35, this loop is not "unrolled". Figure 15 : Memory Map for Extended Precision LMS Adaptative Filter
DPRAM Low Addr. x(n-L+1) x(n-L+2) ... x(n-3) x(n-2) x(n-1) x(n-1) hL(L-1) hH(L-1) hL(L-2) hH(L-2) ... hL(2) hH(2) hL(1) hH(1) hL(0) High Addr. hH(0) Before IDX0 DPRAM x(n-L+2) x(n-L+3) ... x(n-2) x(n-1) x(n) x(n) hL(L-1) hH(L-1) hL(L-2) hH(L-2) ... hL(2) hH(2) hL(1) hH(1) hL(0) hH(0) After IDX0
High Addr. Low Addr.
R9,IDX1 R10
R9,IDX1 R10
39/48
AN1442 - APPLICATION NOTE
This routines assumes that the following general purpose and co-processor registers (SFRs) have been initialized once for ever and that L is less than 31: - R0 with 0000h - R1 with the 16-bit MAXimum tolerated value - R2 contains the 16-bit MINimum tolerated value - R8 contains the adaptive gain value Mu - R9 contains the hL(L-1) address - R10 contains the hH(L-1) address - IDX0 contains the x(n-L+1) address - QX0 with 2*(L-1) - QX1 and QR0 with 4 - QR1 with 4*(L-1)
; ; Read the new filter input from a (E)SFR and move it into the DPRAM at ; x(n-1) address therefore overwriting x(n-1). ; MOV ; ; FIR prolog (LSWs of Impulse response): first multiplication ; CoMULsu [IDX0+], [R9+QR0] ; (ACC) hL(L-1)*x(n-L+1) ; (IDX0) (IDX0)+2, ; (R9) (R9)+4. ; ; FIR loop (LSWs of Impulse response): Repeat L-2 times the same MAC instruction. ; REPEAT L-3 TIMES CoMACsu [IDX0+], [R9+QR0] ; (ACC) (ACC)+hL(i)*x(n-i) ; (IDX0) (IDX0)+2, ; (R9) (R9)+4. ; ; FIR epilog (LSWs of Impulse response): last MAC instruction and provide ; y(n) in an appropriate format ; CoMACsu [IDX0-QX1], [R9-QR1], rnd ; (ACC) (ACC)+hL(0)*x(n)+rnd ; & x(n-L+1) x(n-L+2), ; (IDX0) (IDX0)-2*(L-1), ; (R9) (R9)-2*(2L-1). ; ; Shift ; CoASHR 40/48 8, ; (ACC)=(ACC)>>8 @x(n), ADC_sfr ; move the new input x(n)
AN1442 - APPLICATION NOTE
CoASHR ;
8,
; (ACC)=(ACC)>>8
; FIR prolog (MSWs of Impulse response): first multiplication ; CoMAC [IDX0+], [R10+QR0] ; (ACC) hH(L-1)*x(n-L+1) ; (IDX0) (IDX0)+2, ; (R10) (R10)+4. ; ; FIR loop (MSWs of Impulse response): Repeat L-2 times the same MAC instruction. ; REPEAT L-3 TIMES CoMAC [IDX0+], [R10+QR0] ; (ACC) (ACC)+hH(i)*x(n-i) ; & x(n-i-1) x(n-i), ; (IDX0) (IDX0)+2, ; (R10) (R10)+4. ; ; FIR epilog (MSWs of Impulse response): last MAC instruction and provide ; y(n) in an appropriate format ; CoMAC [IDX0-QX1], [R10-QR1] ; (ACC) (ACC)+hH(0)*x(n) ; & x(n-L+1) x(n-L+2), ; (IDX0) (IDX0)-2*(L-1), ; (R10) (R10)-2*(2L-1). ; ; Shift & Rounding ; CoASHR #data3, rnd ; (ACC) (ACC)>>a #data3 ;+rnd ; ; Limiting ; CoMIN CoMAX ; ;Write the new filter output y(n) into a (E)SFR. ; NOP MOV ; DAC_sfr, MAH ; Pipeline Effect. ; move the new output y(n). R0, R0, R1 R2 ; (ACC) Min((ACC), MAX). ; (ACC) Max((ACC),MIN).
41/48
AN1442 - APPLICATION NOTE
; Read d(n) and move it into a GPR. ; MOV ; ; Error, e(n), Calculation ; SUB MOV CoMUL CoNEG CoSTORE ; ; Coefficients' Updating ; MOV MOV MOV ; ; Coefficient Update Prolog. ; CoLOAD [IDX1+QX1], [R10-] ; (ACC) h(L-1,n) ; (IDX1) (IDX1)+4. ; (R10) (R10)-2. CoMAC R11, [R12+] ; (ACC) h(L-1,n) ; Mu.e(n)*x(n-L+1)+rnd ; (R12) (R12)+2. CoSTORE [R10+], MAL ; hL(L-1,n) hL(L-1,n+1). ; (R10) (R10)+2. CoSTORE [R10+QR0], MAH ; hH(L-1,n) hH(L-1,n+1). ; (R10) (R10)+2. ; ; Coefficient Update Loop. ; EXT_LMS_LOOP: ; CoLOAD [IDX1+QX1], [R10-] ; (ACC) h(k,n) ; (IDX1) (IDX1)+4. ; (R10) (R10)-2. R12, IDX1, R3, IDX0 R9 #L-2 ; (R12) (IDX0) ; (IDX1) (R9) ; (R3) L R11, MAS R5, @e(n), R5, R6 R5 R8 rnd ; (R5) d(n)-y(n)=e(n) ; e(n-1) e(n) ; (ACC) Mu*e(n) ; (ACC) -(ACC)+rnd ; (R11) -Mu*e(n). R5, @d(n) ; (R5) d(n)
42/48
AN1442 - APPLICATION NOTE
; (ACC) h(k,n) -Mu.e(n)*x(n-k) ;+rnd ; x(n-k-1) x(n-k). ; (R12) (R12)+2. CoSTORE [R10+], MAL ; hL(k,n) hL(k,n+1). ; (R10) (R10)+2. CoSTORE [R10+QR0], MAH ; hH(k,n) hH(k,n+1). ; (R10) (R10)+2. ; ; End_of_loop Checking. ; CMPD1 JMPR ; ; Coefficient Update epilog. ; CoLOAD [IDX1+QX1], [R10-] ; (ACC) h(0,n) ; (IDX1) (IDX1)+4. ; (R10) (R10)-2. CoMACM R11, [R12+] ; (ACC) h(0,n) -Mu.e(n)*x(n) ;+rnd ; x(n-1) x(n). ; (R12) (R12)+2. CoSTORE [R10+], MAL ; hL(0,n) hL(0,n+1). ; (R10) (R10)+2. CoSTORE [R10-QR0], MAH ; hH(0,n) hH(0,n+1). ; (R10) (R10)-(2L-1). R3 cc_NZ #0h ; (R3) (R3) -1.
CoMACM
R11,
[R12+]
EXT_LMS_LOOP ; End-of-Loop test & branch.
Instruction Cycles Read Input samples Initialization EXT LMS Loop Post/Pre -Processing Write Output sample Total 2 4 6L+2(L-2)+3 9 2 6L+2(L-2)+20
Program Words 4 8 39 16 4 71
43/48
AN1442 - APPLICATION NOTE
5.6 - Operations on Tables ST10-DSP is usefull to compute on tables or collection of samples. Additionally, the concatenation unit can be used to compute on 32-bit operands. 5.6.1 - Detection of the Minimum or Maximum in a Collection of Samples ST10-DSP instructions (CoMIN and CoMAX) allow to detect the maximum or the minimum between the accumulator and an external operand. Using a hardware loop with pointer auto-modification, the minimum or the maximum of a parameter in a collection of samples can be detected at a rate of 1 clock per sample for 16-bit operands and for 32-bit operands. 5.6.2 - Computing the Sum of a Collection of Samples Using the repeated instruction CoADD, ST10-DSP allows to compute the sum of a collection of samples at a rate of 1 clock per sample for 16-bit operand and for 32-bit operands. 5.6.3 - Search for an Element Within a Collection of Samples Using the repeated instruction CoCMP, ST10-DSP allows to compare an external operand with the accumulator at a rate of 1 clock per sample for both 16-bits and 32-bits operands. The following paragraphs will show 3 examples of table oriented routines. 5.6.4 - Table Move This routine moves a table of L 16-bit data items from one memory location to another (where L is the number of data items). "Orig_Address" is the location of the first element of the table and "Dest_Address" is its location after the table move.
; MAC dedicated registers' initialization: ; MOV MOV ; ; GPR initialization: MOV ; ; Move the table REPEAT MRW TIMES CoMOV [IDX0+], [R1+] ; ((IDX0)) ((R1)) ; (IDX0) (IDX0)+2 ; (R1) (R1)+2. R1, #Orig_Address ; (R1) Orig_Address MRW, IDX0, #L-1 #Dst_Address ; (MRW) L-1. ; (IDX0) Dst_Address.
Instruction Cycles Total L+3
Program Words 8
44/48
AN1442 - APPLICATION NOTE
5.6.5 - Find the index of a Maximum Value in a Table This routine finds the index of the maximum value of data x(i) for i=1 to L, contained in a table. The first element of the index is located at "Orig_Address". The operation is performed in two steps, the maximum value is detected, and then the corresponding index is detected. At the end of the routine, the maximum value is stored in the co-processor accumulator and the index is stored in R1 (GPR).
; MAC dedicated registers' initialization: MOV ; ; Accumulator Initialization. MOV MAH, #FFFFh ; (MAH) FFFFh, ; (MAE) FFh, ; (MAL) 0000h. MOV ; ; GPRs initialization: MOV MOV ; ; First Iteration: Detection of the maximum value REPEAT MRW TIMES CoMAX R0, [R1+] ; (ACC) Max((ACC),x(i)) ; (R1) (R1)+2 ; ; Re-initialization: MOV MOV ; ; Second Iteration: Detection of the corresponding index INDEX_LOOP CoCMP R0 [R1+] ; (MSW) (ACC)-x(i) ; (R1) (R1)+2 NOP JNB ; ;Store result SUB R1, #2 ; R1 is the index of the ; Maximum value. MSW.9, INDEX_LOOP ; Pipieline effect ; loop if MSW-Z flag not set MRW, R1, #L-1 #Orig_Address ; (MRW) L-1. ; (R1) Orig_Address R0, R1, #0000h #Orig_Address ; (R0) 0000h ; (R1) Orig_Address MAL, #FFFFh ; (MAL) FFFFh, MRW, #L-1 ; (MRW) L-1.
Instruction Cycles Total
Note: 1. On average.
Program Words 21
3L/2+10
1
45/48
AN1442 - APPLICATION NOTE
5.6.6 - Compare for Search This routine finds the index of the first piece of data in a table which matches a specified condition "cc_cond" when compared to the contents of the accumulator. It assumes that data is stored in numerical order in the table. The same assumptions are made as for Section 5.6.4 - and Section 5.6.5 -. When a match is made, the index is stored in R1.
; Initialization: ; MOV MOV MOV ; ; Accumulator Initialization. MOV MAH, #data16 ; (MAH) #data16, ; (MAE) 8 times (MAH15), ; (MAL) 0000h. ; ; Second Iteration: Detection of the corresponding index NO_MATCH CoCMP cc_GT R0 [R1+] ; (MSW) (ACC)-x(i) ; (R1) (R1)+2 NOP JNB MSW.12, NO_MATCH ; pipeline effect ; test C-flag of MSW and ; jump if no match ; ; Storing index SUB R1, #2 ; R1 is the index of the ; matching element MRW, R0, R1, #L-1 #0000h #Orig_Address ; (MRW) L-1. ; (R0) 0000h ; (R1) Orig_Address
Instruction Cycles Total
Note: 1. On average.
Program Words 13
L/2+9 1
46/48
AN1442 - APPLICATION NOTE
5.7 - Summary of Routines Table 2 : Summary of routines
Instruction Cycles Co-Processor Initialization Mathematics 32 by 32 signed multiplication Nth Order Power Series [NxN][Nx1] Matrix Multiply N-Real Multiply (Windowing) DSP Routines 2) 32x16 L-tap FIR DF1 DF2
3) Nth 4)
Program Words 19 24 22 24
Page Number 10 10 11 13 15 17 20 23 27 31 35 39 44 45 46
10 12 9N/2+3 N +5N+5 2.N+3.N/URF 2L+3 2N 2N+1 11N-1 14N-15 4L+2(L-2)/URF +1 6L+2(L-2)/URF +20 L+3 3L/2+10
6) 1) 2
5+4.URF 18 10 12 19 24 51+2(URF-1) 71+2(URF-1) 8 21 13
Order IIR filter
Nth Order IIR filter
DF2 N-cascaded Biquads TF
5) N-cascaded
Biquads
16x16 L-tap LMS 32x16 L-tap LMS Operations on Tables Table Move (L items) Find the Index of a Maximum Value in a table (L items) "Compare For Search" 7) (L items)
Notes: 1. "URF" stands for "UnRolling Factor". 2. Representative part of the routine only. 3. Direct Form 1. 4. Direct Form 2. 5. Transpose Form. 6. On average. 7. First data in a table that matches a specified condition. 8. On average.
L/2+7 8)
6 - APPLICATION NOTE VERSION INFORMATION This document has been released one the 6th of November 2001.
47/48
AN1442 - APPLICATION NOTE
Information furnished is believed to be accurate and reliable. However, STMicroelectronics assumes no responsibility for the consequences of use of such information nor for any infringement of patents or other rights of third parties which may result from its use. No license is granted by implication or otherwise under any patent or patent rights of STMicroelectronics. Specifications mentioned in this publication are subject to change without notice. This publication supersedes and replaces all information previously supplied. STMicroelectronics products are not authorized for use as critical components in life support devices or systems without express written approval of STMicroelectronics. The ST logo is a registered trademark of STMicroelectronics (c) 2001 STMicroelectronics - All Rights Reserved STMicroelectronics GROUP OF COMPANIES Australia - Brazil - Canada - China - Finland - France - Germany - Hong Kong - India - Israel - Italy - Japan - Malaysia - Malta - Morocco Singapore - Spain - Sweden - Switzerland - United Kingdom - United States http://www.st.com
48/48


▲Up To Search▲   

 
Price & Availability of AN1442

All Rights Reserved © IC-ON-LINE 2003 - 2022  

[Add Bookmark] [Contact Us] [Link exchange] [Privacy policy]
Mirror Sites :  [www.datasheet.hk]   [www.maxim4u.com]  [www.ic-on-line.cn] [www.ic-on-line.com] [www.ic-on-line.net] [www.alldatasheet.com.cn] [www.gdcy.com]  [www.gdcy.net]


 . . . . .
  We use cookies to deliver the best possible web experience and assist with our advertising efforts. By continuing to use this site, you consent to the use of cookies. For more information on cookies, please take a look at our Privacy Policy. X