# FPGA Implementation of Pipelined 8 Tap FIR Filter

A.Kamaraj<sup>1</sup>, C.Kalyana Sundaram<sup>2</sup>, J.Senthil Kumar<sup>3</sup>

#### ABSTRACT

FIR filters are being designed using HDL languages to enhance the speed of the system. In the whole system if the speed of the individual block is enhanced, the overall speed of the system is enhanced. In order to attain effective utilization hardware is done by applying the pipelining technique. Pipelining is an implementation technique in which multiple instructions are overlapped in execution. The proposed design of this paper is an attempt to optimize the system speed with minimal cost and hardware. The central design concept is to build filters with higher operating frequency without sacrificing the performance of original filters.

To enhance system speed and reducing implementation complexity, a lot of work has been done in the process of achieving digital signal processing by use of the FPGA. In a filter the pipelining of multiplication is achieved by shifts and addition method. This paper describes the design of Third order low pass FIR filter with pipelined architecture. The design synthesis is done using Xilinx ISE 12.1 and implemented in Spartan-3E FPGA. By pipelining the delay of FIR filters can be reduced. Pipelined technique may reduce delay and enhances speed as compared to non-pipelined technique.

Keywords: — FIR filter, Non Pipelining, Pipelining, FPGA

#### I. Introduction

Filter as the name indicates remove unwanted components of signal. Filters are used for extraction of desired signal from noisy signal which consist of unwanted disturbances. Filters can be broadly described as a signal selection system. In the digital era, the digital filter, which has attracted people's broad attention more and more, has been widely applied to communication, voice, image, automatic control, radar, aerospace, medicine and so on. As compared to analog filters, digital filters are having higher precision, better stability and reliability. Digital Filters do not have matching problems. Digital filters represent time division multiplexing which can complete some filtering tasks that a number of analog filters is incapable of doing. We can achieve digital filters by hardware circuits. Also we have an alternative approach as computer software programming.

There are two types of digital filters Finite impulse response filter and infinite impulse response filters. Speaking from the realization, IIR adopts recursive structure and uses the rational fraction which is equal to the ratio of two polynomials approximate to the frequency character, so it is able to get better frequency selection characters by use of less order but it is gained at the cost of the non-linear phase characters. At the same time, the existence of feedback slip requires the higher system stability and easily causes oscillation. Compared with the Infinite Impulse Response (IIR), the FIR filter is capable of meeting the strict requirements of the amplitude and phase characters, avoiding the drift and noise and so on,

<sup>1.</sup> AP / ECE, Mepco Schlenk Engineering College Sivakasi kamarajvlsi@gmail.com

<sup>2.</sup> AP / ECE, Mepco Schlenk Engineering College Sivakasi, gullycks@yahoo.co.in

<sup>3.</sup> AP / ECE, Mepco Schlenk Engineering College Sivakasi, senvimjag@gmail.com

which is generated by the IIR filter, is easily achieved by hardware and has the precise linear-phase and high system stability. In recent years, with the rapid development of the Very Large Scale Integrated (VLSI) technology, FPGA (Field programmable gate array) has been widely applied because of its re-programmability,. re-configurability, low-cost, high logic density and high reliability. FPGA, the structured internal logic array and rich connection resources, particularly can be useful to the digital signal processing [1]. FPGA is a programmable logic device which is having user programmable features. This actually can minimize system design risk and maintenance and shorten the design cycle. To enhance system speed and reducing implementation complexity a lot of work have been done in the process of achieving digital signal processing by use of the FPGA.FIR algorithm can use the software to achieve in the Digital Signal Processor (DSP) [6] or Central Processing Unit (CPU), but in a number of systems demanding high real-time, because of the ordinal implementation of the procedures, the software based on real-time often cannot meet the requirements. So it must use hardware to achieve a high speed. Now FPGA becomes a good choice.

## Direct form structure of fir filter:



Figure 1.1 Direct form structure of FIR filter

There are three basic modules Adder, Delay and Multiplier modules as indicated in Figure 1.1. For an M tap filter, the number of adder and delay modules will be M-1, and the number of multiplier module will be M. Differential equation of digital FIR Filter can be described as [8]

$$y(n) = \sum_{i=0}^{M-1} x(n-i)h(i)$$
 —(1)

#### FIR Filter Design:

FIR filter has been designed using Window method. Matlab special toolbox FDA Tool (Filter Design & Analysis) has been used to simulate and design filter, the filter amplitude-frequency characteristics meet the requirements [2], [3] .The filter coefficients exported to a text file, 8-order impulse response coefficients of FIR filters are as follows:

FIR Filter Design parameters has been given as follows:

- 1) Order of filter 8
- 2) Window type-keiser window
- 3) Sampling Frequency fs=5000
- 4) Cut off frequency fc=1000
- 5) Input data size-16 bit
- 6) Filter Coefficients

## Design flow of Air Filters:



Figure 1.2 Design flow of FIR filters

The design of a digital filter involves following five steps as indicated in figure 1.2.

Filter Specification: This may include stating the type of filter, for example low pass filter, the desired

amplitude and/or phase responses and the tolerances, the sampling frequency, the word length of the input data.

Filter coefficient calculation: The coefficient of a transfer function H (z) is determined in this step, which will satisfy the given specification [7]. The choice of coefficient calculation method will be influenced by several factors. The most important of which are the critical requirements i.e. specification.

**Realization:** This involves converting the transfer function into a suitable filter network or structure.

**Implementation:** This involves producing the software code and/or hardware and performing the actual filtering [6], [10].

## Structure of 8 - Tap FIR Filer:



Figure 1.3 structure of 8-tap FIR filter

#### Pipelining:

In pipelining, any operation along the critical path is broken into smaller, quicker operation with registers between levels, so as to get a smaller critical path delay. As a result there will be an increase in operating frequency which leads to higher throughput [4]. Pipelining transformation leads to a reduction in the critical path, which can be exploited to either increase

the clock speed or sample speed or to reduce power consumption at same speed. Moreover some real time application requires faster input rates where direct structures cannot be used. In such cases we can use pipelining by introducing latches along the critical data path.

The critical path or the minimum time required for processing a new sample is limited by 1 multiply time and 2 add times, i.e., if  $T_M$  is the time taken for multiplication and  $T_A$  is time needed for addition operation then the "sample period"  $(T_{sample})$  is given by [4],

$$T_{\text{sample}}e^{"T}_{M}+2T_{A}$$
.

Therefore, the sampling frequency (s<sub>ample</sub>) (also referred to as the throughput or the iteration rate) is given by

$$f_{sample} = \frac{1}{T_{\rm M} + 2T_{\rm A}}$$

The direct form structure shown in the figure can only be used when the above equations are satisfied. But if some real-time applications demands a faster input rate (sample rate), then this structure cannot be use. In that case, the effective critical path can be reduced by using either pipelining or parallel processing [4].

The following points give the significances of pipelining;

- > The speed of architecture (or the clock period) is limited by the longest path between any 2 latches or between an input and a latch or between a latch and an output or between the input and the output.
- > This longest path or the "critical path" can be reduced by suitably placing the pipelining latches in the architecture.

#### Structure of 8 - Tap Pipelined FIR Fitler:



Figure 1.4: Structure of pipelined FIR filter

Hence from the figures 1.3 and 1.4, its clear that by introducing additional latches the critical path is reduced from TM+7TA to TM+TA [9].

#### Booth Multiplier:

Multiplier modules are common to many DSP applications. The fastest types of multipliers are parallel multipliers. Among these, the Wallace multiplier is among the fastest. However, they suffer from a bad regularity. Hence, when regularity, high performance and low power are primary concerns, Booth multipliers tend to be the primary choice. Booth multipliers allow the operation on signed operands in 2's-complements [5]. They derive from array multipliers where, for each bit in a partial product line, an encoding scheme is used to determine if this bit is positive, negative or zero.

The Modified Booth algorithm achieves a major performance improvement through radix-4 encoding. In this algorithm each partial product line operates on 2 bits at a time, thereby reducing the total number of the partial products. This is particularly true for operands using 16 bits or more.

## Booth Multiplication Table:

Booth Multiplier can be derived using the specalized method as tablulated in Table 1.1. Here 3 bits of the multiplier is taken into account to identify the Booth Multiplier.

Table 1.1: Radix 4 modified Booth algorithm for odd values of i

| X(i) X(i-1) |   | X(i-2) | 5   |  |
|-------------|---|--------|-----|--|
| 0           | 0 | Q      | +0  |  |
| D           | 0 | 1      | +)  |  |
| 0           | ı | Ú      | +1/ |  |
| 0           | 1 | 1      |     |  |
| ı           | 0 | 0      | -2y |  |
| 1           | 6 | I      | -3. |  |
| I           | 1 | 8      | •3' |  |
| 1           | 1 | 1      | ÷Ú  |  |

#### Simulation results for 8-Tap FIR Filter:

The 8 Tap FIR filter is designed with booth multiplier, adder and D-Flip Flop using Modelsim and its result are shown in Figure 1.5, 1.6 and 1.7.

| Messa                 | 988        |                                    |
|-----------------------|------------|------------------------------------|
| <b>D</b> -⊅ /mult1/x  | -No Data-  | 1011160101011010                   |
| <b>⊡</b> -5 /multily  | -No Data-  | 1100111110100010                   |
| 🕦 🥎 /multi1/w         | · No Data- | 1111001010101001100010101011111010 |
| <b>t</b> -'s fedtife  | -No Data-  | 0000110101011(0100010101111110100  |
| <b>⊞-</b> ∜ /mult1/a  | No Data-   | 01000110101001011                  |
| <b>g-</b> ↑ /mulb1/b  | -No Data-  | 1101110010101010                   |
| <b>c</b> i_^ imulti/c | -No Data-  | - 10100011010100101 <del>1</del>   |
| te-^ /multi/d         | -No Data-  | 00100011010100101                  |
| ∰-∮ multije o         | -No Data-  | 0000000000000000                   |
| <b>c-</b> ^ /multi\f  | -No Data-  | 00000000000000000                  |
| <b>g-</b> ♦ /mult1/g  | No Data    | 11911100101011010                  |
| <b>p</b> ^ multifi    | No Data    | 00100011010100101                  |
| <i>♦  </i> multi/yl   | -No Data-  |                                    |
| 4 /mult1/y3           | -No Data-  |                                    |

Figure 1.5: Simulation result of multiplier module

| g-^ /addsub32/dabaa | 10101011010100010 | 10101011010100100101010101000010101  |
|---------------------|-------------------|--------------------------------------|
| g-^ jaddsub32jdatab | 1111000100011110  | 11110001000111101010101000101010101  |
| 🕹 jaddalb32jadd_sub | St1               |                                      |
| 🏄 Jaddsub32/dk      | 511               |                                      |
| 🚣 faddaúi 32 feault | 10011100011110001 | 100111000111000101011101101101111110 |

Figure 1.6: Simulation result of adder module



Figure 1.7: Simulation result of delay module

The Non-Pipelined and Pipelined filter were desingned by combining the above modules. The simulation results Figure 1.8 and Figure 1.9 shows that the Pipelined 8 tap filter produces results quicker than Non-pipelined filter.



Figure 1.8: Simulation result of Non pipelined FIR filter



Figure 1.9: Simulation result of pipelined FIR filter

## Synthesis report for 8 TAP FIR filter:

Xilinx ISE 12.1i is used for implementation of the Non-pipelined and 8 tap Pipelined FIR filter in Spartan 3E FPGA. The resource utilization of the Pipelined system is 46% compared to the non pipelined system from Figure 1.10 and 1.11.

The comparision between Pipelined and Non-Pipelined system is analysed in terms of Number of LUTs, Minimum Period and Maximum Frequency as shown in Table 1.2.

| Logic Utilization                              | Used  | Available | Utilization | Note(s) |
|------------------------------------------------|-------|-----------|-------------|---------|
| Number of Slice Flip Flops                     | 398   | 9,312     | 4%          |         |
| Number of 4 input LUTs                         | 5,896 | 9,312     | 63%         |         |
| Number of occupied Slices                      | 3,165 | 4,656     | 67%         |         |
| Number of Slices containing only related logic | 3,165 | 3,165     | 100%        |         |
| Number of Slices containing unrelated logic    | 0     | 3,165     | 0%          |         |
| Total Number of 4 input LUTs                   | 5,896 | 9,312     | 63%         |         |
| Number of bonded IOBs                          | 177   | 190       | 93%         |         |
| Number of BUFGMUXs                             | 1     | 24        | 4%          |         |
| Average Fanout of Non-Clock Nets               | 4.32  |           |             |         |

Figure 1.10: Synthesis report of Non pipelined FIR filter on Spartan-3E

| Logic Utilization                              | Used  | Available | Urilization | Note(s) |
|------------------------------------------------|-------|-----------|-------------|---------|
| Number of Slice Flip Flops                     | 448   | 9,312     | 496         |         |
| Number of ≠ input LUTs                         | 4,367 | 9,312     | 46%         |         |
| Number of occupied Slices                      | 2,330 | 4,656     | 50%         |         |
| Number of Slices containing only related logic | 2,330 | 2,330     | 100%        |         |
| Number of Slices containing unrelated logic    | Q     | 2,330     | 0%          |         |
| Total Number of 4 input LUTs                   | 4,374 | 9,312     | 46%         |         |
| Number used as logic                           | 4,367 |           |             |         |
| Number used as a route-thru                    | 7     |           |             |         |
| Number of bonded IOBs                          | 177   | 190       | 93%         |         |
| Number of BUFGMUXs                             | 1     | 2-4       | -19-6       | A.C     |
| Average Egnout of Non-Clock Nets               | 4.39  |           |             |         |

Figure 1.11: Synthesis report of pipelined FIR filter on Spartan-3E

## Pipelining (Vs) Non Pipelining:

Table 1.2: Comparison of Non pipelined and Pipelined FIR filter

|                   | 3 TAP          |            | 8 TAP          |            |
|-------------------|----------------|------------|----------------|------------|
|                   | NON PIPELINING | PIPELINING | NON PIPELINING | PIPELINING |
| LUTS              | 2239           | 1113       | 5896           | 4367       |
| MINIMUM PERIOD    | 38.344 ns      | 10.280 ns  | 44.069 ns      | 10.192 ns  |
| MAXIMUM FREQUENCY | 26.080 MHZ     | 97.276 MHZ | 22.692MHZ      | 98.116 MHZ |

#### Resource Utilisation in FPGA:

FPGA editor view in Xilinx describes about the device utilization for the design. The architectural view of FPGA shown in Figure 1.12 and 1.13 provides the details of amount of resources occupied in the device for Non-pipelined and pipelined FIR filter. By comparing both the figures, it is evident that the device utilized for pipelined FIR filters is much reduced when compared with that of the non-pipelined FIR filters.



Figure 1.12 FPGA of Non pipelined FIR filter



Figure 1.13 FPGA of Pipelined FIR filter

#### VIII, Conclusion

This paper mainly describes design and implementation of Pipelined 8 tap FIR filter using FPGA. As the proposed design of this paper is to enhance the speed of the FIR filter, by introducing pipelining operating frequency of the FIR filter is increased approximately 4 times compared to non pipelined filter. Hence the speed and device utilization (LUTs) of the system is improved from the results obtained through simulation and impentation.

## Future Work:

The design and implementation of 8-tap FIR filter is discussed in this paper. Further the design can be extended to n-tap filter to be used in DSP applications.

#### REFERENCES:

- [1] Jiang Xiaoyan and Bao Yujun, "FIR Filter Design Based on FPGA", 2010 International Conference on Computer Application and System Modeling (ICCASM 2010).
- [2] Jieshan Lin and Huang Shizhen, "An Design of the 16-order FIR Digital Filter Based on FPGA", 2009, International Conference on Information Science and Engineering (ICISE2009).
- [3] Zhang Chi and Guo Li Li, Design of FIR filter with Matlab and running on FPGA, Applied Science and Technology. vol. 33, no. 6, pp.83, Jun. 2006
- [4] Sami Khorbotly, Joan E. Carletta, Robert J. Veillette "A Methodology for Implementing Pipelined Fixed-Point Infinite Impulse Response Filters" 41st Southeastern Symposium on System Theory University of Tennessee Space Institute Tullahoma, TN, USA, March 15-17, 2009
- [5] A. Shaw and M. Ahmed, "Pipelined recursive digital filters: a general look-ahead scheme and optimal approximation," IEEE Trans. On Circuits and Systems II: Analog & Digital Signal Processing, vol. 46, no. 11, pp. 1415–1420, Nov. 1999.
- [6] Chao-Huang Wei, Hsiang-Chieh Hsiao, Su-Wei Tsai "FPGA Implementation of FIR Filter with smallest Processor" IEEE, 2005.
- [7] Mohamed Al Mahdi Eshtawie, and Masuri Bin Othman "An Algorithm Proposed for FIR Filter Coefficients Representation". International Journal of Applied Mathematics and Computer Sciences 4;1 2008
- [8] Keshab K.Parhi "VLSI digital signal processing systems" Jhon Wiley and Sons publishing company, New Delhi, 2008.

- [9] Narendra Singh Pal, Harjit Pal Singh, R.K.Sarin and Sarabjeet Singh, "Implementation of High Speed FIR Filter using Serial and Parallel Distributed Arithmetic Algorithm", International Journal of Computer Applications, Volume 25-No.7, July 2011.
- [10] V.Sudhakar, N.S.Murthy, L.Anjaneyulu, "Area Efficient Pipelined Architecture For Realization of FIR Filter Using Distributed Arithmetic", International Conference on Industrial and Intelligent Information, vol.31, 2012.

## Author's Biography



Mr.A.Kamaraj, currently working as a Asst. Prof. in Mepco Schlenk Engineering College, Sivakasi for the past 4.5 years. He has completed his PG in VLSI Design in Kongu Engg. College, Erode in the year 2006 and

UG in the discipline of ECE in Sri Ramakrishna Engg. College, Coimbatore during 1999 to 2003. He has published 1 International and 2 National Papers. His area of interest is System on Chip. Also he has attended and organized many National and International Seminars/Courses.



Mr. C.Kalyana Sundaram, currently working as a Asst. Prof. in Mepco Schlenk Engineering College, Sivakasi for the past 4.5 years. He has completed his PG in VLSI Design in Kongu Engg. College, Erode in the

year 2006 and UG in the discipline of ECE in PSR Engg. College, Sivakasi during 2000 to 2003. He has published 3 International and 5 National Papers. His area of interest is VLSI Design. Also he has attended and organized many National and International Seminars/Courses.



Mr. J.Senthil Kumar, currently working as a Asst. Prof. in Mepco Schlenk Engineering College, Sivakasi for the past 5.5 years. He has completed his PG in Communication Engineering in NEC, Kovilpatti in the year 2005 and

UG in the discipline of ECE in PSR Engg. College, Sivakasi during 1999 to 2003. He has published 1 International and 12 National Papers. His area of interest is Embedded System. Also he has attended and organized many National and International Seminars/Courses.