Middle-East Journal of Scientific Research 23 (4): 750-755, 2015 ISSN 1990-9233 © IDOSI Publications, 2015 DOI: 10.5829/idosi.mejsr.2015.23.04.22167

# High Speed Multiplication and Accumulation (MAC) Design for Digital Fir Filter

<sup>1</sup>S. Chinnapparaj and <sup>2</sup>D. Somasundareswari

<sup>1</sup>Department of ECE, Hindusthan Intitute, Coimbatore, Tamilnadu, India <sup>2</sup>Dean/Electronics and Communication, SNS College of Technology, Coimbatore, Tamilnadu, India

**Abstract:** In this paper, efficient MAC (Multiplication and Accumulation) unit of digital FIR filter is designed to increase the speed and throughput of digital FIR filter. Wireless communication technologies have enlarged the demands for signal processing operations. One of the important key factors of signal processing approach is Finite Impulse Response (FIR) filter. Large endeavours have been worked on digital FIR filter to improve their performance. The best existing BEC (Binary to Excess1 conversion) based SQRT CSLA (Square Root Carry Select Adder) accumulation unit is modified by re-designing the carry selection block. In addition, modified SQRT CSLA is incorporated into reduced complexity Wallace multiplier for addition process. Hence, implementation results of proposed MAC (reduced complexity Wallace multiplier with help of modified SQRT CSLA) unit provide high speed and less area than other existing MAC units. Further the designed efficient MAC unit is integrated into direct form FIR filter to improve the filtering performance.

Key words: Modified SQRT CSLA (Square Root Carry Select Adder) • BEC based SQRT CSLA • MAC (Multiplication and Accumulation) • Reduced complexity Wallace multiplier • Direct form FIR filter

## INTRODUCTION

Improvements of wireless standards and mobile computing applications have largely demands on low power digital signal processing (DSP) architectures. One of the most important DSP operations for signal processing application is Finite Impulse Response (FIR) filter. FIR filter is a type of digital filter has linear phase and stability characteristics. A large endeavours have been implemented the direct form FIR filter to improve the performance of digital FIR filter. In this paper, design of MAC unit for FIR filter is done with less number of chip size, delay and power. The input-output relation of linear time invariant (LTI) direct form FIR filter is represented as in equation (1).

$$y(n) = \sum_{k=0}^{N-1} C_k x(n-1)$$
(1)

where, x(n) represents the filter input, y(n) represents the filter output, N is the length of filter or order of the filter and Ck denotes the filter coefficients. The filter order (N) is fixed in case of direct form FIR filter. The heart of direct form FIR filter is MAC unit. To implement MAC unit through VLSI System design environment, efficient

structure of adder and multiplier required with VLSI main concerns (Low power consumption, less area and high speed). Generally FIR filters with large number of tabs are necessary to obtain high spectral containment and noise reductions. The computational delay and required chip size of direct form FIR filter has raised due to inefficient adder and multiplier structures. In previous manuever [1], Canonic Signed Digit (CSD) representation is used to reduce the number of adder and multiplier. To perform constant multiplication of direct form FIR filter, Multiple Constant Multiplication (MCM) is used in [2]. This approach cannot be used when filter coefficients change dynamically. Hence, the effective Wallace multiplier is proposed in [3-12], by using compressors. Further to improve the multiplication process Modified Booth Algorithm [8] is used for design of Wallace multiplier. Further to improve the performance of Wallace multiplier, effective adder structures are used for adding the partial product results [2]. Square Root Carry Select Adder (SQRT CSLA) is one of the best adders which provide less area and power for addition process. Generally it consists of Half Sum Generation (HSG), Carry Generation (CG) and Full Sum Generation (FSG) as in [3]. In addition Modified Carry Save Adder is also produce better result for addition process [10, 6].

In this paper, the effective MAC unit is designed with help of reduced complexity Wallace multiplier. The main objective of this paper is the modification of SQRT CSLA and incorporation of modified SQRT CSLA into reduced complexity Wallace multiplier. In this design, the circuit for SQRT CSLA is re-designed to reduce the number of gates and therefore modified SQRT CSLA is incorporated into reduced complexity Wallace multiplier. The result of proposed work offers better performance in terms of chip size, delay and power than existing reduced complexity Wallace multiplier with help of modified carry save adder. Further, both existing and proposed reduced complex Wallace multiplier are incorporated into direct form FIR filter instantly to analyze their performances.

**Existing Reduced Complexity Wallace Multiplier:** A Wallace multiplier is a parallel multiplier which performs multiplication operation effectively. The architecture of reduced complexity Wallace multiplier [11] consists of les number of half adder and full adder to perform partial products. In reduced complex Wallace multiplier, partial products are generated through N<sup>2</sup> AND gates and they are arranged in triangle order.

The procedure for producing partial products using reduced complexity Wallace multiplier is as follows:

- The matrix is divided into three row groups in the reduced complexity Wallace multiplier.
- All three bit combinations are added using full adder.
- Single bit and a group of two bits are moved to the next stage directly.

In the final stage, it requires effective digital adder structure for doing binary addition process. In existing system, modified Carry Save Adder is used for addition process. But this requires more number of chip size and delay for implementation. Hence to improve the performance of reduced complexity Wallace multiplier, still we require efficient adder structure. To fulfil this requirement, SQRT CSLA adder structure is re-designed in this paper. The modified SQRT CSLA adder effectively reduces the chip size and delay for addition process.

**Modified SQRT CSLA:** General architecture of SQRT CSLA consists of Ripple Carry Adder unit when input carry 0 (RCA0), Ripple Carry Adder unit when input carry 1 (RCA1) and full sum generation unit (HSG). This method uses the two types RCA units for input carry 0 and input carry 1 respectively and therefore this causes more chip size and delay for select carry outputs. Further RCA1 unit

has replaced by BEC unit for reduce the delay. Hence the existing system called as BEC based SQRT CSLA. However, this architecture also requires more chip size and low speed for addition process. Hence to overcome this problem, the circuit for SQRT CSLA is re-designed. The modified SQRT CSLA consists of Half Sum Generation (HSG) unit, FSG, Carry Generation (CG) unit for both input carry 0 and input carry 1 and Carry Selection (CS) unit. The design of modified SQRT CSLA consists only less number of logic gates when compared to BEC based SQRT CSLA. The architecture of modified SQRT CSLA for 4-bit addition is shown in Figure 1. Similarly we can extend this for 16-bit addition. This architecture consists of HSG, CG unit, CS unit and FSM unit. Also the number of gates is reduced in the proposed design through common Boolean logic expressions. This indicated in CG unit of proposed SQRT CSLA. The common expressions ab+c are used for both CG units and CS unit. Hence, the proposed SQRT CSLA offers less area and delay when compared to conventional BEC based SORT CSLA[5].

Similar modifications are made on 2-bit, 3-bit, 5-bit SQRT CSLA and combining all these we get 16-bit SQRT CSLA structure. Four sets of binary addition process are done concurrently, with help above mentioned modifications. Hence this architecture named as modified SQRT CSLA. The block diagram of 16-bit SQRT CSLA is shown in Figure 2.

**Reduced Complexity Wallace Multiplier Using Modified SQRT CSLA:** Reduced complexity Wallace multiplier is a parallel multiplier in which complexity of multiplication process is reduced. To generate the partial product of reduced complexity Wallace multiplier, N2 AND gates are used and they are arranged in a triangular position. The procedure for generating the partial products is same as existing reduced complexity Wallace multiplier. The method of partial product generation of reduced complexity multiplier is shown in Figure 3 [9].

Further to add the partial generation output, efficient adder structure is essential in final stage of reduced complexity Wallace multiplier. Hence in this proposed work, the designed modified SQRT CSLA is used for addition process of reduced complex multiplier structure. This provides better performance in terms of area, delay and power than existing reduced complexity Wallace multiplier. The proposed MAC (reduced complexity Wallace multiplier with help of modified SQRT CSLA) unit design is absolutely suitable for digital FIR filter. Middle-East J. Sci. Res., 23 (4): 750-755, 2015



Fig. 1: Architecture of modified SQRT CSLA



Fig. 2: Block diagram of 16-bit modified SQRT CSLA



Fig. 3: Partial products generation of reduced complexity Wallace multiplier



Fig. 4: Structure of direct form FIR filter

**Proposed Direct Form Fir Filter:** The structure for direct form FIR filter is shown in figure 4. This structure consists of multipliers, adders and delay units to perform digital

filtering operations. From Figure 4, it is clear that the performance of direct form FIR filter is mostly depends on MAC unit. The low power or area schemes are developed

for FIR filter in previous endeavours. To further improve the performance of digital FIR filter, proposed MAC (reduced complexity Wallace multiplier with help of modified SQRT CSLA) unit is incorporated into direct form FIR filter. When comparing direct form FIR filter using existing reduced complexity Wallace multiplier, the proposed direct form FIR filter using modified SQRT CSLA based reduced complexity Wallace multiplier provides better results. In order to attain high spectral suppression and/or noise reduction, digital FIR filters with moderately large number of tabs are essential as shown in Figure 4 [7].

#### **RESULTS AND DISCUSSIONS**

The aim of proposed work is to improve the performance of MAC for digital FIR filter. In this paper, efficient MAC unit is designed with help of reduced complexity Wallace multiplier and modified SQRT CSLA. The design of both reduced complexity Wallace multiplier and modified SQRT CSLA is done by using Verilog Hardware Description Language (Verilog HDL). From simulation and synthesis tools, the results for MAC unit is analyzed and compared. Synthesis results for MAC unit are analyzed as follows:

Synthesis results for both BEC based SQRT CSLA and modified SQRT CSLA is analyzed and compared as shown in Table 1. From obtained results, it shows that the modified SQRT CSLA consumes less area and delay when compared to BEC based SQRT CSLA. These performances are graphically represented in Figure 5. From graphical representation, it is clear that the modified SQRT CSLA offers 29.26% reduction in area and 4.23% reduction in delay when compared to BEC based SQRT CSLA [9].

The modified SQRT CSLA is applied to reduced complexity Wallace multiplier for multiplication process. Then the results for both reduced complexity Wallace



Fig. 5: Performance of both BEC based SQRT CSLA and modified SQRT CSLA



Fig. 6: Performance of existing and proposed reduced complexity Wallace multiplier

#### Middle-East J. Sci. Res., 23 (4): 750-755, 2015

| Туре                | Slices | LUT | Delay(ns) |
|---------------------|--------|-----|-----------|
| BEC based SQRT CSLA | 41     | 75  | 20.717    |
| Modified SQRT CSLA  | 29     | 53  | 19.839    |

Table 2: Comparison of both reduced complexity Wallace multiplier with help of BEC based SQRT CSLA and modified SQRT CSLA

| Туре                                                                            | Slices | LUT | Delay(ns) | Power(mW) |
|---------------------------------------------------------------------------------|--------|-----|-----------|-----------|
| Existing reduced complexity Wallace multiplier with help of BEC based SQRT CSLA | 119    | 224 | 21.499    | 264       |
| Proposed reduced complexity Wallace multiplier with help of modified SQRT CSLA  | 80     | 155 | 17.74     | 224       |

multiplier with help of BEC based SQRT CSLA and modified SQRT CSLA is analyzed and compared in Table 2. It shows that, proposed reduced complexity Wallace multiplier with help of BEC based SQRT CSLA offers 32.73% reduction in area, 17.48% reduction in delay and 15.15% reduction in power when compared to existing reduced Wallace multiplier with help of BEC based SQRT CSLA. These performances are graphically represented in Figure 6.

Further the proposed reduced complexity reduced Wallace multiplier is incorporated into direct form FIR filter to improve the performance of direct form FIR filter. Therefore, the proposed MAC unit is absolutely suitable for digital signal processing applications and wireless communication applications [10].

## CONCLUSION

In this paper, high speed and area efficient MAC unit is designed with help of reduced complexity Wallace multiplier and modified SORT CSLA for digital FIR filter. Conventional BEC based SQRT CSLA is re-designed in this paper to reduce the chip size and delay for addition process. This modified SQRT CSLA is incorporated into reduced complexity Wallace multiplier to improve the performance of digital multiplication process. The proposed reduced complexity Wallace multiplier offers 32.73% reduction in area, 17.48% reduction in delay and 15.15% reduction in power when compared to existing reduced complexity Wallace multiplier. Further the proposed reduced complexity Wallace multiplier is incorporated into digital FIR filter to improve the digital filtering performance. In future, the proposed MAC based digital filter will be useful to implementation of parallel FIR filter for wireless standard communication, signal and image processing applications.

## REFERENCES

- Dempster, Andrew G. and Malcolm D. Macled, 1995. Use of Minimum-Adder Multiplier Blocks in FIR Digital Filters, IEEE Transactions on Circuits and Systems, 42(9).
- Dash Anindita, Swetapadma Dash and S.K. Mandal, 2014. Design of Optimized Wallace Tree Multiplier in Cadence, International Journal of Computer Applications (IJCA).
- Mohanty Basant Kumar and Sujit Kumar Patel, 2014. Area–Delay–Power Efficient Carry-Select Adder, IEEE Transactions on Circuits and Systems, 61(6).
- Chepuri Satish, Panem Charan Arur, G. Kishore Kumar and G. Mamatha, 2014. An Efficient High Speed Wallace Tree Multiplier, International Journal of Emerging Trends in Electrical and Electronics (IJETEE), 10(4).
- Bharti, Deepshikha and K. Anusudha, 2013. High Speed FIR Filter Based on Truncated Multiplier and Parallel Adder, International Journal of Engineering Trends and Technology (IJETT), 5(5).
- Gowrishankar, V., D. Manoranjitham and P. Jagadeesh, 2013. Efficient FIR filter design using Modified Carry Select Adder and Wallace tree multiplier, International Journal of Science, Engineering and Technology Research (IJSETR), 2(3).
- Malini, Hema and C. Srimathi, 2014. Low Complexity Digit Serial FIR Filter By Multiple Constant Multiplication Algorithms, 3(4).
- Rao, Jagadeshwar M. and Sanjay Dubey, 2012. A High Speed Wallace Tree Multiplier Using Modified Booth Algorithm for Fast Arithmetic Circuits. IOSR Journal of Electronics and Communication Engineering (IOSRJECE), 3(1): 07-11.

- Gahlan, Naveen Kr., Prabhat Shukla and Jasbir Kaur, 2012. Implementation of Wallace Tree Multiplier Using Compressor, International Journal of Technology and Application, 3(3): 1194-1199.
- Senthilkumar, M and S. Ramani, 2013. FPGA Implementation of digital FIR filter based on truncated multiplier, IOSR Journal of Electronics and Communication Engineering (IOSR-JECE), pp: 44-49.
- Priyatharshne, T.N., L. Raja and A. Vinodhini, 2014. An Optimized Wallace Tree Multiplier using Parallel Prefix Han-Carlson Adder for DSP Processors, International Journal of Advanced Research in Electronics and Communication Engineering (IJARECE), 3(11).
- Yu, Pan and Pramod Kumar Meher, 2011. Bit-Level Optimization of Adder-Trees for Multiple Constant Multiplication FIR Filter Implementation, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 61(2).