Search results
1 – 10 of 312B.N. Mohan Kumar and H.G. Rangaraju
Digital signal processing (DSP) applications such as finite impulse response (FIR) filter, infinite impulse response and wavelet transformation functions are mainly constructed…
Abstract
Purpose
Digital signal processing (DSP) applications such as finite impulse response (FIR) filter, infinite impulse response and wavelet transformation functions are mainly constructed using multipliers and adders. The performance of any digital applications is dependent on larger size multipliers, area and power dissipation. To optimize power and area, an efficient zero product and feeder register-based multiplier (ZP and FRBM) is proposed. Another challenging task in multipliers is summation of partial products (PP), results in more delay. To address this issue, the modified parallel prefix adder (PPA) is incorporated in multiplier design. In this work, different methods are studied and analyzed for designing FIR filter, optimized with respect to area, power dissipation, speed, throughput, latency and hardware utilization.
Design/methodology/approach
The distributed arithmetic (DA)-based reconfigurable FIR design is found to be suitable filter for software-defined radio (SDR) applications. The performance of adder and multipliers in DA-FIR filter restricts the area and power dissipation due to their complexity in terms of generation of sum and carry bits. The hardware implementation time of an adder can be reduced by using PPA which is based on Ling equation. The MDA-RFIR filter is designed for higher filter length (N), i.e. N = 64 with 64 taps and this design is developed using Verilog hardware description language (HDL) and implemented on field-programmable gate array. The design is validated for SDR channel equalizer; both RFIR and SDR are integrated as single system and implemented on Artix-7 development board of part name XC7A100tCSG324.
Findings
The MDA-RFIR for N = 64 is optimized about 33% in terms of area-delay, power-speed product and energy efficiency. The theoretical and practical comparisons have been done, and the practically obtained results are compared with existing DA-RFIR designs in terms of throughput, latency, area-delay, power-speed product and energy efficiency are better about 3.5 times, 31, 45 and 29%, respectively.
Originality/value
The MDA-RFIR for N = 64 is optimized about 33% in terms of area-delay, power-speed product and energy efficiency.
Details
Keywords
D.J. Evans and S. Ghanemi
In Part I of this article, surveys of the architectural concepts involved in designing special‐purpose VLSI computing structures are given. This leads to a discussion of systolic…
Abstract
In Part I of this article, surveys of the architectural concepts involved in designing special‐purpose VLSI computing structures are given. This leads to a discussion of systolic array, wavefront array, WARP and CHiP architectures and their applications. The INMOS transputer chip design and the parallel language OCCAM are introduced. The authors believe that together they form a modular hardware/software component of the type which is essential in the construction of highly parallel computer systems. (Part II of this article will consider the soft simulation of systolic algorithms via OCCAM and the systolisaion of the pattern matching problem).
Details
Keywords
Adders play a vital role in almost all digital designs, as all four arithmetic operations can be confined within addition. Hence, area and power optimization of the adders will…
Abstract
Purpose
Adders play a vital role in almost all digital designs, as all four arithmetic operations can be confined within addition. Hence, area and power optimization of the adders will result in overall circuit optimization. Being the fastest adder, the carry select adder (CSLA) gains higher importance among the different adder styles. However, it suffers from the drawback of increased power and area. The implementation of CSLA in digital circuits requires lots of study for optimization. Hence, to overcome this problem, various improvements were made to the CSLA structure to reduce area and, consequently, reduce power. Among these, modified CSLAs show a significant improvement, as they utilize a binary excess-1 code (BEC) to replace the add-one circuit.
Design/methodology/approach
This paper presents further enhancement in the modified CSLA by proposing a decision-based CSLA, which activates BEC on demand. This leads to reduced switching activity. The performance of the proposal is done by analyzing and comparing it with different adders. The comparison is done on the basis of three performance parameters: area, speed and power consumption. This is done by implementing the architecture on Xilinx Virtex5 XC5VLX30 in Verilog environment and is synthesized using Cadence® RTL Compiler® using TSMC 180-nm CMOS cell library.
Findings
Optimization of power, area and increasing the speed of operation are the three main areas of research in very-large-scale integration (VLSI) design for portable devices. As adders are the most fundamental units for any VLSI design, optimization at the adder level has a huge impact on the overall circuit. The modified CSLA has a BEC which continuously switches irrespective of the previous carry bit generated. The unwanted switching results in excess power consumption while also introducing additional delay. Hence, the author has proposed a decider circuit to avoid this excess switching activity. This allows switching of the BEC only when a previous carry is generated. The modified CSLA is based on the ripple carry adder, while the decider-based CSLA utilizes a carry look-ahead adder. This makes a decider-based CSLA faster while utilizing less area and power consumption when compared to the modified CSLA.
Originality/value
The efficiency of the proposed decider-based CSLA has been verified using Cadence RTL Compiler using TSMC 180-nm CMOS cell library and has been found to have 17 per cent power and 11.57 per cent area optimization when compared to the modified CSLA, while maintaining operating frequency.
Details
Keywords
The fast implementation of multivariable discrete linear systems via VLSI array processors is presented. Two direct state‐space realizations, the block controller form and the…
Abstract
The fast implementation of multivariable discrete linear systems via VLSI array processors is presented. Two direct state‐space realizations, the block controller form and the block observer form, are used as the basis for the proposed implementations, which are comprised of similar processing elements in a linear configuration with nearest neighbor links. High concurrency is achieved by exploiting both parallelism and pipelining. The implementations are characterized by modularity, local communication and high throughput rates.
Naresh Kattekola, Amol Jawale, Pallab Kumar Nath and Shubhankar Majumdar
This paper aims to improve the performance of approximate multiplier in terms of peak signal to noise ratio (PSNR) and quality of the image.
Abstract
Purpose
This paper aims to improve the performance of approximate multiplier in terms of peak signal to noise ratio (PSNR) and quality of the image.
Design/methodology/approach
The paper proposes an approximate circuit for 4:2 compressor, which shows a significant amount of improvement in performance metrics than that of the existing designs. This paper also reports a hybrid architecture for the Dadda multiplier, which incorporates proposed 4:2 compressor circuit as a basic building block.
Findings
Hybrid Dadda multiplier architecture is used in a median filter for image de-noising application and achieved 20% more PSNR than that of the best available designs.
Originality/value
The proposed 4:2 compressor improves the error metrics of a Hybrid Dadda multiplier.
Details
Keywords
Jai Gopal Pandey, Sanskriti Gupta and Abhijit Karmakar
The paper aims to develop a systematic approach to design, integrate, and implement a set of crypto cores in a system-on-chip SoC) environment for data security applications. The…
Abstract
Purpose
The paper aims to develop a systematic approach to design, integrate, and implement a set of crypto cores in a system-on-chip SoC) environment for data security applications. The advanced encryption standard (AES) and PRESENT block ciphers are deployed together, leading to a common crypto chip for performing encryption and decryption operations.
Design/methodology/approach
An integrated very large-scale integration (VLSI) architecture and its implementation for the AES and PRESENT ciphers is proposed. As per the choice, the architecture performs encryption or decryption operations for the selected cipher. Experimental results of the field-programmable gate array (FPGA) and application-specific integrated circuit (ASIC) implementations and related design analysis are provided.
Findings
FPGA implementation of the architecture on Xilinx xc5vfx70t-1-ff1136 device consumes 19% slices, whereas the ASIC design is implemented in 180 nm complementary metal-oxide semiconductor ASIC technology that takes 1.0746 mm2 of standard cell area and consumes 14.26 mW of power at 50 MHz clock frequency. A secure audio application using the designed architecture on an open source SoC environment is also provided. A test methodology for validation of the designed chip using an FPGA-based platform and tools is discussed.
Originality/value
The proposed architecture is compared with a set of existing hardware architectures for analyzing various design metrics such as latency, area, maximum operating frequency, power, and throughput.
Details
Keywords
A.K. Oudjida, S. Titri and M. Hamerlain
Matrix product is a compute bound problem that can be efficiently handled by elementary systolic algorithms. From a theoretical point of view, most of the algorithms are very…
Abstract
Matrix product is a compute bound problem that can be efficiently handled by elementary systolic algorithms. From a theoretical point of view, most of the algorithms are very simple and sometimes even trivial. However, the task of designing efficient implementation on a fixed‐connection network, such as on FPGA where resources are very limited, has been more demanding, and sometimes quite tedious. The objective of this paper is twofold: we first describe a full‐systolic algorithm for matrix product that has the merit over its existing counterparts, to require no preloading of input data into elementary processors (EPs) and generates output data only from boundary EPs. The resulting architecture can accept an uninterrupted stream of input data and produces an uninterrupted one with a latency of 2N‐1 for N×N matrix product. This architecture is also scalable and complies with the constraint of problem‐size independence (ψ). Secondly, we present a methodology for generating a family of very compact MP arrays on FPGA based essentially upon manual mapping at CLB level coupled with VHDL structural level.
Details
Keywords
Tintu Mary John and Shanty Chacko
This paper aims to concentrate on an efficient finite impulse response (FIR) filter architecture in combination with the differential evolution ant colony algorithm (DE-ACO). For…
Abstract
Purpose
This paper aims to concentrate on an efficient finite impulse response (FIR) filter architecture in combination with the differential evolution ant colony algorithm (DE-ACO). For the design of FIR filter, the evolutionary algorithm (EA) is found to be very efficient because of its non-conventional, nonlinear, multi-modal and non-differentiable nature. While focusing with frequency domain specifications, most of the EA techniques described with the existing systems diverge from the power related matters.
Design/methodology/approach
The FIR filters are extensively used for many low power, low complexities, less area and high speed digital signal processing applications. In the existing systems, various FIR filters have been proposed to focus on the above criterion.
Findings
In the proposed method, a novel DE-ACO is used to design the FIR filter. It focuses on satisfying the economic power utilization and also the specifications in the frequency domain.
Originality/value
The proposed DE-ACO gives outstanding performance with a strong ability to find optimal solution, and it has got quick convergence speed. The proposed method also uses the Software integrated synthesis environment (ISE) project navigator (p.28xd) for the simulation of FIR filter based on DE-ACO techniques.
Details
Keywords
Multiple-input multiple-output (MIMO) combined with multi-user massive MIMO has been a well-known approach for high spectral efficiency in wideband systems, and it was targeted to…
Abstract
Purpose
Multiple-input multiple-output (MIMO) combined with multi-user massive MIMO has been a well-known approach for high spectral efficiency in wideband systems, and it was targeted to detect the MIMO signals. The increasing data rates with multiple antennas and multiple users that share the communication channel simultaneously lead to higher capacity requirements and increased complexity. Thus, different detection algorithms were developed for the Massive MIMO.
Design/methodology/approach
This paper focuses on the various literature analyzes on various detection algorithms and techniques for MIMO detectors. Here, it reviews several research papers and exhibits the significance of each detection method.
Findings
This paper provides the details of the performance analysis of the MIMO detectors and reveals the best value in the case of each performance measure. Finally, it widens the research issues that can be useful for future researchers to be accomplished in MIMO massive detectors
Originality/value
This paper has presented a detailed review of the detection of massive MIMO on different algorithms and techniques. The survey mainly focuses on different types of channels used in MIMO detections, the number of antennas used in transmitting signals from the source to destination, and vice-versa. The performance measures and the best performance of each of the detectors are described.
Details
Keywords
B.N. Mohan Kumar and H.G. Rangaraju
Finite impulse response (FIR) digital filters are a general element in several digital signal processing (DSP) systems. In VLSI platform, FIR is a developing filter because the…
Abstract
Purpose
Finite impulse response (FIR) digital filters are a general element in several digital signal processing (DSP) systems. In VLSI platform, FIR is a developing filter because the complexity of design grows with the length of the FIR filter and also it has less latency. Generally, the FIR filter is designed dominated by the multiplier and adder. The conventional FIR filters occupy more area because of several numbers of adders and multipliers for filter designs.
Design/methodology/approach
To overcome this issue, the Vedic Multiplier (VM) and Moore-based LoopBack Adder (MLBA) approach-based optimal FIR filter were designed in this research. Normally, the coefficient has been generated manually, which performs the FIR filter operation. So, the coefficient was generated from the MATLAB filter design and analysis tool. All pass coefficient was introduced in this research, which performs the processing element (PE). The VM approach was utilized in the PE to multiply the filter inputs and coefficients. This research employs the Moore-based LBA (MLBA) in the accumulator for the adding output of the PE. An MLBA approach is a significantly reduced area and increases speed by applying a looping transform function. Here, the proposed method is called a VM-MLBA-FIR filter. In this research, the FIR filter was done in Field Programmable Gate Array (FPGA) Xilinx by using Verilog code on various Virtex devices.
Findings
The experiment results showed that VM-MLBA-FIR filter reduced 26.88% of device utilization and 0.32 W of minimum power consumption compared to the existing PSA-FIR filter.
Originality/value
The experiment results showed that VM-MLBA-FIR filter reduced 26.88% of device utilization and 0.32 W of minimum power consumption compared to the existing PSA-FIR filter.
Details