Search results
1 – 10 of 19Mehdi Darbandi, Amir Reza Ramtin and Omid Khold Sharafi
A set of routers that are connected over communication channels can from network-on-chip (NoC). High performance, scalability, modularity and the ability to parallel the structure…
Abstract
Purpose
A set of routers that are connected over communication channels can from network-on-chip (NoC). High performance, scalability, modularity and the ability to parallel the structure of the communications are some of its advantages. Because of the growing number of cores of NoC, their arrangement has got more valuable. The mapping action is done based on assigning different functional units to different nodes on the NoC, and the way it is done contains a significant effect on implementation and network power utilization. The NoC mapping issue is one of the NP-hard problems. Therefore, for achieving optimal or near-optimal answers, meta-heuristic algorithms are the perfect choices. The purpose of this paper is to design a novel procedure for mapping process cores for reducing communication delays and cost parameters. A multi-objective particle swarm optimization algorithm standing on crowding distance (MOPSO-CD) has been used for this purpose.
Design/methodology/approach
In the proposed approach, in which the two-dimensional mesh topology has been used as base construction, the mapping operation is divided into two stages as follows: allocating the tasks to suitable cores of intellectual property; and plotting the map of these cores in a specific tile on the platform of NoC.
Findings
The proposed method has dramatically improved the related problems and limitations of meta-heuristic algorithms. This algorithm performs better than the particle swarm optimization (PSO) and genetic algorithm in convergence to the Pareto, producing a proficiently divided collection of solving ways and the computational time. The results of the simulation also show that the delay parameter of the proposed method is 1.1 per cent better than the genetic algorithm and 0.5 per cent better than the PSO algorithm. Also, in the communication cost parameter, the proposed method has 2.7 per cent better action than a genetic algorithm and 0.16 per cent better action than the PSO algorithm.
Originality/value
As yet, the MOPSO-CD algorithm has not been used for solving the task mapping issue in the NoC.
Details
Keywords
Shu‐yan Jiang, Gang Luo, Su Chen, Wen‐han Zhao and Qi‐zhong Zhou
The purpose of this paper is to introduce several synchronization test methods of Network‐on‐Chip (NoC) at multi‐clock domains by digital logic circuits.
Abstract
Purpose
The purpose of this paper is to introduce several synchronization test methods of Network‐on‐Chip (NoC) at multi‐clock domains by digital logic circuits.
Design/methodology/approach
First, the authors gave the structure of NoC, the test methods for NoC in multi‐clock domains, including Built‐in Self Test (BIST) structure and the architecture of embedded core test. Then the authors approached four different synchronization structures: two‐level trigger, two kinds of lock methods, toggle and pulse synchronization methods. Based on the NoC work conditions, the authors built the experiment structures of different methods, and obtained the experiment results at high frequencies.
Findings
From the experiments at high frequency, it can be seen that the methods of toggle and the pulse methods are prone to failed synchronization. Therefore, the lock method is more appropriate for NoC under multiple clock domains.
Originality/value
In this paper, several synchronization test methods of NoC at multi‐clock domains are discussed and compared, and the best one determined.
Details
Keywords
Seyyed Javad Seyyed Mahdavi Chabok and Seyed Amin Alavi
The routing algorithm is one of the most important components in designing a network-on-chip (NoC). An effective routing algorithm can cause better performance and throughput, and…
Abstract
Purpose
The routing algorithm is one of the most important components in designing a network-on-chip (NoC). An effective routing algorithm can cause better performance and throughput, and thus, have less latency, lower power consumption and high reliability. Considering the high scalability in networks and fault occurrence on links, the more the packet reaches the destination (i.e. to cross the number of fewer links), the less the loss of packets and information would be. Accordingly, the proposed algorithm is based on reducing the number of passed links to reach the destination.
Design/methodology/approach
This paper presents a high-performance NoC that increases telecommunication network reliability by passing fewer links to destination. A large NoC is divided into small districts with central routers. In such a system, routing in large routes is performed through these central routers district by district.
Findings
By reducing the number of links, the number of routers also decreases. As a result, the power consumption is reduced, the performance of the NoC is improved, and the probability of collision with a faulty link and network latency is decreased.
Originality/value
The simulation is performed using the Noxim simulator because of its ability to manage and inject faults. The proposed algorithm, XY routing, as a conventional algorithm for the NoC, was simulated in a 14 × 14 network size, as the typical network size in the recent works.
Details
Keywords
Sujata S.B. and Anuradha M. Sandi
The small area network for data communication within routers is suffering from storage of packet, throughput, latency and power consumption. There are a lot of solutions to…
Abstract
Purpose
The small area network for data communication within routers is suffering from storage of packet, throughput, latency and power consumption. There are a lot of solutions to increase speed of commutation and optimization of power consumption; one among them is Network-on-chip (NoC). In the literature, there are several NoCs which can reconfigurable dynamically and can easily test and validate the results on FPGA. But still, NoCs have limitations which are regarding chip area, reconfigurable time and throughput.
Design/methodology/approach
To address these limitations, this research proposes the dynamically buffered and bufferless reconfigurable NoC (DB2R NoC) using X-Y algorithm for routing, Torus for switching and Flexible Direction Order (FDOR) for direction finding between source and destination nodes. Thus, the 3 × 3 and 4 × 4 DB2R NoCs are made free from deadlock, low power and latency and high throughput. To prove the applicability and performance analysis of DB2R NoC for 3 × 3 and 4 × 4 routers on FPGA, the 22 bits for buffered and 19 bit for bufferless designs have been successfully synthesized using Verilog HDL and implemented on Artix-7 FPGA development bond. The virtual input/output chips cope pro tool has been incorporated in the design to verify and debug the complete design on Artix-7 FPGA.
Findings
In the obtained result, it has been found that 35% improvement in throughput, 23% improvement in latency and 47% optimization in area has been made. The complete design has been tested for 28 packets of injection rate 0.01; the packets have been generated by using NLFSR.
Originality/value
In the obtained result, it has been found that 35% improvement in throughput, 23% improvement in latency and 47% optimization in area has been made. The complete design has been tested for 28 packets of injection rate 0.01; the packets have been generated by using NLFSR.
Details
Keywords
Yaseer Arafat Durrani, Teresa Riesgo, Muhammad Imran Khan and Tariq Mahmood
Low-power consumption has become an important issue that cannot be ignored in System-on-Chip (SoC) design. The key challenge encountered by system design is how to maintain…
Abstract
Purpose
Low-power consumption has become an important issue that cannot be ignored in System-on-Chip (SoC) design. The key challenge encountered by system design is how to maintain balance between the estimation accuracy and speed. This paper aims at demonstrating an accurate and fast power estimation technique.
Design/methodology/approach
The methodology adopted in the paper is to use input patterns with the predefined statistical characteristics which helps to analyze the average power consumption of the different intellectual-property (IP) cores and the interconnects/buses in SoC design. Similarly the paper has implemented Genetic algorithm (GA) to generate sequences of input signals during the power estimation procedure.
Findings
The GA concurrently optimizes the input signal characteristics that influence the final solution of the pattern. In addition to that, a Monte-Carlo zero-delay simulation is also performed for individual IP core and bus at high-level. By the simple addition of these cores/buses, power is predicted by a novel macro-model function. In experiments, the average error is estimated at 13.84%.
Research limitations/implications
To present the research findings with clarity and to avoid complexities, the paper does not consider delay factors like glitches, jitter etc. in the power model.
Practical implications
The proposed methodology allowed accurate power/energy analysis of practical applications mapped onto Network-on-Chip (NoC) based Multiprocessors SoC platform. It enables the performance analysis of different design alternatives under the load imposed by complex applications.
Originality/value
This paper is an original contribution and the results demonstrate that our novel technique could be implemented to achieve fast and accurate power estimation in the early stage of any SoC design.
Anurag Shrivastava and Sudhir Kumar Sharma
Increase in the speed of processors has led to crucial role of communication in the performance of systems. As a result, routing is taken into consideration as one of the most…
Abstract
Purpose
Increase in the speed of processors has led to crucial role of communication in the performance of systems. As a result, routing is taken into consideration as one of the most important subjects of the network-on-chip (NOC) architecture. Routing algorithms to deadlock avoidance prevent packets route completely based on network traffic condition by means of restricting the route of packets. This action leads to less performance especially in non-uniform traffic patterns. On the other hand, true fully adaptive routing algorithm provides routing of packets completely based on traffic conditions. However, deadlock detection and recovery mechanisms are needed to handle deadlocks. Use of a global bus beside NOC as a parallel supportive environment provides a platform to offer advantages of both features of bus and NOC.
Design/methodology/approach
In this research, the authors use this bus as an escaping path for deadlock recovery technique.
Findings
According to simulation results, this bus is a suitable platform for a deadlock recovery technique.
Originality/value
This bus is useful for broadcast and multicast operations, sending delay sensitive signals, system management and other services.
Details
Keywords
Afshan Amin Khan, Roohie Naaz Mir and Najeeb-Ud Din
This work focused on a basic building block of an allocation unit that carries out the critical job of deciding between the conflicting requests, i.e. an arbiter unit. The purpose…
Abstract
Purpose
This work focused on a basic building block of an allocation unit that carries out the critical job of deciding between the conflicting requests, i.e. an arbiter unit. The purpose of this work is to implement an improved hybrid arbiter while harnessing the basic advantages of a matrix arbiter.
Design/methodology/approach
The basic approach of the design methodology involves the extraction of traffic information from buffer signals of each port. As the traffic arrives in the buffer of respective ports, information from these buffers acts as a source of differentiation between the ports receiving low traffic rates and ports receiving high traffic rates. A logic circuit is devised that enables an arbiter to dynamically assign priorities to different ports based on the information from buffers. For implementation and verification of the proposed design, a two-stage approach was used. Stage I comprises comparing the proposed arbiter with other arbiters in the literature using Vivado integrated design environment platform. Stage II demonstrates the implementation of the proposed design in Cadence design environment for application-specific integrated chip level implementation. By using such a strategy, this study aims to have a special focus on the feasibility of the design for very large-scale integration implementation.
Findings
According to the simulation results, the proposed hybrid arbiter maintains the advantage of a basic matrix arbiter and also possesses the additional feature of fault-tolerant traffic awareness. These features for a hybrid arbiter are achieved with a 19% increase in throughput, a 1.5% decrease in delay and a 19% area increase in comparison to a conventional matrix arbiter.
Originality/value
This paper proposes a traffic-aware mechanism that increases the throughput of an arbiter unit with some area trade-off. The key feature of this hybrid arbiter is that it can assign priorities to the requesting ports based upon the real-time traffic requirements of each port. As a result of this, the arbiter is dynamically able to make arbitration decisions. Now because buffer information is valuable in winning the priority, the presence of a fault-tolerant policy ensures that none of the priority is assigned falsely to a requesting port. By this, wastage of arbitration cycles is avoided and an increase in throughput is also achieved.
Details
Keywords
Abdulla Alateeq, Wael Elmedany, Nedal Ababneh and Kevin Curran
The purpose of this paper is to investigate the latest research related to secure routing protocols in Wireless Sensor Network (WSN) and propose a new approach that can achieve a…
Abstract
Purpose
The purpose of this paper is to investigate the latest research related to secure routing protocols in Wireless Sensor Network (WSN) and propose a new approach that can achieve a higher security level compared to the existing one. One of the main security issues in WSNs is the security of routing protocols. A typical WSN consists of a large number of small size, low-power, low-cost sensor devices. These devices are very resource-constrained and usually use cheap short-range radios to communicate with each other in an ad hoc fashion thus, achieving security in these networks is a big challenge, which is open for research.
Design/methodology/approach
The route updates and data messages of the protocol are authenticated using Edwards-curves Digital Signature Algorithm (EdDSA). Routing protocols play an essential role in WSNs, they ensure the delivery of the sensed data from the remote sensor nodes to back-end systems via a data sink. Routing protocols depend on route updates received from neighboring nodes to determine the best path to the sink. Manipulating these updates by inserting rouge nodes in the network that advertise false updates can lead to a catastrophic impact on the compromised WSN performance.
Findings
As a result, a new secure energy-aware routing protocol (SEARP) is proposed, which uses security enhanced clustering algorithm and EdDSA to authenticate route advertisements and messages. A secure clustering algorithm is also used as part of the proposed protocol to conserve energy, prolong network lifetime and counteract wormhole attacks.
Originality/value
In this paper, a SEARP is proposed to address network layer security attacks in WSNs. A secure clustering algorithm is also used as part of the proposed protocol to conserve energy, prolong network lifetime and counteract wormhole attacks. A simulation has been carried out using Sensoria Simulator and the performance evaluation has been discussed.
Details
Keywords
Vipin Sharma, Abdul Q. Ansari and Rajesh Mishra
The purpose of this paper is to design a efficient layout of Multistage interconnection network which has cost effective solution with high reliability and fault-tolerence…
Abstract
Purpose
The purpose of this paper is to design a efficient layout of Multistage interconnection network which has cost effective solution with high reliability and fault-tolerence capability. For parallel computation, various multistage interconnection networks (MINs) have been discussed hitherto in the literature, however, these networks always required further improvement in reliability and fault-tolerance capability. The fault-tolerance capability of the network can be achieved by increasing the number of disjoint paths as a result the reliability of the interconnection networks is also improved.
Design/methodology/approach
This proposed design is a modification of gamma interconnection network (GIN) and three disjoint path gamma interconnection network (3-DGIN). It has a total seven number of paths for all tag values which is uniform out of these seven paths, three paths are disjoint paths which increase the fault tolerance capability by two faults. Due to the presence of more paths than the GIN and 3-DGIN, this proposed design is more reliable.
Findings
In this study, a new design layout of a MIN has been proposed which provides three disjoint paths and uniformity in terms of an equal number of paths for all source-destination (S-D) pairs. The new layout contains fewer nodes as compared to GIN and 3-DGIN. This design provides a symmetrical structure, low cost, better terminal reliability and provides an equal number of paths for all tag values (|S-D|) when compared with existing MINs of this class.
Originality/value
A new design layout of MINs has been purposed and its two terminal reliability is calculated with the help of the reliability block diagram technique.
Details
Keywords
Hongbin Liu, Hu Ren, Hanfeng Gu, Fei Gao and Guangwen Yang
The purpose of this paper is to provide an automatic parallelization toolkit for unstructured mesh-based computation. Among all kinds of mesh types, unstructured meshes are…
Abstract
Purpose
The purpose of this paper is to provide an automatic parallelization toolkit for unstructured mesh-based computation. Among all kinds of mesh types, unstructured meshes are dominant in engineering simulation scenarios and play an essential role in scientific computations for their geometrical flexibility. However, the high-fidelity applications based on unstructured grids are still time-consuming, no matter for programming or running.
Design/methodology/approach
This study develops an efficient UNstructured Acceleration Toolkit (UNAT), which provides friendly high-level programming interfaces and elaborates lower level implementation on the target hardware to get nearly hand-optimized performance. At the present state, two efficient strategies, a multi-level blocks method and a row-subsections method, are designed and implemented on Sunway architecture. Random memory access and write–write conflict issues of unstructured meshes have been handled by partitioning, coloring and other hardware-specific techniques. Moreover, a data-reuse mechanism is developed to increase the computational intensity and alleviate the memory bandwidth bottleneck.
Findings
The authors select sparse matrix-vector multiplication as a performance benchmark of UNAT across different data layouts and different matrix formats. Experimental results show that the speed-ups reach up to 26× compared to single management processing element, and the utilization ratio tests indicate the capability of achieving nearly hand-optimized performance. Finally, the authors adopt UNAT to accelerate a well-tuned unstructured solver and obtain speed-ups of 19× and 10× on average for main kernels and overall solver, respectively.
Originality/value
The authors design an unstructured mesh toolkit, UNAT, to link the hardware and numerical algorithm, and then, engineers can focus on the algorithms and solvers rather than the parallel implementation. For the many-core processor SW26010 of the fastest supercomputer in China, UNAT yields up to 26× speed-ups and achieves nearly hand-optimized performance.
Details