AMD Alveo Based HPQEA Sets Quantum Simulation to 30 qubits

High-Performance Quantum Emulation Accelerator (HPQEA)

Quantum Computing Breakthrough: HPQEA Quantum Emulator Achieves Unprecedented 30-Qubit Simulation Scale with GPU-Beating Speed

The creation of extremely potent and adaptable emulation tools is urgently needed due to the present quantum algorithms‘ exponentially increasing complexity. However, the ability of current simulation systems to achieve both high performance and effective resource utilization is constantly severely limited. An inventive research team has introduced the High-Performance Quantum Emulation Accelerator (HPQEA), a unique technology intended to get above these significant constraints in terms of performance, scalability, applicability, and resource efficiency, marking a significant advancement for classical quantum simulation.

The development team, which included Hoai Luan Pham, Yasuhiko Nakashima, and Tran Van Duy, Tuan Hai Vu, and Vu Trung Duong Le from the Nara Institute of Science and Technology, created HPQEA as a quantum emulator based on the state-vector emulation methodology. This system has shown excellent performance and scalability, marking a major breakthrough in the field of quantum simulation.

You can also read Quantum-Enhanced Computer Vision: An In-Depth look At Emerging Paradigms

Setting a New Benchmark: 30-Qubit Scalability

An FPGA Accelerator for Scalable Quantum Simulation is used to implement HPQEA. The AMD Alveo U280 FPGA board is home to the system’s central component. This board’s evaluation and verification show that HPQEA can faithfully simulate quantum circuits with up to 30 qubits. This level of emulation greatly surpasses what many current FPGA-based devices can do. The system effectively maintained low mean square error and high fidelity, two essential markers of accurate simulation, across the lengthy testing procedure.

A high-performance processing core, a specially designed method for computing controlled-NOT (CX) gates, and efficient usage of High-Bandwidth Memory (HBM) are all features of the HPQEA architecture.

HPQEA has demonstrated remarkable outcomes when compared to similar simulation platforms:

Speed and Efficiency: When compared to similar FPGA-based systems, HPQEA offers faster execution speeds.
Algorithm Support: A greater variety of quantum algorithms are supported.
Resource Footprint: Compared to rival FPGA systems, it uses less hardware.

Importantly, HPQEA was demonstrated to outperform an Nvidia A100 GPU in normalized gate speed for systems simulating up to 20 qubits in a direct comparison of computational efficiency. A significant increase in processing efficiency for quantum emulation capabilities is confirmed by these measurements.

You can also read FirstQFM AB Secures €1.2 M Pre-Seed Funding to Increase AI for Quantum

An Architecture Optimized for Quantum Computation

The HPQEA system’s success is a result of a highly optimized hardware architecture designed for compute and storage with a broad algorithm compatibility goal. By breaking down quantum circuits into low-level gates and applying them successively to the initial state, the system is intended to replicate quantum calculations on classical hardware and produce a final state that is identical to that of a theoretical quantum computation.

Important elements intended to improve memory management and performance include:

Dual Processing Element Arrays (PEAs): HPQEA uses dual PEAs, which are designed to speed up computation and minimise memory usage, making it easier to conduct quantum gates operations in parallel.
Specialized Processing Elements: Every Processing Element (PE) in the HPQEA system is extremely tailored. It consists of a Special Unit (SU) and an Arithmetic Logic Unit (ALU), which work together to carry out the intricate computations needed for quantum gate operations. Careful hardware optimisation, including the creation of these specialized arithmetic logic units and the installation of a specific control structure inside each processing element, allowed for this high performance.
Optimized CX Handling: Two-qubit controlled-NOT gates (CX gates) frequently incur a large amount of delay when they are executed. The architecture includes a specific Optimized CX Swapper to counteract this. By actively enhancing data transfer scheduling, this component lowers the latency related to these crucial processes.

You can also read Researchers Develop Quantum Autonomous Gates for More Stable Systems

Leveraging High-Bandwidth Memory (HBM) for Scale

Memory capacity is inherently heavily taxed by quantum simulation. HPQEA successfully uses High-Bandwidth Memory (HBM) to handle the massive data access and storage needed for intricate data structures in quantum simulation. To provide effective data management and access, it employs a bulk data transfer approach.

The host PC, which is connected to the FPGA board for data transfer and program generation, is where the entire simulation process starts. A Qiskit program written in Python initiates the procedure by creating the circuit context and the initial quantum state. A C application then processes this context before the data is sent to the actual HPQEA hardware. Direct Memory Access (DMA), which essentially avoids the CPU throughout the transfer phase, is used to send the processed data in order to reduce any possible data transfer bottlenecks.

Paving the Way for Accessible Quantum Research

The HPQEA system’s creation and demonstration highlight the efficiency and scalability required for a reliable platform that can create and thoroughly test intricate quantum algorithms. This development could lead to easier access to quantum computing research and development.

The researchers admit several limitations, even though HPQEA has shown notable improvements, outperforming current CPU, GPU, and FPGA-based methods. In particular, they observed that the efficiency of high-bandwidth memory usage at the moment and constraints in processing resources affected execution times.

The group intends to concentrate all of its future research efforts on resolving these acknowledged limitations. This entails developing more effective resource allocation techniques as well as further optimizing memory consumption. The potential for this current research to expand the capabilities of quantum emulation and increase its applicability to more complicated quantum algorithms makes it crucial.

You can also read Infleqtion to Integrate Silicon Light Machines’ DPM Tech for Quantum

Quantum Computing News