Distributed Quantum Computing on HPC Systems is Emulated by CUNQA, Opening the Door for Scalable Hybrid Architectures
Researchers are concentrating on expanding capabilities by connecting numerous quantum processors, a concept known as Distributed Quantum Computing (DQC), as the search for potent quantum computers intensifies. A group from the University of Santiago de Compostela and the Galicia Supercomputing Centre (CESGA) created CUNQA after realizing that these processors would initially serve as accelerators in already-existing High-Performance Computing (HPC) environments.
A new open-source emulator called CUNQA was created specifically to test and assess DQC techniques directly on existing HPC systems. Before fully realized quantum technology becomes publicly accessible, this method enables researchers to investigate programming considerations, architectural difficulties, and performance characteristics.
You can also read Quantum Airport: APS, IBM Bring Quantum Computing to airport
Extraordinary Emulation Skills
The first tool that can precisely simulate all three DQC methods in an HPC setting is CUNQA. The following three DQC models are essential:
- No-communication (Embarrassingly Parallel): This technique involves no communication at runtime and divides quantum tasks classically across several virtual Quantum Processing Units (vQPUs).
- Classical-communication: In order to classically control an instruction, this model uses the classical distribution of quantum tasks while enabling classical linkages between vQPUs. This allows one QPU to receive classical information (such as a measurement result) from another QPU during execution. An example that matches this approach is the Iterative Phase Estimation Algorithm (IPEA), which uses classical communication to minimise the number of ancilla qubits needed in comparison to conventional Quantum Phase Estimation (QPE).
- Quantum-communication: This paradigm preserves the classical link while connecting QPUs via a quantum channel. Implementing quantum-communication protocols like teledata and telegate is necessary in this scenario of purely quantum distribution.
You can also read MIS/MWIS in Asymmetric Quantum Networks with Qubit Control
Connecting Classical and Quantum Hardware
Virtual QPUs (vQPUs), which are traditional processes that mimic the behaviour of a genuine QPU and run on HPC resources, are the fundamental building blocks of CUNQA. By taking tasks from the CPU, carrying them out, and producing results, these vQPUs are made to function as accelerators.
Important integration techniques that complement the future of hybrid computing are supported by CUNQA:
- Co-located: Although QPUs are different hardware accessible over a network, they are housed within the same HPC facility.
- On-node: Like a GPU accelerator, the QPU is housed directly inside a typical HPC node.
- The accelerator paradigm renders the earlier standalone model—completely distinct quantum systems—obsolete.
CUNQA uses a software architecture that places the onus of resource management on the user (out of the middleware). Users can reserve and customize resources, including the amount of vQPUs and their maximum availability time, by utilizing commands such as qraise, an SLURM wrapper, to control the vQPUs’ life cycle.
You can also read Amazon Braket Notebooks support CUDA-Q NVIDIA for HQC
Using QPE to Show Capability
The team used the Quantum Phase Estimation (QPE) algorithm to confirm that the emulator could function with all three DQC approaches. Each scheme’s use of QPE illustrated the trade-offs:
- No-communication (Distribution of Shots): This method achieves strong parallelization and drastically cuts down on simulation time when compared to the basic scenario. However, if the overhead from distribution and information collecting is too high, acceleration will suffer.
- Classical-communication (IPEA): Enhances the basic case but is slower than the optimized no-communication case due to synchronization delays and internal simulator optimizations being removed when extracting and altering gate execution.
- Distributed QPE, or quantum communication, demonstrated execution times that were two orders of magnitude faster than the base case. Due to the requirement that tasks involving quantum communications be simulated within a single executor process and the additional protocols needed to distribute controlled gates, simulation time scales proportionately with the number of vQPUs involved.
CUNQA produced an estimated phase that was in line with the theoretical value, successfully simulating QPE across all models, despite performance variations that reflected the real-world complexity of distributed architectures.
A critical first step in addressing software and architectural issues before they impede the practical implementation of scalable, potent hybrid quantum-classical computation is the proactive creation of CUNQA. The code for CUNQA is open-source software.
You can also read The Quantum Alliance UConn: Quantum Research Disciplines




Thank you for your Interest in Quantum Computer. Please Reply