Enable user to take ownership of MPI by otbrown · Pull Request #722 · QuEST-Kit/QuEST

otbrown · 2026-04-14T13:02:04Z

Closes #712.

QuEST no longer uses MPI_COMM_WORLD everywhere. Instead it uses mpiCommQuest (which will very often just be a duplicate of MPI_COMM_WORLD). This aligns with best practise for library developers, as it helps avoid avoid clashes with user MPI calls.
Added a new custom environment constructor which lest the user positively declare that they want to take ownership of MPI. QuEST relinquishes responsibility, and will not attempt to initialise or finalise MPI itself. It will still throw if there are an incompatible number of MPI processes.
Added some minimal testing... We should probably look at environment testing in more detail at some point! In fairness, I think if the mpiCommQuest stuff was broken essentially any QuEST program would also be broken.
Swapped Irecv and Isend in the comms routine as is the first thing I was asked about when we encountered issues with slingshot-11. It doesn't make any difference.
New Subcommunicator interface lets the user create a custom QuESTEnv on communicator that is not MPI_COMM_WORLD. This interface is guarded behind compile-time flags as it requires bringing the MPI header into the interface, which means all consumers of libQuEST need to include MPI as well. This interface is not yet documented or tested. I think this is tolerable in the short term as it should only appeal to advanced users.

Speaking of tests! Test results on my laptop:

otbrown@oliver-win:~/quantum/quest/source/QuEST/build$ mpirun -n 4 ./tests/tests

QuEST execution environment:
  precision:       2
  multithreaded:   1
  distributed:     1
  GPU-accelerated: 0
  GPU-sharing ok:  0
  cuQuantum:       0
  num nodes:       4

Testing configuration:
  test all deployments:  0
  num qubits in qureg:   6
  max num qubit perms:   10
  max num superop targs: 4
  num mixed-deploy reps: 10

Tested Qureg deployments:
  CPU + OMP + MPI

Randomness seeded to: 2200603699
===============================================================================
All tests passed (293841 assertions in 275 test cases)

and min_example output:

Date:
2026-04-13T18:54:07+01:00

OMP_NUM_THREADS=36
SLURM_NTASKS_PER_NODE=8
SLURM_NNODES=1

QuEST execution environment:
  [precision]
    qreal.................double (8 bytes)
    qcomp.................std::complex<double> (16 bytes)
    qindex................long long int (8 bytes)
    validationEpsilon.....1e-12
  [compilation]
    isMpiCompiled....................1
    isMpiSubCommunicatorCompiled.....0
    isGpuCompiled....................0
    isOmpCompiled....................1
    isCuQuantumCompiled..............0
  [deployment]
    isMpiEnabled............1
    doesUserOwnMpi..........0
    isGpuEnabled............0
    isOmpEnabled............1
    isCuQuantumEnabled......0
    isGpuSharingEnabled.....0
  [cpu]
    numCpuCores.......576 per machine
    numOmpProcs.......36 per machine
    numOmpThrds.......36 per node
    cpuMemory.........754.7 GiB per machine
    cpuMemoryFree.....unknown
  [gpu]
    numGpus...........N/A
    gpuDirect.........N/A
    gpuMemPools.......N/A
    gpuMemory.........N/A
    gpuMemoryFree.....N/A
    gpuCache..........N/A
  [distribution]
    isMpiGpuAware.....0
    numMpiNodes.......8
  [statevector limits]
    minQubitsForMpi.............3
    maxQubitsForCpu.............35
    maxQubitsForGpu.............N/A
    maxQubitsForMpiCpu..........37
    maxQubitsForMpiGpu..........N/A
    maxQubitsForMemOverflow.....58
    maxQubitsForIndOverflow.....63
  [density matrix limits]
    minQubitsForMpi.............3
    maxQubitsForCpu.............17
    maxQubitsForGpu.............N/A
    maxQubitsForMpiCpu..........20
    maxQubitsForMpiGpu..........N/A
    maxQubitsForMemOverflow.....28
    maxQubitsForIndOverflow.....31
  [statevector autodeployment]
    8 qubits......[omp]
    29 qubits.....[omp] [mpi]
  [density matrix autodeployment]
    4 qubits......[omp]
    15 qubits.....[omp] [mpi]

Qureg:
  [deployment]
    isMpiEnabled.....1
    isGpuEnabled.....0
    isOmpEnabled.....1
  [dimension]
    isDensMatr.....0
    numQubits......20
    numCols........N/A
    numAmps........2^20 = 1048576
  [distribution]
    numNodes.....2^3 = 8
    numCols......N/A
    numAmps......2^17 = 131072 per node
  [memory]
    cpuAmps...........2 MiB per node
    gpuAmps...........N/A
    cpuCommBuffer.....2 MiB per node
    gpuCommBuffer.....N/A
    globalTotal.......32 MiB

Qureg (20 qubit statevector, 1048576 qcomps over 8 nodes, 4 MiB per node):
    -0.00062991+0.00025551i   |0⟩        [node 0]
    (8.3557e-5)-0.00063022i   |1⟩
    -0.00053343-0.00011416i   |2⟩
    0.00028553-0.00018421i    |3⟩
    -0.00094126-0.00097974i   |4⟩
    0.00016001+0.00042058i    |5⟩
    0.00073393-0.002013i      |6⟩
    0.00061578+0.00046316i    |7⟩
    0.00048523+0.00068937i    |8⟩
    -0.00081878-0.00060096i   |9⟩
    0.0012373+0.00049652i     |10⟩
    0.0001958+0.00014134i     |11⟩
    0.0013159-0.00040911i     |12⟩
    -0.00061634+0.00094838i   |13⟩
    0.00019392-0.00049259i    |14⟩
    -0.00031263+0.00018272i   |15⟩
                ⋮
    -0.00053361+0.00010033i   |1048560⟩  [node 7]
    0.0017006-0.00053801i     |1048561⟩
    0.0010772-(1.5397e-5)i    |1048562⟩
    -0.00085736-0.00062965i   |1048563⟩
    0.00083238-0.00053071i    |1048564⟩
    -0.00025421+0.00064579i   |1048565⟩
    -0.00069857-0.00043407i   |1048566⟩
    (-9.9973e-5)-0.00094092i  |1048567⟩
    0.00092175-0.00068456i    |1048568⟩
    -0.00026758-0.00027345i   |1048569⟩
    0.00034301+0.00011848i    |1048570⟩
    0.00026923-0.00043647i    |1048571⟩
    -0.00088784-0.00011044i   |1048572⟩
    -0.00037642-0.00087563i   |1048573⟩
    0.00031388+0.00031749i    |1048574⟩
    0.00014548+0.00030294i    |1048575⟩

Total probability: 1

JobID           JobName    Elapsed     MaxRSS
------------ ---------- ---------- ----------
161633       QuEST_min+   00:00:04
161633.batch      batch   00:00:04
161633.exte+     extern   00:00:04
161633.0     min_examp+   00:00:01

There is a caveat here: distributed tests on Cirrus keep failing, but this appears to be true of devel as well as this branch, so I'll do some more digging and raise it as a separate issue once I have a better of idea of what is going on.

…unction name to comm_getMpiComm

…ing for subcomm compiled

…for SubComm, because of course it enforces CXX

…declare that they take ownership of MPI

…fined

…PI_Comm

… updated setComm for init only workflow

…ything but it makes MPI library implementers less nervous

…ith user owned MPI

otbrown · 2026-04-16T11:55:56Z

On the plus side, a distributed QFT benchmark circuit runs perfectly correctly:

QFT Benchmark on 34 Qubits
Date:
2026-04-14T15:24:08+01:00

OMP_NUM_THREADS=36
SLURM_NTASKS_PER_NODE=8
SLURM_NNODES=1

QuEST QFT Benchmark

QuEST execution environment:
  [precision]
    qreal.................double (8 bytes)
    qcomp.................std::complex<double> (16 bytes)
    qindex................long long int (8 bytes)
    validationEpsilon.....1e-12
  [compilation]
    isMpiCompiled...........1
    isGpuCompiled...........0
    isOmpCompiled...........1
    isCuQuantumCompiled.....0
  [deployment]
    isMpiEnabled............1
    isGpuEnabled............0
    isOmpEnabled............1
    isCuQuantumEnabled......0
    isGpuSharingEnabled.....0
  [cpu]
    numCpuCores.......576 per machine
    numOmpProcs.......36 per machine
    numOmpThrds.......36 per node
    cpuMemory.........1.475 TiB per machine
    cpuMemoryFree.....unknown
  [gpu]
    numGpus...........N/A
    gpuDirect.........N/A
    gpuMemPools.......N/A
    gpuMemory.........N/A
    gpuMemoryFree.....N/A
    gpuCache..........N/A
  [distribution]
    isMpiGpuAware.....0
    numMpiNodes.......8
  [statevector limits]
    minQubitsForMpi.............3
    maxQubitsForCpu.............36
    maxQubitsForGpu.............N/A
    maxQubitsForMpiCpu..........38
    maxQubitsForMpiGpu..........N/A
    maxQubitsForMemOverflow.....58
    maxQubitsForIndOverflow.....63
  [density matrix limits]
    minQubitsForMpi.............3
    maxQubitsForCpu.............18
    maxQubitsForGpu.............N/A
    maxQubitsForMpiCpu..........20
    maxQubitsForMpiGpu..........N/A
    maxQubitsForMemOverflow.....28
    maxQubitsForIndOverflow.....31
  [statevector autodeployment]
    8 qubits......[omp]
    29 qubits.....[omp] [mpi]
  [density matrix autodeployment]
    4 qubits......[omp]
    15 qubits.....[omp] [mpi]


Performing QFT on 34 qubits.
Time to run QFT Circuit: 420.544s.

Correctness checks:
Statevector norm: 1
First amplitude: (7.62939e-06, 0)
JobID           JobName    Elapsed     MaxRSS
------------ ---------- ---------- ----------
162993       QFT_bench+   00:07:08
162993.batch      batch   00:07:08
162993.exte+     extern   00:07:08
162993.0            qft   00:07:05

JPRichings · 2026-04-27T11:32:03Z

 */
 void initCustomQuESTEnv(int useDistrib, int useGpuAccel, int useMultithread);

+void initCustomMpiQuESTEnv(int useDistrib, int userOwnsMpi, int useGpuAccel, int useMultithread);


Do we not want a new initialiser so its really clear where there is a different setup? Also this probably braked compatibility for older software.

That is a new initialiser! This is effectively a C interface, so no overloading allowed.

Apologies I had missread this I thought it was replacing initCustomQueSTEnv hence the concern. Once again reading comprehension issue must do better.

The difference is more subtle than I might have liked, but the alternative would be an initialiser that was way too long!

initCustomMpiCommQuESTEnv?

JPRichings · 2026-04-27T11:50:40Z

Looks good not run yet.

My only comment is above and raises a wider point where we need to think about how we managed change in the programming interface as we probably want to stabilise as much as possible and only add to existing functionality.

otbrown · 2026-04-27T14:00:00Z

Looks good not run yet.

My only comment is above and raises a wider point where we need to think about how we managed change in the programming interface as we probably want to stabilise as much as possible and only add to existing functionality.

Agreed, If we're being proper about it, breaking changes to the API require a new major version release, and I don't think we have the time or the inclination currently.

JPRichings · 2026-04-28T16:00:08Z

Remaining testing required:

iarejula-bsc

Overall, looks good to me, really like the idea! I was wondering if this design would allow running quregs in parallel, each with their own communicator (different envs for each one, maybe?), or if that would be impossible with the current implementation. I think that would be a highly interesting feature. As my team is trying to apply malleability to QuEST, that would be a huge win for us.

A few small comments below, but note this is an untested review.

iarejula-bsc · 2026-05-05T14:14:56Z

+                MPI_Comm_free(&mpiCommQuest);
+            }
+
+            MPI_Comm_dup(newComm, &mpiCommQuest);


I think this function should be more conservative. I propose two improvements:

Validate newComm before use. the function currently assumes newComm is valid, but it could be MPI_COMM_NULL or an already-freed handle.

Check the return value of MPI_Comm_dup MPI functions return error codes, not exceptions. Ignoring this return value means any failure (e.g. out of resources, invalid communicator) goes undetected and mpiCommQuest could end up in an undefined state.

Suggested fix:

void comm_setMpiComm(MPI_Comm newComm) { if (newComm == MPI_COMM_NULL) throw std::invalid_argument("newComm cannot be MPI_COMM_NULL"); if (mpiCommQuest != MPI_COMM_NULL) { MPI_Barrier(mpiCommQuest); MPI_Comm_free(&mpiCommQuest); } int err = MPI_Comm_dup(newComm, &mpiCommQuest); if (err != MPI_SUCCESS) { char errMsg[MPI_MAX_ERROR_STRING]; int len; MPI_Error_string(err, errMsg, &len); throw std::runtime_error(std::string("MPI_Comm_dup failed: ") + errMsg); } }

iarejula-bsc · 2026-05-05T14:21:55Z



-void comm_init() {
+void comm_init(int userOwnsMpi) {


Should userOwnsMpi be a bool instead of int? It's only ever used as a boolean. Is there a repo convention for this, or is it to maintain C compatibility?

Yes, but repo convention is to use int for historical (C) reasons

iarejula-bsc · 2026-05-05T14:31:55Z

+        MPI_Init(NULL, NULL);

-    MPI_Init(NULL, NULL);
    MPI_Comm_dup(MPI_COMM_WORLD, &mpiCommQuest);


What if we simplify this by passing the communicator directly as a parameter instead of userOwnsMpi? If the user passes a communicator, QuEST uses it; if null/not provided, we initialize MPI and default to MPI_COMM_WORLD. This removes the ambiguity of who owns MPI and makes the API more explicit:

void comm_init(MPI_Comm userComm = MPI_COMM_NULL) { if (userComm == MPI_COMM_NULL) { MPI_Init(NULL, NULL); MPI_Comm_dup(MPI_COMM_WORLD, &mpiCommQuest); } else { MPI_Comm_dup(userComm, &mpiCommQuest); } }

I think this is what I had in mind originally, but if I recall correctly the issue is that it brings MPI_Comm into the public headers, which we need to avoid as it causes headaches for users who are linking a non-MPI application.

I will double-check though!

iarejula-bsc · 2026-05-05T14:49:32Z

+    // set mpiCommQuest to user provided communicator
+    comm_setMpiComm(userQuestComm);
+
+    // initialise QuEST around that communicator
+    initCustomMpiQuESTEnv(USE_DISTRIB, USER_OWNS_MPI, useGpuAccel, useMultithread);


does this work correcty? initCustomMpiQuESTEnv calls on comm_init where we are overwriting the mpiCommQuest with MPI_COMM_WORLD, does not matter if the user has the ownsership.

void comm_init(int userOwnsMpi) {Expand commentComment on line R106 ... // QuEST must initialise MPI if the user does not own it if (!userOwnsMpi) MPI_Init(NULL, NULL); MPI_Comm_dup(MPI_COMM_WORLD, &mpiCommQuest);

iarejula-bsc

In subcommunicator.cpp:17, userQuestComm appears to be duplicated into mpiCommQuest before the normal environment validation runs. If validation later fails under a throwing or custom error handler, the duplicated communicator may be left in mpiCommQuest with no cleanup, which is an MPI resource leak.
Haven't had time to trace this fully so it may be a non-issue depending on execution order, but probably worth a quick check.

iarejula-bsc · 2026-05-06T10:22:52Z


    // safely callable before MPI initialisation, but NOT after comm_end()
    int isInit;
    MPI_Initialized(&isInit);


There are now two distinct initialized states that the code conflates into one:

MPI runtime initialized: MPI_Init() has been called
QuEST communicator initialized: mpiCommQuest != MPI_COMM_NULL

comm_isInit() only checks the first, So comm_isInit() returning true is not a safe guard against calling MPI with a null communicator.

Precise crash path:

1. MPI_Init() State After MPI runtime: initialized mpiCommQuest: MPI_COMM_NULL QuEST env: not initialized initCustomMpiQuESTEnv(0, 1, 0, 0) └─ validateAndInitCustomQuESTEnv( useDistrib=0, userOwnsMpi=1, useGpuAccel=0, useMultithread=0 ) ├─ validate_envNeverInit(...) ├─ validate_newEnvDeploymentMode(...) ├─ autodep_chooseQuESTEnvDeployment(...) ├─ if (useDistrib) │ └─ comm_init(userOwnsMpi) │ │ ** skipped: useDistrib == 0 ** │ └─ validate_newEnvDistributedBetweenPower2Nodes(...) └─ comm_getNumNodes() ├─ comm_isInit() │ └─ MPI_Initialized(&isInit) │ returns true <-- runtime is up, guard passes │ └─ MPI_Comm_size(mpiCommQuest, &numNodes) CRASH State at Crash MPI runtime: initialized (ok) mpiCommQuest: MPI_COMM_NULL (!) QuEST env: not initialized (!)
CRASH: MPI_Comm_size called with MPI_COMM_NULL -> abort

I suggest that comm_isInit() should distinguish between "MPI runtime is initialized" and "QuEST communicator is initialized", or all accessors should guard against mpiCommQuest != MPI_COMM_NULL.

iarejula-bsc · 2026-05-06T10:31:18Z

+    // set mpiCommQuest to user provided communicator
+    comm_setMpiComm(userQuestComm);
+
+    // initialise QuEST around that communicator
+    initCustomMpiQuESTEnv(useDistrib, userOwnsMpi, useGpuAccel, useMultithread);


I think there is a problem with the order of operations or you must ensure MPI is initialized: initCustomMpiCommQuESTEnv() duplicates the user communicator before verifying that the MPI runtime exists at all.

In the case were MPI is not being initialized, the guard will be skiped

void comm_setMpiComm(MPI_Comm newComm) { if (mpiCommQuest != MPI_COMM_NULL) // false error_commDoubleSetMpiComm(); MPI_Comm_dup(newComm, &mpiCommQuest); // <-- ... before any MPI_Init() check }

otbrown · 2026-05-06T13:31:00Z

@iarejula-bsc :

I was wondering if this design would allow running quregs in parallel, each with their own communicator (different envs for each one, maybe?), or if that would be impossible with the current implementation.

We think it should allow that yes -- also useful for our CQ project. @JPRichings will explicitly add it to his testing to be sure.

Thank you for catching some of the initialisation bugs -- I will have a more detailed look and try fix them later this week! Much appreciated.

iarejula-bsc · 2026-05-07T10:33:49Z

We think it should allow that yes

Nice! I was wondering whether using concurrent quregs at the same time could be an issue out of mpi. I have two main concerns:

In v3, it was possible to inject the env when creating a qureg. Now a singleton pattern is used, so all of them would share the same instance.
I'm not sure whether other modules, such as the GPU, rely on global configuration, which could also cause issues (CUDA or whatever).

otbrown and others added 21 commits February 27, 2026 15:53

Added ENABLE_SUBCOMM build option

1c87a3c

Moved from MPI_COMM_WORLD to mpiQuestComm

19ac320

Decided passing *MPI_Comm was probably overly cautious, and updated f…

95415b7

…unction name to comm_getMpiComm

environment.cpp: added methods to reset rank and numNodes, and report…

0699e58

…ing for subcomm compiled

comm_config.hpp/cpp: added comm_setMpiComm

bfec279

CMakeLists.txt: PUBLIC MPI::MPI_CXX turned out to be unhelpful, even …

3a3bdff

…for SubComm, because of course it enforces CXX

Added new custom QuESTEnv initialiser which allow user to positively …

3358581

…declare that they take ownership of MPI

validation.cpp: updated comm_end call

6a093d0

Merge branch 'devel' into mpi-update

727d1bc

comm_config.hpp: added config.h include so COMPILE_MPI is actually de…

588b636

…fined

subcommunicator.h/cpp: implemented QuESTEnv initialiser with custom M…

7c6caac

…PI_Comm

CMake: added subcommunicator.cpp

e6628e4

comm_config.hpp: added missing config.h include...

bb2d5f2

comm_config.cpp: explicitly initialise mpiCommQuest to MPI_COMM_NULL,…

7c30b8c

… updated setComm for init only workflow

Merge branch 'mpi-update' of github.com:otbrown/QuEST into mpi-update

7512e67

quest.h: added subcommunicator header

cb72846

CMake: added MPI to application binaries when SUBCOMM is enabled

7ff2a2e

Merge branch 'devel' into mpi-update

13769be

comm_routines.cpp: post Irecv before Isend which probably won't do an…

68e435b

…ything but it makes MPI library implementers less nervous

tests: added new env test for initCustomMpiQuESTEnv

f15def4

Added error throws to comm_config to cover new scenarios of badness w…

dbe2b1e

…ith user owned MPI

otbrown requested a review from JPRichings April 16, 2026 11:56

JPRichings reviewed Apr 27, 2026

View reviewed changes

otbrown added 2 commits May 4, 2026 12:14

subcommunicator.cpp: updated var names to match QuEST style

bf62480

Merge branch 'devel' into mpi-update

f26ae47

iarejula-bsc reviewed May 5, 2026

View reviewed changes

iarejula-bsc reviewed May 6, 2026

View reviewed changes

Conversation

otbrown commented Apr 14, 2026

Uh oh!

otbrown commented Apr 16, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JPRichings commented Apr 27, 2026

Uh oh!

otbrown commented Apr 27, 2026

Uh oh!

JPRichings commented Apr 28, 2026

Uh oh!

iarejula-bsc left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

iarejula-bsc May 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

iarejula-bsc left a comment

Choose a reason for hiding this comment

Uh oh!

iarejula-bsc May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

otbrown commented May 6, 2026

Uh oh!

iarejula-bsc commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

iarejula-bsc May 5, 2026 •

edited

Loading

iarejula-bsc May 6, 2026 •

edited

Loading