Edit `CUDAQ_GLOBAL_INDEX_BITS` doc #3005

1tnguyen · 2025-06-06T01:37:56Z

Description

Edit the documentation for CUDAQ_GLOBAL_INDEX_BITS to improve clarity.

Signed-off-by: Thien Nguyen <thiennguyen@nvidia.com>

github-actions · 2025-06-06T03:14:59Z

CUDA Quantum Docs Bot: A preview of the documentation can be found here.

ghanem-nv · 2025-06-06T10:30:29Z

Please correct me if I'm wrong but my understanding is that this variable applies to the communication network between all MPI processes regardless of whether they are intra-node or inter-node. For example, the given list would also apply to 4 nodes and 8 GPUs/node, with NVLinks between the GPUs. The current formulation makes the impression it only applies to the network between nodes and the that the network inside a node is not relevant. Here is a suggestion that reflects my understanding:

Specify the network structure (faster to slower). For example, assuming a 32 MPI processes simulation, whereby the network topology is divided into 4 groups of 8 processes, which have faster communication network between them. In this case, the CUDAQ_GLOBAL_INDEX_BITS environment variable can be set to 3,2. The first 3 (log2(8)) represents 8 processes with fast communication within the group and the second 2 represents the 4 groups (8 processes each) in those total 32 processes. The sum of all elements in this list is 5, corresponding to the total number of MPI processes (2^5 = 32). Default is an empty list (no customization based on network structure of the cluster).

Edit the docstring for CUDAQ_GLOBAL_INDEX_BITS Signed-off-by: Thien Nguyen <thiennguyen@nvidia.com>

1tnguyen · 2025-06-10T02:51:50Z

Please correct me if I'm wrong but my understanding is that this variable applies to the communication network between all MPI processes regardless of whether they are intra-node or inter-node. For example, the given list would also apply to 4 nodes and 8 GPUs/node, with NVLinks between the GPUs. The current formulation makes the impression it only applies to the network between nodes and the that the network inside a node is not relevant. Here is a suggestion that reflects my understanding:

Specify the network structure (faster to slower). For example, assuming a 32 MPI processes simulation, whereby the network topology is divided into 4 groups of 8 processes, which have faster communication network between them. In this case, the CUDAQ_GLOBAL_INDEX_BITS environment variable can be set to 3,2. The first 3 (log2(8)) represents 8 processes with fast communication within the group and the second 2 represents the 4 groups (8 processes each) in those total 32 processes. The sum of all elements in this list is 5, corresponding to the total number of MPI processes (2^5 = 32). Default is an empty list (no customization based on network structure of the cluster).

That makes sense. I've incorporated your suggestions in 26e3801

github-actions · 2025-06-10T03:57:19Z

CUDA Quantum Docs Bot: A preview of the documentation can be found here.

* Edit CUDAQ_GLOBAL_INDEX_BITS doc for clarity Signed-off-by: Thien Nguyen <thiennguyen@nvidia.com> * Address code review comment Edit the docstring for CUDAQ_GLOBAL_INDEX_BITS Signed-off-by: Thien Nguyen <thiennguyen@nvidia.com> --------- Signed-off-by: Thien Nguyen <thiennguyen@nvidia.com> Signed-off-by: Anna Gringauze <agringauze@nvidia.com>

Edit CUDAQ_GLOBAL_INDEX_BITS doc for clarity

4b64e03

Signed-off-by: Thien Nguyen <thiennguyen@nvidia.com>

1tnguyen added the documentation Improvements or additions to documentation label Jun 6, 2025

1tnguyen requested review from mitchdz and sacpis June 6, 2025 01:38

sacpis approved these changes Jun 6, 2025

View reviewed changes

github-actions bot pushed a commit that referenced this pull request Jun 6, 2025

Docs preview for PR #3005.

5c823fd

Address code review comment

26e3801

Edit the docstring for CUDAQ_GLOBAL_INDEX_BITS Signed-off-by: Thien Nguyen <thiennguyen@nvidia.com>

Merge branch 'main' into tnguyen/mgpu-doc-edit

679f9c9

1tnguyen enabled auto-merge (squash) June 10, 2025 03:53

github-actions bot pushed a commit that referenced this pull request Jun 10, 2025

Docs preview for PR #3005.

afded6b

1tnguyen merged commit 8136209 into NVIDIA:main Jun 10, 2025
193 checks passed

github-actions bot pushed a commit that referenced this pull request Jun 10, 2025

Cleaning up docs preview for PR #3005.

b582871

bettinaheim added this to the release 0.12.0 milestone Jul 22, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Edit `CUDAQ_GLOBAL_INDEX_BITS` doc #3005

Edit `CUDAQ_GLOBAL_INDEX_BITS` doc #3005

Uh oh!

1tnguyen commented Jun 6, 2025

Uh oh!

github-actions bot commented Jun 6, 2025

Uh oh!

ghanem-nv commented Jun 6, 2025

Uh oh!

1tnguyen commented Jun 10, 2025

Uh oh!

github-actions bot commented Jun 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Edit CUDAQ_GLOBAL_INDEX_BITS doc #3005

Edit CUDAQ_GLOBAL_INDEX_BITS doc #3005

Uh oh!

Conversation

1tnguyen commented Jun 6, 2025

Description

Uh oh!

github-actions bot commented Jun 6, 2025

Uh oh!

ghanem-nv commented Jun 6, 2025

Uh oh!

1tnguyen commented Jun 10, 2025

Uh oh!

github-actions bot commented Jun 10, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Edit `CUDAQ_GLOBAL_INDEX_BITS` doc #3005

Edit `CUDAQ_GLOBAL_INDEX_BITS` doc #3005