KEMBAR78
Pre-built binaries, installer, and building from source guide by bettinaheim · Pull Request #1010 · NVIDIA/cuda-quantum · GitHub
Skip to content

Conversation

@bettinaheim
Copy link
Collaborator

@bettinaheim bettinaheim commented Dec 6, 2023

Primary features/changes included in this PR:

  • Creates a self-extracting archive using makeself that allows to install a pre-built version of CUDA Quantum on the supported Linux distributions. The new installer should support the same distributions as are supported by our Python wheels.
  • The installer properly contains the Notice file and prompts for accepting the License agreement.
  • Adds an installation guide for building CUDA Quantum from source and deploying it in a supercomputing center. The instructions in this guide are what we use to build the installer ourselves (docs include all snippets needed for the build as well as additional prose text elaborating on compatibility requirements).
  • Adds a section on installation using the new installer to the Getting Started Guide
  • Limitations/requirements for using the installer are documented in the new docs page(s). Specifically, we rely on rpaths to find certain dependencies (cuquantum and cutensor); these paths need to be the same during build and install such that the installer currently requires admin rights.
  • The installer automatically activates the MPI support (if MPI_PATH is set), and configures the appropriate environment variables.
  • The install_prerequisites.sh script is now much more complete and comprehensive. We now build zlib, OpenSSL, and CURL from source such that we have suitable static libraries for them compiled with -fpic that we control; there are a lot of compilation configurations for these libraries and the only way to get a consistent behavior across different OS is to build them from source.
  • Adds a jobs during CI and Deployment, that build the CUDA Quantum binaries in an AlmaLinux 8 environment (same as the base image used for manylinux), run the C++ unit tests, and check that the installer works on different Linux OS. Specifically, CUDA Quantum is installed in each base image listed under cpp in validation_config.json using the installer, and the same basic validation as for the Docker images is run.
  • Made some tweaks to the activate_custom_mpi.sh script such that the MPI plugin by default is built with nvq++ (i.e. with the only compiler we know for certain is available...), and to support building and using a plugin for a non-default MPI installation (i.e. an MPI installation that is not on the PATH). The validation includes a sanity check for a basic MPI plugin built and enable after installation that is documented in the installation guide - which the disclaimer that one should really use a more optimized MPI build in a data center.
  • Set the rpath for the nvqir-tensornet such that it finds the cutensor library also when that library is not on the LD_LIBRARY_PATH, like we do for the cuquantum libraries.
  • Adds labels to the cmake tests to mark tests that require a GPU. During CI, we build all components, including GPU-accelerated one, but the tests that require a GPU are filtered if/since no GPU is found.
  • Various small-ish changes to make the build work across different environments, and some other tweaks and clean-up.
  • Removed some of the packages that were installed in the CUDA Quantum Docker images that I believe should no longer be needed.

Should be addressed/revised in the future (but I don't want to add this to this PR):

  • Add a job using the installer instead of the docker image to the nightly integration tests
  • The installer should be signed, and we should add a sentence to the documentation for how to confirm its authenticity.
  • Add a sentence to the docs for how to update to a newer CUDA Quantum version. To do so, it would be handy if we were to include an uninstall script with the installer, which should remove only the files that were newly created during installation but nothing else.
  • Right now, we statically link everything when creating the binaries for the installer, including glibc (as we do for the pip wheels). This is not really a good setup, as commented in the CMakeLists.txt file. I think a good setup in the future could be to not statically link glibc, but instead include the necessary libraries in the installer, and nvq++ dynamically pick either the system library or the library in the CUDA Quantum installation, depending on which one is newer. That requires more work than I want to squeeze in with this PR, though.
  • We still have certain (now documented) CUDA runtime dependencies. I think it would be reasonable to simply link those statically such that they do not need to be manually installed.
  • We still don't build or include the C++ standard library in our installer. In the future, it may be nice to build it from source, and install it in a subfolder that nvq++ checks (e.g. by setting the clang resource dir?), rather than linking it statically.
  • TBD: I more or less arbitrarily choose to support zlib compression and disable zstd compression in LLVM and Curl. This choice was purely motivated by seeing that on certain OS, disabling zlib compression for the lld linker will cause errors when trying to compile our examples due to certain libraries being linked that use this compression. I didn't do a very in-depth evaluation of whether that is the best compression option, or if we need to enable more in the future.
  • The CUDAQ_BUILD_RELOCATABLE_PACKAGE should be reexamined. It currently doesn't meet the need of creating a self-contained installation. We should move from using libstdc++ to libc++, build the latter from source and include it.
  • Worth pointing out and possibly revising: The mock qpu tests are not built/running unless CUDA Quantum is built with python support enabled. The reason is that the mock implementations of the server is written in python and uses CUDA Quantum to simulate execution.

Other changes for things I randomly came across and fixed that could (should?) go into a separate PR:

  • If one of the examples in the sanity checks we run to validate the Docker images fails, it should now properly fail the build (to be confirmed).
  • Similarly, fixes an issue where the Python wheel validation did not fail properly when an example failed.
  • Set the minimum CMake version to 3.26, since this is the version that we test. This is the minimum version required for the Python wheels, and if we formally allow a lower version, we would need to properly test it during CI to ensure it is indeed supported. The install_prerequisites.sh script contains a quick and easy way to install a suitable CMake version.
  • The final validation of the docker image ran the examples also on the noisy simulator by defaults. I disabled all samples except the noise-related samples for that backend, since this is extremely slow.
  • The nvq++ script required tput which is not generally available on all systems (this is why we used xterm in CI) - I removed that dependency by adding a fallback to color the command line that doesn't need it.
  • Enabled running a couple of samples that were not tested during CI.
  • Update the googletest submodule to v1.14.0.
  • Login to DockerHub before building images (allows for twice the amount of pulls compared to pulling images anonymously) and update some of the used GitHub actions

The pipelines are set up such that a build cache for the installer build as well as for the basic MPI build is stored on GHCR. That means that unless we update any of the dependencies, only CUDA Quantum itself is rebuilt during each CI run. At the same time, we now have a single Docker image that contains a complete CUDA Quantum build in a pretty minimal environment that validates the new guide for building CUDA Quantum from source.

@github-actions
Copy link

github-actions bot commented Dec 6, 2023

CUDA Quantum Docs Bot: A preview of the documentation can be found here.

github-actions bot pushed a commit that referenced this pull request Dec 6, 2023
@github-actions
Copy link

github-actions bot commented Dec 6, 2023

CUDA Quantum Docs Bot: A preview of the documentation can be found here.

github-actions bot pushed a commit that referenced this pull request Dec 6, 2023
@github-actions
Copy link

github-actions bot commented Dec 7, 2023

CUDA Quantum Docs Bot: A preview of the documentation can be found here.

github-actions bot pushed a commit that referenced this pull request Dec 7, 2023
@bettinaheim
Copy link
Collaborator Author

bettinaheim commented Dec 18, 2023

/create_cache

Command Bot: Processing...
The launched workflow can be found here.
Running workflow from branch main. The created cache will be owned by that branch.
Checking out source code from head bettinaheim:main (sha: f5e8aa9).

@github-actions
Copy link

CUDA Quantum Docs Bot: A preview of the documentation can be found here.

github-actions bot pushed a commit that referenced this pull request Jan 17, 2024
@bettinaheim bettinaheim merged commit dc174b2 into NVIDIA:main Jan 17, 2024
@github-actions github-actions bot locked and limited conversation to collaborators Jan 17, 2024
callerName=$(basename "$(caller 0)")
echo "${RED}${callerName}:$(echo $(caller 0) | cut -d " " -f1): $1${NORMAL}"
message_content="${callerName}:$(echo $(caller 0) | cut -d " " -f1): $1"
if [ -x "$(command -v tput)" ] && [ -n "$TERM"] ; then
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if [ -x "$(command -v tput)" ] && [ -n "$TERM"] ; then
if [ -x "$(command -v tput)" ] && [ -n "$TERM" ] ; then

@bettinaheim bettinaheim added the enhancement New feature or request label Jan 22, 2024
@bettinaheim bettinaheim added the documentation Improvements or additions to documentation label Jan 29, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

documentation Improvements or additions to documentation enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants