Error building Vasp6.5.1 using makefile.include.nvhpc_ompi_mkl_omp_acc

Questions regarding the compilation of VASP on various platforms: hardware, compilers and libraries, etc.


Moderators: Global Moderator, Moderator

Post Reply
Message
Author
matthew_matzelle
Newbie
Newbie
Posts: 8
Joined: Mon Jan 20, 2020 9:11 pm

Error building Vasp6.5.1 using makefile.include.nvhpc_ompi_mkl_omp_acc

#1 Post by matthew_matzelle » Thu Jun 26, 2025 1:03 pm

Hi All,

I am trying to compile Vasp6.5.1 on Rocky Linux 9.3 at our HPC using the following makefile: makefile.include.nvhpc_ompi_mkl_omp_acc.

Here is the current makefile.include after adding the directory locations and uncommenting things.

Code: Select all

# Default precompiler options
CPP_OPTIONS = -DHOST=\"LinuxNV\" \
              -DMPI -DMPI_INPLACE -DMPI_BLOCK=8000 -Duse_collective \
              -DscaLAPACK \
              -DCACHE_SIZE=4000 \
              -Davoidalloc \
              -Dvasp6 \
              -Dtbdyn \
              -Dqd_emulate \
              -Dfock_dblbuf \
              -D_OPENMP \
              -DACC_OFFLOAD \
              -DNVCUDA \
              -DUSENCCL

CPP         = nvfortran -Mpreprocess -Mfree -Mextend -E $(CPP_OPTIONS) $*$(FUFFIX)  > $*$(SUFFIX)

# N.B.: you might need to change the cuda-version here
#       to one that comes with your NVIDIA-HPC SDK
CC          = mpicc  -acc -gpu=cc60,cc70,cc80,cc90,cuda12.5 -mp
FC          = mpif90 -acc -gpu=cc60,cc70,cc80,cc90,cuda12.5 -mp
FCL         = mpif90 -acc -gpu=cc60,cc70,cc80,cc90,cuda12.5 -mp -c++libs

FREE        = -Mfree

FFLAGS      = -Mbackslash -Mlarge_arrays

OFLAG       = -fast

DEBUG       = -Mfree -O0 -traceback

LLIBS       = -cudalib=cublas,cusolver,cufft,nccl -cuda

# Redefine the standard list of O1 and O2 objects
SOURCE_O1  := pade_fit.o minimax_dependence.o wave_window.o
SOURCE_O2  := pead.o

# For what used to be vasp.5.lib
CPP_LIB     = $(CPP)
FC_LIB      = $(FC)
CC_LIB      = $(CC)
CFLAGS_LIB  = -O -w
FFLAGS_LIB  = -O1 -Mfixed
FREE_LIB    = $(FREE)

OBJECTS_LIB = linpack_double.o

# For the parser library
CXX_PARS    = nvc++ --no_warnings

##
## Customize as of this point! Of course you may change the preceding
## part of this file as well if you like, but it should rarely be
## necessary ...
##
# When compiling on the target machine itself , change this to the
# relevant target when cross-compiling for another architecture
VASP_TARGET_CPU ?= -tp cascadelake
FFLAGS     += $(VASP_TARGET_CPU)

# Specify your NV HPC-SDK installation (mandatory)
#... first try to set it automatically
#NVROOT      =$(shell which nvfortran | awk -F /compilers/bin/nvfortran '{ print $$1 }')

# If the above fails, then NVROOT needs to be set manually
NVHPC      ?= /shared/EL9/explorer/nvidia-hpc-sdk/24.7
NVVERSION   = 24.7
NVROOT      = /shared/EL9/explorer/nvidia-hpc-sdk/24.7/Linux_x86_64/24.7

## Improves performance when using NV HPC-SDK >=21.11 and CUDA >11.2
OFLAG_IN   = -fast -Mwarperf
SOURCE_IN  := nonlr.o

# Software emulation of quadruple precsion (mandatory)
QD         ?= $(NVROOT)/compilers/extras/qd
LLIBS      += -L$(QD)/lib -lqdmod -lqd
INCS       += -I$(QD)/include/qd

# Intel MKL for FFTW, BLAS, LAPACK, and scaLAPACK
MKLROOT    ?= /shared/EL9/explorer/intel-oneapi/2025.0.1/mkl/2025.0
MKLLIBS     = -Mmkl
#MKLLIBS     = -lmkl_intel_lp64 -lmkl_pgi_thread -lmkl_core -pgf90libs -mp -lpthread -lm -ldl

# If you want to use scaLAPACK from MKL
LLIBS_MKL   = -L$(MKLROOT)/lib -lmkl_scalapack_lp64 -lmkl_blacs_openmpi_lp64 $(MKLLIBS)

# Use a separate scaLAPACK installation (optional but recommended in combination with OpenMPI)
# Comment out the two lines below if you want to use scaLAPACK from MKL instead
#SCALAPACK_ROOT ?= /path/to/your/scalapack/installation
#LLIBS_MKL   = -L$(SCALAPACK_ROOT)/lib -lscalapack $(MKLLIBS)

LLIBS      += $(LLIBS_MKL)

INCS       += -I$(MKLROOT)/include/fftw

# Use cusolvermp (optional)
# supported as of NVHPC-SDK 24.1 (and needs CUDA-11.8)
#CPP_OPTIONS+= -DCUSOLVERMP -DCUBLASMP
#LLIBS      += -cudalib=cusolvermp,cublasmp -lnvhpcwrapcal

# HDF5-support (optional but strongly recommended, and mandatory for some features)
#CPP_OPTIONS+= -DVASP_HDF5
#HDF5_ROOT  ?= /shared/EL9/explorer/HDF5/1.14.6
#LLIBS      += -L$(HDF5_ROOT)/lib -lhdf5_fortran
#INCS       += -I$(HDF5_ROOT)/include

# For the VASP-2-Wannier90 interface (optional)
CPP_OPTIONS    += -DVASP2WANNIER90
WANNIER90_ROOT ?= /projects/bansil/programs/WANNIER90/wannier90-3.1.0nvidiavaspinterfacenompi
LLIBS          += -L$(WANNIER90_ROOT) -lwannier

# For the fftlib library (hardly any benefit for the OpenACC GPU port, especially in combination with MKL's FFTs)
#CPP_OPTIONS+= -Dsysv
#FCL        += fftlib.o
#CXX_FFTLIB  = nvc++ -mp --no_warnings -std=c++11 -DFFTLIB_USE_MKL -DFFTLIB_THREADSAFE
#INCS_FFTLIB = -I./include -I$(MKLROOT)/include/fftw
#LIBS       += fftlib
#LLIBS      += -ldl

# For machine learning library vaspml (experimental)
#CPP_OPTIONS += -Dlibvaspml
#CPP_OPTIONS += -DVASPML_USE_CBLAS
#CPP_OPTIONS += -DVASPML_DEBUG_LEVEL=3
#CXX_ML      = mpic++ -mp
#CXXFLAGS_ML = -O3 -std=c++17 -Wall -Wextra
#INCLUDE_ML  =

# Add -gpu=tripcount:host to compiler commands for NV HPC-SDK > 25.1
#NVFORTRAN_VERSION := $(shell nvfortran --version | sed -n '2s/^nvfortran \([0-9.]*\).*/\1/p')
# define greater_or_equal
#$(shell printf '%s\n%s\n' '$(1)' '$(2)' | sort -V | head -n1 | grep -q '$(2)' && echo true || echo false)
#endef
#ifeq ($(call greater_or_equal,$(NVFORTRAN_VERSION),25.1),true)
#    CC  += -gpu=tripcount:host
#    FC  += -gpu=tripcount:host
#endif

I am attempting to complete the installation with the following versions of compilers/dependencies:
NVIDIA HPC Software Development Kit (SDK) version 24.7
which comes with
CUDA version 12.5
and OpenMPI version 4.1.7a1
Intel MKL version 2025.0 from Intel Oneapi version 2025.0.1
Wannier90 version 3.1

I find the following error message before the compilation fails:

Code: Select all

ar: creating libdmy.a
ar: creating libparser.a
NVFORTRAN-W-0006-Input file empty (crayhip.f90)
NVFORTRAN/x86-64 Linux 24.7-0: compilation completed with warnings
NVFORTRAN-W-0006-Input file empty (intelmkl.f90)
NVFORTRAN/x86-64 Linux 24.7-0: compilation completed with warnings
NVFORTRAN-W-0006-Input file empty (vaspml.f90)
NVFORTRAN/x86-64 Linux 24.7-0: compilation completed with warnings
NVFORTRAN-W-0155-Mismatched data type for member kvector (bandgap_tools.F: 394)
NVFORTRAN-W-0155-Mismatched data type for member kvector (bandgap_tools.F: 401)
  0 inform,   2 warnings,   0 severes, 0 fatal for find_band_edges_current_kpoint
NVFORTRAN-W-0921-Redefinition of symbol RANK_SUFFIX (./ml_reader.inc: 5)
/usr/bin/ld: /shared/EL9/explorer/intel-oneapi/2025.0.1/mkl/2025.0/lib/libmkl_intel_thread.so: undefined reference to `__kmpc_end_masked'
/usr/bin/ld: /shared/EL9/explorer/intel-oneapi/2025.0.1/mkl/2025.0/lib/libmkl_intel_thread.so: undefined reference to `__kmpc_masked'
pgacclnk: child process exit status 1: /usr/bin/ld
make[2]: *** [makefile:153: vasp] Error 2
cp: cannot stat 'vasp': No such file or directory
make[1]: *** [makefile:150: all] Error 1
make: *** [makefile:17: std] Error 2

where for whatever reason the first 2 lines are printed to the error file even though they are not really errors.
The final lines of the output are:

Code: Select all

nvfortran -Mpreprocess -Mfree -Mextend -E -DHOST=\"LinuxNV\" -DMPI -DMPI_INPLACE -DMPI_BLOCK=8000 -Duse_collective -DscaLAPACK -DCACHE_SIZE=4000 -Davoidalloc -Dvasp6 -Dtbdyn -Dqd_emulate -Dfock_dblbuf -D_OPENMP -DACC_OFFLOAD -DNVCUDA -DUSENCCL -DVASP2WANNIER90 rpa_high.F > rpa_high.f90 -DNGZhalf
mpif90 -acc -gpu=cc60,cc70,cc80,cc90,cuda12.5 -mp -Mfree -Mbackslash -Mlarge_arrays -tp cascadelake -fast -I/shared/EL9/explorer/nvidia-hpc-sdk/24.7/Linux_x86_64/24.7/compilers/extras/qd/include/qd -I/shared/EL9/explorer/intel-oneapi/2025.0.1/mkl/2025.0/include/fftw  -c rpa_high.f90
nvfortran -Mpreprocess -Mfree -Mextend -E -DHOST=\"LinuxNV\" -DMPI -DMPI_INPLACE -DMPI_BLOCK=8000 -Duse_collective -DscaLAPACK -DCACHE_SIZE=4000 -Davoidalloc -Dvasp6 -Dtbdyn -Dqd_emulate -Dfock_dblbuf -D_OPENMP -DACC_OFFLOAD -DNVCUDA -DUSENCCL -DVASP2WANNIER90 main.F > main.f90 -DNGZhalf
mpif90 -acc -gpu=cc60,cc70,cc80,cc90,cuda12.5 -mp -Mfree -Mbackslash -Mlarge_arrays -tp cascadelake -Mfree -O0 -traceback -I/shared/EL9/explorer/nvidia-hpc-sdk/24.7/Linux_x86_64/24.7/compilers/extras/qd/include/qd -I/shared/EL9/explorer/intel-oneapi/2025.0.1/mkl/2025.0/include/fftw  -c main.f90
mpif90 -acc -gpu=cc60,cc70,cc80,cc90,cuda12.5 -mp -c++libs -o vasp c2f_interface.o simd.o base.o string.o tutor.o version.o build_info.o command_line.o vhdf5_base.o incar_reader.o reader_base.o openmp_struct.o openacc_struct.o offload_struct.o mpi.o mpi_shmem.o main_mpi.o mathtools.o profiling.o bse_struct.o mgrid_struct.o pot_struct.o hamil_struct.o radial_struct.o pseudo_struct.o wave_struct.o nl_struct.o mkpoints_struct.o bandgap_struct.o poscar_struct.o esf_struct.o afqmc_struct.o minimax_struct.o setex_struct.o locproj_struct.o fock_glb.o chi_glb.o smart_allocate.o xml.o constant.o plugins.o ml_ff_c2f_interface.o ml_ff_prec.o ml_ff_string.o ml_ff_tutor.o ml_ff_constant.o ml_ff_mpi_help.o ml_ff_neighbor.o ml_ff_taglist.o ml_ff_struct.o ml_ff_mpi_shmem.o vdwforcefield_glb.o jacobi.o scala_struct.o ini.o scala.o nvcuda.o crayhip.o intelmkl.o openmp.o openacc.o offload.o scalapack_wrappers.o blas_wrappers.o lapack_wrappers.o asa.o lattice.o poscar.o fft_comm.o fftw.o fft_wrappers.o fft_base.o mgrid.o libmbd.o ml_asa2.o ml_ff_mpi.o ml_ff_helper.o ml_ff_logfile.o ml_ff_math.o ml_ff_iohandle.o ml_ff_memory.o ml_ff_abinitio.o ml_ff_ff2.o ml_ff_ff3.o ml_ff_ff.o ml_ff_mlff.o vaspml.o ldalib.o wpbe.o ggalib.o mbj.o mggalib.o vdw_nl.o xc_driver.o setex.o pseudo.o radial.o gridq.o coulomb_cutoff.o ebs.o symlib.o gauss_quad.o m_unirnk.o mkpoints.o random.o wave.o wave_mpi.o wave_high.o bext.o spinsym.o symmetry.o lattlib.o nonl.o nonlr.o nonl_high.o dfast.o choleski2.o mix.o hamil.o constrmag.o cl_shift.o relativistic.o LDApU.o paw_base.o tau_mu.o fexcg.o egrad.o pawsym.o pawfock.o pawlhf.o diis.o rhfatm.o hyperfine.o fock_ace.o mkpoints_full.o charge.o us.o extpot.o paw.o Lebedev-Laikov.o stockholder.o pot_electrostat.o dipol.o solvation.o scpc.o fermi_energy.o tet.o dos.o elf.o hamil_rot.o chain.o dyna.o fileio.o vhdf5.o bandgap_tools.o pot.o sphpro.o core_rel.o aedens.o wavpre.o wavpre_noio.o broyden.o dynbr.o reader.o writer.o xml_writer.o brent.o stufak.o opergrid.o stepver.o fast_aug.o fock_multipole.o fock.o fock_dbl.o fock_frc.o supercell.o mkpoints_change.o subrot_cluster.o sym_grad.o mymath.o npt_dynamics.o subdftd3.o subdftd4.o internals.o dynconstr.o dimer_heyden.o dvvtrajectory.o vdwforcefield.o nmr.o pead.o k-proj.o subrot.o subrot_scf.o paircorrection.o rpa_force.o ml_reader.o ml_interface_writer.o ml_interface.o coulomb_cutoff_gradients.o force.o pwlhf.o gw_model.o optreal.o steep.o rmm-diis.o davidson.o david_full.o david_inner.o root_find.o lcao_bare.o locproj.o electron_common.o electron.o rot.o electron_all.o shm.o pardens.o optics.o constr_cell_relax.o stm.o finite_diff.o elpol.o hamil_lr.o rmm-diis_lr.o subrot_lr.o lr_helper.o hamil_lrf.o elinear_response.o ilinear_response.o linear_optics.o setlocalpp.o wannier.o electron_OEP.o electron_lhf.o twoelectron4o.o minimax_ini.o minimax_dependence.o minimax_functions1D.o minimax_functions2D.o minimax_varpro.o minimax.o umco.o mlwf.o ratpol.o pade_fit.o screened_2e.o wave_cacher.o crpa.o chi_base.o wpot.o local_field.o ump2.o ump2kpar.o fcidump.o ump2no.o bse_te.o bse_lanczos.o bse.o bse_driver.o time_propagation.o esf.o acfdt.o afqmc.o rpax.o chi.o dmft.o GG_base.o acfdt_GG.o greens_orbital.o lt_mp2.o rnd_orb_mp2.o greens_real_space.o chi_GG.o chi_super.o sydmat.o rmm-diis_mlr.o linear_response_NMR.o wannier_interpol.o wave_interpolate.o wave_rotator.o wave_window.o wap.o elphon_potential_struct.o elphon_base.o elphon_triplets.o elphon_potential.o elphon_accumulators.o elphon_kgrid.o transport.o elphon_common.o elphon_mels.o elphon_selfen_ph.o elphon_driver.o linear_response.o auger.o dmatrix.o phonon.o elphon_derivative.o wannier_mats.o elphon.o core_con_mat.o embed.o rpa_high.o  main.o  -Llib -ldmy -Lparser -lparser -cudalib=cublas,cusolver,cufft,nccl -cuda -L/shared/EL9/explorer/nvidia-hpc-sdk/24.7/Linux_x86_64/24.7/compilers/extras/qd/lib -lqdmod -lqd -L/shared/EL9/explorer/intel-oneapi/2025.0.1/mkl/2025.0/lib -lmkl_scalapack_lp64 -lmkl_blacs_openmpi_lp64 -Mmkl -L/projects/bansil/programs/WANNIER90/wannier90-3.1.0nvidiavaspinterfacenompi -lwannier
make[2]: Leaving directory '/projects/bansil/programs/VASP651gpuexplorer/vasp.6.5.1/build/std'
make[1]: Leaving directory '/projects/bansil/programs/VASP651gpuexplorer/vasp.6.5.1/build/std'

I have tried installing with or without linking to the Wannier90 library.
I find the same issue when trying to install Vasp 6.4.3.

Please let me know if you need any more info.
Please let me know if anything obvious pops out at you or suggest my next possible step to fixing this.

Thank you,
Matt


merzuk.kaltak
Administrator
Administrator
Posts: 316
Joined: Mon Sep 24, 2018 9:39 am

Re: Error building Vasp6.5.1 using makefile.include.nvhpc_ompi_mkl_omp_acc

#2 Post by merzuk.kaltak » Fri Jun 27, 2025 1:12 pm

Hello Matthew,
The error seems to be related to MKL and the way you link to the openmp runtime:

Code: Select all

MKLLIBS     = -Mmkl
#MKLLIBS     = -lmkl_intel_lp64 -lmkl_pgi_thread -lmkl_core -pgf90libs -mp -lpthread -lm -ldl

Please change this to the recommended lines in makefile.include.nvhpc_ompi_mkl_omp_acc:

Code: Select all

#MKLLIBS     = -Mmkl
MKLLIBS     = -lmkl_intel_lp64 -lmkl_pgi_thread -lmkl_core -pgf90libs -mp -lpthread -lm -ldl

Using this makefile.include I was able to compile 6.5.1 with following modules:

  • NVHPC 24.7

  • MKL 2024.2.1

  • OpenMPI 4.1.6-cuda (provided by NVHPC 24.7)


Post Reply