Clarification on number of MPI ranks for VASP tests.

Queries about input and output files, running specific calculations, etc.


Moderators: Global Moderator, Moderator

Post Reply
Message
Author
hszhao.cn@gmail.com
Full Member
Full Member
Posts: 191
Joined: Tue Oct 13, 2020 11:32 pm

Clarification on number of MPI ranks for VASP tests.

#1 Post by hszhao.cn@gmail.com » Wed Jan 03, 2024 2:20 pm

Dear VASP Forum,

I am currently setting up tests for the VASP software and came across some information specifying that "Reference files have been generated with 4 MPI ranks. Note that tests might fail if another number of ranks is used!"

Would it be possible to obtain clarification regarding the use of a different number of MPI ranks for the tests?

As far as the vasp testing is concerned, I would like to know if it is feasible to use a larger number of MPI ranks to distribute the computation more broadly, or perhaps a smaller number if my resources are limited.

Could you please provide some guidance as to the potential pitfalls, if any, of deviating from the 4 MPI ranks? Or if there are certain tests that are more sensitive to the number of ranks used, could you please specify these?

Any advice or resources on this topic would be greatly appreciated, and will greatly assist me in properly configuring my test environment.

Many thanks for your assistance in this matter.

Best regards,
Zhao

hszhao.cn@gmail.com
Full Member
Full Member
Posts: 191
Joined: Tue Oct 13, 2020 11:32 pm

Re: Clarification on number of MPI ranks for VASP tests.

#2 Post by hszhao.cn@gmail.com » Thu Jan 04, 2024 8:10 am

Below is my testing report of HEG_333_LW example shipped with vasp.6.4.2 testsuite using various cores and with/without "-genv I_MPI_FABRICS=shm" option:

I have recently conducted a series of tests on the HEG_333_LW example shipped in vasp.6.4.2 testsuite using different combinations of the number of cores and the option "-genv I_MPI_FABRICS=shm". This test was conducted by starting with one core and incrementally increasing the number of cores by one each time.

Here are the two sets of commands I used. 'n' represents the number of cores:

Without "-genv I_MPI_FABRICS=shm" option:

Code: Select all

module load vasp/6.4.2-intel-oneapi.2023.2.0

export VASP_TESTSUITE_EXE_STD="mpirun -np n vasp_std"
export VASP_TESTSUITE_EXE_NCL="mpirun -np n vasp_ncl"
export VASP_TESTSUITE_EXE_GAM="mpirun -np n vasp_gam"

cd vasp.6.4.2

time VASP_TESTSUITE_TESTS=HEG_333_LW make test

With "-genv I_MPI_FABRICS=shm" option:

Code: Select all

module load vasp/6.4.2-intel-oneapi.2023.2.0

export VASP_TESTSUITE_EXE_STD="mpirun -np n -genv I_MPI_FABRICS=shm vasp_std"
export VASP_TESTSUITE_EXE_NCL="mpirun -np n -genv I_MPI_FABRICS=shm vasp_ncl"
export VASP_TESTSUITE_EXE_GAM="mpirun -np n -genv I_MPI_FABRICS=shm vasp_gam"

cd vasp.6.4.2

time VASP_TESTSUITE_TESTS=HEG_333_LW make test
Below are my findings:

1. The testing either stalls or fails when run on more than 12 cores.
2. The testing is successful when run on less or equal 12 cores, with or without "-genv I_MPI_FABRICS=shm". However, when the "-genv I_MPI_FABRICS=shm" option is used, there is an observable performance lift between one to several seconds.

I would appreciate it if you could provide some insights regarding my findings. Are these expected behaviors, or are there any optimizations that we could do to enhance the testsuite further?

Looking forward to your response.

Best Regards,
Zhao

svijay
Global Moderator
Global Moderator
Posts: 74
Joined: Fri Aug 04, 2023 11:07 am

Re: Clarification on number of MPI ranks for VASP tests.

#3 Post by svijay » Thu Jan 04, 2024 8:18 am

Dear Zhao,

The sentence just states that the reference (with which your calculation will be checked against) has been run with four MPI ranks. You may change the number of ranks used to run tests using the VASP_TESTSUITE_EXE_STD, VASP_TESTSUITE_EXE_GAM and VASP_TESTSUITE_EXE_NCL environment variables. Please note that changing the parallelization strategy could change certain parameters within the calculation (such as NBANDS: see here https://www.vasp.at/wiki/index.php/NBANDS) and hence might lead to some tests failing. There is currently no list of tests that are sensitive to the number of ranks used.

Sudarshan

hszhao.cn@gmail.com
Full Member
Full Member
Posts: 191
Joined: Tue Oct 13, 2020 11:32 pm

Re: Clarification on number of MPI ranks for VASP tests.

#4 Post by hszhao.cn@gmail.com » Thu Jan 04, 2024 2:45 pm

Here is the test script I used, if others have similar needs, you can adjust it according to your own situation:

Code: Select all

#!/usr/bin/env bash
script_name_sh=$HOME/.local/libexec/script_name.sh
source $script_name_sh ${BASH_SOURCE[0]}

cwd

if [ $# -ne 1 ]; then
    echo "This script requires exactly one argument!"
    echo "Please provide a feature_set as the argument (for example, 1 for intel or 2 for intel_omp)"
    exit 1
fi

feature_set=$1

# Change directory to the correct path
cd releases/vasp.6.4.2

case "$feature_set" in

    1)    echo "Running code with intel feature_set"

      #https://www.vasp.at/forum/viewtopic.php?f=2&t=18373#p21390
      #This issued could be related to the fabrics control in the Intel MPI library.
      #Can you try setting the argument I_MPI_FABRICS=shm in your mpirun command?
      #mpirun -np 64 -genv I_MPI_FABRICS=shm vasp_std

      # Load the required module
      module --force purge
      module load vasp/6.4.2-intel-oneapi.2023.2.0

      # Iterate over different numbers of cores starting from 1
      for ((n=1; n<=12; n++)); do
          echo "Testing with $n cores without -genv option"

          export VASP_TESTSUITE_EXE_STD="mpirun -np $n vasp_std" 
          export VASP_TESTSUITE_EXE_NCL="mpirun -np $n vasp_ncl" 
          export VASP_TESTSUITE_EXE_GAM="mpirun -np $n vasp_gam"

          # Run the test
          time VASP_TESTSUITE_TESTS=HEG_333_LW make test | grep SUCCESS

          echo "Testing with $n cores with -genv option"

          export VASP_TESTSUITE_EXE_STD="mpirun -np $n -genv I_MPI_FABRICS=shm vasp_std"
          export VASP_TESTSUITE_EXE_NCL="mpirun -np $n -genv I_MPI_FABRICS=shm vasp_ncl"
          export VASP_TESTSUITE_EXE_GAM="mpirun -np $n -genv I_MPI_FABRICS=shm vasp_gam"

          # Run the test
          time VASP_TESTSUITE_TESTS=HEG_333_LW make test | grep SUCCESS
      done  
 
          ;;

    2)    echo "Running code with intel_omp feature_set"

      # The following setting is adapted from testsuite/impi+omp.conf,
      # and the guidance describe on https://www.vasp.at/wiki/index.php/Validation_tests
      #https://stackoverflow.com/questions/71181984/find-the-optimal-combination-of-setting-values-for-number-of-processes-and-om

      # Load the required module
      module --force purge
      module load vasp/6.4.2-intel_omp-oneapi.2023.2.0      

      # Set up an array with the number of MPI processes (ranks) and threads per process
      nranks=(2 4  6 8 10 12)          #  an array including the number of MPI processes for each test
      nthrd=2  # number of threads per process for all tests, utilizing hyperthreading (2 threads per core)

      # Iterate over each test
      for nrank in ${nranks[*]}; do
        # Set up MPI parameters
        mpi_params="-np $nrank -genv OMP_NUM_THREADS=$nthrd -genv I_MPI_PIN_DOMAIN=omp -genv KMP_AFFINITY=verbose,granularity=fine,compact,1,0 -genv KMP_STACKSIZE=512m"

        # Define the commands to test by using the mpi variable
        export VASP_TESTSUITE_EXE_STD="mpirun $mpi_params vasp_std"
        export VASP_TESTSUITE_EXE_NCL="mpirun $mpi_params vasp_ncl"
        export VASP_TESTSUITE_EXE_GAM="mpirun $mpi_params vasp_gam"

        # Print the number of nrank and threads
        printf "Running tests with nrank=%s, nthrd=%s\n" $nrank $nthrd

        # Run the test and grep for the SUCCESS status
        time VASP_TESTSUITE_TESTS=HEG_333_LW make test | grep SUCCESS
      done

          ;;

    *)    echo "Invalid argument!"
          echo "Please provide a valid feature_set as the argument (for example, 1 for intel or 2 for intel_omp)"
          exit 1
          ;;

esac

svijay
Global Moderator
Global Moderator
Posts: 74
Joined: Fri Aug 04, 2023 11:07 am

Re: Clarification on number of MPI ranks for VASP tests.

#5 Post by svijay » Thu Jan 04, 2024 2:55 pm

Dear Zhao,

Thank you for sharing the results of your tests and scripts to generate them! To expand a bit more on my previous message (and after speaking with my colleagues here at VASP) it appears that the test suite might pass for 2, 4 and 8 ranks. 6 (and as you see as well, 12) are most likely to fail because of the way NBANDS is decided based on the mode of parallelism used.

Sudarshan

hszhao.cn@gmail.com
Full Member
Full Member
Posts: 191
Joined: Tue Oct 13, 2020 11:32 pm

Re: Clarification on number of MPI ranks for VASP tests.

#6 Post by hszhao.cn@gmail.com » Fri Jan 05, 2024 2:19 am

Dear Sudarshan,
To expand a bit more on my previous message (and after speaking with my colleagues here at VASP) it appears that the test suite might pass for 2, 4, and 8 ranks. 6 (and as you see as well, 12) are most likely to fail because of the way NBANDS is decided based on the mode of parallelism used.
You only mentioned an even number of cores above, and I also did a test on an odd number of cores at the same time and for the example I used, there was no problem with integers less than or equal to 12.

Regards,
Zhao

svijay
Global Moderator
Global Moderator
Posts: 74
Joined: Fri Aug 04, 2023 11:07 am

Re: Clarification on number of MPI ranks for VASP tests.

#7 Post by svijay » Fri Jan 05, 2024 8:00 am

Dear Zhao,

Thanks for running these tests and the clarification! For future reference, could you please share the makefile.include that you use to compile VASP? Thanks!

Sudarshan

Post Reply