Page 1 of 1
VASP-6.4.3 NCCL
Posted: Tue Feb 25, 2025 1:15 am
by vladimir.ladygin
Dear VASP Developers,
I've got an official bug trying to use one core per one gpu nccl setup on NERSC Perlmutter. This is just an ordinary relaxation calculation.
"""internal error in: mpi.F at line: 903
M_init_nccl: Error in ncclCommInitRank
If you are not a developer, you should not encounter this problem.
Please submit a bug report.
"""
Kind Regards,
Vladimir
Re: VASP-6.4.3 NCCL
Posted: Tue Feb 25, 2025 9:48 am
by ferenc_karsai
Thanks for the report, I will try to reproduce the error on our machines.
Re: VASP-6.4.3 NCCL
Posted: Wed Feb 26, 2025 12:30 pm
by ferenc_karsai
I talked to a colleague and he observed a similar bug before from a user on the forum.
Here is the originale post:
https://www.vasp.at/forum/viewtopic.php?t=19822
The solution for the moment is that you don't use NCCL, so compile without -DUSENCCL in the makefile.include.