A partial-node MPI job failed to start using Intel MPI mpiexec

Category:

Owens

Pitzer

Software

Resolution:

Resolved

Workaround Link:

Workaround Link

Affected Software:

Intel MPI

A partial-node MPI job may fail to start using mpiexec from intelmpi/2019.3 and intelmpi/2019.7 with error messages like

[mpiexec@o0439.ten.osc.edu] wait_proxies_to_terminate (../../../../../src/pm/i_hydra/mpiexec/intel/i_mpiexec.c:532): downstream from host o0439 was killed by signal 11 (Segmentation fault)
[mpiexec@o0439.ten.osc.edu] main (../../../../../src/pm/i_hydra/mpiexec/mpiexec.c:2114): assert (exitcodes != NULL) failed

/var/spool/torque/mom_priv/jobs/11510761.owens-batch.ten.osc.edu.SC: line 30: 11728 Segmentation fault

/var/spool/slurmd/job00884/slurm_script: line 24:  3180 Segmentation fault      (core dumped)

Affected versions

2019.3 2019.7

Workaround

If you are using Slurm, make sure the job has CPU resource allocation using #SBATCH --ntasks=N instead of

#SBATCH --nodes=1
#SBATCH --ntasks-per-node=N

Ohio Department of Higher Education

25 South Front Street
Columbus, Ohio 43215

State Government Links

Mike DeWine, Governor | Ohio.gov
Ohio Checkbook

Education Links

Ohio Department of Higher Education
Ohio Technology Consortium
OARnet | OSC | OhioLINK
OACC | IUC | OTTA | ODE

Search form

A partial-node MPI job failed to start using Intel MPI mpiexec

Affected versions

Workaround

Upcoming Events

Recent News

Translate

Ohio Department of Higher Education

State Government Links

Education Links

Search form

You are here

A partial-node MPI job failed to start using Intel MPI mpiexec

Affected versions

Workaround

Upcoming Events

Recent News

Translate

Ohio Department of Higher Education

State Government Links

Education Links