2016-10-05 16:22:19 UTC
Sorry about the incomplete message...
Is there any idea about the following error? On that node, there are 15
empty cores.
$ /share/apps/siesta/openmpi-2.0.1/bin/mpirun --host compute-0-3 -np 2
/share/apps/siesta/siesta-4.0-mpi201/tpar/transiesta < A.fdf
There are not enough slots available in the system to satisfy the 2 slots
that were requested by the application:
Either request fewer slots for your application, or make more slots
for use.

As briefly mentioned in this FAQ entry, slots are Open MPI's representation of how many processors are available on a given host. The default number of slots on any machine, if not explicitly specified, is 1 (e.g., if a host is listed in a hostfile by has no corresponding 'slots' keyword).

2014-11-03 12:54:29 UTC
Hi there,
We've started looking at moving to the openmpi 1.8 branch from 1.6 on our
CentOS6/Son of Grid Engine cluster and noticed an unexpected difference
when binding multiple cores to each rank.
Has openmpi's definition 'slot' changed between 1.6 and 1.8? It used to
mean ranks, but now it appears to mean processing elements (see Details,
PS Also, the man page for 1.8.3 reports that '--bysocket' is deprecated,
but it doesn't seem to exist when we try to use it:
mpirun: Error: unknown option '-bysocket'
Type 'mpirun --help' for usage.
On 1.6.5, we launch with the following core binding options:
mpirun --bind-to-core --cpus-per-proc <n> <program>
mpirun --bind-to-core --bysocket --cpus-per-proc <n> <program>
where <n> is calculated to maximise the number of cores available to
use - I guess affectively
max(1, int(number of cores per node / slots per node requested)).
openmpi reads the file $PE_HOSTFILE and launches a rank for each slot
defined in it, binding <n> cores per rank.
On 1.8.3, we've tried launching with the following core binding options
(which we hoped were equivalent):
mpirun -map-by node:PE=<n> <program>
mpirun -map-by socket:PE=<n> <program>
openmpi reads the file $PE_HOSTFILE and launches a factor of <n> fewer
ranks than under 1.6.5. We also notice that, where we wanted a single
rank on the box and <n> is the number of cores available, openmpi
refuses to launch and we get the message:
'There are not enough slots available in the system to satisfy the 1
slots that were requested by the application'
I think that error message needs a little work :)

