[torqueusers] problems with openMP

SCIPIONI Roberto SCIPIONI.Roberto at nims.go.jp
Sun Nov 16 20:46:44 MST 2008

Dear all,

I recently restored my /home directory in my cluster that was damaged
and it looks like the openMP jobs submitted with Torque do not work while the standard LAM-MPI do

the error is

[slavenode2:11511] [NO-NAME] ORTE_ERROR_LOG: Error in file runtime/orte_universe_exists.c at line 299
[slavenode2:11511] orte_init: could not contact the specified universe name default-universe-11511
[slavenode2:11511] [NO-NAME] ORTE_ERROR_LOG: Unreachable in file runtime/orte_init_stage1.c at line 221
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_sds_base_contact_universe failed
  --> Returned value -12 instead of ORTE_SUCCESS

[slavenode2:11511] [NO-NAME] ORTE_ERROR_LOG: Unreachable in file runtime/orte_system_init.c at line 42
[slavenode2:11511] [NO-NAME] ORTE_ERROR_LOG: Unreachable in file runtime/orte_init.c at line 52
Open RTE was unable to initialize properly.  The error occured while
attempting to orte_init().  Returned value -12 instead of ORTE_SUCCESS.
I read somewhere that it could be due to the /tmp directory not being clean

How do I purge the previous not properly finished jobs with open MPI ?



More information about the torqueusers mailing list