[torqueusers] problems with openMP

SCIPIONI Roberto SCIPIONI.Roberto at nims.go.jp
Sun Nov 16 20:46:44 MST 2008


Dear all,


I recently restored my /home directory in my cluster that was damaged
and it looks like the openMP jobs submitted with Torque do not work while the standard LAM-MPI do


the error is


[slavenode2:11511] [NO-NAME] ORTE_ERROR_LOG: Error in file runtime/orte_universe_exists.c at line 299
[slavenode2:11511] orte_init: could not contact the specified universe name default-universe-11511
[slavenode2:11511] [NO-NAME] ORTE_ERROR_LOG: Unreachable in file runtime/orte_init_stage1.c at line 221
--------------------------------------------------------------------------
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_sds_base_contact_universe failed
  --> Returned value -12 instead of ORTE_SUCCESS

--------------------------------------------------------------------------
[slavenode2:11511] [NO-NAME] ORTE_ERROR_LOG: Unreachable in file runtime/orte_system_init.c at line 42
[slavenode2:11511] [NO-NAME] ORTE_ERROR_LOG: Unreachable in file runtime/orte_init.c at line 52
--------------------------------------------------------------------------
Open RTE was unable to initialize properly.  The error occured while
attempting to orte_init().  Returned value -12 instead of ORTE_SUCCESS.
--------------------------------------------------------------------------
~
~
I read somewhere that it could be due to the /tmp directory not being clean


How do I purge the previous not properly finished jobs with open MPI ?


Thanks

RS




More information about the torqueusers mailing list