[torqueusers] torque post job file processing error

jacksond at supercluster.org jacksond at supercluster.org
Mon Oct 18 10:09:58 MDT 2004


Carlos,

   The error PBSE_UNKRESC occurs only when the MOM cannot issue a 'chdir' 
into the requested start directory.  OpenPBS had a long existing NULL 
pointer bug which was identified last week by some torque users.  This bug 
had the potential for causing a silent failure resulting in some jobs 
being launched into the incorrect directory.  This is corrected in patch 
3.  However, your cluster may have a subset of compute nodes which do not 
possess the users home directory or the requested launch directory.  The 
torque fix may now be highlighting this issue.

   Please attempt to confirm if this is the case and let us know.

Thanks,
Dave

On Sat, 16 Oct 2004 torqueusers at supercluster.org wrote:

> im testing torque 1.1.0p3 with maui 3.2.6p9 on one linux/amd64 node, and
> i have the problem that the standard output/error files wont get copied
> back to the user.
> im using the exact same mom_priv/config options that worked for me with
> torque 1.1.0p0, but now with 1.1.0p3 i get this error logs
>
> on the server:
> 10/16/2004 15:17:32;0010;PBS_Server;Job;72.node0.cluster;Exit_status=0
> resources_used.cput=00:00:59 resources_used.mem=3800kb
> resources_used.vmem=22460kb resources_used.walltime=00:01:01
> 10/16/2004 15:17:32;000d;PBS_Server;Job;72.node0.cluster;Post job file
> processing error; job 72.node0.cluster on host node0/0
>
> on the mom:
> 10/16/2004 15:17:32;0080;
> pbs_mom;Job;72.node0.cluster;scan_for_terminated: task 1 terminated, sid
> 24807
> 10/16/2004 15:17:32;0008;   pbs_mom;Job;72.node0.cluster;Terminated
> 10/16/2004 15:17:32;0080;   pbs_mom;Job;72.node0.cluster;Obit sent
> 10/16/2004 15:17:32;0080;   pbs_mom;Req;req_reject;Reject reply
> code=15035( REJHOST=node0.cluster), aux=0, type=54, from
> PBS_Server at node0.cluster
>
>
> looking trough the source code i found that "code=15035" means "Unknown
> resource", which doesnt tell me much. any help in finding what is
> causing this would be much appreciated.
>
> regards,
>
> Carlos
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://supercluster.org/mailman/listinfo/torqueusers
>


More information about the torqueusers mailing list