[torqueusers] job dieing immediately, 0 byte output file being produced

Garrick garrick at usc.edu
Tue Feb 23 09:03:52 MST 2010


Check syslog on the node?

If you want output, your batch script should print something.

HPCC/Linux Systems Admin

On Feb 23, 2010, at 7:16 AM, Sabuj Pattanayek <sabujp at gmail.com> wrote:

> Hi all,
>
> Jobs going to our queue are being terminated immediately and a 0 byte
> output file is being produced. It's being terminated so quickly that
> we can't even checkjob the job. Here's the tracejob on the server:
>
> Job: 859604.pbsserver
>
> 02/23/2010 09:06:00  S    enqueuing into csb, state 1 hop 1
> 02/23/2010 09:06:00  S    Job Queued at request of user at pbsserver,
> owner = user at pbsserver, job name = urandom-nodes1ppn1-trigger.pbs,
> queue = csb
> 02/23/2010 09:06:01  S    Job Modified at request of root at pbsserver
> 02/23/2010 09:06:01  S    Job Run at request of root at pbsserver
> 02/23/2010 09:06:01  S    Job Modified at request of root at pbsserver
> 02/23/2010 09:06:01  S    Exit_status=1 resources_used.cput=00:00:00
> resources_used.mem=0kb resources_used.vmem=0kb
> resources_used.walltime=00:00:00 session_id=27799
>
> and on the node:
>
> Job: 859604.pbsserver
>
> 02/23/2010 09:06:01  M    scan_for_terminated: job 859604.pbsserver
> task 1 terminated, sid=27799
> 02/23/2010 09:06:01  M    job was terminated
> 02/23/2010 09:06:01  M    Job Modified at request of  
> PBS_Server at pbsserver
> 02/23/2010 09:06:01  M    obit sent to server
>
> here was the job that was sent (this usually works without problems):
>
> #!/bin/tcsh
> # Beginning of PBS batch script.
> # Status/Progress EMails sent to
> #PBS -M sabujp at gmail.com
> # Email generated at b)eginning, a)bort, and e)nd of jobs
> #PBS -m bae
> # Nodes required (#nodes:#processors per node:CPU type)
> #PBS -l nodes=1:ppn=1
> # Total job memory required (specify how many megabytes)
> #PBS -l mem=512mb
> # You must specify Wall Clock time (hh:mm:ss) [Maximum allowed 30 days
> = 720:00:00]
> #PBS -l walltime=00:4:00
> # Output file
> # Send (join) both stderr and stdout to file in PBS -o line
> #PBS -j oe
>
>
> cd ~/pbstest
> cat /dev/urandom > /dev/null
>
> # End of PBS batch script.
>
> Any ideas?
>
> Thanks,
> Sabuj Pattanayek
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers


More information about the torqueusers mailing list