[torqueusers] job dieing immediately, 0 byte output file being produced

Sabuj Pattanayek sabujp at gmail.com
Tue Feb 23 08:16:07 MST 2010


Hi all,

Jobs going to our queue are being terminated immediately and a 0 byte
output file is being produced. It's being terminated so quickly that
we can't even checkjob the job. Here's the tracejob on the server:

Job: 859604.pbsserver

02/23/2010 09:06:00  S    enqueuing into csb, state 1 hop 1
02/23/2010 09:06:00  S    Job Queued at request of user at pbsserver,
owner = user at pbsserver, job name = urandom-nodes1ppn1-trigger.pbs,
queue = csb
02/23/2010 09:06:01  S    Job Modified at request of root at pbsserver
02/23/2010 09:06:01  S    Job Run at request of root at pbsserver
02/23/2010 09:06:01  S    Job Modified at request of root at pbsserver
02/23/2010 09:06:01  S    Exit_status=1 resources_used.cput=00:00:00
resources_used.mem=0kb resources_used.vmem=0kb
resources_used.walltime=00:00:00 session_id=27799

and on the node:

Job: 859604.pbsserver

02/23/2010 09:06:01  M    scan_for_terminated: job 859604.pbsserver
task 1 terminated, sid=27799
02/23/2010 09:06:01  M    job was terminated
02/23/2010 09:06:01  M    Job Modified at request of PBS_Server at pbsserver
02/23/2010 09:06:01  M    obit sent to server

here was the job that was sent (this usually works without problems):

#!/bin/tcsh
# Beginning of PBS batch script.
# Status/Progress EMails sent to
#PBS -M sabujp at gmail.com
# Email generated at b)eginning, a)bort, and e)nd of jobs
#PBS -m bae
# Nodes required (#nodes:#processors per node:CPU type)
#PBS -l nodes=1:ppn=1
# Total job memory required (specify how many megabytes)
#PBS -l mem=512mb
# You must specify Wall Clock time (hh:mm:ss) [Maximum allowed 30 days
= 720:00:00]
#PBS -l walltime=00:4:00
# Output file
# Send (join) both stderr and stdout to file in PBS -o line
#PBS -j oe


cd ~/pbstest
cat /dev/urandom > /dev/null

# End of PBS batch script.

Any ideas?

Thanks,
Sabuj Pattanayek


More information about the torqueusers mailing list