[torqueusers] Processes die without any CPU time

Alberto Simões hashashin at gmail.com
Fri Jun 1 08:56:58 MDT 2007


Hi

I have a process that submits (qsub) 150 smaller processes. From these
processes, 98% of them finished without any CPU time. For instance:

[ambs at search ~]$ tracejob 120737
/opt/torque/server_priv/accounting/20070601: Permission denied
/opt/torque/mom_logs/20070601: No such file or directory
/opt/torque/sched_logs/20070601: No such file or directory

Job: 120737.search.di.uminho.pt

06/01/2007 14:32:47  S    enqueuing into default, state 1 hop 1
06/01/2007 14:32:47  S    dequeuing from default, state QUEUED
06/01/2007 14:32:47  S    enqueuing into tcurtos, state 1 hop 1
06/01/2007 14:32:47  S    Job Queued at request of ambs at search.di.uminho.pt,
                          owner = ambs at search.di.uminho.pt, job name =
                          ambs#initmat108.sh, queue = tcurtos
06/01/2007 14:32:47  S    Job Modified at request of maui at search.di.uminho.pt
06/01/2007 14:32:47  S    Job Run at request of maui at search.di.uminho.pt
06/01/2007 14:32:47  S    Job Modified at request of maui at search.di.uminho.pt
06/01/2007 14:32:47  S    Exit_status=-2 resources_used.cput=00:00:00
                          resources_used.mem=0kb resources_used.vmem=0kb
                          resources_used.walltime=00:00:00
06/01/2007 14:32:47  S    Post job file processing error
06/01/2007 14:32:47  S    dequeuing from tcurtos, state COMPLETE



Any hints on what might be the problem?
Thank you in advance,
Alberto

-- 
Alberto Simões


More information about the torqueusers mailing list