[torqueusers] stageout failed on submit_host

Beob Kyun Kim trugens at gmail.com
Mon Mar 2 19:07:23 MST 2009


Hello,

While trying to enable direct job submission from trusted host to local
cluster, I met stageout error.
The message seems to be generated when the output file is copied back to
submit_host.

03/03/2009 10:56:27;000d;PBS_Server;Job;160828.***;sending 'e' mail for job
> 160828.*** to trugens@*** (Exit_status=0
> 03/03/2009
> 10:56:27;000d;PBS_Server;Job;160828.***;[continued]resources_used.cput=00:00:00
> 03/03/2009
> 10:56:27;000d;PBS_Server;Job;160828.***;[continued]resources_used.mem=0k)
> 03/03/2009 10:56:27;000d;PBS_Server;Job;160828.***;[continued]
> 03/03/2009 10:56:27;0010;PBS_Server;Job;160828.***;Exit_status=0
> resources_used.cput=00:00:00 resources_used.mem=0kb resources_used.vmem=0kb
> resources_used.walltime=00:00:01
> 03/03/2009 10:56:27;0009;PBS_Server;Job;160828.***;on_job_exit task
> assigned to job
> 03/03/2009 10:56:27;0009;PBS_Server;Job;160828.***;req_jobobit completed
> 03/03/2009 10:56:27;0004;PBS_Server;Svr;svr_connect;attempting connect to
> host 2528639686 port 15002
> 03/03/2009 10:56:27;0008;PBS_Server;Job;160828.***;JOB_SUBSTATE_EXITING
> 03/03/2009 10:56:27;0001;PBS_Server;Svr;PBS_Server;svr_setjobstate: setting
> job 160828.*** state from EXITING-EXITING to EXITING-STAGEOUT (5-51)
> 03/03/2009 10:56:27;0001;PBS_Server;Svr;PBS_Server;[continued]
> 03/03/2009 10:56:27;0008;PBS_Server;Job;160828.***;JOB_SUBSTATE_STAGEOUT
> 03/03/2009 10:56:27;0008;PBS_Server;Job;160828.***;about to copy
> stdout/stderr/stageout files
> 03/03/2009 10:56:27;0008;PBS_Server;Job;160828.***;copy request failed
>

This is server parameters:

#
> # Set server attributes.
> #
> set server scheduling = True
> set server acl_host_enable = False
> set server acl_hosts = ***
> set server managers = root@***
> set server operators = root@***
> set server default_queue = dteam
> set server log_events = 511
> set server mail_from = adm
> set server query_other_jobs = True
> set server scheduler_iteration = 600
> set server node_check_rate = 150
> set server tcp_timeout = 6
> set server default_node = lcgpro
> set server node_pack = False
> set server log_level = 7
> set server pbs_version = 2.3.0-snap.200801151629
> set server submit_hosts = ***
> set server allow_node_submit = False
>

I open torque server port (15001) for submit_host.
Do you have any idea or experience on this ?

Thanks.

kyun
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20090303/e570d806/attachment-0001.html


More information about the torqueusers mailing list