[torqueusers] torque/maui problem - defered jobs "Invalid request
(15004)"
Austin Godber
godber at mars.asu.edu
Fri Dec 23 17:44:24 MST 2005
Garrick Staples wrote:
> On Thu, Dec 22, 2005 at 01:13:46PM -0700, Austin Godber alleged:
>
>>And I get errors in torque/server_logs/20051222 like this:
>>
>>Invalid request (15004) in send_job, child failed in previous commit
>>request for job
>
> Which linux distro are you using? On RHEL3 x86_64 I've observed very
> slow binds when many are done quickly. It was easy to reproduce with
> something like this:
> seq 1 1000 | xargs -n 1 -i /usr/sbin/pbs_iff -t $pbsservername 15001
>
> Try it on a 32bit and 64bit host. It should finish in about 5 seconds.
> Does it fail on 64bit?
Thanks for your reply, I tried this out, and ended up seeing rather
sporradic results, for instance it would tak between 5 and 60 seconds to
complete regrardless of host.
I managed to resolve my problem after restarting the pbs_moms on all of
the cluster nodes. It was very strange but seemed to resolve my problems.
Austin
More information about the torqueusers
mailing list