[torqueusers] torque/maui problem - defered jobs "Invalid request (15004)"

Austin Godber godber at mars.asu.edu
Fri Dec 23 17:44:24 MST 2005


Garrick Staples wrote:
> On Thu, Dec 22, 2005 at 01:13:46PM -0700, Austin Godber alleged:
> 
>>And I get errors in torque/server_logs/20051222 like this:
>>
>>Invalid request (15004) in send_job, child failed in previous commit
>>request for job
> 
> Which linux distro are you using?  On RHEL3 x86_64 I've observed very
> slow binds when many are done quickly.  It was easy to reproduce with
> something like this:
>   seq 1 1000 | xargs -n 1 -i /usr/sbin/pbs_iff -t $pbsservername 15001
> 
> Try it on a 32bit and 64bit host.  It should finish in about 5 seconds.
> Does it fail on 64bit?

Thanks for your reply, I tried this out, and ended up seeing rather
sporradic results, for instance it would tak between 5 and 60 seconds to
complete regrardless of host.

I managed to resolve my problem after restarting the pbs_moms on all of
the cluster nodes.  It was very strange but seemed to resolve my problems.


Austin


More information about the torqueusers mailing list