[torqueusers] PBS job issue

Abhishek Gupta abhig at Princeton.EDU
Fri Jan 16 09:37:24 MST 2009


Hi Steve,
You are right, it is MPI type of job. I checked the nodes which were 
assigned to the job and there was no job running. Even the job that 
should run in a few seconds, was totally stuck. Could you please tell me 
what should I do to solve this problem?
Thanks,
Abhishek.

Steve Young wrote:
> Hi,
>     I'm wondering if this is an MPI type of job? Did you make sure to 
> compile MPI to be TM-aware? How do you know the job is not actually 
> running somewhere? I've found that if you don't make MPI aware of 
> torque then the jobs end up on nodes MPI assigns and doesn't run on 
> the nodes torque assigns. I ended up using OSC's version of mpiexec 
> but using a version of MPI that can be compiled to be TM aware would 
> do the same thing. This is just a guess without knowing what kind of 
> job your running, what version of torque you have, how you have things 
> configured and such. Hope this helps,
>
> -Steve
>
>
> On Jan 16, 2009, at 11:13 AM, Abhishek Gupta wrote:
>
>> Hi all,
>> I am facing a problem with job submission in which my first job gets 
>> stuck for ever( showing R state ) and if I run the same job keeping 
>> the first job, second job runs without any problem. I found that when 
>> I ask for more than 1 node, then only this problem arises. Even if I 
>> say nodes=1:ppn=2, it runs without any problem, but nodes=2 do not 
>> work for the first time. There is one thing that I found, even some 
>> other job( which require more than one node is stuck started by some 
>> other user), my job with requirement more than one node run smoothly 
>> while the job of that other user stays in that state forever.
>> Could someone tell what could be the issue? Is there any parameter 
>> that need to be set?
>> Thanks,
>> Abhishek.
>> _______________________________________________
>> torqueusers mailing list
>> torqueusers at supercluster.org
>> http://www.supercluster.org/mailman/listinfo/torqueusers
>


More information about the torqueusers mailing list