[torqueusers] #PBS -V in version 2.5.10
sm4082 at nyu.edu
Mon Mar 19 17:37:14 MDT 2012
We are also having this problem. Serious problem with this version is some pbs variables are not being defined (PBS_JOBNAME PBS_JOBID). This is the reason you don't see err and out files ( I am assuming user has these variables in pbs -e and -o directives). If you have compiled torque with --enable-syslog you can see in the logs on compute nodes that it can't create them since variables are undefined.
I asked users to mention absolute path. For parallel jobs and array jobs I am sourcing a script file through wrapper. This script file defines pbs_nodefile that is needed for parallel jobs and array id for array jobs.
Strangely, if I restart pbs_mom it works ok for the user who had failed jobs before. But after a while it happens all again for different user. I checked 2.5.11 and there are not that many differences between this and 2.5.10. Not sure upgrading to 11 would solve this problem.
Sent from my phone. Please excuse my brevity and any typos.
On Mar 19, 2012, at 18:42, Joseph Farran <jfarran at uci.edu> wrote:
> Hi Ken.
> Yes. One of our users has job arrays which is the person experiencing this problem. I deleted all jobs prior to upgrading.
> Is there something I forgot go clean out that needs cleaning?
> On 03/19/2012 03:32 PM, Ken Nielson wrote:
>> On Mon, Mar 19, 2012 at 4:21 PM, Joseph Farran <jfarran at uci.edu <mailto:jfarran at uci.edu>> wrote:
>> We were using Torque 2.5.9 and we were able to use the Torque PBS directive "#PBS -V" just fine.
>> On upgrading to Torque 2.5.10, the same scripts which used to work using "#PBS -V" no longer work.
>> When we submit a job using "#PBS -V", the job starts and nothing happens - no output, no errors, nothing. The job starts but nothing happens.
>> Looking at Torque logs /opt/torque/server_logs shows no errors - just the job starting and ending.
>> If we remove ""#PBS -V" then the job runs just fine.
>> Anyone else ran into this or knows what is going on?
>> Did you have any array jobs in your queue when you upgraded?
> torqueusers mailing list
> torqueusers at supercluster.org
More information about the torqueusers