[torqueusers] Fwd: wrong pbs server name

Samir Gartner jigzat at gmail.com
Thu May 21 12:20:39 MDT 2009


I think I'm gonna cry.... I love you guys!! No, seriously, it worked but
only if executed under root user, now the question is what did I do wrong?
Jobs should start automatically, right?

I was following first the Globus tootlikt tutorial but it is kinda outdated
so I guess I issued some wrong instructions.

On of the weird things was that the tutorial suggested using the /opt/pbs
prefix when executing configure and now I have under /opt/pbs again a
/opt/pbs folder with repeated bin and sbin folders and executables. Is this
wrong or is how it is supposed to be?

2009/5/21 Ling C. Ho <ling at fnal.gov>

Have you configured a scheduler?
>
> What if you use qrun. Would any job starts?
>
> ...
> ling
>
> Samir Gartner wrote:
>
>  Ok, I don't see any file named default_server but server_name has the
>> right server name rufian.perrera.local and there is another file with the
>> same content named server_name.new.
>>
>> Righ now the PSB server name apears to be correct (after stoping the
>> server and manually deletting the zombie jobs) but stil the jobs won't
>> start.
>>
>>
>> [samir at rufian ~]$ echo "sleep 30;date" | /opt/pbs/bin/qsub
>> [samir at rufian ~]$ /opt/pbs/bin/qstat -a
>>
>> rufian.perrera.local:
>>
>> Req'd  Req'd   Elap
>> Job ID               Username Queue    Jobname          SessID NDS   TSK
>> Memory Time  S Time
>> -------------------- -------- -------- ---------------- ------ ----- ---
>> ------ ----- - -----
>> 13.rufian.perrer     samir    batch    STDIN               --      1  --
>>  --  01:00 Q   --
>> [samir at rufian ~]$
>>
>>
>> by the way, is it top posting allowed??
>>
>> 2009/5/21 Jerry Smith <jdsmit at sandia.gov <mailto:jdsmit at sandia.gov>>
>>
>>
>>    Samir,
>>
>>    What do you have in $PBS_HOME/{server_name,default_server}?
>>
>>    It should be what resolves as the ethernet address that pbs should
>>    be listening on.
>>
>>    --Jerry
>>
>>
>>
>>
>>    Samir Gartner wrote:
>>
>>        Ok I finally installed torque under yellowdog/ppc but now I have
>>        another problem. I set up my pbs server as rufian.perrera.local
>>        but when I issue a job it shows itself in localhost.localdomain
>>        and it stays on queued state forever. And if i try to qdel the
>>        job it cant reach the server and the conection times out. Any
>>        ideas of what could be wrong?
>>        I'm not trying to set up anything complicated, is just one
>>        machine that works as server and client.
>>
>>        this is the shell output
>>
>>        [root at rufian bin]# /opt/pbs/bin/qstat -a
>>
>>        rufian.perrera.local:
>>
>>            Req'd  Req'd   Elap
>>        Job ID               Username Queue    Jobname          SessID
>>        NDS   TSK Memory Time  S Time
>>        -------------------- -------- -------- ---------------- ------
>>        ----- --- ------ ----- - -----
>>        7.localhost.loca     samir    batch    STDIN               --
>>       1  --    --  01:00 Q   --
>>        8.localhost.loca     samir    batch    STDIN               --
>>       1  --    --  01:00 Q   --
>>        9.localhost.loca     samir    batch    STDIN               --
>>       1  --    --  01:00 Q   --
>>        10.localhost.loc     samir    batch    STDIN               --
>>       1  --    --  01:00 Q   --
>>        [root at rufian bin]# /opt/pbs/bin/qdel 7.localhost.localdomain
>>        Connection timed out
>>        qdel: cannot connect to server localhost.localdomain (errno=110)
>>        Connection timed out
>>        You have new mail in /var/spool/mail/root
>>        [root at rufian bin]# /opt/pbs/bin/qdel 7.rufian.perrera.local
>>        qdel: Unknown Job Id 7.rufian.perrera.local
>>        [root at rufian bin]# su - samir
>>        [samir at rufian ~]$ /opt/pbs/bin/qdel 7.localhost.localdomain
>>        Connection timed out
>>        qdel: cannot connect to server localhost.localdomain (errno=110)
>>        Connection timed out
>>        [samir at rufian ~]$
>>
>>
>>
>>
>> ------------------------------------------------------------------------
>>
>> _______________________________________________
>> torqueusers mailing list
>> torqueusers at supercluster.org
>> http://www.supercluster.org/mailman/listinfo/torqueusers
>>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20090521/ca52681f/attachment.html 


More information about the torqueusers mailing list