[torqueusers] jobs terminated half way

RB. Ezhilalan (Principal Physicist, CUH) RB.Ezhilalan at hse.ie
Thu Oct 31 08:55:59 MDT 2013


Hi Ricardo,

Please see below answers to your questions:

"Is the cluster yours? Can you run the program outside torque? It's the
easiest way to know if it's torque or the program itself that aborted
the
task."

Yes, this mini cluster is solely ours. I ran the calculations on a
single PC without problem i.e interactively without involving Torque.

Also can you print us your PBS_server configuration?

I have printed below the pbs_server config-I hope this is the right way
to print the configuration. There are total 6 PC's with one PC having
dual core.
Thanks again for your help. 
*****************************************************
ezhil at linux-01:~/egsnrc_mp/dosxyznrc> qmgr -c 'p s'
#
# Create queues and set their attributes.
#
#
# Create and define queue long
#
create queue long
set queue long queue_type = Execution
set queue long resources_default.ncpus = 7
set queue long resources_default.nodes = 1
set queue long resources_default.walltime = 120:00:00
set queue long enabled = True
set queue long started = True
#
# Create and define queue batch
#
create queue batch
set queue batch queue_type = Execution
set queue batch resources_min.ncpus = 7
set queue batch resources_default.nodes = 1
set queue batch resources_default.walltime = 100:00:00
set queue batch enabled = True
set queue batch started = True
#
# Create and define queue short
#
create queue short
set queue short queue_type = Execution
set queue short resources_default.ncpus = 7
set queue short resources_default.nodes = 1
set queue short resources_default.walltime = 03:00:00
set queue short enabled = True
set queue short started = True
#
# Create and define queue medium
#
create queue medium
set queue medium queue_type = Execution
set queue medium resources_default.ncpus = 7
set queue medium resources_default.nodes = 6
set queue medium resources_default.walltime = 15:00:00
set queue medium enabled = False
set queue medium started = False
#
# Set server attributes.
#
set server scheduling = True
set server acl_hosts = linux-01
set server managers = ezhil at linux-01.physics
set server operators = ezhil at linux-01.physics
set server default_queue = batch
set server log_events = 511
set server mail_from = adm
set server scheduler_iteration = 600
set server node_check_rate = 150
set server tcp_timeout = 6
set server mom_job_sync = True
set server keep_completed = 300
set server auto_node_np = True
set server next_job_number = 2108
ezhil at linux-01:~/egsnrc_mp/dosxyznrc>   
************************************************************************
Ezhilalan Ramalingam M.Sc.,DABR.,
Principal Physicist (Radiotherapy),
Medical Physics Department,
Cork University Hospital,
Wilton, Cork
Ireland
Tel. 00353 21 4922533
Fax.00353 21 4921300
Email: rb.ezhilalan at hse.ie 

-----Original Message-----
From: torqueusers-bounces at supercluster.org
[mailto:torqueusers-bounces at supercluster.org] On Behalf Of
torqueusers-request at supercluster.org
Sent: 31 October 2013 13:27
To: torqueusers at supercluster.org
Subject: torqueusers Digest, Vol 111, Issue 40

Send torqueusers mailing list submissions to
	torqueusers at supercluster.org

To subscribe or unsubscribe via the World Wide Web, visit
	http://www.supercluster.org/mailman/listinfo/torqueusers
or, via email, send a message with subject or body 'help' to
	torqueusers-request at supercluster.org

You can reach the person managing the list at
	torqueusers-owner at supercluster.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of torqueusers digest..."


Today's Topics:

   1. Re: Require route queue (Ken Nielson)
   2. jobs terminated half way
      (RB. Ezhilalan (Principal Physicist, CUH))
   3. Re: jobs terminated half way (Ricardo Rom?n Brenes)


----------------------------------------------------------------------

Message: 1
Date: Wed, 30 Oct 2013 14:02:56 -0600
From: Ken Nielson <knielson at adaptivecomputing.com>
Subject: Re: [torqueusers] Require route queue
To: Torque Users Mailing List <torqueusers at supercluster.org>
Message-ID:
	
<CADvLK3dmvaDAak4TK+3qqep0YpHzPvP7Tan625q=47L251itVA at mail.gmail.com>
Content-Type: text/plain; charset="windows-1252"

I am going to look for that one too.


On Fri, Oct 25, 2013 at 3:02 PM, Andrus, Brian Contractor
<bdandrus at nps.edu>wrote:

>  Thanks.****
>
> Wow. Why couldn?t I find that?. Must be Friday.****
>
> Now I can deal with these users that are sneaking around specifying
> walltimes to try and get unlimited time.****
>
> ** **
>
> ** **
>
> Brian Andrus****
>
> ITACS/Research Computing****
>
> Naval Postgraduate School****
>
> Monterey, California****
>
> voice: 831-656-6238****
>
> ** **
>
> ** **
>
> ** **
>
> *From:* torqueusers-bounces at supercluster.org [mailto:
> torqueusers-bounces at supercluster.org] *On Behalf Of *Matt Britt
> *Sent:* Friday, October 25, 2013 1:47 PM
> *To:* Torque Users Mailing List
> *Subject:* Re: [torqueusers] Require route queue****
>
> ** **
>
> I haven't tested it, but in the queue attributes man page, there is
the '
> from_route_only' attribute.****
>
> ** **
>
>  - Matt****
>
> ** **
>
>
> ****
>
> --------------------------------------------****
>
> Matthew Britt****
>
> CAEN HPC Group - College of Engineering****
>
> msbritt at umich.edu****
>
>  ****
>
> ** **
>
> On Fri, Oct 25, 2013 at 4:41 PM, Andrus, Brian Contractor <
> bdandrus at nps.edu> wrote:****
>
> All,
>
> Is there a way to have a queue ONLY  allow jobs that are coming from a
> routing queue?
>
>
> Brian Andrus
> ITACS/Research Computing
> Naval Postgraduate School
> Monterey, California
> voice: 831-656-6238
>
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers****
>
> ** **
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
>


-- 
Ken Nielson
+1 801.717.3700 office +1 801.717.3738 fax
1712 S. East Bay Blvd, Suite 300  Provo, UT  84606
www.adaptivecomputing.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://www.supercluster.org/pipermail/torqueusers/attachments/20131030/8
3ce12e6/attachment-0001.html 

------------------------------

Message: 2
Date: Thu, 31 Oct 2013 10:50:42 -0000
From: "RB. Ezhilalan (Principal Physicist, CUH)" <RB.Ezhilalan at hse.ie>
Subject: [torqueusers] jobs terminated half way
To: <torqueusers at supercluster.org>
Message-ID:
	
<4659DE6B4825AD4F908C85260F0F2195273507 at ckvex001.south.health.local>
Content-Type: text/plain;	charset="us-ascii"

Hi Ricardo,

Thank you for looking at the log files. I noticed that the jobs get
terminated half way when the calculation time for each job is increased
(i.e number of histories). Could the default memory allocation be the
problem? I have used the default settings for the pbs_server. For your
info I am running BEAmnrc montecarlo simulations. Any suggestions?

Regards,
Ezhil
Ezhilalan Ramalingam M.Sc.,DABR.,
Principal Physicist (Radiotherapy),
Medical Physics Department,
Cork University Hospital,
Wilton, Cork
Ireland
Tel. 00353 21 4922533
Fax.00353 21 4921300
Email: rb.ezhilalan at hse.ie 
-----Original Message-----
From: torqueusers-bounces at supercluster.org
[mailto:torqueusers-bounces at supercluster.org] On Behalf Of
torqueusers-request at supercluster.org
Sent: 30 October 2013 18:59
To: torqueusers at supercluster.org
Subject: torqueusers Digest, Vol 111, Issue 39

Send torqueusers mailing list submissions to
	torqueusers at supercluster.org

To subscribe or unsubscribe via the World Wide Web, visit
	http://www.supercluster.org/mailman/listinfo/torqueusers
or, via email, send a message with subject or body 'help' to
	torqueusers-request at supercluster.org

You can reach the person managing the list at
------------------------------

Message: 1
Date: Tue, 29 Oct 2013 09:35:18 -0600
From: Ricardo Rom?n Brenes <roman.ricardo at gmail.com>
Subject: Re: [torqueusers] jobs terminated half way
To: Torque Users Mailing List <torqueusers at supercluster.org>
Message-ID:
	
<CAG-vK_xU5vNFROLhgOxG8en=aK67eeEVZcJqVjkV7DyXOje9iQ at mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Hi again

The only error i could read in the 6 logs you sent regarding those jobs
was
this:

pbs_mom;Svr;pbs_mom;LOG_ERROR::Permission denied (13) in job_purge,
Unlink
of job file failed

I am not sure if this is an actual error, just a error in the logging or
if
this "permission denied" should abort your jobs. Maybe check the workdir
permissions.
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://www.supercluster.org/pipermail/torqueusers/attachments/20131029/8
dfc388f/attachment-0001.html 

------------------------------

Message: 2
Date: Tue, 29 Oct 2013 09:54:17 -0700
From: Michael Jennings <mej at lbl.gov>
Subject: Re: [torqueusers] Problem building rpms torque-2.5.13
To: torqueusers at supercluster.org
Message-ID: <20131029165417.GA27774 at lbl.gov>
Content-Type: text/plain; charset=us-ascii

On Tuesday, 29 October 2013, at 15:40:04 (+0100),
Carles Acosta wrote:

> I am trying to build the rpms for the new torque 2.5.13 release.
> After applying the patch fix_mom_priv_2.5.patch, I use the following
> options:
> 
> # rpmbuild -ta --with munge --with scp --define 'torque_home
> /var/spool/pbs' --define 'torque_server XXXXXXX' --define 'acflags
> --enable-maxdefault --with-readline --with-tcp-retry-limit=2
> --disable-spool' torque-2.5.13.tar.gz
> 
> The process fails with the error:

This is a known issue which has already been fixed in Git.  Here's the
mailing list thread from September:

http://www.supercluster.org/pipermail/torquedev/2013-September/004587.ht
ml

Here's the pull request (with patch):

https://github.com/adaptivecomputing/torque/pull/183

Michael

-- 
Michael Jennings <mej at lbl.gov>
Senior HPC Systems Engineer
High-Performance Computing Services
Lawrence Berkeley National Laboratory
Bldg 50B-3209E        W: 510-495-2687
MS 050B-3209          F: 510-486-8615


------------------------------

Message: 3
Date: Tue, 29 Oct 2013 21:03:45 +0100
From: "Carles Acosta (PIC)" <cacosta at pic.es>
Subject: Re: [torqueusers] Problem building rpms torque-2.5.13
To: Torque Users Mailing List <torqueusers at supercluster.org>
Cc: "torqueusers at supercluster.org" <torqueusers at supercluster.org>
Message-ID: <AA390888-DD39-4BC8-805E-CFF6D2A61334 at pic.es>
Content-Type: text/plain;	charset=us-ascii

Hi Michael,

Thank you very much!

Regards,

Carles 

El Oct 29, 2013, a les 5:54 PM, Michael Jennings <mej at lbl.gov> va
escriure:
> On Tuesday, 29 October 2013, at 15:40:04 (+0100),
> Carles Acosta wrote:
> 
>> I am trying to build the rpms for the new torque 2.5.13 release.
>> After applying the patch fix_mom_priv_2.5.patch, I use the following
>> options:
>> 
>> # rpmbuild -ta --with munge --with scp --define 'torque_home
>> /var/spool/pbs' --define 'torque_server XXXXXXX' --define 'acflags
>> --enable-maxdefault --with-readline --with-tcp-retry-limit=2
>> --disable-spool' torque-2.5.13.tar.gz
>> 
>> The process fails with the error:
> 
> This is a known issue which has already been fixed in Git.  Here's the
> mailing list thread from September:
> 
>
http://www.supercluster.org/pipermail/torquedev/2013-September/004587.ht
ml
> 
> Here's the pull request (with patch):
> 
> https://github.com/adaptivecomputing/torque/pull/183
> 
> Michael
> 
> -- 
> Michael Jennings <mej at lbl.gov>
> Senior HPC Systems Engineer
> High-Performance Computing Services
> Lawrence Berkeley National Laboratory
> Bldg 50B-3209E        W: 510-495-2687
> MS 050B-3209          F: 510-486-8615
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers


------------------------------

Message: 4
Date: Wed, 30 Oct 2013 16:00:26 +0100
From: Luca Nannipieri <nannipieri at pi.ingv.it>
Subject: [torqueusers] priority queue
To: Torque Users Mailing List <torqueusers at supercluster.org>
Message-ID: <52711F0A.2070407 at pi.ingv.it>
Content-Type: text/plain; charset=ISO-8859-15; format=flowed

I have 2 queues:

[root@ ~]# qstat -Q -f
Queue: default
     queue_type = Execution
     Priority = 50
     total_jobs = 1
     state_count = Transit:0 Queued:1 Held:0 Waiting:0 Running:0
Exiting:0
     mtime = 1383139362
     resources_assigned.nodect = 0
     enabled = True
     started = True

Queue: batch
     queue_type = Execution
     Priority = 20
     total_jobs = 1
     state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:1
Exiting:0
     mtime = 1383139335
     resources_assigned.nodect = 1
     enabled = True
     started = True

default has priority= 50 and batch prioriry=20, but if i submit a job 
with default queue the scheduler put in "queued" status even if there is

a running job with batch queue and not running job with default queue 
and put in "queued" the job with batch queue.  Why?

-- 
Ing. Luca Nannipieri
Istituto Nazionale di Geofisica e Vulcanologia
Sezione di Pisa
Via della Faggiola, 32 - 56126 Pisa - Italy
Tel. +39 050 8311926
fax: +39 050 8311942
http://www.pi.ingv.it/chisiamo/paginepersonali/nannipieri.html
PEC: aoo.pisa at pec.ingv.it
----------------------------------------------------------------

Il contenuto di questa e-mail e' rivolto unicamente alle persone
cui e' indirizzato, e puo'contenere informazioni la cui riservatezza
e' tutelata.E' proibita la copia, la divulgazione o l'uso di questo
messaggio o dell'informazione ivi contenuta da chiunque altro che
non sia il destinatario indicato. Se avete ricevuto questa e-mail
per errore, vogliate cortesemente comunicarlo immediatamente per
telefono, fax o e-mail.
Grazie.

This e-mail is intended only for person or entity to which is
addressed and may contain information that is privileged, confidential
or otherwise protected from disclosure. Copying, dissemination or use
of this e-mail or the information herein by anyone other than the
intended recipient is prohibited. If you have received this e-mail
by mistake, please notify us immediately by telephone, fax or e-mail.
Thank you.





------------------------------

Message: 5
Date: Wed, 30 Oct 2013 09:52:46 -0600
From: David Beer <dbeer at adaptivecomputing.com>
Subject: Re: [torqueusers] priority queue
To: Torque Users Mailing List <torqueusers at supercluster.org>
Message-ID:
	
<CAFUQeZ1z-D9o23V69_bAue0svZ2dc7H0XsZjKh8zeWf=Opk0iA at mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Luca,

The priority assigned by the queue is meant to be interpreted by the
scheduler you are using. Usually, having two jobs where job1 has a
priority
of 20 and job2 has a priority of 50 means that both jobs are eligible to
run, but job1 should be evaluated to run before job2 (or the other way
if
your scheduler things lower priority numbers run first or higher).

In other words, the state of queued simply means the job is eligible to
be
run. Two jobs having the same state doesn't mean that they are equal
priority for running.

HTH

David


On Wed, Oct 30, 2013 at 9:00 AM, Luca Nannipieri
<nannipieri at pi.ingv.it>wrote:

> I have 2 queues:
>
> [root@ ~]# qstat -Q -f
> Queue: default
>      queue_type = Execution
>      Priority = 50
>      total_jobs = 1
>      state_count = Transit:0 Queued:1 Held:0 Waiting:0 Running:0
Exiting:0
>      mtime = 1383139362
>      resources_assigned.nodect = 0
>      enabled = True
>      started = True
>
> Queue: batch
>      queue_type = Execution
>      Priority = 20
>      total_jobs = 1
>      state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:1
Exiting:0
>      mtime = 1383139335
>      resources_assigned.nodect = 1
>      enabled = True
>      started = True
>
> default has priority= 50 and batch prioriry=20, but if i submit a job
> with default queue the scheduler put in "queued" status even if there
is
> a running job with batch queue and not running job with default queue
> and put in "queued" the job with batch queue.  Why?
>
> --
> Ing. Luca Nannipieri
> Istituto Nazionale di Geofisica e Vulcanologia
> Sezione di Pisa
> Via della Faggiola, 32 - 56126 Pisa - Italy
> Tel. +39 050 8311926
> fax: +39 050 8311942
> http://www.pi.ingv.it/chisiamo/paginepersonali/nannipieri.html
> PEC: aoo.pisa at pec.ingv.it
> ----------------------------------------------------------------
>
> Il contenuto di questa e-mail e' rivolto unicamente alle persone
> cui e' indirizzato, e puo'contenere informazioni la cui riservatezza
> e' tutelata.E' proibita la copia, la divulgazione o l'uso di questo
> messaggio o dell'informazione ivi contenuta da chiunque altro che
> non sia il destinatario indicato. Se avete ricevuto questa e-mail
> per errore, vogliate cortesemente comunicarlo immediatamente per
> telefono, fax o e-mail.
> Grazie.
>
> This e-mail is intended only for person or entity to which is
> addressed and may contain information that is privileged, confidential
> or otherwise protected from disclosure. Copying, dissemination or use
> of this e-mail or the information herein by anyone other than the
> intended recipient is prohibited. If you have received this e-mail
> by mistake, please notify us immediately by telephone, fax or e-mail.
> Thank you.
>
>
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>



-- 
David Beer | Senior Software Engineer
Adaptive Computing
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://www.supercluster.org/pipermail/torqueusers/attachments/20131030/b
bf881b6/attachment-0001.html 

------------------------------

Message: 6
Date: Wed, 30 Oct 2013 10:50:29 -0700 (PDT)
From: Eva Hocks <hocks at sdsc.edu>
Subject: [torqueusers] customizing xbpsmon
To: Torque Users Mailing List <torqueusers at supercluster.org>
Message-ID:
	<Pine.GSO.4.30.1310301038510.7397-100000 at multivac.sdsc.edu>
Content-Type: TEXT/PLAIN; charset=US-ASCII


Anybody using xpbsmon? I would like to change the size of the cluster
frame.

I changed the height in the xpbsmonrc without success.

*nodeBoxFullMaxHeight:  1000
*nodeBoxMirrorMaxHeight:        1000
*serverBoxMaxHeight:    1000
*siteBoxMaxHeight:      1000


I also tried to chage the same variable in the xpbsmon script with the
same result.

Any help appreciated
Thanks
Eva



------------------------------

Message: 7
Date: Wed, 30 Oct 2013 14:48:29 -0400
From: Kevin Van Workum <vanw at sabalcore.com>
Subject: [torqueusers] TCL scheduler
To: Torque Users Mailing List <torqueusers at supercluster.org>
Message-ID:
	
<CAHom8ysghfYWq9TB8Si-cHw2StToD=qJTdQ7PMztB_LFcr+6Mw at mail.gmail.com>
Content-Type: text/plain; charset="us-ascii"

I'm curious if the TCL scheduler still supported in 4.2.x? Trying to
build
it throws lots of errors.

-- 
Kevin Van Workum, PhD
Sabalcore Computing Inc.
"Where Data Becomes Discovery"
http://www.sabalcore.com
877-492-8027 ext. 11

-- 

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://www.supercluster.org/pipermail/torqueusers/attachments/20131030/2
fe5a4db/attachment.html 

------------------------------

_______________________________________________
torqueusers mailing list
torqueusers at supercluster.org
http://www.supercluster.org/mailman/listinfo/torqueusers


End of torqueusers Digest, Vol 111, Issue 39
********************************************


------------------------------

Message: 3
Date: Thu, 31 Oct 2013 07:16:42 -0600
From: Ricardo Rom?n Brenes <roman.ricardo at gmail.com>
Subject: Re: [torqueusers] jobs terminated half way
To: Torque Users Mailing List <torqueusers at supercluster.org>
Message-ID:
	
<CAG-vK_xOMvJGVFt5GASQuqddTe5kG8X0n4umciH1J_y_QDMDCw at mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Is the cluster yours? Can you run the program outside torque? It's the
easiest way to know if it's torque or the program itself that aborted
the
task.

Also can you print us your PBS_server configuration?
On Oct 31, 2013 4:52 AM, "RB. Ezhilalan (Principal Physicist, CUH)" <
RB.Ezhilalan at hse.ie> wrote:

> Hi Ricardo,
>
> Thank you for looking at the log files. I noticed that the jobs get
> terminated half way when the calculation time for each job is
increased
> (i.e number of histories). Could the default memory allocation be the
> problem? I have used the default settings for the pbs_server. For your
> info I am running BEAmnrc montecarlo simulations. Any suggestions?
>
> Regards,
> Ezhil
> Ezhilalan Ramalingam M.Sc.,DABR.,
> Principal Physicist (Radiotherapy),
> Medical Physics Department,
> Cork University Hospital,
> Wilton, Cork
> Ireland
> Tel. 00353 21 4922533
> Fax.00353 21 4921300
> Email: rb.ezhilalan at hse.ie
> -----Original Message-----
> From: torqueusers-bounces at supercluster.org
> [mailto:torqueusers-bounces at supercluster.org] On Behalf Of
> torqueusers-request at supercluster.org
> Sent: 30 October 2013 18:59
> To: torqueusers at supercluster.org
> Subject: torqueusers Digest, Vol 111, Issue 39
>
> Send torqueusers mailing list submissions to
>         torqueusers at supercluster.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>         http://www.supercluster.org/mailman/listinfo/torqueusers
> or, via email, send a message with subject or body 'help' to
>         torqueusers-request at supercluster.org
>
> You can reach the person managing the list at
>         torqueusers-owner at supercluster.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of torqueusers digest..."
>
>
> Today's Topics:
>
>    1. Re: jobs terminated half way (Ricardo Rom?n Brenes)
>    2. Re: Problem building rpms torque-2.5.13 (Michael Jennings)
>    3. Re: Problem building rpms torque-2.5.13 (Carles Acosta (PIC))
>    4. priority queue (Luca Nannipieri)
>    5. Re: priority queue (David Beer)
>    6. customizing xbpsmon (Eva Hocks)
>    7. TCL scheduler (Kevin Van Workum)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Tue, 29 Oct 2013 09:35:18 -0600
> From: Ricardo Rom?n Brenes <roman.ricardo at gmail.com>
> Subject: Re: [torqueusers] jobs terminated half way
> To: Torque Users Mailing List <torqueusers at supercluster.org>
> Message-ID:
>
> <CAG-vK_xU5vNFROLhgOxG8en=aK67eeEVZcJqVjkV7DyXOje9iQ at mail.gmail.com>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Hi again
>
> The only error i could read in the 6 logs you sent regarding those
jobs
> was
> this:
>
> pbs_mom;Svr;pbs_mom;LOG_ERROR::Permission denied (13) in job_purge,
> Unlink
> of job file failed
>
> I am not sure if this is an actual error, just a error in the logging
or
> if
> this "permission denied" should abort your jobs. Maybe check the
workdir
> permissions.
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
>
http://www.supercluster.org/pipermail/torqueusers/attachments/20131029/8
> dfc388f/attachment-0001.html
>
> ------------------------------
>
> Message: 2
> Date: Tue, 29 Oct 2013 09:54:17 -0700
> From: Michael Jennings <mej at lbl.gov>
> Subject: Re: [torqueusers] Problem building rpms torque-2.5.13
> To: torqueusers at supercluster.org
> Message-ID: <20131029165417.GA27774 at lbl.gov>
> Content-Type: text/plain; charset=us-ascii
>
> On Tuesday, 29 October 2013, at 15:40:04 (+0100),
> Carles Acosta wrote:
>
> > I am trying to build the rpms for the new torque 2.5.13 release.
> > After applying the patch fix_mom_priv_2.5.patch, I use the following
> > options:
> >
> > # rpmbuild -ta --with munge --with scp --define 'torque_home
> > /var/spool/pbs' --define 'torque_server XXXXXXX' --define 'acflags
> > --enable-maxdefault --with-readline --with-tcp-retry-limit=2
> > --disable-spool' torque-2.5.13.tar.gz
> >
> > The process fails with the error:
>
> This is a known issue which has already been fixed in Git.  Here's the
> mailing list thread from September:
>
>
http://www.supercluster.org/pipermail/torquedev/2013-September/004587.ht
> ml
>
> Here's the pull request (with patch):
>
> https://github.com/adaptivecomputing/torque/pull/183
>
> Michael
>
> --
> Michael Jennings <mej at lbl.gov>
> Senior HPC Systems Engineer
> High-Performance Computing Services
> Lawrence Berkeley National Laboratory
> Bldg 50B-3209E        W: 510-495-2687
> MS 050B-3209          F: 510-486-8615
>
>
> ------------------------------
>
> Message: 3
> Date: Tue, 29 Oct 2013 21:03:45 +0100
> From: "Carles Acosta (PIC)" <cacosta at pic.es>
> Subject: Re: [torqueusers] Problem building rpms torque-2.5.13
> To: Torque Users Mailing List <torqueusers at supercluster.org>
> Cc: "torqueusers at supercluster.org" <torqueusers at supercluster.org>
> Message-ID: <AA390888-DD39-4BC8-805E-CFF6D2A61334 at pic.es>
> Content-Type: text/plain;       charset=us-ascii
>
> Hi Michael,
>
> Thank you very much!
>
> Regards,
>
> Carles
>
> El Oct 29, 2013, a les 5:54 PM, Michael Jennings <mej at lbl.gov> va
> escriure:
> > On Tuesday, 29 October 2013, at 15:40:04 (+0100),
> > Carles Acosta wrote:
> >
> >> I am trying to build the rpms for the new torque 2.5.13 release.
> >> After applying the patch fix_mom_priv_2.5.patch, I use the
following
> >> options:
> >>
> >> # rpmbuild -ta --with munge --with scp --define 'torque_home
> >> /var/spool/pbs' --define 'torque_server XXXXXXX' --define 'acflags
> >> --enable-maxdefault --with-readline --with-tcp-retry-limit=2
> >> --disable-spool' torque-2.5.13.tar.gz
> >>
> >> The process fails with the error:
> >
> > This is a known issue which has already been fixed in Git.  Here's
the
> > mailing list thread from September:
> >
> >
>
http://www.supercluster.org/pipermail/torquedev/2013-September/004587.ht
> ml
> >
> > Here's the pull request (with patch):
> >
> > https://github.com/adaptivecomputing/torque/pull/183
> >
> > Michael
> >
> > --
> > Michael Jennings <mej at lbl.gov>
> > Senior HPC Systems Engineer
> > High-Performance Computing Services
> > Lawrence Berkeley National Laboratory
> > Bldg 50B-3209E        W: 510-495-2687
> > MS 050B-3209          F: 510-486-8615
> > _______________________________________________
> > torqueusers mailing list
> > torqueusers at supercluster.org
> > http://www.supercluster.org/mailman/listinfo/torqueusers
>
>
> ------------------------------
>
> Message: 4
> Date: Wed, 30 Oct 2013 16:00:26 +0100
> From: Luca Nannipieri <nannipieri at pi.ingv.it>
> Subject: [torqueusers] priority queue
> To: Torque Users Mailing List <torqueusers at supercluster.org>
> Message-ID: <52711F0A.2070407 at pi.ingv.it>
> Content-Type: text/plain; charset=ISO-8859-15; format=flowed
>
> I have 2 queues:
>
> [root@ ~]# qstat -Q -f
> Queue: default
>      queue_type = Execution
>      Priority = 50
>      total_jobs = 1
>      state_count = Transit:0 Queued:1 Held:0 Waiting:0 Running:0
> Exiting:0
>      mtime = 1383139362
>      resources_assigned.nodect = 0
>      enabled = True
>      started = True
>
> Queue: batch
>      queue_type = Execution
>      Priority = 20
>      total_jobs = 1
>      state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:1
> Exiting:0
>      mtime = 1383139335
>      resources_assigned.nodect = 1
>      enabled = True
>      started = True
>
> default has priority= 50 and batch prioriry=20, but if i submit a job
> with default queue the scheduler put in "queued" status even if there
is
>
> a running job with batch queue and not running job with default queue
> and put in "queued" the job with batch queue.  Why?
>
> --
> Ing. Luca Nannipieri
> Istituto Nazionale di Geofisica e Vulcanologia
> Sezione di Pisa
> Via della Faggiola, 32 - 56126 Pisa - Italy
> Tel. +39 050 8311926
> fax: +39 050 8311942
> http://www.pi.ingv.it/chisiamo/paginepersonali/nannipieri.html
> PEC: aoo.pisa at pec.ingv.it
> ----------------------------------------------------------------
>
> Il contenuto di questa e-mail e' rivolto unicamente alle persone
> cui e' indirizzato, e puo'contenere informazioni la cui riservatezza
> e' tutelata.E' proibita la copia, la divulgazione o l'uso di questo
> messaggio o dell'informazione ivi contenuta da chiunque altro che
> non sia il destinatario indicato. Se avete ricevuto questa e-mail
> per errore, vogliate cortesemente comunicarlo immediatamente per
> telefono, fax o e-mail.
> Grazie.
>
> This e-mail is intended only for person or entity to which is
> addressed and may contain information that is privileged, confidential
> or otherwise protected from disclosure. Copying, dissemination or use
> of this e-mail or the information herein by anyone other than the
> intended recipient is prohibited. If you have received this e-mail
> by mistake, please notify us immediately by telephone, fax or e-mail.
> Thank you.
>
>
>
>
>
> ------------------------------
>
> Message: 5
> Date: Wed, 30 Oct 2013 09:52:46 -0600
> From: David Beer <dbeer at adaptivecomputing.com>
> Subject: Re: [torqueusers] priority queue
> To: Torque Users Mailing List <torqueusers at supercluster.org>
> Message-ID:
>
> <CAFUQeZ1z-D9o23V69_bAue0svZ2dc7H0XsZjKh8zeWf=Opk0iA at mail.gmail.com>
> Content-Type: text/plain; charset="iso-8859-1"
>
> Luca,
>
> The priority assigned by the queue is meant to be interpreted by the
> scheduler you are using. Usually, having two jobs where job1 has a
> priority
> of 20 and job2 has a priority of 50 means that both jobs are eligible
to
> run, but job1 should be evaluated to run before job2 (or the other way
> if
> your scheduler things lower priority numbers run first or higher).
>
> In other words, the state of queued simply means the job is eligible
to
> be
> run. Two jobs having the same state doesn't mean that they are equal
> priority for running.
>
> HTH
>
> David
>
>
> On Wed, Oct 30, 2013 at 9:00 AM, Luca Nannipieri
> <nannipieri at pi.ingv.it>wrote:
>
> > I have 2 queues:
> >
> > [root@ ~]# qstat -Q -f
> > Queue: default
> >      queue_type = Execution
> >      Priority = 50
> >      total_jobs = 1
> >      state_count = Transit:0 Queued:1 Held:0 Waiting:0 Running:0
> Exiting:0
> >      mtime = 1383139362
> >      resources_assigned.nodect = 0
> >      enabled = True
> >      started = True
> >
> > Queue: batch
> >      queue_type = Execution
> >      Priority = 20
> >      total_jobs = 1
> >      state_count = Transit:0 Queued:0 Held:0 Waiting:0 Running:1
> Exiting:0
> >      mtime = 1383139335
> >      resources_assigned.nodect = 1
> >      enabled = True
> >      started = True
> >
> > default has priority= 50 and batch prioriry=20, but if i submit a
job
> > with default queue the scheduler put in "queued" status even if
there
> is
> > a running job with batch queue and not running job with default
queue
> > and put in "queued" the job with batch queue.  Why?
> >
> > --
> > Ing. Luca Nannipieri
> > Istituto Nazionale di Geofisica e Vulcanologia
> > Sezione di Pisa
> > Via della Faggiola, 32 - 56126 Pisa - Italy
> > Tel. +39 050 8311926
> > fax: +39 050 8311942
> > http://www.pi.ingv.it/chisiamo/paginepersonali/nannipieri.html
> > PEC: aoo.pisa at pec.ingv.it
> > ----------------------------------------------------------------
> >
> > Il contenuto di questa e-mail e' rivolto unicamente alle persone
> > cui e' indirizzato, e puo'contenere informazioni la cui riservatezza
> > e' tutelata.E' proibita la copia, la divulgazione o l'uso di questo
> > messaggio o dell'informazione ivi contenuta da chiunque altro che
> > non sia il destinatario indicato. Se avete ricevuto questa e-mail
> > per errore, vogliate cortesemente comunicarlo immediatamente per
> > telefono, fax o e-mail.
> > Grazie.
> >
> > This e-mail is intended only for person or entity to which is
> > addressed and may contain information that is privileged,
confidential
> > or otherwise protected from disclosure. Copying, dissemination or
use
> > of this e-mail or the information herein by anyone other than the
> > intended recipient is prohibited. If you have received this e-mail
> > by mistake, please notify us immediately by telephone, fax or
e-mail.
> > Thank you.
> >
> >
> >
> > _______________________________________________
> > torqueusers mailing list
> > torqueusers at supercluster.org
> > http://www.supercluster.org/mailman/listinfo/torqueusers
> >
>
>
>
> --
> David Beer | Senior Software Engineer
> Adaptive Computing
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
>
http://www.supercluster.org/pipermail/torqueusers/attachments/20131030/b
> bf881b6/attachment-0001.html
>
> ------------------------------
>
> Message: 6
> Date: Wed, 30 Oct 2013 10:50:29 -0700 (PDT)
> From: Eva Hocks <hocks at sdsc.edu>
> Subject: [torqueusers] customizing xbpsmon
> To: Torque Users Mailing List <torqueusers at supercluster.org>
> Message-ID:
>         <Pine.GSO.4.30.1310301038510.7397-100000 at multivac.sdsc.edu>
> Content-Type: TEXT/PLAIN; charset=US-ASCII
>
>
> Anybody using xpbsmon? I would like to change the size of the cluster
> frame.
>
> I changed the height in the xpbsmonrc without success.
>
> *nodeBoxFullMaxHeight:  1000
> *nodeBoxMirrorMaxHeight:        1000
> *serverBoxMaxHeight:    1000
> *siteBoxMaxHeight:      1000
>
>
> I also tried to chage the same variable in the xpbsmon script with the
> same result.
>
> Any help appreciated
> Thanks
> Eva
>
>
>
> ------------------------------
>
> Message: 7
> Date: Wed, 30 Oct 2013 14:48:29 -0400
> From: Kevin Van Workum <vanw at sabalcore.com>
> Subject: [torqueusers] TCL scheduler
> To: Torque Users Mailing List <torqueusers at supercluster.org>
> Message-ID:
>
> <CAHom8ysghfYWq9TB8Si-cHw2StToD=qJTdQ7PMztB_LFcr+6Mw at mail.gmail.com>
> Content-Type: text/plain; charset="us-ascii"
>
> I'm curious if the TCL scheduler still supported in 4.2.x? Trying to
> build
> it throws lots of errors.
>
> --
> Kevin Van Workum, PhD
> Sabalcore Computing Inc.
> "Where Data Becomes Discovery"
> http://www.sabalcore.com
> 877-492-8027 ext. 11
>
> --
>
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL:
>
http://www.supercluster.org/pipermail/torqueusers/attachments/20131030/2
> fe5a4db/attachment.html
>
> ------------------------------
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
>
> End of torqueusers Digest, Vol 111, Issue 39
> ********************************************
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
http://www.supercluster.org/pipermail/torqueusers/attachments/20131031/8
d340e0f/attachment.html 

------------------------------

_______________________________________________
torqueusers mailing list
torqueusers at supercluster.org
http://www.supercluster.org/mailman/listinfo/torqueusers


End of torqueusers Digest, Vol 111, Issue 40
********************************************


More information about the torqueusers mailing list