From fcaba at uns.edu.ar Mon Aug 13 12:14:22 2012 From: fcaba at uns.edu.ar (Fernando Caba) Date: Mon, 13 Aug 2012 15:14:22 -0300 Subject: [Mauiusers] Moving jobs from one node to another Message-ID: <502943FE.2030607@uns.edu.ar> Hy, i want to know something about moving jobs from one node to another. If i need to do some manteinance in one node with a certain number of running jobs (they cannot be killed). Can i move those all jobs (or specific) to another node (free or not)? If yes, how? Sorry because I?m asking again the same, is it a dumb question? Regards Fernando -- ---------------------------------------------------- Ing. Fernando Caba Director General de Telecomunicaciones Universidad Nacional del Sur http://www.dgt.uns.edu.ar Tel/Fax: (54)-291-4595166 Tel: (54)-291-4595101 int. 2050 Avda. Alem 1253, (B8000CPB) Bah?a Blanca - Argentina ---------------------------------------------------- -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 4533 bytes Desc: Firma criptogr??fica S/MIME Url : http://www.supercluster.org/pipermail/mauiusers/attachments/20120813/5152a7ef/attachment.bin From denismpa at gmail.com Mon Aug 13 12:17:23 2012 From: denismpa at gmail.com (Denis) Date: Mon, 13 Aug 2012 15:17:23 -0300 Subject: [Mauiusers] Moving jobs from one node to another In-Reply-To: <502943FE.2030607@uns.edu.ar> References: <502943FE.2030607@uns.edu.ar> Message-ID: 2012/8/13 Fernando Caba : > Hy, i want to know something about moving jobs from one node to another. > If i need to do some manteinance in one node with a certain number of > running jobs (they cannot be killed). > Can i move those all jobs (or specific) to another node (free or not)? If > yes, how? > > Sorry because I?m asking again the same, is it a dumb question? Hello, Fernando. You cannot move a running job to another node. That would be possible with Condor if you link your code against its libraries when compiling. D. > > Regards > > Fernando > > -- > ---------------------------------------------------- > Ing. Fernando Caba > Director General de Telecomunicaciones > Universidad Nacional del Sur > http://www.dgt.uns.edu.ar > Tel/Fax: (54)-291-4595166 > Tel: (54)-291-4595101 int. 2050 > Avda. Alem 1253, (B8000CPB) Bah?a Blanca - Argentina > ---------------------------------------------------- > > > > _______________________________________________ > mauiusers mailing list > mauiusers at supercluster.org > http://www.supercluster.org/mailman/listinfo/mauiusers > -- Denis Anjos, www.versatushpc.com.br From jkusznir at gmail.com Mon Aug 20 13:23:54 2012 From: jkusznir at gmail.com (Jim Kusznir) Date: Mon, 20 Aug 2012 12:23:54 -0700 Subject: [Mauiusers] Node Priority Message-ID: Hi all: I've recently updated my cluster and added some more nodes. I now have three categories of nodes: 24 8-core intel nodes (features: intel) 6 16-core AMD nodes with infiniband (features: amd,infiniband) 4 64-core AMD nodes (features: amd, smp) Some of my users don't submit a feature request and don't much care where they get dumped; some users do supply the feature set. I would like the priority of non-featured jobs that can go anywhere to be in the order above. However, right now, they all go to my most limited node type: the 64 core nodes. How do I change the order when "its all the same" to the scheduler? Thanks! --Jim From akohlmey at cmm.chem.upenn.edu Mon Aug 20 14:17:28 2012 From: akohlmey at cmm.chem.upenn.edu (Axel Kohlmeyer) Date: Mon, 20 Aug 2012 22:17:28 +0200 Subject: [Mauiusers] Node Priority In-Reply-To: References: Message-ID: On Mon, Aug 20, 2012 at 9:23 PM, Jim Kusznir wrote: > Hi all: > > I've recently updated my cluster and added some more nodes. I now > have three categories of nodes: > > 24 8-core intel nodes (features: intel) > 6 16-core AMD nodes with infiniband (features: amd,infiniband) > 4 64-core AMD nodes (features: amd, smp) > > Some of my users don't submit a feature request and don't much care > where they get dumped; some users do supply the feature set. > > I would like the priority of non-featured jobs that can go anywhere to > be in the order above. However, right now, they all go to my most > limited node type: the 64 core nodes. > > How do I change the order when "its all the same" to the scheduler? usually nodes are handed out in the reverse order they are listed in the node file. just try to order the nodes in that file accordingly and see if that helps. cheers, axel. > > Thanks! > --Jim > _______________________________________________ > mauiusers mailing list > mauiusers at supercluster.org > http://www.supercluster.org/mailman/listinfo/mauiusers -- Dr. Axel Kohlmeyer akohlmey at gmail.com http://sites.google.com/site/akohlmey/ Institute for Computational Molecular Science Temple University, Philadelphia PA, USA. From roy.dragseth at cc.uit.no Tue Aug 21 00:41:41 2012 From: roy.dragseth at cc.uit.no (Roy Dragseth) Date: Tue, 21 Aug 2012 08:41:41 +0200 Subject: [Mauiusers] Node Priority In-Reply-To: References: Message-ID: <6797159.rBhMz93LaZ@lux> The simplest way is to reverse the nodelist in torque. You could also take a peek at the NODEALLOCATIONPOLICY parameter in maui. We have this NODEALLOCATIONPOLICY PRIORITY NODECFG[DEFAULT] PRIORITYF=JOBCOUNT in our maui.cfg, which will pack jobs into the most loaded nodes keeping as many nodes free as possible. Might be possible to tweak this to your liking. r. On Monday 20. August 2012 12.23.54 Jim Kusznir wrote: > Hi all: > > I've recently updated my cluster and added some more nodes. I now > have three categories of nodes: > > 24 8-core intel nodes (features: intel) > 6 16-core AMD nodes with infiniband (features: amd,infiniband) > 4 64-core AMD nodes (features: amd, smp) > > Some of my users don't submit a feature request and don't much care > where they get dumped; some users do supply the feature set. > > I would like the priority of non-featured jobs that can go anywhere to > be in the order above. However, right now, they all go to my most > limited node type: the 64 core nodes. > > How do I change the order when "its all the same" to the scheduler? > > Thanks! > --Jim > _______________________________________________ > mauiusers mailing list > mauiusers at supercluster.org > http://www.supercluster.org/mailman/listinfo/mauiusers -- The Computer Center, University of Troms?, N-9037 TROMS? Norway. phone:+47 77 64 41 07, fax:+47 77 64 41 00 Roy Dragseth, Team Leader, High Performance Computing Direct call: +47 77 64 62 56. email: roy.dragseth at uit.no From pankaj.dorlikar at gmail.com Wed Aug 22 13:14:15 2012 From: pankaj.dorlikar at gmail.com (pankaj dorlikar) Date: Thu, 23 Aug 2012 00:44:15 +0530 Subject: [Mauiusers] exception Message-ID: hi, how can we exclude the perticular partition from list of partitions? (!) thing does not seem to be working.. we want a) not to send any jobs from queue1 and queue2 to parttition3. It should only go to partition1 or partition2. b) However jobs submitrted using queue3 should go to on partition3. part b is achieved. but part a is not working. -- Pankaj V. Dorlikar From marco.perosa at gmail.com Mon Aug 13 13:18:16 2012 From: marco.perosa at gmail.com (Marco Perosa) Date: Mon, 13 Aug 2012 19:18:16 -0000 Subject: [Mauiusers] Maui 3.3.1 segfaults on MPBSNodeUpdate In-Reply-To: References: Message-ID: Hi, I have a problem with Maui version 3.3.1, used in conjunction with Torque version 2.5.9. The problem seems to occur only when a large number of jobs (also completed ones) is in the queue/state of a single node, in fact the ouput of 'qnodes node06' (the node in question in this case) is very, very large ('qnodes node06 | wc -m' ---> 510372). On the cluster that I administer one of the users usually launches a very large job array (2000 IDs), but since its execution is very fast it could happen that all of the IDs are executed on a single node, while other nodes are occupied by different jobs that take more time to complete. This is why the situation described above could happen. This is the debug of the crash: dmesg: [262129.550823] maui[30562]: segfault at 7ffffffff000 ip 00007ffff7602845 sp 00007ffffffdbe68 error 6 in libc-2.11.3.so[7ffff74f8000+159000] log: 07/27 17:07:30 INFO: PBS node node06 set to state Idle (free) 07/27 17:07:30 MNodeFind(node06,N) 07/27 17:07:30 MRMNodePreUpdate(node06,Idle,BUNET) 07/27 17:07:30 MPBSNodeUpdate(node06,node06,Idle,BUNET) 07/27 17:07:30 __MPBSIGetSSSStatus(node06,rectime=1343401619,varattr=,jobs=,state=free,netload=142294541287,gres=,loadave=0.00,ncpus=8,physmem=4059908kb,availmem=4991168kb,totmem=5104124kb,idletime=262568,nusers=0,nsessions=? 0,sessions=? 0,uname=Linux node06 2.6.32-5-amd64 #1 SMP Sun May 6 04:00:17 UTC 2012 x86_64,opsys=linux) gdb: Program received signal SIGSEGV, Segmentation fault. 0x00007ffff7602883 in ?? () from /lib/libc.so.6 (gdb) where #0 0x00007ffff7602883 in ?? () from /lib/libc.so.6 #1 0x00000000004a46b9 in MPBSNodeUpdate (N=0x2345da0, PNode=, NState=, R=) at MPBSI.c:3171 #2 0x2b72657473616d40 in ?? () #3 0x6a392b32312b3230 in ?? () ... I think some size limit of one of the values involved is responsible, but I'm not sure what would be the right way to avoid this problem. Thank you for any help you may provide. Ciao, Marco From adaptivecomputing at bridgemailsystem.com Tue Aug 14 07:05:31 2012 From: adaptivecomputing at bridgemailsystem.com (Adaptive Computing) Date: Tue, 14 Aug 2012 13:05:31 -0000 Subject: [Mauiusers] Visit Us at VMWorld Message-ID: <996497.1344949458262.JavaMail.root@mail2.bms.local> An HTML attachment was scrubbed... URL: http://www.supercluster.org/pipermail/mauiusers/attachments/20120814/36fd453e/attachment-0001.html From daniel at dep.fem.unicamp.br Tue Aug 14 09:07:53 2012 From: daniel at dep.fem.unicamp.br (Daniel Lopes de Carvalho) Date: Tue, 14 Aug 2012 15:07:53 -0000 Subject: [Mauiusers] TORQUE/MAUI Fairshare. Message-ID: <502A6850.3020400@dep.fem.unicamp.br> Hello. I'm new to TORQUE/MAUI and I'm looking for a help to setup a fairshare policy. Someone could help me, please? The scenario is the following: One cluster with 15 work nodes (576 procs), 5 queues with their priorities and properties. All the queues has all the work nodes. queue priority property prior 44 preemptor tecno 43 preemptor fast 42 preemptor normal 41 preemptor long 40 preemptee tecno 43 preemptor The queues priorities are working as desired. The issue is: If UserA sent 576 jobs and the cluster is free, all 576 jobs will run immediately. If a UserB also send 576 jobs, the last 288 jobs from UserA will be suspended and the first 288 jobs from UserB will be start immediately. The other last 288 from UserB will hold until a resource is free. Summarizing: I would like to make a kind of balancing execution between users, regardless of the queue and the users group. Thanks and best regards Daniel -- Daniel Lopes de Carvalho daniel at dep.fem.unicamp.br http://www.unisim.cepetro.unicamp.br 19 3521-1221 -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.supercluster.org/pipermail/mauiusers/attachments/20120814/28599f97/attachment-0001.html From s.prabhakaran at grs-sim.de Thu Aug 16 07:56:15 2012 From: s.prabhakaran at grs-sim.de (Suraj Prabhakaran) Date: Thu, 16 Aug 2012 13:56:15 -0000 Subject: [Mauiusers] Torque Maui Communication during job submission Message-ID: Hello, I have been looking into torque and maui communication for some days. I have a question regarding job submission. During a qsub command, does Maui get the information about the qsub only from torque or does it also get directly from the client? Again, any pointers to torque-maui documentation with more descriptions could be very helpful! Best regards, Suraj -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.supercluster.org/pipermail/mauiusers/attachments/20120816/fcbb8519/attachment-0001.html From adaptivecomputing at bridgemailsystem.com Thu Aug 23 09:49:41 2012 From: adaptivecomputing at bridgemailsystem.com (Adaptive Computing) Date: Thu, 23 Aug 2012 15:49:41 -0000 Subject: [Mauiusers] Visit Us at VMWorld Message-ID: <16615544.1345738111740.JavaMail.root@mail4.bridgemailsystem.com> An HTML attachment was scrubbed... URL: http://www.supercluster.org/pipermail/mauiusers/attachments/20120823/357fae9e/attachment-0001.html From chenweiguang82 at 163.com Sat Aug 25 20:41:54 2012 From: chenweiguang82 at 163.com (chenweiguang82) Date: Sun, 26 Aug 2012 02:41:54 -0000 Subject: [Mauiusers] Suspended jobs can not resume when preemptible jobs completed Message-ID: <5081218c.7b0.13960ce3660.Coremail.chenweiguang82@163.com> Hi, Our maui scheduler used preemption policy. The last one job was suspended and the preemptible jobs started when i submitted high priority jobs. I found the suspended jobs still exist in the memory, so the unused memory is not enough for new preemptible jobs. Are there some methods make the suspended jobs store in the disks not use the memory. After preemptible jobs completed, the suspended jobs still at suspended state. I hope the suspended jobs resume automatically after preemptible jobs completed. How to do it? Best wishes, Weiguang Chen -- ****************************** ****************** # Chen, Weiguang # # Postgraduate, Ph. D # 75 University Road, Physics Buliding # 218 # School of Physics & Engineering # Zhengzhou University # Zhengzhou, Henan 450052 CHINA # # Tel: 86-13203730117; # E-mail:chenweiguang82 at 163.com #********************************************** -------------- next part -------------- An HTML attachment was scrubbed... URL: http://www.supercluster.org/pipermail/mauiusers/attachments/20120826/cb448052/attachment-0001.html