[torqueusers] Qstat reporting false node use
Clevenger, Kevin
KClevenger at coh.org
Wed Apr 11 12:38:56 MDT 2007
Hi,
Whene running multiple NAMD jobs on the cluster (Rocks 4.2.1) we see qstat -n report that the jobs start on separate nodes, but when you look at the processes with cluster-ps they in fact are not. Anyone know why this is and how to straigten it out? Output below.
Thanks
Kevin
###################################################
$ qstat -n
cluster.coh.org:
Req'd Req'd Elap
Job ID Username Queue Jobname SessID NDS TSK Memory Time S Time
-------------------- -------- -------- ---------- ------ ----- --- ------ ----- - -----
153.cluster.coh.org bob longrun eq32.submi 5042 8 1 -- 1000: R 00:23
c-0-24+c-0-24+c-0-23+c-0-23+c-0-22+c-0-22+c-0-21+c-0-21+c-0-20+c-0-20+c-0-19
+c-0-19+c-0-18+c-0-18+c-0-17+c-0-17
154.cluster.coh.org bob longrun eq08.submi 32618 4 1 -- 1000: R 00:22
c-0-16+c-0-16+c-0-15+c-0-15+c-0-14+c-0-14+c-0-13+c-0-13
155.cluster.coh.org bob longrun TAK779-eq0 1383 4 1 -- 1000: R 00:18
c-0-12+c-0-12+c-0-11+c-0-11+c-0-10+c-0-10+c-0-9+c-0-9
~~~~~~~~~~~~~~~~~~~~
$ cluster-ps vaidsimpl
c-0-0:
bob 5148 47.2 9.4 215432 195080 ? R 11:03 12:14 /home/bob/vaidsimpl /home/bob/STAT3 eq32.namd
bob 5168 40.8 5.4 126060 112676 ? R 11:03 10:34 /home/bob/vaidsimpl /home/bob/STAT3/eq32.namd
bob 5207 26.4 2.1 56656 44536 ? R 11:04 6:42 /home/bob/vaidsimpl /home/bob/CCR2APO/MD/eq08-con.namd
bob 5216 31.5 3.6 90228 74756 ? S 11:04 7:59 /home/bob/vaidsimpl /home/bob/CCR2APO/MD eq08-con.namd
bob 5287 29.6 3.5 88252 72392 ? R 11:08 6:17 /home/bob/vaidsimpl /home/bob/CCR2TAK779/MD eq08-con.namd
bob 5295 27.0 2.2 57572 45428 ? S 11:08 5:43 /home/bob/vaidsimpl /home/bob/CCR2TAK779/MD/eq08-con.namd
c-0-1:
bob 4307 40.8 5.5 127340 113164 ? R 11:03 10:35 /home/bob/vaidsimpl /home/bob/STAT3/eq32.namd
bob 4313 38.9 5.3 123800 110232 ? R 11:03 10:06 /home/bob/vaidsimpl /home/bob/STAT3/eq32.namd
bob 4357 29.8 2.3 61648 49316 ? S 11:04 7:35 /home/bob/vaidsimpl /home/bob/CCR2APO/MD/eq08-con.namd
bob 4361 29.6 2.3 61464 49196 ? S 11:04 7:31 /home/bob/vaidsimpl /home/bob/CCR2APO/MD/eq08-con.namd
bob 4427 24.8 2.2 57520 45452 ? S 11:08 5:16 /home/bob/vaidsimpl /home/bob/CCR2TAK779/MD/eq08-con.namd
bob 4439 28.5 2.4 61528 50088 ? R 11:08 6:03 /home/bob/vaidsimpl /home/bob/CCR2TAK779/MD/eq08-con.namd
c-0-2:
bob 3449 45.0 5.8 135184 120840 ? S 11:03 11:42 /home/bob/vaidsimpl /home/bob/STAT3/eq32.namd
bob 3450 45.6 5.8 135752 121192 ? R 11:03 11:51 /home/bob/vaidsimpl /home/bob/STAT3/eq32.namd
bob 3495 30.2 2.4 63072 50484 ? S 11:04 7:41 /home/bob/vaidsimpl /home/bob/CCR2APO/MD/eq08-con.namd
bob 3499 30.0 2.4 62080 49692 ? S 11:04 7:38 /home/bob/vaidsimpl /home/bob/CCR2APO/MD/eq08-con.namd
bob 3572 26.1 2.3 59872 48448 ? R 11:08 5:32 /home/bob/vaidsimpl /home/bob/CCR2TAK779/MD/eq08-con.namd
bob 3576 26.1 2.3 58600 47340 ? S 11:08 5:33 /home/bob/vaidsimpl /home/bob/CCR2TAK779/MD/eq08-con.namd
c-0-3:
bob 4699 44.5 5.7 132996 118492 ? S 11:03 11:35 /home/bob/vaidsimpl /home/bob/STAT3/eq32.namd
bob 4718 46.3 5.7 131752 117528 ? S 11:03 12:03 /home/bob/vaidsimpl /home/bob/STAT3/eq32.namd
bob 4763 29.2 2.2 58268 46084 ? R 11:04 7:26 /home/bob/vaidsimpl /home/bob/CCR2APO/MD/eq08-con.namd
bob 4767 29.4 2.4 61616 49428 ? S 11:04 7:30 /home/bob/vaidsimpl /home/bob/CCR2APO/MD/eq08-con.namd
bob 4841 28.3 2.3 60092 47744 ? R 11:08 6:01 /home/bob/vaidsimpl /home/bob/CCR2TAK779/MD/eq08-con.namd
bob 4845 25.4 2.1 57012 44872 ? R 11:08 5:25 /home/bob/vaidsimpl /home/bob/CCR2TAK779/MD/eq08-con.namd
c-0-4:
bob 4077 45.1 5.8 135304 120804 ? S 11:03 11:46 /home/bob/vaidsimpl /home/bob/STAT3/eq32.namd
bob 4081 44.9 5.8 134500 120244 ? S 11:03 11:44 /home/bob/vaidsimpl /home/bob/STAT3/eq32.namd
bob 4126 29.2 2.4 62180 49728 ? S 11:04 7:27 /home/bob/vaidsimpl /home/bob/CCR2APO/MD/eq08-con.namd
bob 4130 30.2 2.3 61008 48688 ? R 11:04 7:43 /home/bob/vaidsimpl /home/bob/CCR2APO/MD/eq08-con.namd
bob 4194 24.1 2.2 57740 45532 ? S 11:08 5:09 /home/bob/vaidsimpl /home/bob/CCR2TAK779/MD/eq08-con.namd
bob 4208 27.2 2.3 61304 48748 ? R 11:08 5:49 /home/bob/vaidsimpl /home/bob/CCR2TAK779/MD/eq08-con.namd
c-0-5:
bob 3971 42.6 5.5 128356 114548 ? R 11:03 11:08 /home/bob/vaidsimpl /home/bob/STAT3/eq32.namd
bob 3991 42.8 5.6 130088 116216 ? R 11:03 11:11 /home/bob/vaidsimpl /home/bob/STAT3/eq32.namd
bob 4036 30.8 2.4 62748 50392 ? R 11:04 7:53 /home/bob/vaidsimpl /home/bob/CCR2APO/MD/eq08-con.namd
bob 4040 29.9 2.4 62376 49828 ? R 11:04 7:39 /home/bob/vaidsimpl /home/bob/CCR2APO/MD/eq08-con.namd
bob 4108 26.1 2.2 59388 47088 ? R 11:08 5:34 /home/bob/vaidsimpl /home/bob/CCR2TAK779/MD/eq08-con.namd
bob 4118 26.7 2.3 61724 49184 ? R 11:08 5:41 /home/bob/vaidsimpl /home/bob/CCR2TAK779/MD/eq08-con.namd
c-0-6:
bob 3881 46.5 5.6 130016 115668 ? S 11:03 12:11 /home/bob/vaidsimpl /home/bob/STAT3/eq32.namd
bob 3885 43.2 5.3 124064 110516 ? S 11:03 11:19 /home/bob/vaidsimpl /home/bob/STAT3/eq32.namd
bob 3913 29.5 2.3 60320 47860 ? R 11:04 7:35 /home/bob/vaidsimpl /home/bob/CCR2APO/MD/eq08-con.namd
bob 3933 28.2 2.1 57148 44992 ? S 11:04 7:15 /home/bob/vaidsimpl /home/bob/CCR2APO/MD/eq08-con.namd
bob 3999 27.2 2.3 61412 48792 ? S 11:08 5:51 /home/bob/vaidsimpl /home/bob/CCR2TAK779/MD/eq08-con.namd
bob 4011 26.0 2.1 56988 44844 ? S 11:08 5:34 /home/bob/vaidsimpl /home/bob/CCR2TAK779/MD/eq08-con.namd
c-0-7:
bob 3789 46.1 5.8 134716 121036 ? R 11:03 12:05 /home/bob/vaidsimpl /home/bob/STAT3/eq32.namd
bob 3792 45.7 5.8 134084 119676 ? S 11:03 11:58 /home/bob/vaidsimpl /home/bob/STAT3/eq32.namd
bob 3837 30.1 2.4 62072 49784 ? R 11:04 7:43 /home/bob/vaidsimpl /home/bob/CCR2APO/MD/eq08-con.namd
bob 3841 24.3 2.1 55472 43388 ? R 11:04 6:14 /home/bob/vaidsimpl /home/bob/CCR2APO/MD/eq08-con.namd
bob 3903 26.8 2.3 60792 48452 ? R 11:08 5:45 /home/bob/vaidsimpl /home/bob/CCR2TAK779/MD/eq08-con.namd
bob 3919 27.9 2.3 61240 48856 ? R 11:08 5:58 /home/bob/vaidsimpl /home/bob/CCR2TAK779/MD/eq08-con.namd
c-0-8:
c-0-9:
c-0-10:
c-0-11:
c-0-12:
bob 1414 0.0 0.0 5848 764 ? S 11:08 0:00 /home/bob/vaidsim ++remote-shell ssh ++nodelist /share/data/etc/nodelist +p16 /home/bob/vaidsimpl /home/bob/CCR2TAK779/MD/eq08-con.namd
c-0-13:
c-0-14:
c-0-15:
c-0-16:
bob 32649 0.0 0.0 5848 764 ? S 11:04 0:00 /home/bob/vaidsim ++remote-shell ssh ++nodelist /share/data/etc/nodelist +p16 /home/bob/vaidsimpl /home/bob/CCR2APO/MD/eq08-con.namd
c-0-17:
c-0-18:
c-0-19:
c-0-20:
c-0-21:
c-0-22:
c-0-23:
c-0-24:
bob 5069 0.0 0.0 5848 764 ? S 11:03 0:00 /home/bob/vaidsim ++remote-shell ssh ++nodelist /share/data/etc/nodelist +p16 /home/bob/vaidsimpl /home/bob/STAT3/eq32.namd
"EMF <COH.org>" made the following annotations.
------------------------------------------------------------------------------
SECURITY/CONFIDENTIALITY WARNING: This message and any attachments are intended solely for the individual or entity to which they are addressed. This communication may contain information that is privileged, confidential, or exempt from disclosure under applicable law (e.g., personal health information, research data, financial information). Because this e-mail has been sent without encryption, individuals other than the intended recipient may be able to view the information, forward it to others or tamper with the information without the knowledge or consent of the sender. If you are not the intended recipient, or the employee or person responsible for delivering the message to the intended recipient, any dissemination, distribution or copying of the communication is strictly prohibited. If you received the communication in error, please notify the sender immediately by replying to this message and deleting the message and any accompanying files from your system. If, due to the security risks, you do not wish to receive further communications via e-mail, please reply to this message and inform the sender that you do not wish to receive further e-mail from the sender.
==============================================================================
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20070411/bdcd4fb9/attachment-0001.html
More information about the torqueusers
mailing list