Bugzilla – Bug 113
munge support (AlternateUserAuthentication) has still some issues on torque-2.5.4
Last modified: 2013-02-13 18:16:50 MST
You need to log in before you can comment on or make changes to this bug.
When no munged is running on a host or a node, torque creates an epmty file in the credentials directory and doesn't close its open fd, which may lead to a point where the torque process is running out of file descriptors. The error seems in req_getcred.c line 257+ : I guess there is a "close(fd);unlink(mungeFileName);" missing. The bigger Problem ist, that even when munge is disabled on a node (with node submit allowed) a user can still submit or query: [zrshj01@n010102 ~]$ qsub -l nodes=1:ppn=24 test.sh munge: Error: Unable to access "/var/run/munge/munge.socket.2": No such file or directory 2484465.icmu03 02/16/2011 16:32:09;0100;PBS_Server;Req;;Type AlternateUserAuthentication request received from zrshj01@n010102, sock=12 02/16/2011 16:32:09;0080;PBS_Server;Req;req_reject;Reject reply code=15021(Invalid credential MSG=cannot authenticate), aux=0, type=AlternateUserAuthenticat ion, from zrshj01@n010102 02/16/2011 16:32:09;0100;PBS_Server;Req;;Type QueueJob request received from zrshj01@n010102, sock=12 02/16/2011 16:32:09;0100;PBS_Server;Req;;Type JobScript request received from zrshj01@n010102, sock=12 02/16/2011 16:32:09;0100;PBS_Server;Req;;Type Commit request received from zrshj01@n010102, sock=12 02/16/2011 16:32:09;0100;PBS_Server;Job;2484465.icmu03;enqueuing into user, state 1 hop 1 02/16/2011 16:32:09;0100;PBS_Server;Job;2484465.icmu03;dequeuing from user, state QUEUED 02/16/2011 16:32:09;0100;PBS_Server;Job;2484465.icmu03;enqueuing into tue-short, state 1 hop 1 02/16/2011 16:32:09;0008;PBS_Server;Job;2484465.icmu03;Job Queued at request of zrshj01@n010102, owner = zrshj01@n010102, job name = test.sh, queue = tue-sh ort Juergen Hennerich
This was fixed somewhere between 2.5.9 and 2.5.12 Please close the ticket gentlemen
Resolved per Lucasz' recommendation.