[torqueusers] Torque 2.5.0 beta on Cygwin

Felix Wolfheimer Felix.Wolfheimer at cst.com
Wed Jul 7 08:09:24 MDT 2010

Hi Torque Team,

thank you for uploading the beta of Torque 2.5.0! I've just downloaded it and tried to install it on a Linux cluster as well as cluster running Windows. While the installation on the Linux cluster worked just out of the box I've some problems to get the system to work on Windows using Cygwin (version 1.7.5). As a first step on the Windows side I'm not working on the Windows cluster but just locally on my desktop machine which I use as test environment to collect some experience with Torque on Windows. The setup of the machine is as follows: It is running Windows XP (x64). It is member of a Windows domain and I usually work on the machine using a domain account. However, for the setup I decided to use the local user "Administrator" as user account which runs the Torque services to keep things as simple as possible. I wanted to install all components (pbs_server, pbs_sched and pbs_mom) on this machine and then submit jobs locally using my normal user account which is defined in the domain.

All Torque services start without any problem. However, as soon as I submit a job using my normal user account on the machine the job ends immediately and in the stdout and stderr files is only one line saying "shell "/bin/bash" is not executable by user "FelixWolfheimer"" (stdout) and "PBS: exec of shell '/bin/bash' failed" (stderr), which is very strange as I can use the bash shell as this user. I guess this has something to do with user permissions but after hours trying to figure out what is happening I'm quite clueless now.

This is what I did so far:

Logged in as "Administrator" (using the local Admin account to prevent messing around with the domain stuff).
Downloaded and installed the Cygwin packages mentioned in README.cygwin.

Downloaded Torque and configured it using the following options:
configure             --disable-gui \
                                --disable-gcc-warnings \
                                --enable-force-nodefile \
                                --enable-static \
                                --enable-shared \
                                --with-rcp=scp \
                                --disable-unixsockets \
Running make and make install ran just fine and gave no errors or warnings.

The setup of passwordless ssh for Administrator and my normal user account worked seamlessly and I can login without being prompted for a password.

The AddPriviledges script did its work and editrights -l -u Administrator gives me:

I found out that you need to edit the /etc/passwd and /etc/group files manually to assign the Administrator account to the correct primary group ("Administrators", 544) as this is not done automatically. Maybe this could be included in the README.cygwin. It took me quite a while to find this out. Otherwise the account "Administrator" was not able to start the services (Message was: "Must be started by user with Administrator priviledges" or something similar).

The relevant part of my /etc/passwd now looks like
LocalService:*:19:544:U-NT AUTHORITY\LocalService,S-1-5-19::
NetworkService:*:20:544:U-NT AUTHORITY\NetworkService,S-1-5-20::

where "<my_account>" is the name of my user account.

The relevant part of my /etc/group looks like

I registered the Torque services using cygrunsrv and they start up correctly and run under the "Administrator" account.

Has anyone an idea what I'm doing wrong?

Thank you very much for your help.

Best regards

Felix Wolfheimer

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://www.supercluster.org/pipermail/torqueusers/attachments/20100707/c8d9f2d6/attachment-0001.html 

More information about the torqueusers mailing list