[torqueusers] ha torque

Brock Palen brockp at umich.edu
Wed Apr 9 15:36:22 MDT 2008


Ours is 868MB  but its all because we don't rotate out our account  
logs,  Right now we have 6025 jobs,  and server_priv/jobs is just 73 MB.

[root at nyx server_priv]# du -h --max-depth=1
8.0K    ./acl_svr
4.0K    ./disallowed_types
796M    ./accounting
73M     ./jobs
4.0K    ./acl_groups
72K     ./queues
48K     ./acl_users
24K     ./acl_hosts
868M    .

Good luck,  You really don't need more than a Gig LUN to do this.   
Maybe try DRBD.  I use it for Virtualized OS's all the time, to  
mirror their partitions across hosts.

Brock Palen
www.umich.edu/~brockp
Center for Advanced Computing
brockp at umich.edu
(734)936-1985



On Apr 9, 2008, at 5:30 PM, Steve Snelgrove wrote:
> On my test system, the size of this directory is 13 meg.  However,  
> this does contain the jobs sub-directory and thus the size will  
> vary depending on how many jobs are running.
>
> root# cd /var/spool/torque
> root# du -h server_priv
> 652K    server_priv/jobs
> 4.0K    server_priv/arrays
> 8.5M    server_priv/accounting
> 4.0K    server_priv/disallowed_types
> 20K     server_priv/acl_svr
> 4.0K    server_priv/hostlist
> 4.0K    server_priv/acl_hosts
> 4.0K    server_priv/acl_users
> 4.0K    server_priv/acl_groups
> 8.0K    server_priv/queues
> 13M     server_priv
>
> I am not qualified to give an opinion on Maui or the scheduler, sorry.
>
> Daniel Bourque wrote:
>> thanks
>>
>> how much disk space does /var/spool/torque/server_priv typically  
>> use ?
>>
>> how about the maui scheduler ? should it be running on both  
>> headnodes, trying to communicate with localhost ?
>>
>> I'm a little confused by the example, where the scheduler runs on  
>> the the hosts as pbs_mom and not pbs_server... is the intent to  
>> also failover the scheduler along with the shared file system ?
>>
>>
>> thanks again.
>>
>> Daniel Bourque
>> Sr. Systems Engineer
>> WeatherData Service Inc
>> An Accuweather Company
>>
>>
>>
>>
>> Steve Snelgrove wrote:
>>
>>> The 2.3 release of Torque has support for HA by allowing two head  
>>> node server to access the server_priv files on a shared file  
>>> system.  See http://www.clusterresources.com/torquedocs21/4.3high- 
>>> availability.shtml for more details.
>>>
>>>
>>
>
> _______________________________________________
> torqueusers mailing list
> torqueusers at supercluster.org
> http://www.supercluster.org/mailman/listinfo/torqueusers
>
>



More information about the torqueusers mailing list