|
|||
TORQUE Admin Manual OverviewThis collection of documentation for TORQUE resource manager is intended as a reference for both users and system administrators alike. The chapters in this documentation are divided up to provide easy access to similar information.The first chapter, 1.0 Overview, provides the details for installation and basic and advanced configuration options necessary to get the system up and running. System Testing is also covered. The second chapter, 2.0 Submitting and Managing Jobs, covers different actions applicable to jobs. The first section, 2.1 Job Submission, details how to submit a job and request resources (nodes, software licenses, etc) and provides several examples. Other actions include monitoring, canceling, preemption, and keeping completed jobs. Chapter 3 covers admin task relating to nodes. These include: adding nodes, changing their properties, and state. Also an explanation of how to configure restricted user access to nodes is covered in section 3.4 Host Security. Chapter 4 details server side configurations of queue and server-level policies. Chapter 5 talks about using the native scheduler verses an advanced scheduler. Chapter 6 deals with issues of data management. For non-network filesystems, the SCP/RCP Setup section details setting up ssh keys and nodes to automate transferring data. The NFS and Other Networked Filesystems section covers configuration for these filesystems. This chapter also addresses the use of File Stage-In/Stage-Out using the stagein and stageout directives of the qsub command. Chapter 7 details supporting MPI (Message Passing Interface) and PVM (Parallel Virtual Machine). Chapter 8 covers configuration, utilization and states of resources. Chapter 9 explains how jobs are tracked by TORQUE for accounting purposes. Chapter 10 provides a troubleshooting guide. Included is help with general problems, a FAQ (Frequently Asked Questions) list, how to set-up and use a compute node checks and how to debug TORQUE. The numerous appendices provide tables of commands, parameters, configuration options, error codes, case studies, the Quick Start Guide, etc.
|
|||
| © 2001-2008 Cluster Resources, Incorporated | |||