#format wiki #language en ||<>|| = What is PBS? = PBS is an acronym for Portable Batch System. TORQUE is an open source resource manager providing control over batch jobs and distributed compute nodes. It is a community effort based on the original *PBS. = Getting Started = You need server and some clients. Server also can have client package installed, so users can login to server (ssh) and submit jobs to queues. For Ubuntu 10.04 it's reccomended to manually download and install packages from Maverick repository - Lucid packages are old and problematic. == Set up your Server == 1. Install the required packages {{{ sudo aptitude install torque-server torque-scheduler torque-client }}} the rest should be installed as dependencies.<
> '''NOTE:''' there is a bug in Maverick torque-scheduler package - check if in /etc/init.d/torque-scheduler proper PIDFILE is set {{{ PIDFILE=/var/spool/torque/sched_priv/sched.lock }}} 2. Set the server and queue parameters: * start server in 'create' mode * set up server settings (scheduling = True, query_other_jobs = True, and later default_queue) * define nodes * create and set up queues (queue_type,resources_max,started,enabled) You can do it by using '''qmgr''' command, either interactive or not (with -c option), except first 'create' command. == Set up Clients == 1. Install the required packages {{{ sudo aptitude install torque-client torque-mom }}} 2. Set up node - set default server in {{{/var/spool/torque/server_name}}} 3. Check if pbs_mom is starting automatically (/etc/rcXd..., you can write pbs_mom command in /etc/rc.local file) = Using Torque PBS system = == Basics of pbs scripts == * {{{#!}}} sha-bang line * {{{-l}}} PBS parameters, including number of CPUs (cores), memory and time needed * {{{-q}}} name of the queue you want to submit to * {{{-m}}} notification settings * {{{-V}}} load user environment variables {{{ #!/bin/bash #PBS -l nodes=1:ppn=8,mem=1g,walltime=72:00:00 #PBS -q YourQueue_name_here #PBS -m abe #PBS -V cd /to_some_directory some_commands and_parameters }}} == Submitting, viewing and managing jobs == Basic tasks: * submit job to the queue {{{qsub yourfile.pbs}}} * check what jobs are running on which nodes {{{qstat -an}}} * remove job from queue {{{qdel XXX.host}}} This is only a stub - to be continued soon... = Resources = http://www.clusterresources.com/products/torque-resource-manager.php ---- CategoryScience