I updated this on jul 21 using comments below
Torque is a batch job queuing system that is used on clusters. But I find it handy to use it on my multi-core workstation as well. It allows jobs that need to be run to be schedule by multiple users. The scheduler will make sure that not too many jobs are run simultaneously which could cause high system loads or memory issues.
I previously posted how to install torque on ubuntu hardy from the torque source package. However, torque is now in the repositories of lucid and here are the steps that I had to take to get it to work on my workstation.
For this setup I kept the server host name 'torqueserver' which is the default in the package. You can do the same or use a fully qualified domain name. In that case, you will have to adept the steps somewhat.
My workstation has 8 cores, and I only want to give 6 of them to the que. Please adapt your numbers accordingly.
0) open root terminal
1) add torqueserver as an alias to /etc/hosts.
*) see post by drlemon. Alternatively use a resolvable host name (check with 'host $HOSTNAME') in the file: /var/lib/torque/server_name and whereever torqueserver is used below, use that host name
change 127.0.1.1 myHostName to 127.0.1.1 myHostName torqueserver
2) install torque from repositories
3) stop torque
apt-get install torque*
4) check torque is not running (otherwise you can kill it)
5) create missing directory
6) add torqueserver as serverhost
echo "SERVERHOST localhost" >> /var/lib/torque/torque.cfg
8) setup database
echo "torqueserver np=8" >> /var/lib/torque/server_priv/nodes
echo "pbs_server = 127.0.1.1" >> /var/lib/torque/mom_priv/config
9) create que and set server settings in database
pbs_server -t create
10) restart server and scheduler and node server
create queue batch
set queue batch queue_type = Execution
set queue batch max_running = 6
set queue batch resources_max.ncpus = 8
set queue batch resources_max.nodes = 1
set queue batch resources_default.ncpus = 1
set queue batch resources_default.neednodes = 1:ppn=1
set queue batch resources_default.walltime = 24:00:00
set queue batch max_user_run = 6
set queue batch enabled = True
set queue batch started = True
set server default_queue = batch
set server scheduling = True
11) check that the nodes are up
pbs_sched #this will give some warning about missing files
12) exit the root terminal and as a normal user test the que
13) see drlemon: do a gedit /etc/init.d/torque* and change in all three files the pidfile= line so that it points to /var/lib instead of /var/spool. Additionally remove the -t create from the server options in the torque-server file.
echo "sleep 30" | qsub
This works for me but probably requires more configuration in a demanding computing environment. Check out the torque website for more queue configurations, user management etc.