Installing Son of Grid Engine

With many thanks to Michel Villerius for his Shark Setup guide.

1. Install Ubuntu 14.04.3 LTS on master

Make sure to include the OpenSSH server (or install it later with apt-get install openssh-server) for remote access.

1.1 Network settings

For one thing, make sure that in /etc/hosts, the line with ‘ngsmaster’ contains an IP-address that is not a 127.0.x.x address.

My VirtualBox ngsmaster had these settings for eth0 (VirtualBox ‘Internal networking’ interface):

1.2 Get it up to date

sudo apt-get update ; sudo apt-get -y dist-upgrade ; sudo reboot

1.3 Install Son of Grid Engine

1.3.1 Get compile dependencies

sudo su
apt-get build-dep gridengine-common gridengine-client gridengine-exec gridengine-master gridengine-qmon
apt-get -y install libhwloc-dev

1.3.2 Create Son of Grid Engine directory

mkdir /usr/local/SonOfGridEngine
cd /usr/local/SonOfGridEngine

1.3.3 Get Son of Grid Engine source, extract and compile

See https://gridscheduler.sourceforge.net/CompileGridEngineSource.html for details

wget https://arc.liv.ac.uk/downloads/SGE/releases/8.1.8/sge-8.1.8.tar.gz
tar xvfz sge-8.1.8.tar.gz
cd sge-8.1.8/source
./aimk -no-java -no-jni -no-secure -spool-classic -only-depend
./scripts/zerodepend
./aimk -no-java -no-jni -no-secure -spool-classic depend
./aimk -no-java -no-jni -no-secure -spool-classic

1.3.4 Install Son of Grid Engine

mkdir /usr/local/SonOfGridEngine/gridengine-sge-8.1.8
ln -s /usr/local/SonOfGridEngine/gridengine-sge-8.1.8 /usr/local/SonOfGridEngine/gridengine
export SGE_ROOT=/usr/local/SonOfGridEngine/gridengine
cd /usr/local/SonOfGridEngine/sge-8.1.8/source
./scripts/distinst -all -local -noexit

1.3.5 Configuration of Son of Grid Engine master

cd $SGE_ROOT
.install_qmaster

. /usr/local/SonOfGridEngine/gridengine/default/common/settings.sh

Add the gridadmin user to the Grid Engine managers group. This way, we don't need to sudo to configure anything.

qconf -am gridadmin

1.4 Set up common environment variables

mkdir /usr/local/COMMON-ENV
nano /usr/local/COMMON-ENV/common-cluster-env.sh

Enter the following:
#!/bin/bash
#### This file contains system wide variables and PATH for all hosts
#### in the cluster. It will be loaded by /etc/bash.bashrc on all hosts.

#### Configure Son of Grid Engine ####
. /usr/local/SonOfGridEngine/gridengine/default/common/settings.sh

#### Add /usr/local/bin to PATH ####
export PATH=$PATH:/usr/local/bin

chmod a+x /usr/local/COMMON-ENV/common-cluster-env.sh
echo ". /usr/local/COMMON-ENV/common-cluster-env.sh" >> /etc/bash.bashrc

1.5 Set up NFS so that we can export the /usr/local and /home directories

apt-get -y install nfs-kernel-server
echo "/usr/local/ ngsnode-??(rw,sync,no_root_squash,no_subtree_check)" >> /etc/exports
echo "/home/ ngsnode-??(rw,sync,no_root_squash,no_subtree_check)" >> /etc/exports
service nfs-kernel-server star

2. Install Ubuntu 14.04.3 LTS on execution node

2.1 Network settings

For one thing, make sure that in /etc/hosts, the line with ‘ngsnode-01’ contains an IP-address that is not a 127.0.x.x address.

My VirtualBox ngsmaster had these settings for eth0 (VirtualBox ‘Internal networking’ interface):

If DNS is working, it should not be needed to edit the hosts file. If it is not, make sure ‘ngsmaster’ and ‘ngsnode-01’ are present in the hosts file of both machines so they can find each other.

2.2 Get it up to date

sudo apt-get update ; sudo apt-get -y dist-upgrade ; sudo reboot

2.3 Set up the /usr/local and /home NFS mounts from the master

sudo su
apt-get -y install nfs-common
echo "ngsmaster:/usr/local /usr/local nfs rsize=8192,wsize=8192,timeo=14,intr" >> /etc/fstab
echo "ngsmaster:/home /home nfs rsize=8192,wsize=8192,timeo=14,intr" >> /etc/fstab
mount -a
echo ". /usr/local/COMMON-ENV/common-cluster-env.sh" >> /etc/bash.bashrc

2.4 Set up Son of Grid Engine execution node daemon

NOTE! First add ‘ngsnode-01’ to the administrative host list. To do this, on ‘ngsmaster’, type qconf -ah ngsnode-01. This is because the execution node will register itself with the master node, adding itself to the host lists and such.

apt-get -y install hwloc
cd $SGE_ROOT
./install_execd

Links