With many thanks to Michel Villerius for his Shark Setup guide.
Make sure to include the OpenSSH server (or install it later with apt-get install openssh-server) for remote access.
For one thing, make sure that in /etc/hosts, the line with ‘ngsmaster’ contains an IP-address that is not a 127.0.x.x address.
My VirtualBox ngsmaster had these settings for eth0 (VirtualBox ‘Internal networking’ interface):
sudo apt-get update ; sudo apt-get -y dist-upgrade ; sudo reboot
sudo su
apt-get build-dep gridengine-common gridengine-client gridengine-exec gridengine-master gridengine-qmon
apt-get -y install libhwloc-dev
mkdir /usr/local/SonOfGridEngine
cd /usr/local/SonOfGridEngine
See https://gridscheduler.sourceforge.net/CompileGridEngineSource.html for details
wget https://arc.liv.ac.uk/downloads/SGE/releases/8.1.8/sge-8.1.8.tar.gz
tar xvfz sge-8.1.8.tar.gz
cd sge-8.1.8/source
./aimk -no-java -no-jni -no-secure -spool-classic -only-depend
./scripts/zerodepend
./aimk -no-java -no-jni -no-secure -spool-classic depend
./aimk -no-java -no-jni -no-secure -spool-classic
mkdir /usr/local/SonOfGridEngine/gridengine-sge-8.1.8
ln -s /usr/local/SonOfGridEngine/gridengine-sge-8.1.8 /usr/local/SonOfGridEngine/gridengine
export SGE_ROOT=/usr/local/SonOfGridEngine/gridengine
cd /usr/local/SonOfGridEngine/sge-8.1.8/source
./scripts/distinst -all -local -noexit
cd $SGE_ROOT
.install_qmaster
. /usr/local/SonOfGridEngine/gridengine/default/common/settings.sh
Add the gridadmin user to the Grid Engine managers group. This way, we don't need to sudo to configure anything.
qconf -am gridadmin
mkdir /usr/local/COMMON-ENV
nano /usr/local/COMMON-ENV/common-cluster-env.sh
Enter the following:
#!/bin/bash
#### This file contains system wide variables and PATH for all hosts
#### in the cluster. It will be loaded by /etc/bash.bashrc on all hosts.
#### Configure Son of Grid Engine ####
. /usr/local/SonOfGridEngine/gridengine/default/common/settings.sh
#### Add /usr/local/bin to PATH ####
export PATH=$PATH:/usr/local/bin
chmod a+x /usr/local/COMMON-ENV/common-cluster-env.sh
echo ". /usr/local/COMMON-ENV/common-cluster-env.sh" >> /etc/bash.bashrc
apt-get -y install nfs-kernel-server
echo "/usr/local/ ngsnode-??(rw,sync,no_root_squash,no_subtree_check)" >> /etc/exports
echo "/home/ ngsnode-??(rw,sync,no_root_squash,no_subtree_check)" >> /etc/exports
service nfs-kernel-server star
For one thing, make sure that in /etc/hosts, the line with ‘ngsnode-01’ contains an IP-address that is not a 127.0.x.x address.
My VirtualBox ngsmaster had these settings for eth0 (VirtualBox ‘Internal networking’ interface):
If DNS is working, it should not be needed to edit the hosts file. If it is not, make sure ‘ngsmaster’ and ‘ngsnode-01’ are present in the hosts file of both machines so they can find each other.
sudo apt-get update ; sudo apt-get -y dist-upgrade ; sudo reboot
sudo su
apt-get -y install nfs-common
echo "ngsmaster:/usr/local /usr/local nfs rsize=8192,wsize=8192,timeo=14,intr" >> /etc/fstab
echo "ngsmaster:/home /home nfs rsize=8192,wsize=8192,timeo=14,intr" >> /etc/fstab
mount -a
echo ". /usr/local/COMMON-ENV/common-cluster-env.sh" >> /etc/bash.bashrc
NOTE! First add ‘ngsnode-01’ to the administrative host list. To do this, on ‘ngsmaster’, type qconf -ah ngsnode-01. This is because the execution node will register itself with the master node, adding itself to the host lists and such.
apt-get -y install hwloc
cd $SGE_ROOT
./install_execd