Installation

Tarantella needs to be built from source. Since Tarantella is built on top of TensorFlow, you will require a recent version of it. Additionally, you will need an installation of the open-source communication libraries GaspiCxx and GPI-2, which Tarantella uses to implement distributed training.

Lastly, you will need pybind11, which is required for Python and C++ inter-communication.

In the following we will look at the required steps in detail.

Installing dependencies

Compiler and build system

Tarantella can be built using a recent gcc compiler with support for C++17 (starting with gcc 7.4.0). You will also need the build tool CMake (from version 3.12).

Installing TensorFlow

First you will need to install TensorFlow. Tarantella supports TensorFlow versions 2.0 to 2.7 (some features are only available in versions above 2.2). Either version can be installed in a conda environment using pip, as recommended on the TensorFlow website.

In order to do that, first install conda on your system. Then, create and activate an environment for Tarantella:

conda create -n tarantella
conda activate tarantella

Now, you can install the latest supported TensorFlow version with:

conda install python=3.9
pip install --upgrade tensorflow==2.7.*

Tarantella requires at least Python 3.7. Make sure the selected version also matches the TensorFlow requirements.

Installing pybind11

The next dependency you will need to install is pybind11, which is available through pip and conda. We recommend installing pybind11 via conda:

conda install pybind11 -c conda-forge

Installing GPI-2

Next, you will need to download, compile and install the GPI-2 library. GPI-2 is an API for high-performance, asynchronous communication for large scale applications, based on the GASPI (Global Address Space Programming Interface) standard.

The currently supported versions are v1.4-1.5, which need to be built with position independent flags (-fPIC). To download the required version, clone the GPI-2 git repository and checkout the latest tag:

git clone https://github.com/cc-hpc-itwm/GPI-2.git
cd GPI-2
git fetch --tags
git checkout -b v1.5.1 v1.5.1

Now, use autotools to configure and compile the code:

./autogen.sh
export GPI2_INSTALLATION_PATH=/your/gpi2/installation/path
CFLAGS="-fPIC" CPPFLAGS="-fPIC" ./configure --with-ethernet --prefix=${GPI2_INSTALLATION_PATH}
make

where ${GPI2_INSTALLATION_PATH} needs to be replaced with the path where you want to install GPI-2. Note the --with-ethernet option, which will use standard TCP sockets for communication. This is the correct option for laptops and workstations.

In case you want to use Infiniband, replace the above option with --with-infiniband. Now you are ready to install GPI-2 with:

make install
export PATH=${GPI2_INSTALLATION_PATH}/bin:$PATH

where the last two commands make the library visible to your system. If required, GPI-2 can be removed from the target directory by using make uninstall.

Installing GaspiCxx

GaspiCxx is a C++ abstraction layer built on top of the GPI-2 library, designed to provide easy-to-use point-to-point and collective communication primitives. Tarantella’s communication layer is based on GaspiCxx and its PyGPI API for Python. Currently we support GaspiCxx version v1.1.0.

To install GaspiCxx and PyGPI, first download the latest release from the git repository:

git clone https://github.com/cc-hpc-itwm/GaspiCxx.git
cd GaspiCxx
git fetch --tags
git checkout -b v1.1.0 v1.1.0

GaspiCxx requires an already installed version of GPI-2, which should be detected at configuration time (as long as ${GPI2_INSTALLATION_PATH}/bin is added to the current ${PATH} as shown above).

Compile and install the library as follows, making sure the previously created conda environment is activated:

conda activate tarantella

mkdir build && cd build
export GASPICXX_INSTALLATION_PATH=/your/gaspicxx/installation/path
cmake -DBUILD_PYTHON_BINDINGS=ON    \
      -DBUILD_SHARED_LIBS=ON        \
      -DCMAKE_INSTALL_PREFIX=${GASPICXX_INSTALLATION_PATH} ../
make install

where ${GASPICXX_INSTALLATION_PATH} needs to be set to the path where you want to install the library.

SSH key-based authentication

In order to use Tarantella on a cluster, make sure you can ssh between nodes without password. For details, refer to the FAQ section. In particular, to test Tarantella on your local machine, make sure you can ssh to localhost without password.

Building Tarantella from source

With all dependencies installed, we can now download, configure and compile Tarantella. To download the source code, simply clone the GitHub repository:

git clone https://github.com/cc-hpc-itwm/tarantella.git
cd tarantella
git checkout tags/v0.8.0 -b v0.8.0

Next, we need to configure the build system using CMake. For a standard out-of-source build, we create a separate build folder and run cmake in it:

conda activate tarantella

cd tarantella
mkdir build && cd build
export TARANTELLA_INSTALLATION_PATH=/your/installation/path
cmake -DCMAKE_INSTALL_PREFIX=${TARANTELLA_INSTALLATION_PATH} \
      -DCMAKE_PREFIX_PATH=${GASPICXX_INSTALLATION_PATH} ../

This will configure your installation to use the previously installed GPI-2 and GaspiCxx libraries. To install Tarantella on a cluster equipped with Infiniband capabilities, make sure that GPI-2 is installed with Infiniband support as shown here.

Now, we can compile and install Tarantella to TARANTELLA_INSTALLATION_PATH:

make
make install
export PATH=${TARANTELLA_INSTALLATION_PATH}/bin:${PATH}

[Optional] Building and running tests

In order to build Tarantella with tests, you will also need to install Boost (for C++ tests), and pytest (for Python tests). Additionally, the PyYAML and NetworkX libraries are required by some tests.

To install boost with the required devel-packages, under Ubuntu you can use

sudo apt install libboost-all-dev

while in Fedora you can use

sudo dnf install boost boost-devel

The other dependencies can be installed in the existing conda environment:

pip install -U pytest
pip install PyYAML==3.13
conda install networkx

After having installed these libraries, make sure to configure Tarantella with testing switched on:

cd tarantella
mkdir build && cd build
export LD_LIBRARY_PATH=`pwd`:${LD_LIBRARY_PATH}
export LD_LIBRARY_PATH=${GPI2_INSTALLATION_PATH}/lib64:${LD_LIBRARY_PATH}
export LD_LIBRARY_PATH=${GASPICXX_INSTALLATION_PATH}/lib:${LD_LIBRARY_PATH}

export PYTHONPATH=`pwd`:${PYTHONPATH}
export PYTHONPATH=${GASPICXX_INSTALLATION_PATH}/lib:${PYTHONPATH}

cmake -DENABLE_TESTING=ON ../

Now you can compile Tarantella and run its tests in the build directory:

make
ctest

[Optional] Building documentation

If you would like to build the documentation locally, run the following cmake command

cmake -DCMAKE_INSTALL_PREFIX=${TARANTELLA_INSTALLATION_PATH} -DBUILD_DOCS=ON ..

before compiling. This requires you to have Sphinx installed:

pip install -U sphinx