Installation
Tarantella needs to be built from source. Since Tarantella is built on top of TensorFlow, you will require a recent version of it. Additionally, you will need an installation of the open-source communication libraries GaspiCxx and GPI-2, which Tarantella uses to implement distributed training.
Lastly, you will need pybind11, which is required for Python and C++ inter-communication.
In the following we will look at the required steps in detail.
Installing dependencies
Compiler and build system
Tarantella can be built using a recent gcc
compiler with support for C++17 (starting with gcc 7.4.0
).
You will also need the build tool CMake (from version 3.12
).
Installing TensorFlow
First you will need to install TensorFlow.
Tarantella supports TensorFlow versions 2.0
to 2.7
(some features are only available
in versions above 2.2
).
Either version can be installed in a conda environment using pip,
as recommended on the TensorFlow website.
In order to do that, first install conda on your system. Then, create and activate an environment for Tarantella:
conda create -n tarantella
conda activate tarantella
Now, you can install the latest supported TensorFlow version with:
conda install python=3.9
pip install --upgrade tensorflow==2.7.*
Tarantella requires at least Python 3.7
. Make sure the selected version also matches
the TensorFlow requirements.
Installing pybind11
The next dependency you will need to install is
pybind11,
which is available through pip and conda.
We recommend installing pybind11
via conda:
conda install pybind11 -c conda-forge
Installing GPI-2
Next, you will need to download, compile and install the GPI-2 library. GPI-2 is an API for high-performance, asynchronous communication for large scale applications, based on the GASPI (Global Address Space Programming Interface) standard.
The currently supported versions are v1.4-1.5
, which need to be built with
position independent flags (-fPIC
).
To download the required version, clone the
GPI-2 git repository
and checkout the latest tag
:
git clone https://github.com/cc-hpc-itwm/GPI-2.git
cd GPI-2
git fetch --tags
git checkout -b v1.5.1 v1.5.1
Now, use autotools to configure and compile the code:
./autogen.sh
export GPI2_INSTALLATION_PATH=/your/gpi2/installation/path
CFLAGS="-fPIC" CPPFLAGS="-fPIC" ./configure --with-ethernet --prefix=${GPI2_INSTALLATION_PATH}
make
where ${GPI2_INSTALLATION_PATH}
needs to be replaced with the path where you want to install
GPI-2. Note the --with-ethernet
option, which will use standard TCP sockets for communication.
This is the correct option for laptops and workstations.
In case you want to use Infiniband, replace the above option with --with-infiniband
.
Now you are ready to install GPI-2 with:
make install
export PATH=${GPI2_INSTALLATION_PATH}/bin:$PATH
where the last two commands make the library visible to your system.
If required, GPI-2 can be removed from the target directory by using make uninstall
.
Installing GaspiCxx
GaspiCxx is a C++ abstraction layer built on top of the GPI-2 library, designed to provide easy-to-use point-to-point and collective communication primitives. Tarantella’s communication layer is based on GaspiCxx and its PyGPI API for Python. Currently we support GaspiCxx version v1.1.0.
To install GaspiCxx and PyGPI, first download the latest release from the git repository:
git clone https://github.com/cc-hpc-itwm/GaspiCxx.git
cd GaspiCxx
git fetch --tags
git checkout -b v1.1.0 v1.1.0
GaspiCxx requires an already installed version of GPI-2, which should be detected at
configuration time (as long as ${GPI2_INSTALLATION_PATH}/bin
is added to the current
${PATH}
as shown above).
Compile and install the library as follows, making sure the previously created conda environment is activated:
conda activate tarantella
mkdir build && cd build
export GASPICXX_INSTALLATION_PATH=/your/gaspicxx/installation/path
cmake -DBUILD_PYTHON_BINDINGS=ON \
-DBUILD_SHARED_LIBS=ON \
-DCMAKE_INSTALL_PREFIX=${GASPICXX_INSTALLATION_PATH} ../
make install
where ${GASPICXX_INSTALLATION_PATH}
needs to be set to the path where you want to install
the library.
SSH key-based authentication
In order to use Tarantella on a cluster, make sure you can ssh between nodes
without password. For details, refer to the FAQ section.
In particular, to test Tarantella on your local machine, make sure
you can ssh to localhost
without password.
Building Tarantella from source
With all dependencies installed, we can now download, configure and compile Tarantella. To download the source code, simply clone the GitHub repository:
git clone https://github.com/cc-hpc-itwm/tarantella.git
cd tarantella
git checkout tags/v0.8.0 -b v0.8.0
Next, we need to configure the build system using CMake.
For a standard out-of-source build, we create a separate build
folder and run cmake
in it:
conda activate tarantella
cd tarantella
mkdir build && cd build
export TARANTELLA_INSTALLATION_PATH=/your/installation/path
cmake -DCMAKE_INSTALL_PREFIX=${TARANTELLA_INSTALLATION_PATH} \
-DCMAKE_PREFIX_PATH=${GASPICXX_INSTALLATION_PATH} ../
This will configure your installation to use the previously installed GPI-2 and GaspiCxx libraries. To install Tarantella on a cluster equipped with Infiniband capabilities, make sure that GPI-2 is installed with Infiniband support as shown here.
Now, we can compile and install Tarantella to TARANTELLA_INSTALLATION_PATH
:
make
make install
export PATH=${TARANTELLA_INSTALLATION_PATH}/bin:${PATH}
[Optional] Building and running tests
In order to build Tarantella with tests, you will also need to install Boost (for C++ tests), and pytest (for Python tests). Additionally, the PyYAML and NetworkX libraries are required by some tests.
To install boost with the required devel-packages, under Ubuntu you can use
sudo apt install libboost-all-dev
while in Fedora you can use
sudo dnf install boost boost-devel
The other dependencies can be installed in the existing conda environment:
pip install -U pytest
pip install PyYAML==3.13
conda install networkx
After having installed these libraries, make sure to configure Tarantella with testing switched on:
cd tarantella
mkdir build && cd build
export LD_LIBRARY_PATH=`pwd`:${LD_LIBRARY_PATH}
export LD_LIBRARY_PATH=${GPI2_INSTALLATION_PATH}/lib64:${LD_LIBRARY_PATH}
export LD_LIBRARY_PATH=${GASPICXX_INSTALLATION_PATH}/lib:${LD_LIBRARY_PATH}
export PYTHONPATH=`pwd`:${PYTHONPATH}
export PYTHONPATH=${GASPICXX_INSTALLATION_PATH}/lib:${PYTHONPATH}
cmake -DENABLE_TESTING=ON ../
Now you can compile Tarantella and run its tests in the build
directory:
make
ctest
[Optional] Building documentation
If you would like to build the documentation
locally, run the following cmake
command
cmake -DCMAKE_INSTALL_PREFIX=${TARANTELLA_INSTALLATION_PATH} -DBUILD_DOCS=ON ..
before compiling. This requires you to have Sphinx installed:
pip install -U sphinx