Installation

Before you can run workloads on your nodes, they will need to have our fudgelet installed on them.

The fudgelet is a lightweight agent that runs as a background process on your node; managing workloads, reporting hardware metrics and sending your application logs to the server.

It can be downloaded as a binary or installed via your preferred linux package manager. We also offer quick setup scripts for stream-lining the installation, as well as a container image that can be used to quickly evaluate the platform.

Running inside a container

To quickly see Clusterfudge in action, without installing anything onto your machine, you can pull a fudgelet container image and run it locally.

Run the fudgelet container image

docker run -e CLUSTERFUDGE_API_KEY=$YOUR_API_KEY_HERE --gpus=all ghcr.io/clusterfudgeai/fudge/fudgelet:latest --interactive

This requires the NVIDIA container runtime to be configured on your machine; without this the fudgelet will not be able to detect your GPUs.

Using a setup script

The quickest way to get started is via our get.clusterfudge.com setup scripts. These will grab the latest version of the fudgelet for you and initialise the fudgelet.toml file.

Downloading

If you'd just like to download the latest fudgelet version and have your fudgelet.toml config file initialized, use the download.sh script.

Download the fudgelet binary using wget/curl

curl https://get.clusterfudge.com/download.sh | { export CLUSTERFUDGE_API_KEY=$YOUR_API_KEY_HERE; sudo -E bash; }

Installing

If you'd prefer to install the fudgelet using the standard package manager for your distribution, you can use the install.sh script instead.

Install the fudgelet using a linux package manager

curl https://get.clusterfudge.com/install.sh | { export CLUSTERFUDGE_API_KEY=$YOUR_API_KEY_HERE; sudo -E bash; }

Running

If you'd just like to run the fudgelet, as the current user, without systemd, and without modifying your path, we offer a run.sh script as well.

Download and run the fudgelet as your current user

curl https://get.clusterfudge.com/run.sh | CLUSTERFUDGE_API_KEY=$YOUR_API_KEY_HERE bash

Setup script configuration

When you use one of our setup scripts, you can modify the behaviour of the fudgelet through environment variables. These environment variables will update the values set in the fudgelet.toml file so you can enable/disable specific features.

The options are:

Setup script options

export HARDWARE_ONLY=true
export SLURM_ONLY=true
export ENABLED_CLUSTERFUDGE_LAUNCHES=true

Here is an example of setting these values when using the install.sh script:

Example of setting installer script options

curl https://get.clusterfudge.com/install.sh | { export CLUSTERFUDGE_API_KEY=$YOUR_API_KEY_HERE; export SLURM_ONLY=true; sudo -E bash; }

Custom Install

If you want more control over how the fudgelet is installed, you can download the latest packages direct from our package repository.

Download a fudgelet version

curl -fsSL -o $OUT https://storage.googleapis.com/clusterfudge-releases/fudgelet/$OS/$ARCH/latest/fudgelet

Apt repo

curl https://europe-west2-apt.pkg.dev/doc/repo-signing-key.gpg | gpg --dearmor -o /usr/share/keyrings/google-ar-archive-keyring.gpg
chmod 0644 /usr/share/keyrings/google-ar-archive-keyring.gpg

echo "deb [signed-by=/usr/share/keyrings/google-ar-archive-keyring.gpg] https://europe-west2-apt.pkg.dev/projects/clusterfudge-images fudgelet main" | sudo tee -a  /etc/apt/sources.list.d/artifact-registry.list

Yum repo

sudo yum makecache

sudo tee -a /etc/yum.repos.d/artifact-registry.repo << EOL
[fudgelet]
name=fudgelet
baseurl=https://europe-west2-yum.pkg.dev/projects/clusterfudge-images/fudgelet-rpm
enabled=1
repo_gpgcheck=0
type=rpm
gpgcheck=0
metadata_expire=6h
EOL

sudo yum install -y fudgelet
sudo yum makecache

Configuration

The fudgelet is configured via a fudgelet.toml file that, by default, is checked for at /etc/clusterfudge/fudgelet.toml.

This is where the fudgelet will source your API key from.

Required fudgelet.toml configuration items

api_key = "$YOUR_API_KEY_HERE"

The following values can also be optionally configured:

Optional fudgelet.toml configuration items

verbosity = "trace|debug|info|warn|error"
log_file_path = "..."
log_file_directory = "..."
hostname_override = "..."
git_known_hosts_file_path = "..."
git_private_ssh_key_file_path = "..."