Homelab K8s with GPU Nodes
If you happen to have a computer that has a GPU, I highly recommend incorporating it into your homelab! As with most things, settings things up can vary from environment to environment. I have listed the components I am personally using. If any one of them aligns with your setup, I hope the guide below can help with the installation.
Setup:
- K3s Cluster
- NVIDIA GPU (Worked with 20 and 30 series)
- Ubuntu 20.04 LTS
After incorporating a GPU enabled node to the cluster, I will provide the steps to deploy t-rex
onto the node. It can mine cryptocurrency by utilizing the GPU.
Setting Up The Node
Some work needs to be done for GPU enabled nodes. Unlike other compute workloads on Kubernetes, containers that uses NVIDIA GPU compute will require necessary kernels to be able to run.
The commands below gives an overview of how to prepare the node specifically for K3s.
# Install NVIDIA Drivers
apt search nvidia-driver # Identify desired drivers
apt install nvidia-headless-xxx-server
distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
&& curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add — \
&& curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
sudo apt-get update \
&& sudo apt-get install -y nvidia-container-runtime
mkdir -p /var/lib/rancher/k3s/agent/etc/containerd
wget https://k3d.io/v5.2.2/usage/advanced/cuda/config.toml.tmpl -O /var/lib/rancher/k3s/agent/etc/containerd/config.toml.tmpl
# Deploy NVIDIA Device Plugins on the Control Plane node
mkdir -p /var/lib/rancher/k3s/server/manifests
cat <<EOF | sudo tee /var/lib/rancher/k3s/server/manifests/nvidia-plugins.yaml
apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
name: nvidia-device-plugin
namespace: kube-system
spec:
chart: nvidia-device-plugin
repo: https://nvidia.github.io/k8s-device-plugin
valuesContent: |-
nodeSelector:
gpu: enabled # Modify label selector accordingly for your cluster
EOF
# Ensure the label matches on the GPU enabled node as well. Feel free to modify for your setup
mkdir -p /etc/rancher/k3s
cat <<EOF | sudo tee /etc/rancher/k3s/config.yaml
node-taint:
- "nvidia.com/gpu=2060:NoSchedule"
node-label:
- "gpu=enabled"
- "gpu-type=nvidia"
EOF
The instructions above are based on the information found in these links:
- https://itnext.io/enabling-nvidia-gpus-on-k3s-for-cuda-workloads-a11b96f967b0
- https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/install-guide.html#install-guide
Now that the node is setup, lets work on deploying the crypto miner.
Creating the Container
The Dockerfile is simple. Use an NVIDIA base container and copy over the executable for t-rex
into it. The provided script the release provides will need to be modified slightly so we will be able to provide arguments to the container. This can be customized further to suit your preferences. For example, if you want to make the algorithm customizable, replace kawpow
with ${2}
and then you can set the algorithm as the second argument you pass into the container.
# Download latest T-Rex repository
wget https://github.com/trexminer/T-Rex/releases/download/0.24.8/t-rex-0.24.8-linux.tar.gz
tar xvf t-rex-0.24.8-linux.tar.gz
cat <<EOF | tee RVN-ravenminer.sh
#!/bin/sh
# Modify the script accordingly. This has been modified to expect an argument for the wallet address
./t-rex -a kawpow -o stratum+tcp://stratum.ravenminer.com:3838 -u ${1}.k8s --api-bind-http 0.0.0.0:4067 -p pps
EOF
# Create Dockerfile
cat <<EOF | tee Dockerfile
FROM nvidia/cuda:11.4.0-base-ubuntu20.04
COPY t-rex .
RUN chmod +x t-rex
RUN chmod +x RVN*
RUN apt-get update && apt-get install libnvidia-ml-dev -y
EXPOSE 4067
ENTRYPOINT [ "./RVN-ravenminer.sh" ]
EOF
# Create Dockerimage
docker build -t private.registry/miner/raven:0.0.0 .
Creating the Helm Chart
The chart is adapted from the default chart Helm generates. The only essential differences from the original is that the Deployment has been modified to a Statefulset. The Statefulset is used because pods will not be able to cycle if there is an existing one already running. Most GPUs will not be able to allocate a fraction of its compute, so a Deployment will never be able to cycle a new pod when an existing one is already occupying it.
The helm chart is linked here https://github.com/sunshuu/miner-chart. To deploy it, simply modify it according to your environment and run helm upgrade -i raven-miner /path/to/helmchart -f your_customization.yaml
.
Hopefully It Worked!
If your cluster has an Ingress configured, you should be able to view the T-Rex webpage. It’s a nice little webpage that shows all the stats of your mining.
