How Can Distributed Storage (Longhorn) Be Configured on EKS?

adil
4 min readFeb 2, 2024

--

A Kubernetes cluster may use distributed storage for a variety of purposes.

Photo by Shubham's Web3 on Unsplash

The first things that come to mind are disaster recovery, scalability, and high availability.

These are boilerplate requirements. Another thing I need is simplicity.

For Kubernetes, there is distributed block storage available: Longhorn

Longhorn Architecture

Image source: longhorn.io

How to Install Longhorn on EKS?

On Longhorn's official website, the installation instructions are clearly laid out.

The open-iscsi package is a prerequisite for Longhorn to install on a Kubernetes node. However, the default EKS nodes do not have the open-iscsci package installed.

DaemonSet is required in order to install the package on the OS. On each Kubernetes node, a DaemonSet container will be operating.

00-iscsi-dependency.yaml

---
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: longhorn-os-dependency
spec:
selector:
matchLabels:
app: longhorn-os-dependency
template:
metadata:
labels:
app: longhorn-os-dependency
spec:
hostNetwork: true
hostPID: true
initContainers:
- name: dependency-install
command: ["/bin/sh"]
args: ["-c", "nsenter --mount=/proc/1/ns/mnt -- sh -c 'yum -y install iscsi-initiator-utils'"]
image: alpine:latest
securityContext:
privileged: true
containers:
- name: pause
image: public.ecr.aws/eks-distro/kubernetes/pause:3.8

In order to install the OS package on the node, this DaemonSet will have privileged access rights.

Apply:

kubectl apply -f 00-iscsi-dependency.yaml

Let’s check the logs:

The required package has been installed.

Now let’s set up Longhorn:

helm install longhorn longhorn/longhorn --namespace longhorn-system --create-namespace --version 1.5.3

After installation, the following services and containers ought to be visible to you:

How to access Longhorn’s UI?

To access the service longhorn-frontend, we’ll create a tunnel:

kubectl port-forward service/longhorn-frontend -n longhorn-system 8181:80

The user interface should be accessible at localhost:8181

Create a Persistent Volume Claim

(See: Terminology Confusion in Kubernetes: StorageClass, PersistentVolume, PersistentVolumeClaim)

01-pvc.yaml

---
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: longhorn-001-pvc
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 1Gi

Apply:

➜  ~ kubectl apply -f 01-pvc.yaml
persistentvolumeclaim/longhorn-001-pvc created

By default, Longhorn creates three replicas of a volume.

On the UI, three copies of the longhorn-001-pvc are visible:

Their status is Stopped since no pod has the volume associated to it.

Create a Deployment with Volume

02-deployment.yaml

---
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
spec:
replicas: 1
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- image: nginx
name: nginx
volumeMounts:
- name: 001-pvc
mountPath: /data/example
volumes:
- name: 001-pvc
persistentVolumeClaim:
claimName: longhorn-001-pvc

Apply:

➜  ~ kubectl apply -f 02-deployment.yaml
deployment.apps/nginx created

The status of the replicas has changed.

Let’s test the disaster recovery!

I have four Kubernetes nodes up and running. The replicas are running on three nodes. I’ll take down one of those nodes and see how Longhorn responds.

Current state:

I’ll take down this node: 192.168.31.59

I chose this node since it contains a Nginx container and a volume replica.

Let’s see what’s happening:

The status of the node is NotReady. The pod is inaccessible even if it seems to be running.

Longhorn UI:

Another replica was created on the fourth node after around ten minutes:

It is too big to fit on one screen. You can see and contrast the three blue partitions with the previous screenshot.

The container is created on another node.

Pro Tip:

Longhorn leverages NFS to link between replicas. In the longhorn-system namespace, a pod for NFS mounting is created when you create a volume in ReadWriteMany mode.

Example:

This pod ought to always run. If the container crashes for some reason, either delete the pod or reduce deployment to 0 (zero). Make sure the volume is detached after that. On the Longhorn UI, the volume’s current status is shown.

Next, create your pod or increase the deployment to one (or any desired number). Consequently, a new NFS connection will be established.

--

--