Volume Backup, Delete and Restore
This tutorial provides an introduction for taking backups of a volume and restoring it. It demonstrates how to create, delete, and restore the volume using gzip.
Before you begin
Before you begin this tutorial, you should be familiar with the following Kubernetes concepts:
- Pods
- PersistentVolumes
- The kubectl command line tool
Before you begin
-
You need to have a Kubernetes cluster that has only one Node, and the kubectl command-line tool must be configured to communicate with your cluster. If you do not already have a single-node cluster, you can create one by using Minikube.
-
Familiarize yourself with the material in Persistent Volumes.
Objectives
After this tutorial, you will be familiar with the following:
- Understanding the importance of volume backups
- Backing up kubernetes volumes
- Deleting and recreating volumes
- Restoring applications with backed-up data
- Practical workflow for backup and restore
Configure Persistent Volume and Persistent Volume Claim
Use the existing yaml file from the Configure a Pod to Use a PersistentVolume for Storage task to configure PersistentVolume (PV) and PersistentVolumeClaim (PVC). Then follow the steps from the beginning, to create a hostpath
and index.html
file as well.
View information about the PersistentVolume:
kubectl get pv task-pv-volume
The output shows that the PersistentVolume has a STATUS
of Available
. This
means it has not yet been bound to a PersistentVolumeClaim.
NAME CAPACITY ACCESSMODES RECLAIMPOLICY STATUS CLAIM STORAGECLASS REASON AGE
task-pv-volume 10Gi RWO Retain Available manual 4s
After you create the PersistentVolumeClaim, the Kubernetes controlplane looks for a PersistentVolume that satisfies the claim's requirements. If the controlplane finds a suitable PersistentVolume with the same StorageClass, it binds the claim to the volume.
Look again at the PersistentVolume:
kubectl get pv task-pv-volume
Now the output shows a STATUS
of Bound
.
NAME CAPACITY ACCESSMODES RECLAIMPOLICY STATUS CLAIM STORAGECLASS REASON AGE
task-pv-volume 10Gi RWO Retain Bound default/task-pv-claim manual 2m
Look at the PersistentVolumeClaim:
kubectl get pvc task-pv-claim
The output shows that the PersistentVolumeClaim is bound to your PersistentVolume,
task-pv-volume
.
NAME STATUS VOLUME CAPACITY ACCESSMODES STORAGECLASS AGE
task-pv-claim Bound task-pv-volume 10Gi RWO manual 30s
Create a Pod
The next step is to create a Pod that uses your PersistentVolumeClaim as a volume.
Here is the configuration file for the Pod:
apiVersion: v1
kind: Pod
metadata:
name: pv-pod-backup
spec:
volumes:
- name: task-pv-storage
persistentVolumeClaim:
claimName: task-pv-claim
initContainers:
- name: init-volume
image: httpd:2.4
command: ["/bin/bash", "-c"]
args:
- |
echo "Initializing the mounted volume...";
echo "Initialized at: $(date)" >> /usr/local/apache2/htdocs/index.html;
volumeMounts:
- mountPath: "/mnt/data"
name: task-pv-storage
containers:
- name: task-pv-container
image: httpd:2.4
ports:
- containerPort: 80
name: "http-server"
volumeMounts:
- mountPath: "/usr/local/apache2/htdocs"
name: task-pv-storage
Notice that the Pod's configuration file specifies a PersistentVolumeClaim, but it does not specify a PersistentVolume. From the Pod's point of view, the claim is a volume.
Create the Pod:
kubectl apply -f https://k8s.io/examples/pods/storage/pv-pod-backup.yaml
Verify that the container in the Pod is running;
kubectl get pod pv-pod-backup
Get a shell to the container running in your Pod:
kubectl exec -it pv-pod-backup -- /bin/bash
In your shell, verify that httpd:2.4 is serving the index.html
file from the
hostPath volume:
# Be sure to run these 3 commands inside the root shell that comes from
# running "kubectl exec" in the previous step
apt update
apt install curl
curl http://localhost/
The output shows the text that you wrote to the index.html
file on the
hostPath volume:
Hello from Kubernetes storage
If you see that message, you have successfully configured a Pod to use storage from a PersistentVolumeClaim.
Backup, Delete and Restore Volumes
Create Backup
The process involves accessing the pod to interact with its file system, navigating to the mounted volume directory (/usr/local/apache2/htdocs)
containing the data to be backed-up, and compressing the directory contents into a .tar.gz
file stored in the pod's temporary directory (/tmp)
. After exiting the pod shell, the backup file is transferred to the local machine using kubectl cp
, ensuring the data is securely saved outside the pod environment for restoration or safekeeping.
Access the Pod: To start the backup process, first, access the pod that has the volume you want to back up:
kubectl exec -it pv-pod-backup -- /bin/bash
This command opens an interactive shell inside the specified pod (pv-pod-backup).
Navigate to the Mounted Volume Directory: Once inside the pod, navigate to the directory where the volume is mounted:
cd /usr/local/apache2/htdocs
This directory contains the data you wish to backup.
Create a Compressed Backup File (gzip): Use the tar command to create a compressed backup file of the directory's contents:
tar -czvf /tmp/volume-backup.tar.gz .
Exit the pod: After creating the backup file, exit the pod:
exit
Copy the Backup to Local Machine:
Use kubectl cp
to copy the backup file from the pod to your local system:
kubectl cp pv-pod-backup:/tmp/volume-backup.tar.gz ./volume-backup.tar.gz
This saves the backup file volume-backup.tar.gz
in your current local directory.
Delete the data from volume
To understand the backup use, The process involves accessing the pod to navigate to the mounted volume directory /usr/local/apache2/htdocs
, deleting all files and subdirectories within it using rm -rf *
, and verifying the deletion by listing the directory contents ls -l
. Finally, the pod shell is exited after confirming the directory is empty.
Access the Pod Again: To delete the data from the volume, access the pod that uses the volume:
kubectl exec -it task-pv-pod -- /bin/bash
Navigate to the Mounted Volume Directory: Inside the pod, go to the directory where the volume is mounted:
cd /usr/local/apache2/htdocs
Delete All Files: Delete all files and subdirectories in the volume:
rm -rf *
Verify the Deletion: List the contents of the directory to confirm if it is empty:
ls -l
If the deletion was successful, no files or directories will be listed.
Exit the Pod: After verifying the deletion, exit the pod:
exit
Restore the Backup Data
The process involves copying the backup file (volume-backup.tar.gz) from the local machine to the pod's temporary directory using kubectl cp
. Then, access the pod shell, navigate to the mounted volume directory /usr/local/apache2/htdocs
, and extract the backup using tar
. Finally, verify the restored files with ls -l
and exit the pod shell.
Copy the Backup File to the Pod: Transfer the backup file from your local machine, back to the pod:
kubectl cp ./volume-backup.tar.gz pv-pod-backup:/tmp/volume-backup.tar.gz
This copies the backup file to the /tmp directory of the pv-pod-backup pod.
Access the Pod: Open an interactive shell inside the pod:
kubectl exec -it task-pv-pod -- /bin/bash
Extract the Backup: Navigate to the mounted volume directory and extract the backup:
cd /usr/local/apache2/htdocs
tar -xzvf /tmp/volume-backup.tar.gz
Verify the Restoration: List the contents of the directory to confirm the files have been restored:
ls -l
If successful, you should see all the files and directories from the backup.
Exit the Pod:
exit
Test the Restored Data
Now Check the status of the Pod pv-pod-backup
Whether it is in Running
state or not. If it is in Running
state then restoring the volume worked perfectly fine.
kubectl get pod pv-pod-backup -o wide
Now You can again check the output by access the pods
kubectl exec -it pv-pod-backup -- /bin/bash
To run the restored file run
# Be sure to run these 3 commands inside the root shell that comes from
# running "kubectl exec" in the previous step
apt update
apt install curl
curl http://localhost/
Understanding the importance of volume backups
Some key points about importance of volume backup are:
- Backups keep your data safe if it's accidentally deleted, corrupted, or lost due to hardware failure.
- If your system crashes or faces an attack, backups help restore data and reduce downtime.
- Backups let you transfer data between environments or cloud platforms without losing it.
- Losing data can stop your work. Backups help avoiding this.
- Migrate data from development to testing or production environments.
- Backups allow you to create a safe copy of your production data for Testing new features or bug fixes.