123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344 |
- \documentclass{beamer}
- %\setbeamercovered{transparent}
- \usetheme{poul}
- %\usetheme{Madrid}
- \usepackage[utf8]{inputenc}
- \usepackage[svgpath=images/]{svg}
- \usepackage{graphicx}
- \graphicspath{ {images/} }
- \usepackage[hyphenbreaks]{breakurl}
- \usepackage{hyperref}
- \def\UrlBreaks{\do\/\do-}
- %Information to be included in the title page:
- \title{Backup and (hopefully) Restore}
- \author{Andrea Gussoni}
- \institute{P.O.u.L.}
- \date{23 Marzo 2017}
- \titlegraphic{\includesvg[height=1.5cm]{logowhite}}
- \begin{document}
- \frame{\titlepage}
- \begin{frame}
- \frametitle{Why do we need backups?}
- Bad things can happen and do happen:
- \begin{itemize}
- \item You may drop your computer accidentally.
- \item The disk may be damaged by vibrations during the daily commute.
- \item The computer where you keep the unique copy of your thesis
- may be stolen.
- \item After some time the disk may simply stop operating because of ageing.
- \item But often the principal cause of data loss is that thing that it is between the keyboard and the chair.
- \end{itemize}
- \end{frame}
- \begin{frame}
- \frametitle{Why do we need backups?}
- \begin{center}
- \includegraphics[width=0.7\textwidth]{gitlab}
- \end{center}
- \footnotetext{\url{https://twitter.com/gitlabstatus/status/826591961444384768}}
- \end{frame}
- \begin{frame}
- \frametitle{What are backups?}
- \begin{block}{Definition}
- The copying and archiving of computer data so that it may be
- used to restore the original after a data loss event.
- \end{block}
- \end{frame}
- \begin{frame}
- \frametitle{What to backup?}
- It is important to distinguish what it is necessary to backup from what
- it is not.\\\pause
- Obviously this depends on the setup that you are using (native services, containers, VMs etc...)
- \end{frame}
- \begin{frame}
- \frametitle{A general guideline}
- Must:
- \begin{itemize}
- \item /home
- \end{itemize}
- \vfill
- At your discretion:
- \begin{itemize}
- \item /etc
- \item /var
- \item /mnt /media
- \end{itemize}
- \vfill
- Not necessary\footnote{if these folders contain something important probably you are doing something wrong in your setup}:
- \begin{itemize}
- \item /proc /sys
- \item /dev /tmp
- \end{itemize}
- \end{frame}
- \begin{frame}
- \frametitle{Backup types}
- Backups can be:
- \begin{itemize}
- \item \textbf{full}: a complete backup of a all files and folder starting from a root node.
- \item \textbf{incremental}: contains all the differences since the last incremental backup.
- \item \textbf{differential} contains the changes since the last full backup.
- \end{itemize}
- \end{frame}
- \begin{frame}
- \frametitle{Backup Support}
- \begin{itemize}
- \item Hard disks (HDD).
- \item Solid-State drives (SSD).
- \item Optical supports: DVDs, Blu-ray.
- \item Flash Drives.
- \item Cloud\footnote{Remember that there is no cloud, just other people's computers.}.
- \end{itemize}
- \end{frame}
- \begin{frame}
- \frametitle{dd}
- \textbf{dd} is a powerful tool that basically can copy everything that is a file or a block device. It is common to use it for disk cloning.\\
- Usage example:
- \begin{itemize}
- \item \textit{dd if=/dev/sdX of=/dev/sdY conv=fdatasync\footnote{useful to actually wait the end of data transfer and avoid corrupted copies}}
- \begin{itemize}
- \item \textbf{if:} input file/device
- \item \textbf{out:} output file/device
- \end{itemize}
- \end{itemize}
- \vfill\pause
- \begin{alertblock}{Caution}
- Since \textbf{dd} often requires \textit{sudo} privileges to run, if you mismatch the name of a device you can actually wipe the content of your primary hard disk, double check always the arguments before pressing enter.
- \end{alertblock}
- \end{frame}
- \begin{frame}
- \frametitle{GNU ddrescue}
- gdrescue is an enhanced version of dd that tries to rescue good parts in case of read errors. It may be useful to recover data from a drive with some damaged sector.\\
- Usage Example:
- \begin{itemize}
- \item \textit{ddrescue [options] /dev/sdX outfile mapfile}
- \begin{itemize}
- \item \textbf{mapfile:} a human readable text file ddrescue uses to manage the copy
- \end{itemize}
- \end{itemize}\pause
- \begin{alertblock}{Caution}
- For the rescued data to be correct, both dd and gddrescue are best used on unmounted devices.
- \end{alertblock}\pause
- \begin{block}{Tip}
- gddrescue can also be useful when trying to reallocate sectors on a drive with a few sector unreadable. Doing a wipe of the drive with gddrescue should reallocate bad sectors.
- \end{block}
- \end{frame}
- \begin{frame}
- \frametitle{rsync}
- Also known as an advanced version of cp
- \begin{exampleblock}{Pros}
- \begin{itemize}
- \item (unlike cp) preserves links, file permissions and ownerships, modification times, etc.
- \item designed to be network efficient because only transfers file changes.
- \item easy to use.
- \end{itemize}
- \end{exampleblock}
- \begin{alertblock}{Cons}
- \begin{itemize}
- \item no storage encryption.
- \end{itemize}
- \end{alertblock}
- \end{frame}
- \begin{frame}
- \frametitle{rsync: usage}
- \begin{itemize}
- \item rsync -Pr source destination
- \begin{itemize}
- \item \textbf{P:} keep partially transferred files if the transfer is interrupted.
- \item \textbf{r:} recursive directory option.
- \end{itemize}
- \vfill
- \pause
- \item rsync source host:destination\footnote{But please don't do this \textit{rsync -av -{}-delete source host:$\sim$/}}
- \begin{itemize}
- \item uses ssh by default, but can also be forced with the -e ssh option.
- \end{itemize}
- \vfill
- \pause
- \item rsync -aAXv -{}-exclude=\{...\} /* /backup folder
- \begin{itemize}
- \item backup /* while following symlinks and preserving file properties.
- \end{itemize}
- \end{itemize}
- \end{frame}
- \begin{frame}
- \frametitle{rsnapshot: rsync automated}
- rsnapshot produces automated, periodical system snapshots
- \vfill
- \begin{exampleblock}{Pros}
- \begin{itemize}
- \item preserves links, file permissions and ownership, modification times, etc.
- \item network efficient.
- \item each snapshot contains a full system backup.
- \item easy to use.
- \end{itemize}
- \end{exampleblock}
- \vfill
- \begin{alertblock}{Cons}
- \begin{itemize}
- \item no storage encryption.
- \end{itemize}
- \end{alertblock}
- \end{frame}
- \begin{frame}
- \frametitle{duplicity}
- duplicity produces encrypted, incremental backups in tar format.
- \begin{exampleblock}{Pros}
- \begin{itemize}
- \item preserves links, file permissions and ownership, modification times, etc.
- \item network efficient.
- \item incremental backups.
- \item supports storage encryption with gpg.
- \item easy to use.
- \end{itemize}
- \end{exampleblock}
- \end{frame}
- \begin{frame}
- \frametitle{duplicity: usage}
- \begin{itemize}
- \item duplicity /home/user scp::/user@host//backup/directory
- \vfill\pause
- \item duplicity [restore] scp://user@host//backup/directory /home/user
- \vfill\pause
- \item duplicity full /home/user scp::/user@host//backup/directory
- \end{itemize}
- \end{frame}
- \begin{frame}
- \frametitle{duplicity: usage}
- \begin{itemize}
- \item duplicity list-current-files scp::/user@host//backup/directory
- \begin{itemize}
- \item list the files contained in the backup.
- \end{itemize}
- \vfill\pause
- \item duplicity [restore] -t 3D scp://user@host//backup/directory /home/user
- \begin{itemize}
- \item specify the time from which to restore files.
- \end{itemize}
- \vfill\pause
- \item duplicity remove-older-than 30D scp::/user@host//backup/directory
- \begin{itemize}
- \item remove from the backup full backups older than the specified period.
- \end{itemize}
- \end{itemize}
- \end{frame}
- \begin{frame}
- \frametitle{Demo}
- \begin{center}
- {\Huge Demo!}
- \end{center}
- \end{frame}
- \begin{frame}
- \frametitle{Last but not Least}
- \begin{itemize}
- \item When you use duplicity with encryption enabled always remember to backup the gpg keys you use to encrypt and sign the backup.\\
- If you loose them you won't be able to restore the backup.\pause
- \item Always check that the backup is taking place, don't just assume that everything is working fine because you followed exactly the suggested guide.\pause
- \item Always try to test that the backup is really working by trying to restore the backup. You'll be surprised to know how many times the backup procedures are not really working, and unfortunately if you do not test them you'll notice it only when the files are gone.
- \end{itemize}
- \end{frame}
- \begin{frame}
- \frametitle{Hi again GitLab}
- \begin{center}
- \includegraphics[width=0.9\textwidth,height=0.5\textheight]{gitlab2}
- \end{center}
- \footnotetext{\url{https://docs.google.com/document/d/1GCK53YDcBWQveod9kfzW-VCxIABGiryG7_z_6jHdVik/pub}}
- %\pause
- %\footnotesize{I don't want to put shame on GitLab for this incident, but only to use it as a case study.\\ In fact I think that the incident has been managed really well by the GitLab Team.\\
- %Instead of starting blaming each other and finding silly excuses as usually happens in cases like this, they have been really open from the beginning about the problem and put as a priority the restore of the functionality of the service.}
- \end{frame}
- \begin{frame}
- \frametitle{Before the Backup}
- A different approach to data protection is to use RAID (\textit{Redundant Array of Independent Disks}).\\
- \pause
- In general what we try to obtain with RAID is:
- \begin{itemize}
- \item Survival of the system if a disk failure happen.
- \item In certain conditions we can achieve higher performances compared to the single disk case.
- \end{itemize}
- \footnotetext{For further informations you can visit \url{https://www.digitalocean.com/community/tutorials/an-introduction-to-raid-terminology-and-concepts}}
- \end{frame}
- \begin{frame}
- \frametitle{RAID Configurations}
- \begin{figure}
- \centering
- \includesvg[width = 75pt]{RAID_0}
- \includesvg[width = 75pt]{RAID_1}\\
- \includesvg[width = 150pt]{RAID_5}
- \end{figure}
- \end{frame}
- \begin{frame}
- \frametitle{New generation filesystems}
- There are new kind of filesystems that try to resolve some problems that we usually have in data storage. The two main examples are ZFS and Btrfs\footnote{Please remind that Btrfs is still in heavy development, before using it in production check at \url{https://btrfs.wiki.kernel.org/index.php/Status} that the features you will need are considered stable.} Classical features that we can find in this kind of filesystems are:
- \begin{itemize}
- \item CopyOnWrite.
- \item Deduplication.
- \item Data \& Metadata checksums.
- \item Integrated RAID.
- \item Volume Management.
- \item Snapshots.
- \end{itemize}
- \end{frame}
- \begin{frame}
- \frametitle{Snapshots}
- \begin{itemize}
- \item Snapshots can be particularly useful because they allow us to obtain an (almost) instant snapshot of a volume that we can restore later, archive somewhere etc.\\\pause
- \item So we can use them in order to do some potential risky modifications on a system and restore the previous state with a little effort.\\\pause
- \item Remember that having a separate \textit{classical} backup is always useful, in particular for important data of our applications.\pause
- \item RAID is not a backup.
- \end{itemize}
- \end{frame}
- \begin{frame}
- \frametitle{References}
- \begin{itemize}
- \item \url{https://wiki.archlinux.org/index.php/Full_system_backup_with_rsync}
- \item \url{https://wiki.archlinux.org/index.php/Duplicity}
- \item \url{http://duplicity.nongnu.org/}
- \item \url{https://www.digitalocean.com/community/tutorials/how-to-use-duplicity-with-gpg-to-securely-automate-backups-on-ubuntu}
- \item \url{https://github.com/zertrin/duplicity-backup.sh}
- \end{itemize}
- \end{frame}
- \begin{frame}
- \frametitle{Special Thanks}
- I used as reference and starting point for this presentation the material of the previous editions of the course.\\
- Special thanks to \textit{Valeria Mazzola}\footnote{\url{https://slides.poul.org/2016/corsi-linux-avanzati/Backup_and_Restore.pdf}} and \textit{Federico Amedeo Izzo}\footnote{\url{https://filesystem.izzo.ovh/}} for the slides of the two previous edition of this talk.
- \end{frame}
- \begin{frame}
- \frametitle{License}
- \begin{center}
- {\Huge Thank you!}
- \vfill
- \includesvg[height=1.5cm]{by-sa}\\
- {\footnotesize These slides are published under a Creative Commons Attribution-ShareAlike 4.0 license.}
- \end{center}
- \end{frame}
- \end{document}
|