talk.tex 13 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348
  1. \documentclass{beamer}
  2. %\setbeamercovered{transparent}
  3. \usetheme{poul}
  4. %\usetheme{Madrid}
  5. \usepackage[utf8]{inputenc}
  6. \usepackage[svgpath=images/]{svg}
  7. \usepackage{graphicx}
  8. \graphicspath{ {images/} }
  9. \usepackage[hyphenbreaks]{breakurl}
  10. \usepackage{hyperref}
  11. \def\UrlBreaks{\do\/\do-}
  12. %Information to be included in the title page:
  13. \title{Backup and (hopefully) Restore}
  14. \author{Andrea Gussoni}
  15. \institute{P.O.u.L.}
  16. \date{23 Marzo 2017}
  17. \titlegraphic{\includesvg[height=1.5cm]{logowhite}}
  18. \begin{document}
  19. \frame{\titlepage}
  20. \begin{frame}
  21. \frametitle{Why do we need backups?}
  22. Bad things can happen and do happen:
  23. \begin{itemize}
  24. \item You may drop your computer accidentally.
  25. \item The disk may be damaged by vibrations during the daily commute.
  26. \item The computer where you keep the unique copy of your thesis
  27. may be stolen.
  28. \item After some time the disk may simply stop operating because of ageing.
  29. \item But often the principal cause of data loss is that thing that it is between the keyboard and the chair.
  30. \end{itemize}
  31. \end{frame}
  32. \begin{frame}
  33. \frametitle{Why do we need backups?}
  34. \begin{center}
  35. \includegraphics[width=0.7\textwidth]{gitlab}
  36. \end{center}
  37. \footnotetext{\url{https://twitter.com/gitlabstatus/status/826591961444384768}}
  38. \end{frame}
  39. \begin{frame}
  40. \frametitle{What are backups?}
  41. \begin{block}{Definition}
  42. The copying and archiving of computer data so that it may be
  43. used to restore the original after a data loss event.
  44. \end{block}
  45. \end{frame}
  46. \begin{frame}
  47. \frametitle{What to backup?}
  48. It is important to distinguish what it is necessary to backup from what
  49. it is not.\\\pause
  50. Obviously this depends on the setup that you are using (native services, containers, VMs etc...)
  51. \end{frame}
  52. \begin{frame}
  53. \frametitle{A general guideline}
  54. Must:
  55. \begin{itemize}
  56. \item /home
  57. \end{itemize}
  58. \vfill
  59. At your discretion:
  60. \begin{itemize}
  61. \item /etc
  62. \item /var
  63. \item /mnt /media
  64. \end{itemize}
  65. \vfill
  66. Not necessary\footnote{if these folders contain something important probably you are doing something wrong in your setup}:
  67. \begin{itemize}
  68. \item /proc /sys
  69. \item /dev /tmp
  70. \end{itemize}
  71. \end{frame}
  72. \begin{frame}
  73. \frametitle{Backup types}
  74. Backups can be:
  75. \begin{itemize}
  76. \item \textbf{full}: a complete backup of a all files and folder starting from a root node.
  77. \item \textbf{incremental}: contains all the differences since the last incremental backup.
  78. \item \textbf{differential} contains the changes since the last full backup.
  79. \end{itemize}
  80. \end{frame}
  81. \begin{frame}
  82. \frametitle{Backup Support}
  83. \begin{itemize}
  84. \item Hard disks (HDD).
  85. \item Solid-State drives (SSD).
  86. \item Optical supports: DVDs, Blu-ray.
  87. \item Flash Drives.
  88. \item Cloud\footnote{Remember that there is no cloud, just other people's computers.}.
  89. \end{itemize}
  90. \end{frame}
  91. \begin{frame}
  92. \frametitle{dd}
  93. \textbf{dd} is a powerful tool that basically can copy everything that is a file or a block device. It is common to use it for disk cloning.\\
  94. Usage example:
  95. \begin{itemize}
  96. \item \textit{dd if=/dev/sdX of=/dev/sdY conv=fdatasync\footnote{useful to actually wait the end of data transfer and avoid corrupted copies}}
  97. \begin{itemize}
  98. \item \textbf{if:} input file/device
  99. \item \textbf{out:} output file/device
  100. \end{itemize}
  101. \end{itemize}
  102. \vfill\pause
  103. \begin{alertblock}{Caution}
  104. Since \textbf{dd} often requires \textit{sudo} privileges to run, if you mismatch the name of a device you can actually wipe the content of your primary hard disk, double check always the arguments before pressing enter.
  105. \end{alertblock}
  106. \end{frame}
  107. \begin{frame}
  108. \frametitle{GNU ddrescue}
  109. gdrescue is an enhanced version of dd that tries to rescue good parts in case of read errors. It may be useful to recover data from a drive with some damaged sector.\\
  110. Usage Example:
  111. \begin{itemize}
  112. \item \textit{ddrescue [options] /dev/sdX outfile mapfile}
  113. \begin{itemize}
  114. \item \textbf{mapfile:} a human readable text file ddrescue uses to manage the copy
  115. \end{itemize}
  116. \end{itemize}\pause
  117. \begin{alertblock}{Caution}
  118. For the rescued data to be correct, both dd and gddrescue are best used on unmounted devices.
  119. \end{alertblock}\pause
  120. \begin{block}{Tip}
  121. gddrescue can also be useful when trying to reallocate sectors on a drive with a few sector unreadable. Doing a wipe of the drive with gddrescue should reallocate bad sectors.
  122. \end{block}
  123. \end{frame}
  124. \begin{frame}
  125. \frametitle{rsync}
  126. Also known as an advanced version of cp
  127. \begin{exampleblock}{Pros}
  128. \begin{itemize}
  129. \item (unlike cp) preserves links, file permissions and ownerships, modification times, etc.
  130. \item designed to be network efficient because only transfers file changes.
  131. \item easy to use.
  132. \end{itemize}
  133. \end{exampleblock}
  134. \begin{alertblock}{Cons}
  135. \begin{itemize}
  136. \item no storage encryption.
  137. \end{itemize}
  138. \end{alertblock}
  139. \end{frame}
  140. \begin{frame}
  141. \frametitle{rsync: usage}
  142. \begin{itemize}
  143. \item rsync -Pr source destination
  144. \begin{itemize}
  145. \item \textbf{P:} keep partially transferred files if the transfer is interrupted.
  146. \item \textbf{r:} recursive directory option.
  147. \item this do not preserve the attributes of the file.
  148. \end{itemize}
  149. \vfill
  150. \pause
  151. \item rsync source host:destination\footnote{But please don't do this \textit{rsync -av -{}-delete source host:$\sim$}}
  152. \begin{itemize}
  153. \item uses ssh by default, but can also be forced with the -e ssh option.
  154. \end{itemize}
  155. \vfill
  156. \pause
  157. \item rsync -aAXv -{}-exclude=\{...\} /* /backupfolder
  158. \begin{itemize}
  159. \item backup /* while following symlinks and preserving file properties.
  160. \end{itemize}
  161. \end{itemize}
  162. \end{frame}
  163. \begin{frame}
  164. \frametitle{rsnapshot: rsync automated}
  165. rsnapshot produces automated, periodical system snapshots
  166. \vfill
  167. \begin{exampleblock}{Pros}
  168. \begin{itemize}
  169. \item preserves links, file permissions and ownership, modification times, etc.
  170. \item network efficient.
  171. \item each snapshot contains a full system backup.
  172. \item easy to use.
  173. \end{itemize}
  174. \end{exampleblock}
  175. \vfill
  176. \begin{alertblock}{Cons}
  177. \begin{itemize}
  178. \item no storage encryption.
  179. \end{itemize}
  180. \end{alertblock}
  181. \end{frame}
  182. \begin{frame}
  183. \frametitle{duplicity}
  184. duplicity produces encrypted, incremental backups in tar format.
  185. \begin{exampleblock}{Pros}
  186. \begin{itemize}
  187. \item preserves links, file permissions and ownership, modification times, etc.
  188. \item network efficient.
  189. \item incremental backups.
  190. \item supports storage encryption with gpg.
  191. \item easy to use.
  192. \end{itemize}
  193. \end{exampleblock}
  194. \end{frame}
  195. \begin{frame}
  196. \frametitle{duplicity: usage}
  197. \begin{itemize}
  198. \item duplicity /home/user scp::/user@host//backup/directory
  199. \vfill\pause
  200. \item duplicity [restore] scp://user@host//backup/directory /home/user
  201. \vfill\pause
  202. \item duplicity full /home/user scp::/user@host//backup/directory
  203. \end{itemize}
  204. \end{frame}
  205. \begin{frame}
  206. \frametitle{duplicity: usage}
  207. \begin{itemize}
  208. \item duplicity list-current-files scp::/user@host//backup/directory
  209. \begin{itemize}
  210. \item list the files contained in the backup.
  211. \end{itemize}
  212. \vfill\pause
  213. \item duplicity [restore] -t 3D scp://user@host//backup/directory /home/user
  214. \begin{itemize}
  215. \item specify the time from which to restore files.
  216. \end{itemize}
  217. \vfill\pause
  218. \item duplicity remove-older-than 30D scp::/user@host//backup/directory
  219. \begin{itemize}
  220. \item remove from the backup full backups older than the specified period.
  221. \end{itemize}
  222. \end{itemize}
  223. \end{frame}
  224. \begin{frame}
  225. \frametitle{Demo}
  226. \begin{center}
  227. {\Huge Demo!}
  228. \end{center}
  229. \end{frame}
  230. \begin{frame}
  231. \frametitle{Last but not Least}
  232. \begin{itemize}
  233. \item When you use duplicity with encryption enabled always remember to backup the gpg keys you use to encrypt and sign the backup.\\
  234. If you loose them you won't be able to restore the backup.\pause
  235. \item Always check that the backup is taking place, don't just assume that everything is working fine because you followed exactly the suggested guide.\pause
  236. \item Always try to test that the backup is really working by trying to restore the backup. You'll be surprised to know how many times the backup procedures are not really working, and unfortunately if you do not test them you'll notice it only when the files are gone.
  237. \end{itemize}
  238. \end{frame}
  239. \begin{frame}
  240. \frametitle{Hi again GitLab}
  241. \begin{center}
  242. \includegraphics[width=0.9\textwidth,height=0.5\textheight]{gitlab2}
  243. \end{center}
  244. \footnotetext{\url{https://docs.google.com/document/d/1GCK53YDcBWQveod9kfzW-VCxIABGiryG7_z_6jHdVik/pub}}
  245. %\pause
  246. %\footnotesize{I don't want to put shame on GitLab for this incident, but only to use it as a case study.\\ In fact I think that the incident has been managed really well by the GitLab Team.\\
  247. %Instead of starting blaming each other and finding silly excuses as usually happens in cases like this, they have been really open from the beginning about the problem and put as a priority the restore of the functionality of the service.}
  248. \end{frame}
  249. \begin{frame}
  250. \frametitle{Before the Backup}
  251. A different approach to data protection is to use RAID (\textit{Redundant Array of Independent Disks}).\\
  252. \pause
  253. In general what we try to obtain with RAID is:
  254. \begin{itemize}
  255. \item Survival of the system if a disk failure happen.
  256. \item In certain conditions we can achieve higher performances compared to the single disk case.
  257. \end{itemize}
  258. \footnotetext{For further informations you can visit \url{https://www.digitalocean.com/community/tutorials/an-introduction-to-raid-terminology-and-concepts}}
  259. \end{frame}
  260. \begin{frame}
  261. \frametitle{RAID Configurations}
  262. \begin{figure}
  263. \centering
  264. \includesvg[width = 75pt]{RAID_0}
  265. \includesvg[width = 75pt]{RAID_1}\\
  266. \includesvg[width = 150pt]{RAID_5}
  267. \end{figure}
  268. \end{frame}
  269. \begin{frame}
  270. \frametitle{New generation filesystems}
  271. There are new kind of filesystems that try to resolve some problems that we usually have in data storage. The two main examples are ZFS and Btrfs\footnote{Please remind that Btrfs is still in heavy development, before using it in production check at \url{https://btrfs.wiki.kernel.org/index.php/Status} that the features you will need are considered stable.} Classical features that we can find in this kind of filesystems are:
  272. \begin{itemize}
  273. \item CopyOnWrite.
  274. \item Deduplication.
  275. \item Data \& Metadata checksums.
  276. \item Integrated RAID.
  277. \item Volume Management.
  278. \item Snapshots.
  279. \end{itemize}
  280. \end{frame}
  281. \begin{frame}
  282. \frametitle{Snapshots}
  283. \begin{itemize}
  284. \item Snapshots can be particularly useful because they allow us to obtain an (almost) instant snapshot of a volume that we can restore later, archive somewhere etc.\\\pause
  285. \item So we can use them in order to do some potential risky modifications on a system and restore the previous state with a little effort.\\\pause
  286. \item Remember that having a separate \textit{classical} backup is always useful, in particular for important data of our applications.\pause
  287. \item RAID is not a backup.
  288. \end{itemize}
  289. \end{frame}
  290. \begin{frame}
  291. \frametitle{References}
  292. \begin{itemize}
  293. \item \url{https://wiki.archlinux.org/index.php/Full_system_backup_with_rsync}
  294. \item \url{https://wiki.archlinux.org/index.php/Duplicity}
  295. \item \url{http://duplicity.nongnu.org/}
  296. \item \url{https://www.digitalocean.com/community/tutorials/how-to-use-duplicity-with-gpg-to-securely-automate-backups-on-ubuntu}
  297. \item \url{https://github.com/zertrin/duplicity-backup.sh}
  298. \item \url{https://wiki.archlinux.org/index.php/Rsnapshot}
  299. \item \url{https://slides.poul.org/2017/corsi-linux-avanzati/backup_handbook.pdf}
  300. \end{itemize}
  301. \end{frame}
  302. \begin{frame}
  303. \frametitle{Special Thanks}
  304. I used as reference and starting point for this presentation the material of the previous editions of the course.\\
  305. Special thanks to \textit{Valeria Mazzola}\footnote{\url{https://slides.poul.org/2016/corsi-linux-avanzati/Backup_and_Restore.pdf}} and \textit{Federico Amedeo Izzo}\footnote{\url{https://filesystem.izzo.ovh/}} for the slides of the two previous edition of this talk.
  306. \end{frame}
  307. \begin{frame}
  308. \frametitle{License}
  309. \begin{center}
  310. {\Huge Thank you!}
  311. \vfill
  312. \includesvg[height=1.5cm]{by-sa}\\
  313. {\footnotesize These slides are published under a Creative Commons Attribution-ShareAlike 4.0 license.}
  314. \end{center}
  315. \end{frame}
  316. \end{document}
  317. \grid