talk.tex 12 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343
  1. \documentclass{beamer}
  2. %\setbeamercovered{transparent}
  3. \usetheme{poul}
  4. %\usetheme{Madrid}
  5. \usepackage[utf8]{inputenc}
  6. \usepackage[svgpath=images/]{svg}
  7. \usepackage{graphicx}
  8. \graphicspath{ {images/} }
  9. \usepackage[hyphenbreaks]{breakurl}
  10. \usepackage{hyperref}
  11. \def\UrlBreaks{\do\/\do-}
  12. %Information to be included in the title page:
  13. \title{Backup and (hopefully) Restore}
  14. \author{Andrea Gussoni}
  15. \institute{P.O.u.L.}
  16. \date{23 Marzo 2017}
  17. \titlegraphic{\includesvg[height=1.5cm]{logowhite}}
  18. \begin{document}
  19. \frame{\titlepage}
  20. \begin{frame}
  21. \frametitle{Why do we need backups?}
  22. Bad things can happen and do happen:
  23. \begin{itemize}
  24. \item You may drop your computer accidentally.
  25. \item The disk may be damaged by vibrations during the daily commute.
  26. \item The computer where you keep the unique copy of your thesis
  27. may be stolen.
  28. \item Or after some time it may simply age and stop operating.
  29. \end{itemize}
  30. \end{frame}
  31. \begin{frame}
  32. \frametitle{Why do we need backups?}
  33. \begin{center}
  34. \includegraphics[width=0.7\textwidth]{gitlab}
  35. \end{center}
  36. \footnotetext{\url{https://twitter.com/gitlabstatus/status/826591961444384768}}
  37. \end{frame}
  38. \begin{frame}
  39. \frametitle{What are backups?}
  40. \begin{block}{Definition}
  41. The copying and archiving of computer data so that it may be
  42. used to restore the original after a data loss event.
  43. \end{block}
  44. \end{frame}
  45. \begin{frame}
  46. \frametitle{What to backup?}
  47. It is important to distinguish what it is necessary to backup from what
  48. it is not.\\\pause
  49. Obviously this depends on the setup that you are using (native services, containers, VMs etc...)
  50. \end{frame}
  51. \begin{frame}
  52. \frametitle{A general guideline}
  53. Must:
  54. \begin{itemize}
  55. \item /home
  56. \end{itemize}
  57. \vfill
  58. At your discretion:
  59. \begin{itemize}
  60. \item /etc
  61. \item /var
  62. \end{itemize}
  63. \vfill
  64. Not necessary\footnote{if these folders contain something important probably you are doing something wrong in your setup}:
  65. \begin{itemize}
  66. \item /proc /sys /tmp
  67. \item /dev /mnt /media
  68. \end{itemize}
  69. \end{frame}
  70. \begin{frame}
  71. \frametitle{Backup types}
  72. Backups can be:
  73. \begin{itemize}
  74. \item \textbf{full}: a complete backup of a all files and folder starting from a root node.
  75. \item \textbf{incremental}: contains all the differences since the last incremental backup.
  76. \item \textbf{differential} contains the changes since the last full backup.
  77. \end{itemize}
  78. \end{frame}
  79. \begin{frame}
  80. \frametitle{Backup Support}
  81. \begin{itemize}
  82. \item Hard disks (HDD).
  83. \item Solid-State drives (SSD).
  84. \item Optical supports: DVDs, Blu-ray.
  85. \item Flash Drives.
  86. \item Cloud\footnote{Remember that there is no cloud, just other people's computers.}.
  87. \end{itemize}
  88. \end{frame}
  89. \begin{frame}
  90. \frametitle{dd}
  91. \textbf{dd} is a powerful tool that basically can copy everything that is a file or a block device. It is common to use it for disk cloning.\\
  92. Usage example:
  93. \begin{itemize}
  94. \item \textit{dd if=/dev/sdX of=/dev/sdY \&\& sync\footnote{useful to actually wait the end of data transfer and avoid corrupted copies}}
  95. \begin{itemize}
  96. \item \textbf{if:} input file/device
  97. \item \textbf{out:} output file/device
  98. \end{itemize}
  99. \end{itemize}
  100. \vfill\pause
  101. \begin{alertblock}{Caution}
  102. Since \textbf{dd} often requires \textit{sudo} privileges to run, if you mismatch the name of a device you can actually wipe the content of your primary hard disk, double check always the arguments before pressing enter.
  103. \end{alertblock}
  104. \end{frame}
  105. \begin{frame}
  106. \frametitle{GNU ddrescue}
  107. gdrescue is an enhanced version of dd that tries to rescue good parts in case of read errors. It may be usefull to recover data from a drive with some damaged sector.\\
  108. Usage Example:
  109. \begin{itemize}
  110. \item \textit{ddrescue [options] /dev/sdX outfile mapfile}
  111. \begin{itemize}
  112. \item \textbf{mapfile:} a human readable text file ddrescue uses to manage the copy
  113. \end{itemize}
  114. \end{itemize}\pause
  115. \begin{alertblock}{Caution}
  116. For the rescued data to be correct, both dd and gddrescue are best used on unmounted devices.
  117. \end{alertblock}\pause
  118. \begin{block}{Tip}
  119. gddrescue can also be useful when trying to reallocate sectors on a drive with a few sector unreadable. Doing a wipe of the drive with gddrescue should reallocate bad sectors.
  120. \end{block}
  121. \end{frame}
  122. \begin{frame}
  123. \frametitle{rsync}
  124. Also known as an advanced version of cp
  125. \begin{exampleblock}{Pros}
  126. \begin{itemize}
  127. \item (unlike cp) preserves hard and symbloic links, file permissions and ownerships, modification times, etc.
  128. \item designed to be network efficient because only transfers file changes.
  129. \item easy to use.
  130. \end{itemize}
  131. \end{exampleblock}
  132. \begin{alertblock}{Cons}
  133. \begin{itemize}
  134. \item no storage encryption.
  135. \end{itemize}
  136. \end{alertblock}
  137. \end{frame}
  138. \begin{frame}
  139. \frametitle{rsync: usage}
  140. \begin{itemize}
  141. \item rsync -Pr source destination
  142. \begin{itemize}
  143. \item \textbf{P:} keep partially transferred files if the transfer is interrupted.
  144. \item \textbf{r:} recursive directory option
  145. \end{itemize}
  146. \vfill
  147. \pause
  148. \item rsync source host:destination\footnote{But please don't do this \textit{rsync -av --delete source host:$\sim$/}}
  149. \begin{itemize}
  150. \item uses ssh by default, but can also be forced with the -e ssh option.
  151. \end{itemize}
  152. \vfill
  153. \pause
  154. \item rsync -aAXv --exclude={...} /* /backup folder
  155. \begin{itemize}
  156. \item backup /* while following symlinks and preserving file properties.
  157. \end{itemize}
  158. \end{itemize}
  159. \end{frame}
  160. \begin{frame}
  161. \frametitle{rsnapshot: rsync automated}
  162. rsnapshot produces automated, periodical system snapshots
  163. \vfill
  164. \begin{exampleblock}{Pros}
  165. \begin{itemize}
  166. \item preserves hard and symbolic links, file permissions and ownership, modification times, etc.
  167. \item network efficient.
  168. \item each snapshot contains a full system backup.
  169. \item easy to use.
  170. \end{itemize}
  171. \end{exampleblock}
  172. \vfill
  173. \begin{alertblock}{Cons}
  174. \begin{itemize}
  175. \item no storage encryption.
  176. \end{itemize}
  177. \end{alertblock}
  178. \end{frame}
  179. \begin{frame}
  180. \frametitle{duplicity}
  181. duplicity produces encrypted, incremental backups in tar format.
  182. \begin{exampleblock}{Pros}
  183. \begin{itemize}
  184. \item preserves hard and symbolic links, file permissions and ownership, modification times, etc.
  185. \item network efficient.
  186. \item incremental backups.
  187. \item supports storage encryption with gpg.
  188. \item easy to use.
  189. \end{itemize}
  190. \end{exampleblock}
  191. \end{frame}
  192. \begin{frame}
  193. \frametitle{duplicity: usage}
  194. \begin{itemize}
  195. \item duplicity /home/user scp::/user@host//backup/directory
  196. \vfill\pause
  197. \item duplicity [restore] scp://user@host//backup/directory /home/user
  198. \vfill\pause
  199. \item duplicity full /home/user scp::/user@host//backup/directory
  200. \end{itemize}
  201. \end{frame}
  202. \begin{frame}
  203. \frametitle{duplicity: usage}
  204. \begin{itemize}
  205. \item duplicity list-current-files scp::/user@host//backup/directory
  206. \begin{itemize}
  207. \item list the files contained in the backup.
  208. \end{itemize}
  209. \vfill\pause
  210. \item duplicity [restore] -t 3D scp://user@host//backup/directory /home/user
  211. \begin{itemize}
  212. \item specify the time from which to restore files.
  213. \end{itemize}
  214. \vfill\pause
  215. \item duplicity remove-older-than 30D scp::/user@host//backup/directory
  216. \begin{itemize}
  217. \item remove from the backup full backups older than the specified period.
  218. \end{itemize}
  219. \end{itemize}
  220. \end{frame}
  221. \begin{frame}
  222. \frametitle{Demo}
  223. \begin{center}
  224. {\Huge Demo!}
  225. \end{center}
  226. \end{frame}
  227. \begin{frame}
  228. \frametitle{Last but not Least}
  229. \begin{itemize}
  230. \item When you use duplicity with encryption enabled always remember to backup the gpg keys you use to encrypt and sign the backup.\\
  231. If you loose them you won't be able to restore the backup.\pause
  232. \item Always check that the backup is taking place, don't just assume that everything is working fine because you followed exactly the suggested guide.\pause
  233. \item Always try to test that the backup is really working by trying to restore the backup. You'll be surprised to know how many times the backup procedures are not really working, and unfortunately if you do not test them you'll notice it only when the files are gone.
  234. \end{itemize}
  235. \end{frame}
  236. \begin{frame}
  237. \frametitle{Hi again GitLab}
  238. \begin{center}
  239. \includegraphics[width=0.9\textwidth,height=0.5\textheight]{gitlab2}
  240. \end{center}
  241. \footnotetext{\url{https://docs.google.com/document/d/1GCK53YDcBWQveod9kfzW-VCxIABGiryG7_z_6jHdVik/pub}}
  242. %\pause
  243. %\footnotesize{I don't want to put shame on GitLab for this incident, but only to use it as a case study.\\ In fact I think that the incident has been managed really well by the GitLab Team.\\
  244. %Instead of starting blaming each other and finding silly excuses as usually happens in cases like this, they have been really open from the beginning about the problem and put as a priority the restore of the functionality of the service.}
  245. \end{frame}
  246. \begin{frame}
  247. \frametitle{Before the Backup}
  248. A different approach to data protection is to use RAID (\textit{Redundant Array of Independent Disks}).\\
  249. \pause
  250. In general what we try to obtain with RAID is:
  251. \begin{itemize}
  252. \item Survival of the system if a disk failure happen.
  253. \item In certain conditions we can achieve higher performances compared to the single disk case.
  254. \end{itemize}
  255. \footnotetext{For further informations you can visit \url{https://www.digitalocean.com/community/tutorials/an-introduction-to-raid-terminology-and-concepts}}
  256. \end{frame}
  257. \begin{frame}
  258. \frametitle{RAID Configurations}
  259. \begin{figure}
  260. \centering
  261. \includesvg[width = 75pt]{RAID_0}
  262. \includesvg[width = 75pt]{RAID_1}\\
  263. \includesvg[width = 150pt]{RAID_5}
  264. \end{figure}
  265. \end{frame}
  266. \begin{frame}
  267. \frametitle{Problems}
  268. RAID can help in the event of a disk failure, but it doesn't protect us against \textbf{Silent Data Corruption}\\\pause
  269. To address this problem new generation filesystems like ZFS or Btrfs have been created. Classical features that we can find in this kind of fylesistems are:
  270. \begin{itemize}
  271. \item CopyOnWrite
  272. \item Deduplication
  273. \item Data \& Metadata checksums
  274. \item Integrated RAID
  275. \item Volume Management
  276. \item Snapshots
  277. \end{itemize}
  278. \end{frame}
  279. \begin{frame}
  280. \frametitle{Snapshots}
  281. \begin{itemize}
  282. \item Snapshots can be particularly useful because they allow us to obtain an (almost) instant snapshot of a volume that we can restore later, archive somewhere etc.\\\pause
  283. \item So we can use them in order to do some potential risky modifications on a system and restore the previous state with a little effort.\\\pause
  284. \item Remember that having a separate \textit{classical} backup is always useful, in particular for important data of our applications.\pause
  285. \item RAID is not a backup.
  286. \end{itemize}
  287. \end{frame}
  288. \begin{frame}
  289. \frametitle{References}
  290. \begin{itemize}
  291. \item \url{https://wiki.archlinux.org/index.php/Full_system_backup_with_rsync}
  292. \item \url{https://wiki.archlinux.org/index.php/Duplicity}
  293. \item \url{http://duplicity.nongnu.org/}
  294. \item \url{https://www.digitalocean.com/community/tutorials/how-to-use-duplicity-with-gpg-to-securely-automate-backups-on-ubuntu}
  295. \item \url{https://github.com/zertrin/duplicity-backup.sh}
  296. \end{itemize}
  297. \end{frame}
  298. \begin{frame}
  299. \frametitle{Special Thanks}
  300. I used as reference and starting point for this presentation the material of the previous editions of the course.\\
  301. Special thanks to \textit{Valeria Mazzola}\footnote{\url{https://slides.poul.org/2016/corsi-linux-avanzati/Backup_and_Restore.pdf}} and \textit{Federico Amedeo Izzo}\footnote{\url{https://filesystem.izzo.ovh/}} for the slides of the two previous edition of this talk.
  302. \end{frame}
  303. \begin{frame}
  304. \frametitle{License}
  305. \begin{center}
  306. {\Huge Thank you!}
  307. \vfill
  308. \includesvg[height=1.5cm]{by-sa}\\
  309. {\footnotesize These slides are published under a Creative Commons Attribution-ShareAlike 4.0 license.}
  310. \end{center}
  311. \end{frame}
  312. \end{document}