![]() |
|
Here follows the main features of dar/libdar tools. Each features let you have an overview and bring you some pointers you are welcome to follow for a more detailed information. |
FILTERS |
references: man dar / command line usage notes |
keywords: -I -X -P -g -[ -] -am |
|
dar
is able to backup from a total file system to a single file, thanks to
its filter mechanism. This one is dual headed: The first head let one
decide which part of a directory tree to consider for the operation
(backup, restoration, etc.) while the second head defines which type of
file to consider (filter only based on filename, like for example the
extension of the file). |
DIFFERENTIAL BACKUP | references: man dar/TUTORIAL |
keywords: -A | |
When making a backup with dar, you have the possibility to make a full backup or a differential backup. A full backup, as expected, makes backup of all files as specified on the command line (with or without filters). Instead, a differential backup, (over filter mechanism), saves only files that have changed since a given reference backup. Additionally, files that existed in the reference backup and which do no more exist at the time of the differential backup are recorded in the backup as "been removed". At recovery time, (unless you deactivate it), restoring a differential backup will update changed files and new files, but also remove files that have been recorded as "been removed". Note that the reference backup can be a full backup or another differential backup. This way you can make a first full backup, then many differential backup, each taking as reference the last backup made, for example. |
SLICES | references: man dar/TUTORIAL |
keywords: -s -S -p -aSI -abinary |
|
Dar
stands for Disk
ARchive. From the beginning it was designed to be able to split an
archive over several removable media whatever their number is and
whatever their size is. To restore from such a splitted archive, dar
will directly fetch the requested data in the correct slice(s). Thus
dar is able to save and restore using old floppy disk,
CD-R, DVD-R, CD-RW, DVD-RW, Zip, Jazz, etc... However, Dar will not
un/mounting a removable medium, instead it is independent of hardware.
Given the size, it will split the archive in several files (called
SLICES), eventually pausing before creating the next one, allowing this
way, the user to un/mount a medium, burn the file on CD-R, send it by
email (if your mail system does not allow huge file in emails, dar can
help you here also). By default, (no size specified), dar will make one
slice whatever its size is. Additionally, the size of the first slice
can be specified separately, if for example you want first to fulfill a
partially filled disk before starting using empty ones. Last, at
restoration time, dar will just pause and prompt the user asking a
slice only if it is missing. Note that all these operation can be
automatized using the "user command between slices" feature (presented
below), that let dar do all you want it to do once a slice is created
or before reading a slice. |
DIRECTORY TREE SNAPSHOT | references: man dar |
keywords: -A + |
|
Dar can make a snapshot of a directory tree and files recording the inode status of files. This may be used to detect changes in filesystem, by "diffing" the resulting archive with the filesystem at a later time. The resulting archive can also be used as reference to save file that have changed since the snapshot has been done. A snapshot archive is very small compared to the corresponding full backup, but it cannot be used to restore any data. |
COMPRESSION | references: man dar |
keywords: -z |
|
dar can use compression. By default no compression is used. Actually gzip, bzip2 and lzo algorithms are implemented, and there is still some room available for any other compression algorithm. Note that, compression is made before slicing, which means that using compression together with slices, will not make slices smaller, but will probably make less slices in the backup. |
DIRECT ACCESS | |
even using compression and/or encryption dar has not
to read the whole backup to extract one file. This way if you just want
to restore one file from a huge backup, the process will be much faster
than using tar. Dar first reads the catalogue (i.e. the contents of the
backup), then it goes directly to the location of the saved file(s) you
want to restore and then proceeds to restoration. In particular using slices,
dar will ask only for the slice(s) containing the file(s) to restore. |
SEQUENTIAL ACCESS |
references: man dar |
(suitable for tapes) |
--sequential-read, -at |
The
direct access feature seen above is well adapted to random access media
like disks, but not for tapes. Since release 2.4.0, dar provides a
sequential mode in which dar sequentially read and write archives. It
has the advantage to be efficient with tape but suffers from the same
drawback as tar archive: it is slow to restore a single file from a
huge archive. |
HARD LINK CONSIDERATION | |
hard links are properly saved in any case and properly restored if possible. For example, if restoring across
a mounted file system, hard linking will fail, but dar will then
duplicate the inode and file contents, issuing a warning. Hard link
support includes the following inode types: plain files, char devices,
block devices, symlinks (Yes, you can hard link symbolic links! Thanks to Wesley Leggette for the info ;-) ) |
SPARSE FILES |
references: man dar |
--sparse-file-min-size, -ah | |
By default Dar takes care of sparse files, even if the underlying filesystem does
not support sparse files(!). When a long sequence of zeroed bytes is
met in a file during backup, those are not stored into the archive but
the number of zeroed bytes is stored instead (structure known as a "hole"). When comes the time to
restore that file, dar restore the normal data but when a hole is met
in the archive dar directly skips at the position of the data following
that hole which, if the underlying filesystem supports sparse files,
will (re)create a hole in the restored file, making a sparse file.
Sparse files can report to be several hundred gigabytes large while they
need only a few bytes of disk space, being able to properly save and restore them
avoids wasting disk space at restoration time and in archives. |
EXTENDED
ATTRIBUTES (EA) |
references: man dar |
MacOS
X FILE FORKS / ACL |
keywords: -u -U -am -ae --alter=list-ea |
Dar is able to
save and restore EA, all or just those matching a given pattern.
File Forks (MacOS X) are implemented over
EA as well as Linux's ACL, they are thus transparently saved, tested,
compared and restored by dar.
Note that ACL under MacOS seem to not rely on EA, thus while they are
marginally used they are ignored by dar.
|
ARCHIVE TESTING | references: man dar/TUTORIAL/
Good
Backup Practice |
keywords: -t |
|
thanks to CRC (cyclic redundancy checks), dar is able to detect data corruption in the archive. Only the file where data corruption occurred will not be possible to restore, but dar will restore the others even when compression or encryption (or both) is used. |
DATA PROTECTION | references: man dar/Parchive integration |
keywords: -al |
|
dar relies on the Parchive
program for data
protection against media errors. Thanks to dar's ability to run user
command or script and thanks to the ad hoc provided scripts, dar can use Parchive
as simply as adding a word (par2) on command-line. Depending on the
context (archive creation, archive testing, ...), dar will by this mean
create parity data for each slice, verify and if necessary repair the
archive slices. However, even without Parchive, dar has the ability to be restored using an isolated catalogue as backup of the internal catalogue of an archive, which if corrupted could lead the whole archive to become unreadable. The other vital information (like the slice layout) is replicated in each slice, this let dar overcome data corruption of that part too, and restore more than nothing in case of major problem. As a last resort, Dar also proposes a "lax" mode in which the user is asked questions (like the compression algorithm used, ...) to help dar recover very corrupted archives. However this does not replace using Parchive and has to be considered as the last resort option. |
REMOTE OPERATIONS | references: command line usage notes, man dar/dar_slave/dar_xform |
USING PIPES | keywords: -i -o - |
dar
is able to produce an
archive to its standard output or named pipe, it is also able to read
an archive from its standard input or named pipe, which let one to make
remote backup easily. However this would requires to read the archive in sequential mode which leads to transfer a whole archive just to restore a single file. For that reason, dar is also able to read an archive through a pair of pipes using dar_slave at one side and dar at the other side. From the pair of pipe, one pipe let dar ask to dar_slave which portion of the archive to send through the other pipe. This makes a remote restoration much efficient and can still be protected, simply remotely running dar_slave through a ssh session for example will let all exchanges be encrypted. |
ISOLATION | references: man dar |
keywords: -C -A -@ |
|
the catalogue (i.e.: the contents of an
archive), can be extracted (this operation is called isolation) to a small file, that
can in turn be used as reference for differential archive. There is then no
more need to provide an archive to be able to create a differential
backup based on it, just its catalogue
is necessary. Such an isolated catalogue
can also be used to rescue the archive it has been isolated from in the case the archive's internal catalogue has been corrupted. Such isolated catalogue can be created at the same time as the archive (operation called on-fly isolation) or as a separate operation (called isolation). |
RE-SHAPE SLICES OF AN EXISTING ARCHIVE | references: man dar_xform |
|
|
the
provided program
named "dar_xform" is able to change the size of slices of a given
archive. The resulting archive is totally identical to archives
directly created by dar. Source archive can be taken from a set of
slice, from standard input or even a named pipe. Note that dar_xform
can work on encrypted and/or compressed data without having to
decompress or even decrypt it. |
USER COMMAND BETWEEN SLICES | references: man dar dar_slave dar_xform/command line usage notes |
keywords: -E -F -~ |
|
several hooks are provided for dar to call a given command once a slice has been written or before reading a slice. Several macros allow the user command or script to know the requested slice number, path and archive basename. |
USER
COMMAND BEFORE AND AFTER SAVING A DIRECTORY OR A FILE |
references: man dar/command line usage notes |
keywords: -< -> -= |
|
It
is possible to define a set of file that will have a command executed
before dar start saving them and once dar has completed saving them.
This is especially intended for saving live database backup. Before
entering a directory dar will call the specified user command, then it
will proceed to the backup of that directory. Once the whole directory
has been saved, dar will call again the same user command (with
slightly different arguments) and then continue the backup
process. Such user command may have for action to stop the database and
to reactivate it afterward for example. |
STRONG ENCRYPTION | references: man dar |
keywords: -K -J -# -* blowfish, twofish, aes256, serpent256, camellia256 |
|
Dar can use blowfish, twofish, aes256, serpent256 and camellia256 algorithms to encrypt the whole archive. Two "elastic buffers" are inserted and encrypted with the rest of the data, one at the beginning and one at the end of the archive to prevent a clear text attack or codebook attack. |
SLICE HASHING |
references: man dar |
--hash, md5, sha1 |
|
When
creating an archive dar can compute an md5 or sha1 hash before the
archive is written to disk and produce a small file compatible with
md5sum or sha1sum that let verify that each slice of the archive is not
corrupted. |
CONFIGURATION FILE | references: man dar, conditional syntax and user targets |
keywords: -B |
|
dar can read parameter from
file. This is a way to extends the command-line limited length
input. A configuration file can ask dar to read (or to include) other
configuration files. A simple but efficient mechanism forbids a file to
include itself directly or not, and there is no limitation in the
degree of recursion for the inclusion of configuration files. Two special configuration files $HOME/.darrc and /etc/darrc are read if they exist. They share the same syntax as any configuration file which is the syntax used on the command-line, eventually completed by newlines and comments. Any configuration file can also receive conditional statements, which describe which options are to be used in different conditions. Conditions are: "restoration", "listing", "testing", "difference", "saving", "isolation", "any operation", "none yet defined" (which may be useful in case or recursive inclusion of files) ... |
SELECTIVE COMPRESSION | references: man dar/samples |
keywords: -Y -Z -m -am |
|
dar can be given a special filter that determines which files will be compressed or not. This way you can speed up the backup operation by not trying to compress *.mp3, *.mpg, *.zip, *.gz and other already compressed files, for example. Moreover another mechanism allow you to say that files under a given size (whatever their name is), will not be compressed. |
DAR MANAGER | references: man dar_manager |
The advantage of differential
backup is that it takes much less space to store and time to complete
than always making full backup. But, in the other hand, while you can thus have a
lot of them due to the reduces space requirement, if you want to restore a particular file, you can thus
spend time to find in which backup is located the most recent version.
This is solved using dar_manager.
This command-line program,
will gather contents information of all your backups. At restoration
time, it will call dar for you to restore the asked file(s) from the
proper backup. |
FLAT RESTORATION | references: man dar |
keywords: -f | |
It is possible to restore any
file without restoring the directories and subdirectories it was in at
the time of the backup. If this option is activated, all files will be
restored in the (-R) root directory whatever their real position is recorded inside the archive. |
NODUMP FLAG | references: man dar |
keywords: --nodump | |
Linux ext2/3/4 filesystem, provide for each inodes a set of flags, among which is the "nodump" flag, which
in substance says "don't save this file for backup". This is used by the so-called
dump backup program. Dar can take care to not save those files that
have this flag set. |
ONE FILESYSTEM | references: man dar |
keywords: -M | |
dar can backup files of a given filesystem only, even if some subdirectory in the scope are mounting points for other filesystems, dar will not recurse in these directories. |
ARCHIVE MERGING | references: man dar |
keywords: -+ -ak -A -@ |
|
From version 2.3.0, dar supports the merging of two
existing archives into a single one. This merging operation is assorted by
the same filtering mechanism used for archive creation. This let the
user define which file will be part of the resulting archive. By extension, archive merging can also take as single source archive as input. This may sound a bit strange at first, but this let you make a subset of a given archive without having to extract any file to disk. In particular, if your filesystem does not support Extended Attributes (EA), thanks to this feature you can still cleanup an archive from files you do not want to keep anymore without loosing any EA or performing any change to standard file attributes (like modification dates for example) of files that will stay in the resulting archive. Last, this merging feature give you also the opportunity to change the compression level or algorithm used as well as the encryption algorithm and pass. Of course, from a pair of source archive you can do all these sub features at the same time: filtering out files you do not want in the resulting archive, use a different compression level and algorithm or encryption password and algorithm than the source archive(s), you may also have a different archive slicing or no slicing at all (well dar_xform is more efficient for this feature only, see above "RE-SHAPE SLICES OF AN EXISTING ARCHIVE" for details). |
ARCHIVE SUBSETTING |
references: man dar |
keywords: -+ -ak |
|
As seen above under the "archive merging" feature description, it is possible to define a
subset of files from an archive and put them into a new archive without
having to really extract these files to disk. To speed up the process, it is also possible to avoid
uncompressing/recompressing files that are kept in the resulting archive or change
their compression, as well change the encryption scheme used. Last, you
may manipulate this way files and their EA while you don't have EA
support available on your system. |
DECREMENTAL BACKUP | references: man dar / Decremental backup |
keywords: -+ -ad |
|
As
opposed to incremental backups, where the older one is a full backup
and each subsequent backup contains only the changes from the previous
backup, decremental backup let the full backup be the more recent while
the older ones only contain changes compared to the just more recent one. This
has the advantage of having a single archive to use to restore a whole
system (dar_manager is not necessary) while reducing the overall amount
of data to retain older versions of files (same amount required as with
differential backup). It has also the advantage to not have to keep
several set of backup as you just need to delete the oldest backup when
you need storage space. However it has the default to require at each
new backup the creation of a full backup, then the transformation of
the previous full backup into a so-called decremental backup. Everything has
a cost! ;-) |
DRY-RUN EXECUTION |
references: man dar |
keywords: -e |
|
You
can run any feature without effectively performing the action. Dar will
report any problem but will not create, remove or modify any file. |
DIRTY FILES |
references: man dar |
keywords: --dirty-behavior , --retry-on-change |
|
At
backup time, dar checks that each saved file had not changed at the
time it was read. If a file has changed in that situation, it is
flagged as "dirty" in the archive, and handled differently from other
files at restoration time. The dirty file handling is either to warn
the user before restoring, to ignore and not restore them, or to ignore
the dirty flag and restore them normally. Dar has room to retry saving
a file when it has been found dirty, before effectively putting the
"dirty" flag for that file in the archive. This retry option is limited
by a maximum number of try per file, after which the file is
definitively marked as dirty and the backup process continues with the
next file. |
ARCHIVE USER COMMENTS |
references: man dar |
keywords: --user-comment, -l -v, -l -q |
|
The
archive header can encompass a message from the user. This message is
never ciphered nor compressed and always available to any one listing
the archive summary (-l and -q options). Several macro are available to
add more confort using this option, like the current date, uid and gid
used for archive creation, hostname, and command-line used for the
archive creation. |
PADDED ZEROS TO SLICE NUMBER |
references: man dar |
keywords: --min-digits |
|
Dar
slice are numbered by integers starting by 1. Which makes filename of
the following form: archive.1.dar, archive.2.dar, ..., archive.10.dar,
etc. However, the lexicographical order used by many directory listing
tools, is not adapted to show the slices in order. For that reason, dar
let the user define how much zeros to add in the slice numbers to have
usual file browsers listing slices as expected. For example, with 3 as
minimum digit, the slice name would become: archive.001.dar,
archive.002.dar, ... archive.010.dar. |
CACHE DIRECTORY TAGGING STANDARD |
references: man dar |
keywords: --cache-directory-tagging |
|
Many software use cache directories (mozilla web browser for example), directories where is stored temporaneous data that is not interesting to backup. The Cache Directory Tagging Standard
provides a standard way for software applications to identify this type
of data, which let dar able to take into account and avoid saving them. |