Linux data compressors (gzip, bzip and xz)
For greater efficiency and savings of backup media, there is the data compression feature.
There are basically three Linux Data Compressors with different data compression algorithms. The first to appear was GZIP, then BZIP2 and finally XZ.
Gzip and gunzip data compressor
The first widely used data compressor is gzip. It uses a comprehension algorithm called Lempel-Ziv. This technique finds duplicate characters in the input data. The second occurrence of characters is replaced by pointers to the previous reference, in the form of distance and length pairs. When compressing a file, gzip adds the suffix .gz.
To compress a file:
To unzip a file:
Or
bzip2 and bunzip2 data compressor
The bzip2 compactor compresses files using the Burrows-Wheeler and Huffman algorithm. This technique operates on large blocks of data. The larger the block size, the higher the compression rate achieved. It is considered better than conventional data compressors. When compressing a file, bzip2 adds the suffix .bz2.
To compress a file:
To unzip a file:
Or
There are some cases where the compressed file may become larger than the original file. This can occur if the algorithm used finds no occurrences to compress the data and the compactor header is added to the original file.
xz and unxz data compressor
Also, we have the xz data compressor, which uses an algorithm similar to gzip. It produces files with the extension.xz or .lzma.
To compress a file:
To unzip:
Or
To give you an idea of the difference between the three compressors gzip, bzip2 and xz, see the comparative example of the TAR package from a website backup file:
site.tar.xz 2.1M # file compressed with xz ### Joining Files with Tarball
Tarball files are bundles of files and directories that maintain the original directory and file structure in a tar archive, with the possibility of compressing data.
The command to package files and directories is tar. The name of this command comes from “Tape-Archive”. It reads files and directories and saves them to tape or file.
Along with the data, it saves important information such as the last modification, access permissions, and others. This makes it able to restore the original state of the data.
The tar command options are not so optional. It takes at least two arguments:
- options: Tells what tar should do
- [source]: If tar is used for backup, this parameter can be a file, a device, a directory to be copied;
- [destination]: If the command is used for backup, this option will specify the destination for the data. It can be a tarball file or a device. If used to restore the files, it will specify a tarball file and a device from which the data will be extracted.
First, you must choose what tar should do using the options:
- -c: Creates a new .tar file;
- -u: Add more files to the .tar file only if they are new or modified;
- -r: Adds the files specified at the end of the file .tar;
- -g: Creates an incremental backup;
- -t: Lists the contents of a .tar file;
- -x: Extracts the .tar archive files;
It even has auxiliary options:
- -j: Uses bzip2 to compress and unzip the .tar.bz2 files;
- -J: Uses xz to compress and unzip the files .tar.xz
- -z: Uses gzip to compress and unzip the files .tar.gz;
- -v: Lists all processed files;
- -f: Indicates that the destination is a file on disk, not a magnetic tape drive;
The tar options can be combined into a single parameter such as “cvzf”.
Because it is a command that was originally designed to read/write to tape, to create a tar archive, or to read a tar archive from disk, the “f” option should always be used.
Examples:
To save a particular /var/lib/mysql directory to one in the /var/backup/mysql.tar.gz file:
To extract the same package:
You can open the contents of a tarball file in two ways:
The gzip command decompresses arquivo.tar.gz and removes the suffix .gz.
The tar utility extracts the contents of the package.
We can also use simpler forms:
Or
If the file is compressed with bzip2, it must be unzipped by bunzip2 or use bzip2’s —d option.
Or
And
In the case of files compressed with xz, the xz command can be used:
Followed by:
OR
In the graphical environment, you can unzip and extract a tarball file without much effort, just by clicking on the file. In this way, Linux will invoke the appropriate data compactor in the background along with the tar to extract the data package in the current directory.
See the comparison between the compression performed by the gzip, bzip2 and xz compressors and an uncompressed tar archive:
For the exam, it is recommended to memorize the following table:
**Compactor | **Extension** | **Tar Option** |
.tar.gz | Gzip | $ tar xvzf arquivo.tar.gz |
.tar.bz2 | Bzip2 | $ tar xvjf arquivo.tar. bz2 |
.tar.xz | Xz | $ tar xVJF.tar.xz |
Did you like it?
Share