Skip to content

Working with tar on Linux

Tarball files are free software distributions that maintain a directory structure, source files, a Makefile file, documentation, and other files, encapsulated in a tar archive with gzip data compression.

This software distribution method is very popular because the tar and gzip utilities are very common.

The result is a file with the suffix .tar.gz or .tgz.

It is still possible to find the tarball files with the suffix .bz2 and .tbz2. These files were compressed with bzip2 which uses a better data compression algorithm than gzip.

You will be able to open the contents of a tarball file in two ways:

# gzip —d arquivo.tar.gz

The gzip command decompresses arquivo.tar.gz and Remove the suffix .gz.

# tar xvf arquivo.tar

The tar utility extracts the contents of the package.

We can also use simpler forms:

# tar xvzf arquivo.tar.gz

Or

# gzip —dc arquivo.tar.gz | tar xv

If the file is compressed with bzip2, it must be unzipped by bunzip2 or use bzip2’s —d option.

# bzip2 —d.tar.bz2 file

Or

# bunzip2 file.tar.bz2

And

# tar xvf arquivo.tar

See the comparison between the compression performed by the gzip and bzip2 compressors and an uncompressed tar file:

10.772.480 apache.tar 2,097.339 apache.tar.bz2 (5x smaller) 2,467.371 apache.tar.gz (4x smaller)

It is important to know that for Linux the suffix of the files does not matter. It checks the beginning of the files to find out what their format is and which program is associated with that format.

Therefore, the suffixes .tar.gz, .tgz, .bz2, and .tbz2 are meant to facilitate human understanding.

The first bytes of the files are called Magic Numbers, so that each type of file format has its first single bytes, which can be read to identify them.

The file utility can be used to read the first bytes of the files and identify the type of file.

Examples:

# file php-5.3.0.tar.gz

php-5.3.0.tar.gz: gzip compressed data, was “php-5.3.0.tar”, from Unix, last modified: Mon Jun 29 12:46:48 2009, max compression

# file IntegrACAO.zip

IntegrACAO.zip: Zip archive data, at least v2.0 to extract

# /bin/bash /bin/bash file

: ELF 32-bit LSB shared object, Intel 80386, version 1 (SYSV), for GNU/Linux 2.6.4, dynamically linked (uses shared libs), stripped

# file mytext mytext

: ASCII text

# file player.html

player.html: HTML document text

# script script file

: Bourne-Again shell script text

Learn much more about Linux in our online course. You can register here. If you already have an account, or want to create one, just log in or create your user here.

Did you like it?

Share