Skip to content

Text Filters on Linux

Linux has several tools for working with and transforming text files. These tools are called text filters and are useful for working with scripts, verifying log files, etc.

Cat command

Use:

$ cat [options] file

When the options appear in brackets, they are actually optional.

The cat command concatenates files, prints your screen content and you can still receive text typed via the keyboard to a file.

Let’s see how to create a file with just a few lines of text:

$ cat > teste.txt

Now you can type any text. When you’re done, press Ctrl d on an empty line to end the input data and save the teste.txt file.

To view the content of the newly created file:

$ cat teste.txt

Cat can also be used to concatenate files.

$ cat texto1.txt > texto.txt

Observe that in this example the contents of the texto.txt file are replaced by texto1.txt.

To add the contents of the texto1.txt to the end of the texto.txt file The correct one would be:

$ cat texto1.txt >> texto.txt

Cut command

Use:

$ cut file options

The cut command translated literally means cut. He reads the content of one or more files and is output from a vertical column.

Your most frequent options are:

  • -b number: Prints a vertical list with the number byte (from left to right);
  • -c number: Prints a vertical list with the number character (from left to right);
  • -d delimiter: Configures the delimiter that separates one column from another. The default is the Tab;
  • -f number: Prints column number.

Examples:

$ cut —d: -f 1 /etc/passwd

$ cut —b 1 /etc/passwd

$ cat /etc/hosts | cut —f 2

Expand command

Use:

$ expand [options] file

The expand command changes the Tab (tab) within the texts to number of corresponding spaces. It is useful for making a text that makes use of more attractive tabs for certain devices, such as video, printer, files, etc.

The commonly used options are:

  • -t number: Specifies the number of spaces the tab contains. The default is 8;
  • -i: Converts only at the beginning of lines.

Example:

$ expand LEIAME.TXT

Unexpand command

Use:

$ unexpand [options] file

The unexpand command replaces the single space with TAB (tab) within the texts, at the beginning of the lines. It is the inverse of the expand command.

The commonly used options are:

  • -t number: Specifies the number of spaces the tab contains. The default is 8;
  • -a: Converts all spaces instead of spaces at the beginning of lines.

Example:

$ unexpand LEIAME.TXT

fmt command

Use:

$ fmt [options]

The fmt (format) command formats a text with a specific width. It can remove spaces or add spaces according to the desired width. The default is 75 characters.

The frequently used option is:

  • -w number: Set the desired width for the text.

Example:

$ fmt —w 50 LEIAME.TXT

Head command

Use:

$ head [options]

The head command shows the first 10 lines at the beginning of the text.

The frequently used option is:

  • -n number: Configures the number of lines that the head will show.

Example:

$ head —n 50 LEIAME.TXT

Join command

Use:

$ join [options] file1 file2

O The join command unites the lines of both files that have a common index. The join command can be used like a simple database.

The commonly used options are:

  • -j1 number: Choose the number field as the index for file1.
  • -j2 number: Choose the number field as the index for file2.
  • -j number: Choose the number field as the index for both files.

Example:

Assume that file1 contains the following content:

1 GZH-1234

2 HYD-2389

3 GIS-2348

And file 2 has the following content:

1 Fiat Uno Mille Smart

2 Audi A3

3 Monza

After the command:

$ join —j 1 file1 file2

The output will be as follows:

1 GZH-134 Fiat Uni Mille Smart

2 HYD-2389 Audi A3

3 GIS-2348 Monza

TABLE 5 - Symbols to differentiate header and footer

Symbol Description
\\:\\:\\: Symbol used to start the text header
\\:\\: Symbol used to start the body of the text
\\: Symbol used to start the text footer
Notice that the join command requires that both files contain an index, as in the example.

nl Command

Use:

$ nl [options] [file]

The nl (number) command lines) is used to number the lines of a file. The command consider special conditions for the header and footer of the file.

The commonly used options are:

  • -h suboption: Used to format the text header. The default is not to number the header;
  • -b suboption: Used to format the body of the text. The default is to number only non-empty lines;
  • -f suboption: Used to format the text footer. The default is to not number the footer.

The sub-options are:

  • A: Number all lines;
  • t: Number only completed lines;
  • n: Do not number lines.

Example:

Assume that the arquivo.txt has the following content:

\:\:\: Grades and Attendance

Report for Software Engineering students

—————————————————————————————————————————————————————————————————————

———————————————————————————————————————

\:

Carlos Torres 8.5 80% Approved

José Antônio 10.0 100% Approved

Maria de Lourdes 10.0 100% & nbsp; Approved

Mário Cabral 9.5 100% Approved

\:

————————————————————————————————————————

And we use the nl command:

$ nl arquivo.txt

The result will be:

Grades and Attendance Report for Software Engineering students

————————————————————————————————————————————————————————

———————————————————————————————————

1 Carlos Torres 8.5 80% Approved

2 José Antônio & nbsp; 10.0 100% Approved

3 Maria de Lourdes 10.0 & nbsp; 100% Approved

4 Mário Cabral 9.5 100% Approved

————————————————————————————————— ————

Od command

Use:

$ od [options] [file]

O The od command (Octal and Other Formats) is used to view the content of a file in hexadecimal formats, octal, ASCII, and character names.

The frequently used option is:

  • -t type: Specifies the type of output that the od command should generate.

Possible types are:

  • a: Character name;
  • c: ASCII;
  • o: Octal;
  • x: Hex.

Example:

$ cat > arquivo.txt

Linux Certification

Ctrl d

$ od —t x arquivo.txt

0000000 74726543 63696669 6fe3e761 6e694c20

0000020 00007875

0000022

Paste command

Use:

$ paste [options] file1 file2

The paste command is used to concatenate the rows of several files into vertical columns.

The commonly used options are:

  • -d’s’: Separate columns with the s symbol inside single quotes;
  • -s: concatenates the entire content of a file with one line for each file.

Example:

Assume that file1 has the following content:

Lemmoraes

Rodrigues

Aduarte

And file 2 has the following content:

Provedor.com.br

provedor2.com.br

provider3.com.br

When using the paste command, the result will be as follows:

$ paste —d’@’ file1 file2

[email protected]

[email protected]

[email protected]

Pr command

Use:

$ pr [options] file

The pr (printing) command formats a text file for paginated output with defined header, margins, and width. It is useful for formatting raw texts for printing. The header consists of date, time, file name, and number of page.

The commonly used options are:

  • -d: Specifies double spacing;
  • -l number: Specifies the number of characters wide on the page. The default is 66 characters;
  • -the number: Specifies the number of spaces in the left margin.

Example:

$ pr —l 75 —or 5

Split command

Use:

$ split [options] input_file output_file

The split command is used to split large files into smaller n-files. The output files are generated according to the number of lines in the input file.

The default is to split the file every 1000 lines. The names of the output files follow the standard filesaidaaa filesaidaab filesaidaac files, and so on.

The most common option is:

  • -n: Where n is the number of lines that will divide the input file.

Example:

$ split -20 arquivo1.txt arquivosaida.txt

Tail command

Use:

$ tail [options] file

The tail command displays the last 10 lines from a file. It works like the opposite of the head command.

The most common options are:

  • -n number: Specifies the number of final lines that the tail will show from a file;
  • -f: Shows the last final lines of a file continuously while another process writes more lines. Very useful for viewing LOG files.

Example:

$ tail —n 50 /var/log/messages

$ tail —f /var/log/messages

Tr command

Use:

$ tr [options] variable_search variable_swap

The tr command replaces one variable with another specified one. This command does not work directly with files, so it must be used with the standard output of another command.

The most common options are:

  • -d: Delete occurrences of the search variable;
  • -s: Suppress repeated occurrences of the search variable.

Example:

$ cat file1 | tr a-z A-Z

In this example, the tr command changes all the letters from a to z to capital letters.

$ cat file1 | tr —d a

In this example, the tr command erases the letter a.

$ cat file1 | tr —s 1

In this example, the tr command suppresses the repeated occurrences of number 1.

The tr command can swap characters from the search variable to the swap variable, but the number of characters must be the same in both.

Wc command

Use:

$ wc [options] [files]

The wc command counts the lines, words, and characters of one or more files. If more than one file is passed as an argument, it will present the statistics for each file and also the total.

The most common options are:

  • -c: Count the number of characters in one or more files;
  • -l: Count the number of lines in one or more files;
  • -L: Count the number of characters in the longest line of the file;
  • -w: Count the words in one or more files.

Example:

$ toilet LEIAME.TXT

Sort command

Use:

$ sort [-b] [-d] [-f] [-i] [-m] [-m] [-n] [-r] [-u] file [-o output_ file]

The sort command sorts the lines of a text file.

Your options are:

  • -b: Ignore spaces at the beginning of the line;
  • -d: Alphabetizes lines and ignores punctuation;
  • -f: Ignores the difference between uppercase and lowercase letters;
  • -I: Ignore control characters;
  • -m: Merge two or more files into an ordered output file;
  • -M: Treat all three first letters of the lines such as month (e.g. JAN);
  • -n: Sort by the numbers at the beginning of the lines;
  • -r: Sort in reverse order;
  • -u: If the line is duplicated, show only the first line;
  • -o: Send the command output to the file.

E.g.:

$ sort file

Uniq command

Use:

$ uniq [Options]… [Input_File [Output_File]]

The uniq command removes duplicate lines from an ordered file.

The most common options are:

  • -c: Indicates the number of occurrences at the beginning of the lines;
  • -d: Prints only duplicate lines;
  • -i: Ignores the difference between uppercase and lowercase letters;
  • -u: Prints only single lines, which have no duplicates.

sed command

Use:

$sed [Options] {script} file

The sed command is a powerful string editor for filtering or editing text sequences.

The most common options are:

  • -e: Adds a script to the commands to be executed;
  • -f file: Adds the content of a file as a script to be executed;
  • -r: Uses regular expressions in the script.

Examples:

To replace In other words, we use the “s”, with the “/” delimiters, in such a way that the first occurrence is the text to be searched for, and the second is the text that will be replaced. Note that sed will only change the first occurrence of each line, and is case-sensitive. The default output of sed will be terminal.

$ cat file

will be hot at night today. The Night is beautiful.

$ sed s/night/day/ file

Today will be hot during the day. The Night is beautiful.