2.4.7. Inspecting files#
As discussed in Section 2.4.6.3, typing ls will list files in your current working directory.
You might be starting to wonder what’s in them.
Here we explore several tools to inspect file content.
2.4.7.1. Display file content in the terminal#
The most basic command to display content of a text file in your terminal is cat (for catalogue), e.g.:
$ cat myfile.txt
would display the content of the file myfile.txt in your current working directory, if it exists.
Exercise 2.35
Test cat on a file that comes pre-installed on your Ubuntu system, namely the GPL license file
located in the /usr/share/common-licenses/ system directory:
$ cat /usr/share/common-licenses/GPL
As you can see, cat just spits out the contents of a file to the terminal.
If the file is longer than can fit on your screen, you will not be able to see the first part.
Your terminal usually does have a memory of the last printed lines (which is configurable in the terminal settings),
so you can try to scroll up with its scroll bar.
2.4.7.2. Interactively look through file content in the terminal#
A more interactive solution to inspect large text files is to use a program called less:
$ less /usr/share/common-licenses/GPL
If this seems familiar from when you looked at man pages, it is. In fact, man uses less to display its output!
The same navigation options described in Section 2.4.5 therefore apply here as well, e.g. press q to quit.
A few other ones are jumping to the top using g and to the bottom using G. Pressing h will give you an overview of all options of less.
Exercise 2.36
Try this:
$ less /usr/share/common-licenses/GPL
and search for the kind of WARRANTY you get on Ubuntu.
In the interactive mode, you can press / to search for words or phrases, press Enter to start the search.
You can jump through search results with n (next) and p (previous).
2.4.7.3. Display parts of a file in the terminal#
Sometimes you know you only want to check the beginning or end of a file,
and see it in the terminal.
For this, you can use the commands head and tail.
To print just the first or last few lines of a file. Try:
$ head /usr/share/common-licenses/GPL
and
$ tail /usr/share/common-licenses/GPL
The command also take a switch with a number -[N] to show the \(N\) first or last lines.
For example,
$ head -2 /usr/share/common-licenses/GPL
shows only the first two lines.
Exercise 2.37
Show just very last line of the file /usr/share/common-licenses/GPL in the terminal.
2.4.7.4. Count words and lines#
Sometimes you just want count the number of words or lines in a file.
This can be done with wc, short for “word count”
$ wc /usr/share/common-licenses/GPL
You should see three numbers: the number of lines, the number of words, and the total number of bytes in the file.
Exercise 2.38
Look in the manual for wc for a switch to only display the number of lines, and not the other counts.
Use it to print the number of lines of the GPL file.
2.4.7.5. Cutting a file ‘vertically’ with cut#
While head and tail allow you to split a file “horizontally” by its lines,
you can use cut to split a file “vertically” by its columns.
This tool can operate in a few ways. For example, it can be used to cut a specific range of columns:
$ cut -c 1-10 /usr/share/common-licenses/GPL
Here -c 1-10 indicates that only the first to the tenth character of every line should be displayed.
Exercise 2.39
What does the following do?
$ cut -c 2,4,6 /usr/share/common-licenses/GPL
confirm by comparing the output to what you can see with less /usr/share/common-licenses/GPL.
Exercise 2.40
Check the help of cut for information on the syntax to indicate character ranges.
Print every line of /usr/share/common-licenses/GPL except for:::::: their first 5 characters.
Another common usage of cut is to split content by a particular delimiter into multiple fields,
and then only display specified fields.
This can be done to split a sentence in words, using (a space) as the delimiter,
or a file with tabular data where each column is separated by ; (using ; as delimiter).
To do this, we must specify the delimiter with the -d option and the requested field with the -f option.
Here is how we can cut the first word of every line:
$ cut -d ' ' -f 1 /usr/share/common-licenses/GPL
Note that if we specify a space as a delimiter, we have to put the space in single quotes ' ' such that
our shell understands this space is configuration input for the cut tool.
You may also note that cut separates words by single space only, so a line starting with multiple spaces will be interpreted as starting with one or more “empty” words.
Exercise 2.41
Cut the GPL file by space, and display the first three fields only.
2.4.7.6. Sorting lines#
The sort program will sort lines in a file alphanumerically,
comparing lines by the very first character first,
and if only those are the same moving to their second character, etc.
The line with the character that appears earlier in the alphanumeric order should come first.
In the alphanumeric order,
we first get spaces ( ), then uppercase characters (A-Z), the lowercase characters (a-z), and finally numbers 0-9.
To show the lines in the GPL files sorted, simply use:
$ sort /usr/share/common-licenses/GPL
Note that this does not alter the file, it just displays the lines sorted in the terminal.
This can be useful if you have an unsorted log file where every line starts with a time stamp and then a message, for example.
Exercise 2.42
Display the GPL file sort, but in reverse order, so lines starting with many spaces appear last in the output.
Use the help for sort to find the correct switch.
Note that you may have to scroll up in your terminal a lot to validate the result, since the file contains many empty lines.