Photo by Pro Church Media on Unsplash
The Cut Command: A Comprehensive Guide with Real-World Examples
Table of contents
Introduction
Working with files and text is part of the daily routine of a lot of developers and Linux provides various command line utilities for processing files and text. One of the useful command-line utilities is cut
. The cut
command in Linux is used to slice a particular section from each line of files or standard input. It is typically used in combination of awk
and sed
for further data massaging
Usage
The cut
command is followed by options that determine the specific section to be cut and the file to be processed.
cut OPTION... [FILE]...
some of the most commonly used options for the cut
command:
-f
: specifies the field or column to be selected-d
: specifies the delimiter to use when separating fields-b
: specifies the bytes to be selected-c
: specifies the characters to be selected
Examples
$ cat example.csv
apple,orange,banana
grapefruit,kiwi,strawberry
#select the first column from file
$ cut -f 1 -d ',' example.csv
apple
grapefruit
In this example, the cut
command is used to remove the second and third field (column) from the file example.csv
, using a comma as the delimiter (-d ','). The -f
option specifies the field or column to be selected in the output.
Here are a few more examples to understand the cut command:
# select the first and second column and change output delimiter to space
$ cut -f1,2 --output-delimiter=" " -d "," example.csv
apple orange
grapefruit kiwi
# select characters between 1st and 5th
$ cut -c1-5 example.txt
apple
grape
# redirect the output to a new file using the '>' operator
$ cut -f 1 -d ' ' example.csv > new_file.csv
Here are a few real-world examples of the cut
command that I have come across and might be useful to you as well.
- Extracting the domain name from a list of URLs:
$ cat url_list.txt
https://www.google.com
https://www.facebook.com
https://www.amazon.com
$ cut -d/ -f3 url_list.txt
www.google.com
www.facebook.com
www.amazon.com
This command will extract the domain name from each line in the url_list.txt
file by using the '/' character as a delimiter and selecting the 3rd field.
- Extracting specific fields from a system log file:
$ cat syslog.txt
Jan 21 12:34:56 localhost kernel: [0.000000] Initializing cgroup subsys cpuset
Jan 21 12:34:56 localhost kernel: [0.000000] Initializing cgroup subsys cpu
Jan 21 12:34:56 localhost kernel: [0.000000] Initializing cgroup subsys cpuacct
$ cut -f 1,4,6 -d " " syslog.txt
Jan 21 kernel: [0.000000] Initializing cgroup subsys cpuset
Jan 21 kernel: [0.000000] Initializing cgroup subsys cpu
Jan 21 kernel: [0.000000] Initializing cgroup subsys cpuacct
This command will extract the first, fourth and sixth fields from each line in the Syslog file, using a space as the delimiter.
- Extract the name of the CSV files present in the folder:
# command to show the folder and file in hierarchical manner
S tree -L 1
.
├── [ 0] file1.py
├── [ 0] file2.csv
├── [ 0] file3.csv
└── [ 0] file3.txt
$ find . -type f -name "*.csv" | cut -d "/" -f2 | cut -d"." -f1
file2
file3
In this example, The first piece of the command finds all the CSV files in a folder and gives its output to the standard input of the cut
command. The cut
command removes the folder path with delimiter (-d "/" ) and outputs the file name with extension (.csv). Finally, the last piece of the command trims the extension and gives only the filename.
These are just a few examples of how the cut
command can be used in creative ways to manipulate and analyze data. With a little experimentation and a good understanding of the command's options and capabilities, you can use it to solve a wide range of text-processing problems.
Conclusion
In this short guide, we explored the cut
command, a versatile tool that can be used to extract and analyze data. One of the tools that should be present in every toolkit of the developer. with this, we're now ready to use cut
command in our day-to-day life to solve problems that we encounter in our work.