Working with files and text is part of the daily routine of a lot of developers and Linux provides various command line utilities for processing files and text. One of the useful command-line utilities is
cut . The
cut command in Linux is used to slice a particular section from each line of files or standard input. It is typically used in combination of
sed for further data massaging
cut command is followed by options that determine the specific section to be cut and the file to be processed.
cut OPTION... [FILE]...
some of the most commonly used options for the
-f: specifies the field or column to be selected
-d: specifies the delimiter to use when separating fields
-b: specifies the bytes to be selected
-c: specifies the characters to be selected
$ cat example.csv apple,orange,banana grapefruit,kiwi,strawberry #select the first column from file $ cut -f 1 -d ',' example.csv apple grapefruit
In this example, the
cut command is used to remove the second and third field (column) from the file
example.csv, using a comma as the delimiter (-d ','). The
-f option specifies the field or column to be selected in the output.
Here are a few more examples to understand the cut command:
# select the first and second column and change output delimiter to space $ cut -f1,2 --output-delimiter=" " -d "," example.csv apple orange grapefruit kiwi # select characters between 1st and 5th $ cut -c1-5 example.txt apple grape # redirect the output to a new file using the '>' operator $ cut -f 1 -d ' ' example.csv > new_file.csv
Here are a few real-world examples of the
cut command that I have come across and might be useful to you as well.
- Extracting the domain name from a list of URLs:
$ cat url_list.txt https://www.google.com https://www.facebook.com https://www.amazon.com $ cut -d/ -f3 url_list.txt www.google.com www.facebook.com www.amazon.com
This command will extract the domain name from each line in the
url_list.txt file by using the '/' character as a delimiter and selecting the 3rd field.
- Extracting specific fields from a system log file:
$ cat syslog.txt Jan 21 12:34:56 localhost kernel: [0.000000] Initializing cgroup subsys cpuset Jan 21 12:34:56 localhost kernel: [0.000000] Initializing cgroup subsys cpu Jan 21 12:34:56 localhost kernel: [0.000000] Initializing cgroup subsys cpuacct $ cut -f 1,4,6 -d " " syslog.txt Jan 21 kernel: [0.000000] Initializing cgroup subsys cpuset Jan 21 kernel: [0.000000] Initializing cgroup subsys cpu Jan 21 kernel: [0.000000] Initializing cgroup subsys cpuacct
This command will extract the first, fourth and sixth fields from each line in the Syslog file, using a space as the delimiter.
- Extract the name of the CSV files present in the folder:
# command to show the folder and file in hierarchical manner S tree -L 1 . ├── [ 0] file1.py ├── [ 0] file2.csv ├── [ 0] file3.csv └── [ 0] file3.txt $ find . -type f -name "*.csv" | cut -d "/" -f2 | cut -d"." -f1 file2 file3
In this example, The first piece of the command finds all the CSV files in a folder and gives its output to the standard input of the
cut command. The
cut command removes the folder path with delimiter (-d "/" ) and outputs the file name with extension (.csv). Finally, the last piece of the command trims the extension and gives only the filename.
These are just a few examples of how the
cut command can be used in creative ways to manipulate and analyze data. With a little experimentation and a good understanding of the command's options and capabilities, you can use it to solve a wide range of text-processing problems.
In this short guide, we explored the
cut command, a versatile tool that can be used to extract and analyze data. One of the tools that should be present in every toolkit of the developer. with this, we're now ready to use
cut command in our day-to-day life to solve problems that we encounter in our work.