Awk
Table of Contents
About Awk
Awk is a powerful text-processing utility on GNU/Linux. It can perform complex text processing tasks, such as pulling out certain columns of information or doing substitutions. One thing to note is that there are different variants of Awk. On GNU/Linux distros, Awk is actually Gawk (GNU Awk). So the package name is actually Gawk, but when you run the commands it’s just “awk”.
Basic Awk Commands
awk options ’selection _criteria {action }’ input-file > output-file
Print out file (similar to cat)
- awk ’{print}’ test.sh
- awk ’{print $0}’ test.sh
- cat /etc/shells | awk …
- awk … /etc/shells =FIELD SEPARATORS
- awk ’BEGIN{FS=“:”; OFS=“-”} {print $1,$6,$7}’ /etc/passwd
- cat etc/shells | awk ’^\ {print $0}’
- cat etc/shells | awk ’^\ {print $NF}’ awk -F “/” ’/^\//{print $NF}’ /etc/shells | uniq
- awk -F “:” ’ {print $2}’ /etc/passwd
&& and ||
ps -ef | awk '/root/ && $2<100' ps -ef | awk '/root/ || /dt/ && (length($NF) < 30)'
Printing columns
- ps -ef | head | awk ’{print $1“ ”$8}’
- show \t \n as alternative to just “ ”
- show $NF instead of $8
Filter results for search pattern
/\[/
- df | awk ’\/dev\/loop {print $1“\t”$2“\t”$3}’
- df | awk ’\/dev\/loop {print $1“\t”$2 - $3}’
Filter results by length of line (character numbers)
awk 'length($0) > 7' /etc/shells
Find a specific string in any columns
ps -ef | awk '{ if($NF == "/bin/fish") print $0};'
Increment/Decrement
Post-increment
awk 'BEGIN { for(i=1;i<=10;i++) print "square of", i, "is",i*i; }'
Regex
The ~ is the regular expression match operator. It checks if a string matches the provided regular expression.
awk '$1 ~ /^[b,c]/ {print $0}' .bashrc
Print substr()
We use the substr() function. It prints a substring from the given string. We apply the function on each line, skipping the first three characters. In other words, we print each record from the fourth character till its end.
awk '{print substr($0, 4)}' numbered.txt
Match and RSTART
The match() function sets the RSTART variable; it is the index of the start of the matching pattern.
awk 'match($0, /o/) {print $0 " has \"o\" character at " RSTART}' numbered.txt
Builtin Variables in Awk
NR
NR command keeps a current count of the number of input records. Remember that records are usually lines. Awk command performs the pattern/action statements once for each record in a file.
Print range of lines
df | awk 'NR==7, NR==11 {print NR, $0}' NOTE: NR prints the line numbers, delete the NR if line numbers not needed!
Print the line count
awk 'END {print NR}' /etc/shells awk 'END {print NR}' /etc/shells /etc/passwd
NF
NF command keeps a count of the number of fields within the current input record. We’ve already shown using $NF for last field.
Using NF to print all non-empty lines
awk 'NF > 0' /etc/shells
FS
FS command contains the field separator character which is used to divide fields on the input line. The default is “white space”, meaning space and tab characters. FS can be reassigned to another character (typically in BEGIN) to change the field separator.
RS
RS command stores the current record separator character. Since, by default, an input line is the input record, the default record separator character is a newline.
OFS
OFS command stores the output field separator, which separates the fields when Awk prints them. The default is a blank space. Whenever print has several parameters separated with commas, it will print the value of OFS in between each parameter.
ORS
ORS command stores the output record separator, which separates the output lines when Awk prints them. The default is a newline character. print automatically outputs the contents of ORS at the end of whatever it is given to print.
Footer
Copyright © 2020-2021 Derek Taylor (DistroTube)
This page is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License (CC-BY-ND 4.0).
The source code for distro.tube can be found on GitLab. User-submitted contributions to the site are welcome, as long as the contributor agrees to license their submission with the CC-BY-ND 4.0 license.