Grep, Awk, and Sed in bash on OSX
In the last couple days I have been messing around with huge log files on OSX bash (or command-line). I have used grep, awk, and sed to manipulate and rewrites these log files. Here are some commands that I found useful to document for the future. If you have never used these bash tools, this might be useful, especially if you are trying to mess around with files that are really big. I’m going to consider sample log file and explain a couple different things you can do with these tools. I am using these tools on OSX and these commands or one-line scripts will work with any Linux flavor.
So, let’s consider the following log file:
Now, here are some things you can do to this log file:
How many files are in a directory: ls | wc -l
Print the file: cat sample.log
Print lines that match a particular word: grep “RAW” sample.log
Print those lines to a file called test.log: grep “RAW” sample.log > test.log
Print particular columns and sort: cat sample.log | awk ‘{ print $1,$2}’ | sort -k 1
Find and Replace using SED and Regex: cat sample.log | sed ‘s/TEST/JSON/g’
Split a log file into multiple files using a column as name with AWK: awk ‘{ print >>($4″.log”); close($4″.log”) }’ sample.log
Use substr (removes last character) in AWK to manipulate a string per line: cat sample.log | awk ‘{ print $1,$2,$3,substr($4,1,length($4)-1),$5}’
Print first line of file with SED: sed q test.log
Print last line of file with SED: sed ‘$!d’ sample.log
Perform a regular expression on last character of entire file using SED: cat sample.log | sed ‘$ s/5$//’
Add some text to beginning and end of a file with AWK: cat sample.log | awk ‘BEGIN{print “START” } { print } END{print “END”}’
Count and print how many unique fields are in all rows using AWK: cat sample.log | awk ‘{ if (a[$1]++ == 0) print $1 }’ | wc -l
Make everything lowercase with AWK: cat sample.log | awk ‘{print tolower($0)}’
Multiple SED regular expressions: sed ’1s/^/START/;$ s/5$/END/’ sample.log
Regex with SED on multiple files: for file in *; do sed ’1s/^/START/’ $file > $file’.json’; done
The last one would not apply to the sample log files but all files in a directory. I included it to show how you can do a for-loop in a one-liner. If you are looking for more sed and awk one-liners, check out these.
More OSX Bash Commands
Run a command on each file in directory: for i in `ls`; do $i; done
Run a command on each txt file with in a folder using a if statement: for file in *; do if [[ ${file: -4} == ".txt" ]]; then echo $file; fi; done
Erase the contents of every file in a directory: for i in `ls`; do > $i; done
Rename the extension of all files in a folder: for old in *.txt; do mv $old `basename $old .txt`.json; done
Merge all files in a directory with a comma seperator: find . -type f -not -name output.txt -exec cat {} ; -exec echo `,` ; > output.txt