Data, Maps, Usability, and Performance

Grep, Awk, and Sed in bash on OSX

Last updated on September 12, 2016 in Development

linux bash scripting osx

In the last couple days I have been messing around with huge log files on OSX bash (or command-line). I have used grep, awk, and sed to manipulate and rewrites these log files. Here are some commands that I found useful to document for the future. If you have never used these bash tools, this might be useful, especially if you are trying to mess around with files that are really big. I’m going to consider sample log file and explain a couple different things you can do with these tools. I am using these tools on OSX and these commands or one-line scripts will work with any Linux flavor.

So, let’s consider the following log file:

Now, here are some things you can do to this log file:

How many files are in a directory: ls | wc -l

Print the file: cat sample.log

Print lines that match a particular word: grep “RAW” sample.log

Print those lines to a file called test.log: grep “RAW” sample.log > test.log

Print particular columns and sort: cat sample.log | awk ‘{ print $1,$2}’ | sort -k 1

Find and Replace using SED and Regex: cat sample.log | sed ‘s/TEST/JSON/g’

Split a log file into multiple files using a column as name with AWK: awk ‘{ print >>($4″.log”); close($4″.log”) }’ sample.log

Use substr (removes last character) in AWK to manipulate a string per line: cat sample.log | awk ‘{ print $1,$2,$3,substr($4,1,length($4)-1),$5}’

Print first line of file with SED: sed q test.log

Print last line of file with SED: sed ‘$!d’ sample.log

Perform a regular expression on last character of entire file using SED: cat sample.log | sed ‘$ s/5$//’

Add some text to beginning and end of a file with AWK: cat sample.log | awk ‘BEGIN{print “START” } { print } END{print “END”}’

Count and print how many unique fields are in all rows using AWK: cat sample.log | awk ‘{ if (a[$1]++ == 0) print $1 }’ | wc -l

Make everything lowercase with AWK: cat sample.log | awk ‘{print tolower($0)}’

Multiple SED regular expressions: sed ’1s/^/START/;$ s/5$/END/’ sample.log

Regex with SED on multiple files: for file in *; do sed ’1s/^/START/’ $file > $file’.json’; done

The last one would not apply to the sample log files but all files in a directory. I included it to show how you can do a for-loop in a one-liner. If you are looking for more sed and awk one-liners, check out these.

More OSX Bash Commands

Run a command on each file in directory: for i in `ls`; do $i; done

Run a command on each txt file with in a folder using a if statement: for file in *; do if [[ ${file: -4} == ".txt" ]]; then echo $file; fi; done

Erase the contents of every file in a directory: for i in `ls`; do > $i; done

Rename the extension of all files in a folder: for old in *.txt; do mv $old `basename $old .txt`.json; done

Merge all files in a directory with a comma seperator: find . -type f -not -name output.txt -exec cat {} ; -exec echo `,` ; > output.txt

Related:

Unix Tricks

Tags: ,

Facebook Twitter Hacker News Reddit More...