Yesterday, I wrote production level code using awk. It was either write the code in go which I figured would take all day. But I had really about an hour or so. 40 lines of shell script mixture of about 20 lines of awk and bash/sed/grep/tail/xargs/cut/cat/curl got the job done.
AWK is very freaking powerful if you learn to parse data with it.
Probably a good book, but it serves a way different purpose than this article which is more of a gateway drug for newbies like myself who have never worked with awk before and don't quite understand the value-add.
I really enjoyed this article, because now that I have a sampling of the class of problems that awk can help me with, I am more likely to use it. And it only took 10 minutes to get this overview. My brain is already buzzing with ideas on how I can use this with gron[1] that was mentioned a few days ago on HN.
We all need a compelling reason for a tool before investing the time for whatever Bible may exist, so "just grab" the Bible first instead isn't the most helpful advice :) But I do appreciate you flagging it!
Might I ask why this old version (I haven't read it) instead of the manual/book for GNU awk? [1] It does a good job of mentioning differences between other implementations and GNU awk extensions are marked
That old version is written by the ‘A’[0], ‘W’[1], and ‘K’[2] that are the eponymous inventors of awk, and the book is at least occasionally cited as basically a masterpiece.[3]
I've also seen this book mentioned elsewhere but never took the effort of reading it. Hopefully I'll go through it sometime soon - it may be ~30 years old, but the core concepts is likely to work in gawk even now.
Also, the gawk book is written by Arnold Robbins who is one of current maintainers of the tool and as per [1] gawk was started by David Trueman and Arnold.
Hmm, the vital thing to say is that awk is a series of
PATTERN { ACTION }
statements, where { ACTION } is optional and defaults to { print }
Side note; I almost always use Perl or Ruby with the -n or -p flags instead of awk. I don't need to have three or four syntaxes to remember when I can use just one.
>where { ACTION } is optional and defaults to { print }
There's more to that:
Either PATTERN or { ACTION } is optional (but not both).
If no PATTERN is given, the default means "match every line" (and so do the given ACTION for every line).
The ACTION could be anything (even an explicit print, whether or not PATTERN is given), summing up some column, filtering more on substring(...), assigning values to variables, comparing variables and doing actions based on them (awk's if statement), calling a built-in or user-defined function, etc.
I studied under Al Aho a few semesters ago (the 'A' in AWK). I'm dipping my toes in Data Science in Python at my current internship, and while I'm seriously impressed at the bevy of NLP/Machine Learning algorithms I have at my disposal, I've come to realize that some of the stuff I'm working on can be solved by unix tools written in the 70s/80s. And that's dope.
This point comes up somewhat periodically on HN - the point you made. And yes, it is "dope" :)
More people should check prior art before making new stuff. Reuse / stand on the shoulders of giants / don't reinvent the wheel, etc. But NIH is alive and well ...
"And, unlike some languages, awk’s syntax is familiar, and borrows some of the best parts of languages like C, python, and bash (although, technically, awk was created before both python and bash)."
If anyone's interested, I created a repo that contains some common one-liners that I've used https://github.com/kirang89/awk-cookbook. There's a PDF version as well :)
In the early 2000s, I spent quite some time learning with the tutorials on IBM developer site. Many of them are well-written and easy to read compared with blog posts. And they paid the authors.
I recently looked for a simple scripting language where I would need to do some mathematical arithemetics, and do some formatted output of the results (2 decimal digits). I found awk scripts (stand alone) a good option. Bash it self is not well suited, and also Python felt like a worse option in my case. Also looked at bc as script language but its reusablity and formatting features were more limited.
Yesterday, I wrote production level code using awk. It was either write the code in go which I figured would take all day. But I had really about an hour or so. 40 lines of shell script mixture of about 20 lines of awk and bash/sed/grep/tail/xargs/cut/cat/curl got the job done.
AWK is very freaking powerful if you learn to parse data with it.