Something I've come to internalize over the years is to always try to avoid being clever especially with lingua francas like Bash.
There will be people - seasoned professional developers, even - reading these languages that don't know the first thing about them!
`[` is one of those things that are second nature for seasoned bash people but that are utterly ungoogleable by a bash beginner. Beginners don't know `[` is an executable, and it would never occur to them to `man [`. It can become quite the rabbit hole to figure out that there's also `[[` (which is NOT an executable), that `[ a > b ]` is very very different from `[[ a > b ]]`, etc.
So do these people a favor and put the `if` there so they at least have some hope of stumbling upon a stack overflow post talking about some of this stuff.
I have a somewhat quirky style and almost always use "if test ..." since I find it clearer than "[" and usually don't need the additional functionality that "[[" provides.
I always use "test" and never "[" (and I've only used "[[" once when I thought I needed a builtin regex matcher for efficiency). Partly because fish does the same, partly because despite writing bash scripts for years I do not want to have the "[ vs [[" information in my brain (nor should whoever reads my code).
(For similar reasons, whenever I'm grepping for a nontrivial regex, I always use egrep rather than grep, because the escaping semantics are simpler—"if it's escaped with a \, then it's not special"—and again I do not want to have the information of "which characters are special only when escaped in base grep" in my brain.)
> I do not want to have the information of "which characters are special only when escaped in base grep" in my brain
Tangential but relevant to the Edit thread today were vis was mentioned, “No more {very-,}{no,}magic modes with different semantics and escaping rules. Instead we use the familiar and predictable POSIX Extended Regular Expression variant.”¹ is a real advantage over vim.
I honestly thought I was the only one, how cool is that? I too find `test` preferable over `[` since it leaves no doubt we're calling a command just like any other and the same assumptions still apply. I also avoid `[[` since it isn't posix and won't work in simpler shells.
Couldn't agree more.
As a very occasional bash script user (who'll go with Python instead whenever possible) I sort of lost interest already at the first, banale example:
>>if [ -r ~/.profile ]; then
>> source ~/.profile
>>fi
>>We can simplify this using control operators:
>>[ -r ~/.profile ] && . ~/.profile
I wonder which version I would prefer for readability three months down the road.
One of the most useful tricks I've learned and now apply to every non-throwaway shell script is the use of "||" for guard clauses to catch failing operations, along with the use of a block statement for running multiple operations "{ }".
Like so:
# Download the binary
wget --quiet --output-document /usr/local/bin/mybinary "$download_url" || {
error 'Failed downloading the CLI'
exit 1
}
debug "Making CLI executable"
# Make it executable
chmod +x /usr/local/bin/mybinary || {
error 'Failed making CLI executable'
exit 1
}
This has been completely transformative for me and has made writing maintainable + debuggable scripts so much easier.
if ! wget --quiet --output-document /usr/local/bin/mybinary "$download_url"; then
error 'Failed downloading the CLI'
exit 1
fi
?
I know someone who wrote most of his Bash that way, using `cmd && { ...; }` instead of `if cmd; then ...; fi` and `cmd || { ...; }` instead of `if ! cmd; then ...; fi`. I never thought it was particularly clear or maintainable--that it was more on the "clever" side than on the "clear" side.
Sure, it's a little clunky to have an `if` for everything that might fail, but uh... these days I write Go for a living.
My brain parses that || as quickly as it verifies the !, and having the actual command at the left of the line is helpful for clarity; FWIW, this is actually a pretty common paradigm in many languages including Ruby--with "or die"--which people generally liked for cute syntax.
Yep, this was a common Perl construct as well, since you could follow "or die" with a custom error string. I used to amuse myself (way back in the Stone Age, when I was in college and didn't know better) by using amusing and unique invectives for error messages like 'or die "you son of a motherless goat"'. Not only did they make me giggle a little, they made searching code for the line that failed a little easier...
What did Larry Wall do to deserve that? Ruby cribbed the "or die" idiom (and others) from Perl because they were useful. At least let him own his good ideas along with the rest of the language.
I mentioned that these days I write Go for a living, which involves a lot of:
if err := thingThatMightFail(); err != nil {
...
}
so I was saying that of course I think lots of `if` statements for error handling are reasonable. (Hmm, thinking back, I learned Go in 2012, and my Bash probably peaked in 2013, I wonder if having learned Go made me more amenable to this in Bash.)
I think a single-line `cmd || die "msg"` like in Perl or Ruby is handy and fine. Or maybe even a two-line
some command that might fail ||
die "Some message that is long enough that it wants to be its own line"
But once it starts getting to have multiple statements that you're grouping with `{ }`, just use an `if` statement. If I were reviewing someone's Ruby that had multiple lines after the "or" in "or die", I'd tell them to just use
I think you answered your own question. Those `if` constructs are indeed clunky. In fact I'd venture to say it's really just "C-like language Stockholm syndrome" that makes them feel more "maintainable" or "clear".
The bash "or" syntax reads very clearly. Do X or do Y. Do X or fail like so. It's very English-like and intuitive. And understanding why it works is just a matter of understanding logical operator short circuiting, which you have to understand to work in C-like languages anyway.
die() { echo "$@" >&2; exit 1; }
wget --quiet --output-document /usr/local/bin/mybinary "$download_url" ||
die 'Failed downloading the CLI'
chmod +x /usr/local/bin/mybinary ||
die 'Failed making CLI executable'
If you need more complex cleanup, that doesn't work as well... but (a) hopefully you can avoid that cleanup and just use a plain "|| die", and (b) if you need cleanup, hopefully you can put that into one function (possibly "die" itself) and have its cleanup work no matter where it's called from, and still run everything as "|| cleanup_and_die" or whatever you'd call it.
Prepending the script name is immensely helpful when diagnostic messages are mixed on stderr. It's also standard style for Unix utilities in general. The BSD extensions err(3) and warn(3), provided by all modern Unix libc implementations and commonly use by shell utilities, prepend the program name.
Another common style guideline is to include context for the diagnostic, often prepended. For example,
It's not Bash. I purposefully stick to POSIX shell constructs because memorizing all the various Bash (and ksh and Zsh) extensions and their oddball semantics is too much trouble. By sticking to POSIX shell it's easier to learn and remember actual shell programming semantics--parameter expansion, positional parameters, pathname globbing, etc. You have to understand those things, anyhow, if you actually care to understand the language; they happen whether you know it or not, and they're fundamental to the language.
Plus, the POSIX specification for the shell is significantly more concise and clear than the manuals for Bash, Zsh, etc. See Shell & Utilities -> Shell Command Language at https://pubs.opengroup.org/onlinepubs/9699919799/ I know exactly where to go if I can't remember the difference between ${foo:-X} vs ${foo+Y}.
Pro-protip tip: Add $0 but don't do pattern expansion. Worst case it's just slightly more verbose output. Best case, it's simpler to understand & maintain, and showing the path to your script will help unfamiliar people locate it quicker for debugging. (& I think it looks cleaner)
Also, don't run a command that directly overwrites a file unless you know it's going to succeed. A temp file lets you atomically replace the file only if the download succeeded. And a cleanup trap helps clean this up in the event of errors. Finally, you can die on unset variables or errors (cleanup still works) and you can enable tracing if environment variable DEBUG=1 was set.
#!/usr/bin/env bash
set -xeuo pipefael
-x to print the commands
-e to exit on error
-u to error out if there are unbound variables
-o pipefail to exit if a command that's not the last in a pipeline fails
No need to || {}.
Only word of warning here is that -x can print out secrets if you're not careful.
I stopped using `set -e`. It is disabled if the function you're running is part of an `if` statement, for example:
thingThatCanFail() {
echo "step one succeeded"
echo "step two failed"
false
echo "step three was run too"
}
if ! thingThatCanFail; then
echo "thingThatCanFail failed!"
fi
With or without `set -e`, step three is run, and the function returns success, even though you might expect the failure of step two to prevent step three from running.
If "thingThatCanFail" is called _outside_ of an if statement, then `set -e` causes different behavior (i.e. step three _is_ skipped).
I instead use lots of chaining with && (as in the article), or explicit checks after each command. I have two utility functions I define in nearly every script:
Do yourself a favor and read the linked section. The "elegant" method sure can be nice, but comes with a downside you really need to be aware of.
I love shell scripting and this highlights one of the major problems with bash: small changes like this can appear completely interchangeable with other mechanisms for doing the same thing but introduce edge cases that can unexpectedly bite you.
For example... consider what happens if "[ $foo -eq 0 ] && /bin/do-something" is the last statement within a function or the end of your script.
I cannot believe this is not at the top. Short-circuiting is useful and fun in places, but to suggest it as a general replacement for conditionals as the OP does is absolutely wrong and will cause bugs.
Does this caveat still apply when you only use a single short-circuit operator in a given command? As a long time shell scripter I'd honestly be a little afraid to use more than one at a time.
I always found it delightful how `if [ condition ]` works in bash, which, in case you don't know: the way `if <something>` works in bash is that <something> is a command that runs like everything else in bash. If the command succeeds (i.e. if the return code for the process is 0), the body of the if executes.
So how does `if [ <condition> ]` work then? Is it some kind of special case? No! The way it works is that there's an executable named `[` in UNIX that that takes the condition expression as arguments, and returns "success" or "failure" depending on if the condition evaluates to true. It's right there in the filesystem at /bin/[
Now, some sticklers might argue that maybe not the greatest idea to spawn a process every time you need to evaluate an if statement, but I really do admire the UNIX purity of it: of COURSE that's how bash conditionals work! This is UNIX, after all! Why add an expression parser/evaluator to bash when you can just have it's own process for that? Do one thing and do it well!
It is neat, but it also trips up a lot of us in the beginning. Because at one point you instead write: 'if [condition]' and get weird errors. And you can't understand how suddenly you can't even do a simple if-statement.
And there are tons of gotchas like that that made me terrified of bash. Even the simplest construct can attack you from any angle and the trial and error and distrust is everywhere. I'm getting better at it but for me and I guess many others, I just don't use bash enough to remember these quirks and the resulting experience is rather poor. Which is a shame!
The savior for me was shellcheck. Instead of needing character-perfect memory of each construct and common snippet I can run shellcheck on it and it will tell me what common pitfall I might have stumbled upon and most importantly, why it is an issue. For a beginner I think it is an absolute must.
Actually... it usually is a special case. For performance reasons there are some sneaky things Bash tends to do to avoid spawning lots of extra processes. In this case the `[` binary is more for backwards compatibility.
You can verify this by writing a simple test script and doing (at least on Linux) "strace ./test.sh 2>&1 | grep exec" and observe no explicit call to that binary.
Because Bash already has an expression parser/evaluator that it uses for this case, so now there isn't just one thing doing it, there are two, and you have to know when they are used and when they aren't.
Honestly the idea that a certain benign-looking syntax could (but maybe not!) spawn some external subprocess is terrifying to me, and I don't find it delightful at all.
While it might not be a special case exactly, you most likely won’t see /bin/[ being executed either. Typically your shell will implement it as a built in command. Because as you say, spawning processes is expensive.
You should not overuse chaining of commands. It obscures individual exit status, it can fail in subtle ways (esp. with features like set -e or pipefail), eventually you'll need to refactor it to do something more complicated, and it can hide important behavior from the casual reader.
Parameter expansion is a great way to simplify your code, but it can be obtuse to the casual reader. Setting a default variable using DEFAULT="${DEFAULT:-myvalue}" is a bit easier to understand than just ${DEFAULT:=myvalue}.
Read the dash manual and try to stick to just those features as it's almost entirely POSIX. Most scripts do not need to rely on bash functionality. https://linux.die.net/man/1/dash
If you find it will really simplify your life to have arrays, hashes/maps/dicts, a while loop reading from a subshell's output, etc, then use Bash and use those features. Otherwise, stick to POSIX semantics. You can do almost everything you need with parameter expansion, expr and other unix tools.
With any programming language, you should try to use the least amount of syntax and functionality possible to accomplish your goal, as long as it is readable, maintainable, and does not hide tricky behavior. Being verbose is always preferable to being hard to maintain; verbose and uncomplicated things can easily be simplified later.
Strongly disagree on elegance of [[ condition ]] || cmd and [[ condition ]] && cmd.
Every developer has read enough if statements to understand if then else at a glance. Few people write enough bash scripts to have that same instant parsing of [[ condition ]] && cmd or [[ condition ]] || cmd, especially when intermingled.
I like the idea of cmd || { raise error; } that others have mentioned, but I hate the idea of using these in place of an if statement to save 2 lines.
(or the equivalent for languages where the logical operators are written differently) are not bash-exclusive idioms; I've seen them in a number of expression-oriented languages (less frequently on statement-oriented languages where the particular action is a function rather than a statement, but the fact that it isn't usable consistently in those.
languages tends to make it less idiomatic there.)
Sure, it's weird to write (`then` is a separate statement, it has to end with `fi`, that's different), but I prefer making things easier to read than write. I believe that it's easier to read that boilerplate even if you have to look up how to write it.
I guess I've been reading that construction long enough that it's obvious (at least to me). It's also helpful that the same construction is common in obfuscated and tightly-compiled Javascript...
It will also fail if there is a stdout, but echo cannot write to it, for instance because you are redirecting to a file and have run out of disk space.
I agree. Also, not sure what's being improved. I also would have expected to run into a reference of double and single brackets used in if statements, e.g. [[ ]] and [ ] -- which I always seem to find confusing.
[[ ]] is effectively an entirely different parser than the surrounding shell, similar to (( )).
so you can do things like
[[ 1 < 2 ]] && echo true # where < is now not redirect, like to (( 1 < 2 ))
[[ "x" == 'y' || x == x ]] # note how || is not behaving the same as [ '' || 'y' ]
`man 1 test` explains the more conventional `[ ]` where the bash man page would better explain the [[ case.
The [ character is actually the program called "test"[1], so you can think of it as just running grep or sed or any other external binary/program, whereas the [[ replacement is shell-specific builtin; this blog is not using "bash-ism" but instead the external test operator from `coreutils` which generally would work from sh, zsh and whatever but on zsh, it doesn't work the same (kinda, it's more that the comparison EQUALS operator doesn't work well with [ but it's fine with [[).
Here it is in pdksh:
# [ "Z" == "z" ] && echo Bob
# [ "Z" == "Z" ] && echo Bob
Bob
I chose pdksh on purpose - because _it_ also supports [[ (builtin like bash)[2], so while [[ is a "bash-ism" it's actually present in some/many other shells as well. However, the operational aspect is not identical between [ and [[, both in the bash and pdksh implementations and let's add zsh to show the same line of shell break:
# [[ "Z" == "Z" && "A" == "A" ]] && echo Bob
Bob
# [ "Z" == "Z" && "A" == "A" ] && echo Bob
pdksh: [: missing ]
# [[ "Z" == "Z" && "A" == "A" ]] && echo Bob
Bob
# [ "Z" == "Z" && "A" == "A" ] && echo Bob
-bash: [: missing `]'
# [[ "Z" == "Z" && "A" == "A" ]] && echo Bob
Bob
# [ "Z" == "Z" && "A" == "A" ] && echo Bob
zsh: = not found
The [ operator requires that you use the bash/pdksh method to combine them with && "outside the brackets" in a more POSIX like use, like so:
# if ([ "Z" == "Z" ] && [ "A" == "A" ]); then echo Bob; fi;
Bob
...except on zsh, because (surprise!) zsh has decided to internalize the [ command rather than use the external one (which works OK); but it also has a problem with the "=" sign being used like this[3] with the test operator. zsh requires we then handle the == sign by quoting it (or unsetting a value internally):
# if [ "Z" == "Z" ] && [ "A" == "A" ]; then echo Bob; fi;
zsh: = not found
# if [ "Z" '==' "Z" ] && [ "A" '==' "A" ]; then echo Bob; fi;
Bob
This last example is a subtle point - the use of [[ is actually _more_ compatible between bash/pdksh/zsh than the use of [ due to the way zsh handles the input which is different than bash/pdksh and even old school POSIX Bourne shell (/bin/sh).
This got kinda long sorry, hope it helps.
[1] there's actually a binary "[" and a binary "test" in the `coreutils` package on most systems, however I'm not sure why they have different binary file sizes to be honest
> Only if the readability check expression is truthful/succeeds, we want to source the file, so we use the && operator.
Almost. The real way to do this is to check for the non-existence of the file as the “success” case and do the action via a || on failure.
Otherwise if you run in a strict mode with error on any unhandled non-zero command (set -e), you’ll exit the script with failure when the profile doesn’t exist:
[[ ! -r ~/.profile ]] || . ~/.profile
Note that the if expression does not have this issue as they don’t trigger the error on exit handling. Only the && approach does.
cat ~/.profile && echo This is your profile || echo Failed to read profile
I think these are commonly called short-circuits. Using these with brackets too, in order to gain greater control or more complex comparisions is super useful if not aware of it.
I'm no bash expert, just a sysadmin with a little dangerous knowledge, but constructing things that way just feels more natural (to me, at least, it's probably 'teaching your grandmother to suck eggs' to a proper bash hacker).
You can group with curley braces too. This can be especially useful for input/output redirection over a chunk of commands.
Be careful with using parenthesis -- this means you're now in a subshell! That "exit 1" will NOT terminate the entire script, but instead set "$?" to 1 for the next command. Additionally, any variable or environmental changes will be lost (which may be desirable) once you're outside of that block.
Thanks for the tip! I knew what I was doing was probably not right, but it worked when I needed it in the past (probably because of setting pipefail). It's very much welcome to know how to do it better :)
Curly braces are POSIX and work fine in dash. (Perhaps when testing you forgot these restrictions, which are the same as in bash: there must be whitespace after {, and a newline or semicolon before }.)
And also the same typo on the more general line just above what @anderskaseorg mentions (i.e. `{ expr }`) which should read `{ expr; }` (or line breaks instead of leading space/ trailing ';', etc. Such quirky syntax.)
> [ $USER != "root" ] && echo You must be root && exit 1
I get that the authors point was more focused on the conditional usage, but this seems like a pretty bad way to check if the user is root and isn’t really any more complicated than checking EUID = 0 or some of the other methods.
This is one of those articles written by someone who wants to make themselves look important but doesn't really understand the language/tool they are demonstrating. $_ shouldn't be used in scripts but is okay for an interactive session. Also he uses compound commands such as { cmd1 ; cmd2 } without the semicolon req: { cmd1 ; cmd2 ; }
I'm not entirely sure what I think of "x or y" replacing "if not x then y", but it's a common enough idiom that I don't have trouble reading or writing it.
I have tended to use it more recently for vertical compactness, and in python because pylint doesn't like single-line if statements but happily tolerates "x or y".
From personal messing around in the past, ||/&& generally don't seem as performant as using plain statements, by a fair margin (I'm guessing its causing subshells to be spawned for whatever reason, but I didn't bother digging further)
and I'd call most of this discussion flow control, not conditionals. Strictly speaking, conditionals are one of the primary benefits of zsh over bash imo.
There will be people - seasoned professional developers, even - reading these languages that don't know the first thing about them!
`[` is one of those things that are second nature for seasoned bash people but that are utterly ungoogleable by a bash beginner. Beginners don't know `[` is an executable, and it would never occur to them to `man [`. It can become quite the rabbit hole to figure out that there's also `[[` (which is NOT an executable), that `[ a > b ]` is very very different from `[[ a > b ]]`, etc.
So do these people a favor and put the `if` there so they at least have some hope of stumbling upon a stack overflow post talking about some of this stuff.