The article is dangerously wrong in its discussion of IFS.
What you should do to avoid the problem of mishandling spaces is use proper quoting (for i in "$@"; do ...), not changing IFS; setting IFS to \n\t will still break embedded tabs and newlines.
In general, in bash scripts any of use of $ should always be between double quotes unless you have a reason to do otherwise.
Seconded. It is quite off mark. This will break code which depends on splitting, like when you have a some variable called FOO_FLAG which contains "--blah arg" that's supposed to expand to two arguments. Observing proper quoting is the way (except for internal data representations that you can guarantee not to have spaces).
And also, the newline and tab is not explained! What is with that?
"We don't want accidental field splitting of interpolated expansions on spaces, ... but we do want it on embedded tabs or newlines?"
Huh?
If you don't want field splitting, set IFS to empty! (And then you don't need the dollar sign Bash extension for \t and \n):
$ VAR="a b c d"
$ for x in $VAR ; do echo $x ; done
a
b
c
d
$ IFS='' ; for x in $VAR ; do echo $x ; done
a b c d
No splitting on anything: not spaces, tabs or newlines!
Agreed. In addition to still having trouble with tabs and newlines, setting IFS still leaves the other big problem with unquoted variables: unexpected expansion of wildcards. The shell considers any unquoted string that contains * , ?, or [ to be a glob expression, and will replace it with a list of matching files. This can cause some really strange bugs.
Also, an unquoted variable that happens to be null will essentially vanish from the argument list of any command it's used with, which can cause another class of weird bugs. Consider the shell statement:
if [ -n $var ]; then
... which looks like it should execute the condition if $var is nonblank, but in fact will execute it even if $var is blank (the reason is complex, I'll leave it as a puzzle for the reader).
Setting IFS is a crutch that only partly solves the problem; putting double-quotes around variable references fully solves it.
The test command has certain rules depending on the number of arguments.
The most pertinent rule is:
For one argument, the expression is true if, and only if, the argument is not null.
In this case
[ -n $var ]
is the same as
test -n $var
$var is not quoted, so when this command is run, word splitting occurs and therefore $var is null. Which there falls into the one argument rule above.
> the reason is complex, I'll leave it as a puzzle for the reader
[ -n ]
is the same as
test -n
In this case -n has no argument, so it cannot be parsed as "-n STRING", instead it is parsed as "STRING", where STRING is "-n", with the behaviour "True if string is not empty.".
has no issue with spaces in filenames. If *.txt matches "foo bar.txt", then that's what the filename variable is set to. In the body of the loop you have to make sure you have "$filename".
You don't need play games with IFS to correctly process filesystem entry names expanded from a pattern.
Wildcards are not variables. Wildcards don't get expanded in quotes. Variables get expanded in double quotes but not single quotes. $@ obeys all the same expansion rules as all other variables. Command substitution with both $() and `` follow the same rules as variables.
> Why doesn't set -e (or set -o errexit, or trap ERR) do what I expected?
> set -e was an attempt to add "automatic error detection" to the shell. Its goal was to cause the shell to abort any time an error occurred, so you don't have to put || exit 1 after each important command. That goal is non-trivial, because many commands intentionally return non-zero.
> What are the advantages and disadvantages of using set -u (or set -o nounset)?
> Bash (like all other Bourne shell derivatives) has a feature activated by the command set -u (or set -o nounset). When this feature is in effect, any command which attempts to expand an unset variable will cause a fatal error (the shell immediately exits, unless it is interactive).
pipefail is not quite as bad, but is nevertheless incompatible with most other shells.
is a good example of why -e and pipefail are dangerous. grep will return an error status if it gets an error (e.g. file not found) or if it simply fails to find any matches. With -e and pipefail, this command will terminate the script if there happen to be no matches, so you have to use something like || true at the end... which completely breaks the exit-on-error behavior that was the point of the exercise.
> GreyCat's personal recommendation is simple: don't use set -e. Add your own error checking instead.
> rking's personal recommendation is to go ahead and use set -e, but beware of possible gotchas. It has useful semantics, so to exclude it from the toolbox is to give into FUD.
To be honest, I think traditional shells are now only good for environments where you know you aren’t doing anything too weird and all the most likely inputs work as expected without a lot of effort. This is spending time wisely; just because everything except a zero can technically be part of a Unix filename doesn’t mean that I want to invest hours or days making damn sure everything works for pathological cases.
If I actually do want to guard against every case imaginable, I immediately switch to Python or some other language that at least knows how to quote things unambiguously without a lot of effort.
Shell is a lot better than the other languages I know at many tasks involving I/O redirection, spawning programs, etc. It's kind of arcane, but so are the APIs for doing that stuff in other scripting languages. I'm eagerly awaiting some new contender in the system scripting language arena though.
Fails to mention what is in my opinion the most devious, subtle potential pitfall with `set -e`: assigning (or even just a bare evaluation of) an arithmetic zero. `foo=0` won't do anything surprising, but `let foo=0` will return 1, and thus abort your script if you're not careful.
Also, as an alternative to the proposed `set +e; ...; set -e` wrapper for retrieving the exit status of something expected to exit non-zero (generally cleaner in my opinion, if slightly "clever"):
I still don't get why bash (or zsh) don't try to integrate more Korn shell (88 & 93) scripting features. But there the focus seems more on more colorful prompts and autocompletion handholding…
And even despite more free licenses (AFAIR, IANAL), you can't depend on actual Korn shells being available on Unices. At least the dependent app situation has been getting a lot better, mostly by the death of workstations and their proprietary OSs (try depending on almost any grep/awk/sed option/switch when it has to run on Solaris/AIDX/HP-UX).
Although "all the world's a GNU/Linux" seems the new plague upon our lands here…
So after all these years, I'd say we're still in pretty much the same situation that birthed Perl. Which still would be my preferred choice if I'd actually have to distribute scripts and we're not talking about my own private, context-specific shortcuts, scripts and functions.
What you should do to avoid the problem of mishandling spaces is use proper quoting (for i in "$@"; do ...), not changing IFS; setting IFS to \n\t will still break embedded tabs and newlines.
In general, in bash scripts any of use of $ should always be between double quotes unless you have a reason to do otherwise.
And also, the newline and tab is not explained! What is with that?
"We don't want accidental field splitting of interpolated expansions on spaces, ... but we do want it on embedded tabs or newlines?"
Huh?
If you don't want field splitting, set IFS to empty! (And then you don't need the dollar sign Bash extension for \t and \n):
No splitting on anything: not spaces, tabs or newlines!Also, an unquoted variable that happens to be null will essentially vanish from the argument list of any command it's used with, which can cause another class of weird bugs. Consider the shell statement:
if [ -n $var ]; then
... which looks like it should execute the condition if $var is nonblank, but in fact will execute it even if $var is blank (the reason is complex, I'll leave it as a puzzle for the reader).
Setting IFS is a crutch that only partly solves the problem; putting double-quotes around variable references fully solves it.
In this case
is the same as $var is not quoted, so when this command is run, word splitting occurs and therefore $var is null. Which there falls into the one argument rule above.Therefore, always quote your variables.
http://robertmuth.blogspot.com/2012/08/better-bash-scripting...
what the author is doing is like this in Python:
You don't need play games with IFS to correctly process filesystem entry names expanded from a pattern.
https://github.com/koalaman/shellcheck
and `brew install shellcheck`
> Why doesn't set -e (or set -o errexit, or trap ERR) do what I expected?
> set -e was an attempt to add "automatic error detection" to the shell. Its goal was to cause the shell to abort any time an error occurred, so you don't have to put || exit 1 after each important command. That goal is non-trivial, because many commands intentionally return non-zero.
http://mywiki.wooledge.org/BashFAQ/112:
> What are the advantages and disadvantages of using set -u (or set -o nounset)?
> Bash (like all other Bourne shell derivatives) has a feature activated by the command set -u (or set -o nounset). When this feature is in effect, any command which attempts to expand an unset variable will cause a fatal error (the shell immediately exits, unless it is interactive).
pipefail is not quite as bad, but is nevertheless incompatible with most other shells.
grep some-string /some/file | sort
is a good example of why -e and pipefail are dangerous. grep will return an error status if it gets an error (e.g. file not found) or if it simply fails to find any matches. With -e and pipefail, this command will terminate the script if there happen to be no matches, so you have to use something like || true at the end... which completely breaks the exit-on-error behavior that was the point of the exercise.
Solution: do proper error checking.
> GreyCat's personal recommendation is simple: don't use set -e. Add your own error checking instead.
> rking's personal recommendation is to go ahead and use set -e, but beware of possible gotchas. It has useful semantics, so to exclude it from the toolbox is to give into FUD.
If I actually do want to guard against every case imaginable, I immediately switch to Python or some other language that at least knows how to quote things unambiguously without a lot of effort.
Also, as an alternative to the proposed `set +e; ...; set -e` wrapper for retrieving the exit status of something expected to exit non-zero (generally cleaner in my opinion, if slightly "clever"):
And even despite more free licenses (AFAIR, IANAL), you can't depend on actual Korn shells being available on Unices. At least the dependent app situation has been getting a lot better, mostly by the death of workstations and their proprietary OSs (try depending on almost any grep/awk/sed option/switch when it has to run on Solaris/AIDX/HP-UX). Although "all the world's a GNU/Linux" seems the new plague upon our lands here…
So after all these years, I'd say we're still in pretty much the same situation that birthed Perl. Which still would be my preferred choice if I'd actually have to distribute scripts and we're not talking about my own private, context-specific shortcuts, scripts and functions.