Use the Unofficial Bash Strict Mode (Unless You Love Debugging)

devit · 9 years ago

The article is dangerously wrong in its discussion of IFS.

What you should do to avoid the problem of mishandling spaces is use proper quoting (for i in "$@"; do ...), not changing IFS; setting IFS to \n\t will still break embedded tabs and newlines.

In general, in bash scripts any of use of $ should always be between double quotes unless you have a reason to do otherwise.

kazinator · 9 years ago

Seconded. It is quite off mark. This will break code which depends on splitting, like when you have a some variable called FOO_FLAG which contains "--blah arg" that's supposed to expand to two arguments. Observing proper quoting is the way (except for internal data representations that you can guarantee not to have spaces).

And also, the newline and tab is not explained! What is with that?

"We don't want accidental field splitting of interpolated expansions on spaces, ... but we do want it on embedded tabs or newlines?"

Huh?

If you don't want field splitting, set IFS to empty! (And then you don't need the dollar sign Bash extension for \t and \n):

   $ VAR="a b c d"
   $ for x in $VAR ; do echo $x ; done
   a
   b
   c
   d
   $ IFS='' ; for x in $VAR ; do echo $x ; done
   a b c d

No splitting on anything: not spaces, tabs or newlines!

gdavisson · 9 years ago

Agreed. In addition to still having trouble with tabs and newlines, setting IFS still leaves the other big problem with unquoted variables: unexpected expansion of wildcards. The shell considers any unquoted string that contains * , ?, or [ to be a glob expression, and will replace it with a list of matching files. This can cause some really strange bugs.

Also, an unquoted variable that happens to be null will essentially vanish from the argument list of any command it's used with, which can cause another class of weird bugs. Consider the shell statement:

if [ -n $var ]; then

... which looks like it should execute the condition if $var is nonblank, but in fact will execute it even if $var is blank (the reason is complex, I'll leave it as a puzzle for the reader).

Setting IFS is a crutch that only partly solves the problem; putting double-quotes around variable references fully solves it.

stantona · 9 years ago

The test command has certain rules depending on the number of arguments. The most pertinent rule is: For one argument, the expression is true if, and only if, the argument is not null.

In this case

    [ -n $var ]

is the same as

    test -n $var

$var is not quoted, so when this command is run, word splitting occurs and therefore $var is null. Which there falls into the one argument rule above.

Therefore, always quote your variables.

ycmbntrthrwaway · 9 years ago

> the reason is complex, I'll leave it as a puzzle for the reader

    [ -n ]

is the same as

    test -n

In this case -n has no argument, so it cannot be parsed as "-n STRING", instead it is parsed as "STRING", where STRING is "-n", with the behaviour "True if string is not empty.".

nemild · 9 years ago

Google's Testing on the Toilet for Bash talks about $, and is a great reference for those trying to improve their bash scripting:

http://robertmuth.blogspot.com/2012/08/better-bash-scripting...

Hello71 · 9 years ago

it would almost make sense (but still be wrong) if it were discussing POSIX shell, where arrays do not exist and these contortions are necessary.

what the author is doing is like this in Python:

    stuff=["a b", "c d", "e f"]
    for thing in '\n'.join(stuff).split('\n'):
        print thing

ianbicking · 9 years ago

That's right for $@, but AFAIK only $@ – for instance you can't do:

    for filename in "*.txt" ; do...

kazinator · 9 years ago

   for filename in *.txt ; do ...

has no issue with spaces in filenames. If *.txt matches "foo bar.txt", then that's what the filename variable is set to. In the body of the loop you have to make sure you have "$filename".

You don't need play games with IFS to correctly process filesystem entry names expanded from a pattern.

adrusi · 9 years ago

Wildcards are not variables. Wildcards don't get expanded in quotes. Variables get expanded in double quotes but not single quotes. $@ obeys all the same expansion rules as all other variables. Command substitution with both $() and `` follow the same rules as variables.

setpatchaddress · 9 years ago

Agreed, but that's not something you can automatically enforce.

dllthomas · 9 years ago

I guess you could require a comment on any line with an unquoted variable expansion...

ishtu · 9 years ago

It's also a good idea to check your complex scripts before run with awesome shellcheck tool. http://www.shellcheck.net/

sshykes · 9 years ago

Thanks for this!

https://github.com/koalaman/shellcheck

and `brew install shellcheck`

Redoubts · 9 years ago

To be honest, the build time required for this (and usually GHC as well) get really annoying. Especially for simple updates.

Hello71 · 9 years ago

http://mywiki.wooledge.org/BashFAQ/105:

> Why doesn't set -e (or set -o errexit, or trap ERR) do what I expected?

> set -e was an attempt to add "automatic error detection" to the shell. Its goal was to cause the shell to abort any time an error occurred, so you don't have to put || exit 1 after each important command. That goal is non-trivial, because many commands intentionally return non-zero.

http://mywiki.wooledge.org/BashFAQ/112:

> What are the advantages and disadvantages of using set -u (or set -o nounset)?

> Bash (like all other Bourne shell derivatives) has a feature activated by the command set -u (or set -o nounset). When this feature is in effect, any command which attempts to expand an unset variable will cause a fatal error (the shell immediately exits, unless it is interactive).

pipefail is not quite as bad, but is nevertheless incompatible with most other shells.

gdavisson · 9 years ago

The example given in the article:

grep some-string /some/file | sort

is a good example of why -e and pipefail are dangerous. grep will return an error status if it gets an error (e.g. file not found) or if it simply fails to find any matches. With -e and pipefail, this command will terminate the script if there happen to be no matches, so you have to use something like || true at the end... which completely breaks the exit-on-error behavior that was the point of the exercise.

Solution: do proper error checking.

yes_or_gnome · 9 years ago

From the first link (105).

> GreyCat's personal recommendation is simple: don't use set -e. Add your own error checking instead.

> rking's personal recommendation is to go ahead and use set -e, but beware of possible gotchas. It has useful semantics, so to exclude it from the toolbox is to give into FUD.

makecheck · 9 years ago

To be honest, I think traditional shells are now only good for environments where you know you aren’t doing anything too weird and all the most likely inputs work as expected without a lot of effort. This is spending time wisely; just because everything except a zero can technically be part of a Unix filename doesn’t mean that I want to invest hours or days making damn sure everything works for pathological cases.

If I actually do want to guard against every case imaginable, I immediately switch to Python or some other language that at least knows how to quote things unambiguously without a lot of effort.

mbrock · 9 years ago

Shell is a lot better than the other languages I know at many tasks involving I/O redirection, spawning programs, etc. It's kind of arcane, but so are the APIs for doing that stuff in other scripting languages. I'm eagerly awaiting some new contender in the system scripting language arena though.

1amzave · 9 years ago

Fails to mention what is in my opinion the most devious, subtle potential pitfall with `set -e`: assigning (or even just a bare evaluation of) an arithmetic zero. `foo=0` won't do anything surprising, but `let foo=0` will return 1, and thus abort your script if you're not careful.

Also, as an alternative to the proposed `set +e; ...; set -e` wrapper for retrieving the exit status of something expected to exit non-zero (generally cleaner in my opinion, if slightly "clever"):

    retval=0
    count=$(grep -c some-string some-file) || retval=$?

lisivka · 9 years ago

I wrote library for shell scripts with design goal to work properly in strict mode: https://github.com/vlisivka/bash-modules

alwaysdownvoted · 9 years ago

One solution is not to use Bash. There are more basic shells that are equally as, or more, POSIX-like.

mhd · 9 years ago

I still don't get why bash (or zsh) don't try to integrate more Korn shell (88 & 93) scripting features. But there the focus seems more on more colorful prompts and autocompletion handholding…

And even despite more free licenses (AFAIR, IANAL), you can't depend on actual Korn shells being available on Unices. At least the dependent app situation has been getting a lot better, mostly by the death of workstations and their proprietary OSs (try depending on almost any grep/awk/sed option/switch when it has to run on Solaris/AIDX/HP-UX). Although "all the world's a GNU/Linux" seems the new plague upon our lands here…

So after all these years, I'd say we're still in pretty much the same situation that birthed Perl. Which still would be my preferred choice if I'd actually have to distribute scripts and we're not talking about my own private, context-specific shortcuts, scripts and functions.

ta0967 · 9 years ago

which particular ksh features do you miss in zsh?