Pipe, standard input and command line arguments in Bash

grep "hehe" < test.sh

Input redirection – works for a single file only, of course, whereas cat works for any number of input files.

Consider the notations:

grep "hehe" $(cat test.sh)
grep "hehe" `cat test.sh`

These are equivalent in this context; it is much easier to use the ‘$(cmd)‘ notation in nested uses, such as:

x=$(dirname $(dirname $(which gcc)))
x=`dirname \`dirname \\\`which gcc\\\`\``

(This gives you the base directory in which GCC is installed, in case you are wondering.)

In the grep example, what happens is that the contents of test.sh is read and split into white-space separated words, and each such word is provided as an argument to grep. Since grep treats the words after "hehe" (where grep, of course, does not see the double quotes – and they are not needed in this case; as a general rule, use single quotes rather than double quotes, especially around complex strings like regular expressions which often use shell metacharacters)… As I was saying, grep treats the words after "hehe" as file names, and tries to open each file, usually failing dismally because the files do not exist. This is why the notation is not appropriate in this context.

After revisiting the question, there is more that could be said – that hasn’t already been said.

First off, many Unix commands are designed to work as ‘filters’; they read input from some files, transform it in some way, and write the result onto standard output. Such commands are designed for use within command pipelines. Examples include:

cat
grep
troff and relatives
awk (with caveats)
sed
sort

All these filters have the same general behaviour: they take command line options to control their behaviour, and then they either read the files specified as command line arguments or, if there are no such arguments, they read their standard input. Some (like sort) can have options to control where their output goes instead of standard output, but that is relatively uncommon.

There are a few pure filters – tr is one such – that strictly read standard input and write to standard output.

Other commands have different behaviours. Eric Raymond provides a taxonomy for command types in “The Art of UNIX Programming“.

Some commands generate lists of file names on standard output – the two classics are ls and find.

Sometimes, you want to apply the output from a file name generator as command line arguments for a filter. There’s a program that does that automatically – it is xargs.

Classically, you would use:

find . -name '*.[chyl]' | xargs grep -n magic_name /dev/null

This would generate a complete list of files with the extensions ‘.c‘, ‘.h‘, ‘.y‘ and ‘.l‘ (C source, headers, Yacc and Lex files). As the list is read by xargs, it would create command lines with grep -n magic_name /dev/null at the start and each word (separated by white space) as an argument.

In the old days, Unix file names didn’t include spaces. Under the influence of Mac and Windows, such spaces are now common-place. The GNU versions of find and xargs have complementary options to deal with this problem:

find . -name '*.[chyl]' -print0 | xargs -0 grep -n magic_name /dev/null

The ‘-print0‘ option means “print file names terminated by a NUL ‘\0′” (because the only characters that cannot appear in a (simple) file name are “https://stackoverflow.com/” and NUL, and obviously, “https://stackoverflow.com/” can appear in path names). The corresponding ‘-0‘ tells xargs to look for names terminated by NUL instead of space separated names.

More Related Contents:

Leave a Comment Cancel reply