What is the general syntax of a Unix shell command?

These days, the POSIX standard using getopt() (aka getopt(3)) is widely used as a standard notation, but in the early days, people were experimenting. On some machines, the sort command no longer supports the + notation. However, various commands (notably ar and tar) accept controls without any prefix character – and dd (alluded to by Alok in a comment) uses another convention altogether.

The GNU convention of using ‘--‘ for long options (supported by getopt_long(3)) was changed from using ‘+‘. Of course, the X11 software uses a single dash before multi-character options. So, the whole thing is a collection of historic relics as people experimented with how best to handle it.

POSIX documents the Utility Conventions that it works to, except where historical precedent is stronger.


What styles of option handling are there?

[At one time, SO 367309 contained the following material as my answer. It was originally asked 2008-12-15 02:02 by FerranB, but was subsequently closed and deleted.]

How many different types of options do you recognize? I can think of
many, including:

  • Single-letter options preceded by single dash, groupable when there is
    no argument, argument can be attached to option letter or in next
    argument (many, many Unix commands; most POSIX commands).
  • Single-letter options preceded by single dash, grouping not allowed,
    arguments must be attached (RCS).
  • Single-letter options preceded by single dash, grouping not allowed,
    arguments must be separate (pre-POSIX SCCS, IIRC).
  • Multi-letter options preceded by single dash, arguments may be
    attached or in next argument (X11 programs; also Java and many programs on Mac OS X with a NeXTSTEP heritage).
  • Multi-letter options preceded by single dash, may be abbreviated
    (Atria Clearcase).
  • Multi-letter options preceded by single plus (obsolete).
  • Multi-letter options preceded by double dash; arguments may follow ‘=’
    or be separate (GNU utilities).
  • Options without prefix/suffix, some names have abbreviations or are
    implied, arguments must be separate. (AmigaOS
    Shell
    )

For options taking an optional argument, sometimes the argument must be attached (co -p1.3 rcsfile.c),
sometimes it must follow an ‘=’ sign. POSIX doesn’t support optional
arguments meaningfully (the POSIX getopt() only allows them for the last
option on the command line).

All sensible option systems use an option consisting of double-dash
(‘--‘) alone to mean “end of options” — the following arguments are
“non-option arguments” (usually file names; POSIX calls them ‘operands’)
even if they start with a
dash. (I regard supporting this notation as an imperative. Be aware that if the -- is preceded by an option requiring an argument, the -- will be treated as the argument to the option, not as the ‘end of options’ marker.)

Many but not all programs accept single dash as a file name to mean
standard input (usually) or standard output (occasionally). Sometimes,
as with GNU ‘tar‘, both can be used in a single command line:

... | tar -cf - -F - | ...

The first solo dash means ‘write to stdout’; the second means ‘read file
names from stdin’.

Some programs use other conventions — that is, options not preceded by a
dash. Many of these are from the oldest days of Unix. For example,
‘tar’ and ‘ar’ both accept options without a dash, so:

tar cvzf /tmp/somefile.tgz some/directory

The dd command uses opt=value exclusively:

dd if=/some/file of=/another/file bs=16k count=200

Some programs allow you to interleave options and other arguments
completely; the C compiler, make and the GNU utilities run without
POSIXLY_CORRECT in the environment are examples. Many programs expect
the options to precede the other arguments.

Note that git and other VCS commands often use a hybrid system:

git commit -m 'This is why it was committed'

There is a sub-command as one of the arguments. Often, there will be optional ‘global’ options that can be specified between the command and the sub-command. There are examples of this in POSIX; the sccs command is in this category; you can argue that some of the other commands that run other commands are also in this category: nice and xargs spring to mind from POSIX; sudo is a non-POSIX example, as are svn and cvs.


I don’t have strong preferences between the different systems. When
there are few enough options, then single letters with mnemonic value
are convenient. GNU supports this, but recommends backing it up with
multi-letter options preceded by a double-dash.

There are some things I do object to. One of the worst is the same
option letter being used with different meanings depending on what other
option letters have preceded it. In my book, that’s a no-no, but I know
of software where it is done.

Another objectionable behaviour is inconsistency in style of handling
arguments (especially for a single program, but also within a suite of
programs). Either require attached arguments or require detached
arguments (or allow either), but do not have some options requiring an
attached argument and others requiring a detached argument. And be
consistent about whether ‘=‘ may be used to separate the option and
the argument.

As with many, many (software-related) things — consistency is more
important than the individual decisions. Using tools that automate
and standardize the argument processing helps with consistency.


Whatever you do, please, read the TAOUP’s Command-Line Options and
consider Standards for Command Line Interfaces. (Added by J F
Sebastian — thanks; I agree.
)

Leave a Comment