Option Parsing On a Budget
Recently I was writing a little code generation utility which took lots of
positional arguments. I wanted to add two optional features to this utility,
these options would take no arguments. I decided to use getopt
but realised
that this would make the code depend on POSIX, I liked the idea of staying
dependency free so I quickly investigated really simple solutions for option
parsing (without compromises) which would be equivalent to POSIX and GNU
getopt
.
The first iteration of the code used getopt
, this was some pretty standard
getopt
code. Very portable to all systems which implement the basic POSIX
getopt
.
while (c = getopt(argc, argv, "dl"), c != -1) {
switch (c) {
case 'd': des_init = true; break;
case 'l': comp_lit = true; break;
case '?': usage();
default: assert("Option not implemented" == NULL);
}
}
My first attempt at replacing this code looked similar to the following:
int opti;
for (opti = 1; argv[opti] != NULL && argv[opti][0] == '-'; opti++) {
if (strcmp(argv[opti], "--") == 0) {
opti++;
break;
}
for (const char *opt = &argv[opti][1]; *opt != '\0'; opt++) {
switch (*opt) {
case 'd': des_init = true; break;
case 'l': comp_lit = true; break;
default:
fprintf(stderr, "%s unknown option -- %c\n", argv0, *opt);
usage();
}
}
}
This replacement was POSIX getopt
compliant in that it parsed options until it
hit --
or until the first non-option argument. This replacement was twice as
long as the getopt
version but did meant that the code no longer relied on
POSIX. The opti
variable had the same purpose as optind
in getopt style
code.
I would have been happy with this version but I noticed that my program did not
actually permit any arguments beginning with -
and I was also up for the
challenge. That being said, I don't think handling this is an essential feature.
The final version, after a few unreadable iterations ended up being only 21
lines long. This version handles mixed positional and optional arguments by
relying on the C standard which allows modification of argv
. Additionally,
this version made code which followed it more readable than the getopt version.
It really seems like a win win.
bool opts_end = false;
argc = 0;
for (int i = 1; argv[i] != NULL; i++) {
if (opts_end || argv[i][0] != '-') {
argv[argc++] = argv[i];
continue;
}
if (strcmp(argv[i], "--") == 0) {
opts_end = true;
continue;
}
for (const char *opt = &argv[i][1]; *opt != '\0'; opt++) {
switch (*opt) {
case 'd': des_init = true; break;
case 'l': comp_lit = true; break;
default:
fprintf(stderr, "%s: unknown option -- %c\n", argv0, *opt);
usage();
}
}
}
That being said, implementing mixed options and non-options could be considered
a misfeature. It can cause unexpected problems more often than it solves them.
Additionally, although GNU getopt
does this, modifying argv
is considered by
some to be a bit of a dirty trick. But, as mentioned before, positional
arguments in this particular codebase could not start with a hyphen, and
implementing this feature seemed like a fun task.
Obviously this code does not handle option arguments, that's because I didn't
have a need for those. In the case that I needed option arguments I would likely
have gone with arg.h
or getopt
.
As a final note, the code in this post is taken from a MIT licensed codebase, and should be published as part of pack shortly.