Writing Small CLI Programs in Common Lisp

Posted on March 17th, 2021.

I write a lot of command-line programs. For tiny programs I usually go with the typical UNIX approach: throw together a half-assed shell script and move on. For large programs I make a full Common Lisp project, with an ASDF system definition and such. But there's a middle ground of smallish programs that don't warrant a full repository on their own, but for which I still want a real interface with proper --help and error handling.

I've found Common Lisp to be a good language for writing these small command line programs. But it can be a little intimidating to get started (especially for beginners) because Common Lisp is a very flexible language and doesn't lock you into one way of working.

In this post I'll describe how I write small, stand-alone command line programs in Common Lisp. It might work for you, or you might want to modify things to fit your own needs.

  1. Requirements
  2. Solution Skeleton
    1. Directory Structure
    2. Lisp Files
    3. Building Binaries
    4. Building Man Pages
    5. Makefile
  3. Case Study: A Batch Coloring Utility
    1. Libraries
    2. Package
    3. Configuration
    4. Errors
    5. Colorization
    6. Not-Quite-Top-Level Interface
    7. User Interface
    8. Top-Level Interface
  4. More Information

Requirements

When you're writing programs in Common Lisp, you've got a lot of options. Laying out the requirements I have helped me decide on an approach.

First: each new program should be one single file. A few other files for the collection as a whole (e.g. a Makefile) are okay, but once everything is set up creating a new program should mean adding one single file. For larger programs a full project directory and ASDF system are great, but for small programs having one file per program reduces the mental overhead quite a bit.

The programs need to be able to be developed in the typical Common Lisp interactive style (in my case: with Swank and VLIME). Interactive development is one of the best parts of working in Common Lisp, and I'm not willing to give it up. In particular this means that a shell-script style approach, with #!/path/to/sbcl --script and the top and directly running code at the top level in the file, doesn't work for two main reasons:

The programs need to be able to use libraries, so Quicklisp will need to be involved. Common Lisp has a lot of nice things built-in, but there are some libraries that are just too useful to pass up.

The programs will need to have proper user interfaces. Command line arguments must be robustly parsed (e.g. collapsing -a -b -c foo -d into -abcfoo -d should work as expected), malformed or unknown options must be caught instead of dropping them on the floor, error messages should be meaningful, and the --help should be thoroughly and thoughtfully written so I can remember how to use the program months later. A man page is a nice bonus, but not required.

Relying on some basic conventions (e.g. a command foo is always in foo.lisp and defines a package foo with a function called toplevel) is okay if it makes my life easier. These programs are just for me, so I don't have to worry about people wanting to create executables with spaces in the name or something.

Portability between Common Lisp implementations is nice to have, but not required. If using a bit of SBCL-specific grease will let me avoid a bunch of extra dependencies, that's fine for these small personal programs.

Solution Skeleton

After trying a number of different approaches I've settled on a solution that I'm pretty happy with. First I'll describe the general approach, then we'll look at one actual example program in its entirety.

Directory Structure

I keep all my small single-file Common Lisp programs in a lisp directory inside my dotfiles repository. Its contents look like this:

…/dotfiles/lisp/
    bin/
        foo
        bar
    man/
        man1/
            foo.1
            bar.1
    build-binary.sh
    build-manual.sh
    Makefile
    foo.lisp
    bar.lisp

The bin directory is where the executable files end up. I've added it to my $PATH so I don't have to symlink or copy the binaries anywhere.

man contains the generated man pages. Because it's adjacent to bin (which is on my path) the man program automatically finds the man pages as expected.

build-binary.sh, build-manual.sh, and Makefile are some glue to make building programs easier.

The .lisp files are the programs. Each new program I want to add only requires adding the <programname>.lisp file in this directory and running make.

Lisp Files

My small Common Lisp programs follow a few conventions that make building them easier. Let's look at the skeleton of a foo.lisp file as an example. I'll show the entire file here, and then step through it piece by piece.

(eval-when (:compile-toplevel :load-toplevel :execute)
  (ql:quickload '(:with-user-abort) :silent t))

(defpackage :foo
  (:use :cl)
  (:export :toplevel *ui*))

(in-package :foo)

;;;; Configuration -----------------------------------------------
(defparameter *whatever* 123)

;;;; Errors ------------------------------------------------------
(define-condition user-error (error) ())

(define-condition missing-foo (user-error) ()
  (:report "A foo is required, but none was supplied."))

;;;; Functionality -----------------------------------------------
(defun foo (string))

;;;; Run ---------------------------------------------------------
(defun run (arguments)
  (map nil #'foo arguments))

;;;; User Interface ----------------------------------------------
(defmacro exit-on-ctrl-c (&body body)
  `(handler-case (with-user-abort:with-user-abort (progn ,@body))
     (with-user-abort:user-abort () (sb-ext:exit :code 130))))

(defparameter *ui*
  (adopt:make-interface
    :name "foo"))

(defun toplevel ()
  (sb-ext:disable-debugger)
  (exit-on-ctrl-c
    (multiple-value-bind (arguments options) (adopt:parse-options-or-exit *ui*); Handle options.
      (handler-case (run arguments)
        (user-error (e) (adopt:print-error-and-exit e))))))

Let's go through each chunk of this.

(eval-when (:compile-toplevel :load-toplevel :execute)
  (ql:quickload '(:with-user-abort) :silent t))

First we quickload any necessary libraries. We always want to do this, even when compiling the file, because we need the appropriate packages to exist when we try to use their symbols later in the file.

with-user-abort is a library for easily handling control-c, which all of these small programs will use.

(defpackage :foo
  (:use :cl)
  (:export :toplevel *ui*))

(in-package :foo)

Next we define a package foo and switch to it. The package is always named the same as the resulting binary and the basename of the file, and always exports the symbols toplevel and *ui*. These conventions make it easy to build everything automatically with make later.

;;;; Configuration -----------------------------------------------
(defparameter *whatever* 123)

Next we define any configuration variables. These will be set later after parsing the command line arguments (when we run the command line program) or at the REPL (when developing interactively).

;;;; Errors ------------------------------------------------------
(define-condition user-error (error) ())

(define-condition missing-foo (user-error) ()
  (:report "A foo is required, but none was supplied."))

We define a user-error condition, and any errors the user might make will inherit from it. This will make it easy to treat user errors (e.g. passing a mangled regular expression like (foo+ as an argument) differently from programming errors (i.e. bugs). This makes it easier to treat those errors differently:

;;;; Functionality -----------------------------------------------
(defun foo (string))

Next we have the actual functionality of the program.

;;;; Run ---------------------------------------------------------
(defun run (arguments)
  (map nil #'foo arguments))

We define a function run that takes some arguments (as strings) and performs the main work of the program.

Importantly, run does not handle command line argument parsing, and it does not exit the program with an error code, which means we can safely call it to say "run the whole program" when we're developing interactively without worrying about it killing our Lisp process.

Now we need to define the command line interface.

;;;; User Interface ----------------------------------------------
(defmacro exit-on-ctrl-c (&body body)
  `(handler-case (with-user-abort:with-user-abort (progn ,@body))
     (with-user-abort:user-abort () (adopt:exit 130))))

We'll make a little macro around with-user-abort to make it less wordy. We'll exit with a status of 130 if the user presses ctrl-c. Maybe some day I'll pull this into Adopt so I don't have to copy these three lines everywhere.

(defparameter *ui*
  (adopt:make-interface
    :name "foo"))

Here we define the *ui* variable whose symbol we exported above. Adopt is a command line argument parsing library I wrote. If you want to use a different library, feel free.

(defun toplevel ()
  (sb-ext:disable-debugger)
  (exit-on-ctrl-c
    (multiple-value-bind (arguments options) (adopt:parse-options-or-exit *ui*); Handle options.
      (handler-case (run arguments)
        (user-error (e) (adopt:print-error-and-exit e))))))

And finally we define the toplevel function. This will only ever be called when the program is run as a standalone program, never interactively. It handles all the work beyond the main guts of the program (which are handled by the run function), including:

That's it for the structure of the .lisp files.

Building Binaries

build-binary.sh is a small script to build the executable binaries from the .lisp files. ./build-binary.sh foo.lisp will build foo:

#!/usr/bin/env bash

set -euo pipefail

LISP=$1
NAME=$(basename "$1" .lisp)
shift

sbcl --load "$LISP" \
     --eval "(sb-ext:save-lisp-and-die \"$NAME\"
               :executable t
               :save-runtime-options t
               :toplevel '$NAME:toplevel)"

Here we see where the naming conventions have become important — we know that the package is named the same as the binary and that it will have the symbol toplevel exported, which always names the entry point for the binary.

Building Man Pages

build-manual.sh is similar and builds the man pages using Adopt's built-in man page generation. If you don't care about building man pages for your personal programs you can ignore this. I admit that generating man pages for these programs is a little bit silly because they're only for my own personal use, but I get it for free with Adopt, so why not?

#!/usr/bin/env bash

set -euo pipefail

LISP=$1
NAME=$(basename "$LISP" .lisp)
OUT="$NAME.1"
shift

sbcl --load "$LISP" \
     --eval "(with-open-file (f \"$OUT\" :direction :output :if-exists :supersede)
               (adopt:print-manual $NAME:*ui* :stream f))" \
     --quit

This is why we always name the Adopt interface variable *ui* and export it from the package.

Makefile

Finally we have a simple Makefile so we can run make to regenerate any out of date binaries and man pages:

files := $(wildcard *.lisp)
names := $(files:.lisp=)

.PHONY: all clean $(names)

all: $(names)

$(names): %: bin/% man/man1/%.1

bin/%: %.lisp build-binary.sh Makefile
    mkdir -p bin
    ./build-binary.sh $<
    mv $(@F) bin/

man/man1/%.1: %.lisp build-manual.sh Makefile
    mkdir -p man/man1
    ./build-manual.sh $<
    mv $(@F) man/man1/

clean:
    rm -rf bin man

We use a wildcard to automatically find the .lisp files so we don't have to do anything extra after adding a new file when we want to make a new program.

The most notable line here is $(names): %: bin/% man/man1/%.1 which uses a static pattern rule to automatically define the phony rules for building each program. If $(names) is foo bar this line effectively defines two phony rules:

foo: bin/foo man/man1/foo.1
bar: bin/bar man/man1/bar.1

This lets us run make foo to make both the binary and man page for foo.lisp.

Case Study: A Batch Coloring Utility

Now that we've seen the skeleton, let's look at one of my actual programs that I use all the time. It's called batchcolor and it's used to highlight regular expression matches in text (usually log files) with a twist: each unique match is highlighted in a separate color, which makes it easier to visually parse the result.

For example: suppose we have some log files with lines of the form <timestamp> [<request ID>] <level> <message> where request ID is a UUID, and messages might contain other UUIDs for various things. Such a log file might look something like this:

2021-01-02 14:01:45 [f788a624-8dcd-4c5e-b1e8-681d0a68a8d3] INFO Incoming request GET /users/28b2d548-eff1-471c-b807-cc2bcee76b7d/things/7ca6d8d2-5038-42bd-a559-b3ee0c8b7543/
2021-01-02 14:01:45 [f788a624-8dcd-4c5e-b1e8-681d0a68a8d3] INFO Thing 7ca6d8d2-5038-42bd-a559-b3ee0c8b7543 is not cached, retrieving...
2021-01-02 14:01:45 [f788a624-8dcd-4c5e-b1e8-681d0a68a8d3] WARN User 28b2d548-eff1-471c-b807-cc2bcee76b7d does not have access to thing 7ca6d8d2-5038-42bd-a559-b3ee0c8b7543, denying request.
2021-01-02 14:01:46 [f788a624-8dcd-4c5e-b1e8-681d0a68a8d3] INFO Returning HTTP 404.
2021-01-02 14:01:46 [bea6ae06-bd06-4d2a-ae35-3e83fea2edc7] INFO Incoming request GET /users/28b2d548-eff1-471c-b807-cc2bcee76b7d/things/7ca6d8d2-5038-42bd-a559-b3ee0c8d7543/
2021-01-02 14:01:46 [bea6ae06-bd06-4d2a-ae35-3e83fea2edc7] INFO Thing 7ca6d8d2-5038-42bd-a559-b3ee0c8d7543 is not cached, retrieving...
2021-01-02 14:01:46 [b04ced1d-1cfa-4315-aaa9-0e245ff9a8e1] INFO Incoming request POST /users/sign-up/
2021-01-02 14:01:46 [bea6ae06-bd06-4d2a-ae35-3e83fea2edc7] INFO Returning HTTP 200.
2021-01-02 14:01:46 [b04ced1d-1cfa-4315-aaa9-0e245ff9a8e1] ERR Error running SQL query: connection refused.
2021-01-02 14:01:47 [b04ced1d-1cfa-4315-aaa9-0e245ff9a8e1] ERR Returning HTTP 500.

If I try to just read this directly, it's easy for my eyes to glaze over unless I laboriously walk line-by-line.

Screenshot of uncolored log output

I could use grep to highlight the UUIDs:

grep -P \
    '\b[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}\b' \
    example.log

Unfortunately that doesn't really help too much because all the UUIDs are highlighted the same color:

Screenshot of grep-colored log output

To get a more readable version of the log, I use batchcolor:

batchcolor \
    '\b[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}\b' \
    example.log

batchcolor also highlights matches, but it highlights each unique match in its own color:

Screenshot of batchcolored log output

This is much easier for me to visually parse. The interleaving of separate request logs is now obvious from the colors of the IDs, and it's easy to match up various user IDs and thing IDs at a glance. Did you even notice that the two thing IDs were different before?

batchcolor has a few other quality of life features, like picking explicit colors for specific strings (e.g. red for ERR):

Screenshot of fully batchcolored log output

I use this particular batchcolor invocation so often I've put it in its own tiny shell script. I use it to tail log files when developing locally almost every day, and it makes visually scanning the log output much easier. It can come in handy for other kinds of text too, like highlighting nicknames in an IRC log.

Let's step through its code piece by piece.

Libraries

(eval-when (:compile-toplevel :load-toplevel :execute)
  (ql:quickload '(:adopt :cl-ppcre :with-user-abort) :silent t))

First we quickload libraries. We'll use Adopt for command line argument processing, cl-ppcre for regular expressions, and the previously-mentioned with-user-abort to handle control-c.

Package

(defpackage :batchcolor
  (:use :cl)
  (:export :toplevel :*ui*))

(in-package :batchcolor)

We define and switch to the appropriately-named package. Nothing special here.

Configuration

;;;; Configuration ------------------------------------------------------------
(defparameter *start* 0)
(defparameter *dark* t)

Next we defparameter some variables to hold some settings. *start* will be used later when randomizing colors, don't worry about it for now.

Errors

;;;; Errors -------------------------------------------------------------------
(define-condition user-error (error) ())

(define-condition missing-regex (user-error) ()
  (:report "A regular expression is required."))

(define-condition malformed-regex (user-error)
  ((underlying-error :initarg :underlying-error))
  (:report (lambda (c s)
             (format s "Invalid regex: ~A" (slot-value c 'underlying-error)))))

(define-condition overlapping-groups (user-error) ()
  (:report "Invalid regex: seems to contain overlapping capturing groups."))

(define-condition malformed-explicit (user-error)
  ((spec :initarg :spec))
  (:report
    (lambda (c s)
      (format s "Invalid explicit spec ~S, must be of the form \"R,G,B:string\" with colors being 0-5."
              (slot-value c 'spec)))))

Here we define the user errors. Some of these are self-explanatory, while others will make more sense later once we see them in action. The specific details aren't as important as the overall idea: for user errors we know might happen, display a helpful error message instead of just spewing a backtrace at the user.

Colorization

Next we have the actual meat of the program. Obviously this is going to be completely different for every program, so feel free to skip this if you don't care about this specific problem.

;;;; Functionality ------------------------------------------------------------
(defun rgb-code (r g b)
  ;; The 256 color mode color values are essentially r/g/b in base 6, but
  ;; shifted 16 higher to account for the intiial 8+8 colors.
  (+ (* r 36)
     (* g 6)
     (* b 1)
     16))

We're going to highlight different matches with different colors. We'll need a reasonable amount of colors to make this useful, so using the basic 8/16 ANSI colors isn't enough. Full 24-bit truecolor is overkill, but the 8-bit ANSI colors will work nicely. If we ignore the base colors, we essentially have 6 x 6 x 6 = 216 colors to work with. rgb-code will take the red, green, and blue values from 0 to 5 and return the color code. See Wikipedia for more information.

(defun make-colors (excludep)
  (let ((result (make-array 256 :fill-pointer 0)))
    (dotimes (r 6)
      (dotimes (g 6)
        (dotimes (b 6)
          (unless (funcall excludep (+ r g b))
            (vector-push-extend (rgb-code r g b) result)))))
    result))

(defparameter *dark-colors*  (make-colors (lambda (v) (< v 3))))
(defparameter *light-colors* (make-colors (lambda (v) (> v 11))))

Now we can build some arrays of colors. We could use any of the 216 available colors, but in practice we probably don't want to, because the darkest colors will be too dark to read on a dark terminal, and vice versa for light terminals. In a concession to practicality we'll generate two separate arrays of colors, one that excludes colors whose total value is too dark and one excluding those that are too light.

(Notice that *dark-colors* is "the array of colors which are suitable for use on dark terminals" and not "the array of colors which are themselves dark". Naming things is hard.)

Note that these arrays will be generated when the batchcolor.lisp file is loaded, which is when we build the binary. They won't be recomputed every time you run the resulting binary. In this case it doesn't really matter (the arrays are small) but it's worth remembering in case you ever have some data you want (or don't want) to compute at build time instead of run time.

(defparameter *explicits* (make-hash-table :test #'equal))

Here we make a hash table to store the strings and colors for strings we want to explicitly color (e.g. ERR should be red, INFO cyan). The keys will be the strings and values the RGB codes.

(defun djb2 (string)
  ;; http://www.cse.yorku.ca/~oz/hash.html
  (reduce (lambda (hash c)
            (mod (+ (* 33 hash) c) (expt 2 64)))
          string
          :initial-value 5381
          :key #'char-code))

(defun find-color (string)
  (gethash string *explicits*
           (let ((colors (if *dark* *dark-colors* *light-colors*)))
             (aref colors
                   (mod (+ (djb2 string) *start*)
                        (length colors))))))

For strings that we want to explicitly color, we just look up the appropriate code in *explicits* and return it.

Otherwise, we want to highlight unique matches in different colors. There are a number of different ways we could do this, for example: we could randomly pick a color the first time we see a string and store it in a hash table for subsequent encounters. But this would mean we'd grow that hash table over time, and one of the things I often use this utility for is tail -fing long-running processes when developing locally, so the memory usage would grow and grow until the batchcolor process was restarted, which isn't ideal.

Instead, we'll hash each string with a simple DJB hash and use it to index into the appropriate array of colors. This ensures that identical matches get identical colors, and avoids having to store every match we've ever seen.

There will be some collisions, but there's not much we can do about that with only ~200 colors to work with. We could have used 16-bit colors like I mentioned before, but then we'd have to worry about picking colors different enough for humans to easily tell apart, and for this simple utility I didn't want to bother.

We'll talk about *start* later, ignore it for now (it's 0 by default).

(defun ansi-color-start (color)
  (format nil "~C[38;5;~Dm" #\Escape color))

(defun ansi-color-end ()
  (format nil "~C[0m" #\Escape))

(defun print-colorized (string)
  (format *standard-output* "~A~A~A"
          (ansi-color-start (find-color string))
          string
          (ansi-color-end)))

Next we have some functions to output the appropriate ANSI escapes to highlight our matches. We could use a library for this but it's only two lines. It's not worth it.

And now we have the beating heart of the program:

(defun colorize-line (scanner line &aux (start 0))
  (ppcre:do-scans (ms me rs re scanner line)
    ;; If we don't have any register groups, colorize the entire match.
    ;; Otherwise, colorize each matched capturing group.
    (let* ((regs? (plusp (length rs)))
           (starts (if regs? (remove nil rs) (list ms)))
           (ends   (if regs? (remove nil re) (list me))))
      (map nil (lambda (word-start word-end)
                 (unless (<= start word-start)
                   (error 'overlapping-groups))
                 (write-string line *standard-output* :start start :end word-start)
                 (print-colorized (subseq line word-start word-end))
                 (setf start word-end))
           starts ends)))
  (write-line line *standard-output* :start start))

colorize-line takes a CL-PPCRE scanner and a line, and outputs the line with any of the desired matches colorized appropriately. There are a few things to note here.

First: if the regular expression contains any capturing groups, we'll only colorize those parts of the match. For example: if you run batchcolor '^<(\\w+)> ' to colorize the nicks in an IRC log, only the nicknames themselves will be highlighted, not the surrounding angle brackets. Otherwise, if there are no capturing groups in the regular expression, we'll highlight the entire match (as if there were one big capturing group around the whole thing).

Second: overlapping capturing groups are explicitly disallowed and a user-error signaled if we notice any. It's not clear what do to in this case — if we match ((f)oo|(b)oo) against foo, what should the output be? Highlight f and oo in the same color? In different colors? Should the oo be a different color than the oo in boo? There's too many options with no clear winner, so we'll just tell the user to be more clear.

To do the actual work we iterate over each match and print the non-highlighted text before the match, then print the highlighted match. Finally we print any remaining text after the last match.

Not-Quite-Top-Level Interface

;;;; Run ----------------------------------------------------------------------
(defun run% (scanner stream)
  (loop :for line = (read-line stream nil)
        :while line
        :do (colorize-line scanner line)))

(defun run (pattern paths)
  (let ((scanner (handler-case (ppcre:create-scanner pattern)
                   (ppcre:ppcre-syntax-error (c)
                     (error 'malformed-regex :underlying-error c))))
        (paths (or paths '("-"))))
    (dolist (path paths)
      (if (string= "-" path)
        (run% scanner *standard-input*)
        (with-open-file (stream path :direction :input)
          (run% scanner stream))))))

Here we have the not-quite-top-level interface to the program. run takes a pattern string and a list of paths and runs the colorization on each path. This is safe to call interactively from the REPL, e.g. (run "<(\\w+)>" "foo.txt"), so we can test without worrying about killing the Lisp process.

User Interface

In the last chunk of the file we have the user interface. There are a couple of things to note here.

I'm using a command line argument parsing library I wrote myself: Adopt. I won't go over exactly what all the various Adopt functions do. Most of them should be fairly easy to understand, but check out the Adopt documentation for the full story if you're curious.

If you prefer another library (and there are quite a few around) feel free to use it — it should be pretty easy to adapt this setup to a different library. The only things you'd need to change would be the toplevel function and the build-manual.sh script (if you even care about building man pages at all).

You might also notice that the user interface for the program is almost as much code as the entire rest of the program. This may seem strange, but I think it makes a certain kind of sense. When you're writing code to interface with an external system, a messier and more complicated external system will usually require more code than a cleaner and simpler external system. A human brain is probably the messiest and most complicated external system you'll ever have to deal with, so it's worth taking the extra time and code to be especially careful when writing an interface to it.

First we'll define a typical -h/--help option:

(defparameter *option-help*
  (adopt:make-option 'help
    :help "Display help and exit."
    :long "help"
    :short #\h
    :reduce (constantly t)))

Next we'll define a pair of options for enabling/disabling the Lisp debugger:

(adopt:defparameters (*option-debug* *option-no-debug*)
  (adopt:make-boolean-options 'debug
    :long "debug"
    :short #\d
    :help "Enable the Lisp debugger."
    :help-no "Disable the Lisp debugger (the default)."))

By default the debugger will be off, so any unexpected error will print a backtrace to standard error and exit with a nonzero exit code. This is the default because if I add a batchcolor somewhere in a shell script, I probably don't want to suddenly hang the entire script if something breaks. But we still want to be able to get into the debugger manually if something goes wrong. This is Common Lisp — we don't have to settle for a stack trace or core dump, we can have a real interactive debugger in the final binary.

Note how Adopt's make-boolean-options function creates two options here:

Even though disabled is the default, it's still important to have both switches for boolean options like this. If someone wants the debugger to be enabled by default instead (along with some other configuration options), they might have a shell alias like this:

alias bcolor='batchcolor --debug --foo --bar'

But sometimes they might want to temporarily disable the debugger for a single run. Without a --no-debug option, they would have to run the vanilla batchcolor and retype all the other options. But having the --no-debug option allows them to just say:

bcolor --no-debug

This would expand to:

batchcolor --debug --foo --bar --no-debug

The later option wins, and the user gets the behavior they expect.

Next we'll define some color-related options. First an option to randomize the colors each run, instead of always picking the same color for a particular string, and then a toggle for choosing colors that work for dark or light terminals:

(adopt:defparameters (*option-randomize* *option-no-randomize*)
  (adopt:make-boolean-options 'randomize
    :help "Randomize the choice of color each run."
    :help-no "Do not randomize the choice of color each run (the default)."
    :long "randomize"
    :short #\r))

(adopt:defparameters (*option-dark* *option-light*)
  (adopt:make-boolean-options 'dark
    :name-no 'light
    :long "dark"
    :long-no "light"
    :help "Optimize for dark terminals (the default)."
    :help-no "Optimize for light terminals."
    :initial-value t))

The last option we'll define is -e/--explicit, to allow the user to select an explicit color for a particular string:

(defun parse-explicit (spec)
  (ppcre:register-groups-bind
      ((#'parse-integer r g b) string)
      ("^([0-5]),([0-5]),([0-5]):(.+)$" spec)
    (return-from parse-explicit (cons string (rgb-code r g b))))
  (error 'malformed-explicit :spec spec))

(defparameter *option-explicit*
  (adopt:make-option 'explicit
    :parameter "R,G,B:STRING"
    :help "Highlight STRING in an explicit color.  May be given multiple times."
    :manual (format nil "~
      Highlight STRING in an explicit color instead of randomly choosing one.  ~
      R, G, and B must be 0-5.  STRING is treated as literal string, not a regex.  ~
      Note that this doesn't automatically add STRING to the overall regex, you ~
      must do that yourself!  This is a known bug that may be fixed in the future.")
    :long "explicit"
    :short #\e
    :key #'parse-explicit
    :reduce #'adopt:collect))

Notice how we signal a malformed-explicit condition if the user gives us mangled text. This is a subtype of user-error, so the program will print the error and exit even if the debugger is enabled. We also include a slightly more verbose description in the man page than the terse one in the --help text.

Next we write the main help and manual text, as well as some real-world examples:

(adopt:define-string *help-text*
  "batchcolor takes a regular expression and matches it against standard ~
   input one line at a time.  Each unique match is highlighted in its own color.~@
   ~@
   If the regular expression contains any capturing groups, only those parts of ~
   the matches will be highlighted.  Otherwise the entire match will be ~
   highlighted.  Overlapping capturing groups are not supported.")

(adopt:define-string *extra-manual-text*
  "If no FILEs are given, standard input will be used.  A file of - stands for ~
   standard input as well.~@
   ~@
   Overlapping capturing groups are not supported because it's not clear what ~
   the result should be.  For example: what should ((f)oo|(b)oo) highlight when ~
   matched against 'foo'?  Should it highlight 'foo' in one color?  The 'f' in ~
   one color and 'oo' in another color?  Should that 'oo' be the same color as ~
   the 'oo' in 'boo' even though the overall match was different?  There are too ~
   many possible behaviors and no clear winner, so batchcolor disallows ~
   overlapping capturing groups entirely.")

(defparameter *examples*
  '(("Colorize IRC nicknames in a chat log:"
     . "cat channel.log | batchcolor '<(\\\\w+)>'")
    ("Colorize UUIDs in a request log:"
     . "tail -f /var/log/foo | batchcolor '[a-f0-9]{8}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{4}-[a-f0-9]{12}'")
    ("Colorize some keywords explicitly and IPv4 addresses randomly (note that the keywords have to be in the main regex too, not just in the -e options):"
     . "batchcolor 'WARN|INFO|ERR|(?:[0-9]{1,3}\\\\.){3}[0-9]{1,3}' -e '5,0,0:ERR' -e '5,4,0:WARN' -e '2,2,5:INFO' foo.log")
    ("Colorize earmuffed symbols in a Lisp file:"
     . "batchcolor '(?:^|[^*])([*][-a-zA-Z0-9]+[*])(?:$|[^*])' tests/test.lisp")))

Finally we can wire everything together in the main Adopt interface:

(defparameter *ui*
  (adopt:make-interface
    :name "batchcolor"
    :usage "[OPTIONS] REGEX [FILE...]"
    :summary "colorize regex matches in batches"
    :help *help-text*
    :manual (format nil "~A~2%~A" *help-text* *extra-manual-text*)
    :examples *examples*
    :contents (list
                *option-help*
                *option-debug*
                *option-no-debug*
                (adopt:make-group 'color-options
                                  :title "Color Options"
                                  :options (list *option-randomize*
                                                 *option-no-randomize*
                                                 *option-dark*
                                                 *option-light*
                                                 *option-explicit*)))))

All that's left to do is the top-level function that will be called when the binary is executed.

Top-Level Interface

Before we write toplevel we've got a couple of helpers:

(defmacro exit-on-ctrl-c (&body body)
  `(handler-case (with-user-abort:with-user-abort (progn ,@body))
     (with-user-abort:user-abort () (adopt:exit 130))))

(defun configure (options)
  (loop :for (string . rgb) :in (gethash 'explicit options)
        :do (setf (gethash string *explicits*) rgb))
  (setf *start* (if (gethash 'randomize options)
                  (random 256 (make-random-state t))
                  0)
        *dark* (gethash 'dark options)))

Our toplevel function looks much like the one in the skeleton, but fleshed out a bit more:

(defun toplevel ()
  (sb-ext:disable-debugger)
  (exit-on-ctrl-c
    (multiple-value-bind (arguments options) (adopt:parse-options-or-exit *ui*)
      (when (gethash 'debug options)
        (sb-ext:enable-debugger))
      (handler-case
          (cond
            ((gethash 'help options) (adopt:print-help-and-exit *ui*))
            ((null arguments) (error 'missing-regex))
            (t (destructuring-bind (pattern . files) arguments
                 (configure options)
                 (run pattern files))))
        (user-error (e) (adopt:print-error-and-exit e))))))

This toplevel has a few extra bits beyond the skeletal example.

First, we disable the debugger immediately, and then re-enable it later if the user asks us to. We want to keep it disabled until after argument parsing because we can't know whether the user wants it or not until we parse the arguments.

Instead of just blindly running run, we check for --help and print it if desired. We also validate that the user passes the correct amount of arguments, signaling a subtype of user-error if they don't. Assuming everything looks good we handle the configuration, call run, and that's it!

Running make generates bin/batchcolor and man/man1/batchcolor.1, and we can view our log files in beautiful color.

More Information

I hope this overview was helpful. This has worked for me, but Common Lisp is a flexible language, so if you want to use this layout as a starting point and modify it for your own needs, go for it!

If you want to see some more examples you can find them in my dotfiles repository. Some of the more fun ones include:

The approach I laid out in this post works well for small, single-file programs. If you're creating a larger program you'll probably want to move to a full ASDF system in its own directory/repository. My friend Ian wrote a post about that which you might find interesting.