Playing With Syntax
Posted on August 19th, 2016.
One of the things I love about Lisp is that it gives you the ability to change and mold the syntax of the language to what you need. In this post I want to look at the evolution of a little macro I've been playing around with for a while now.
Mutation is a fundamental concept in most programming languages. Functional
programming may be beautiful, but mutation is still useful (and fast). In most
languages assignment is done with an =
or :=
operator:
x = 10
In Common Lisp the operator is named setf
instead of =
for historical
reasons, and of course it's used in prefix notation:
(setf x 10)
Aside from the prefix ordering, Common Lisp's syntax is already a bit more elegant because you can set arbitrarily many variables without repeating the assignment operator over and over again:
; Have to type out `=` three times
x = 10
y = 20
z = 30
; But only one `setf` required
(setf x 10
y 20
z 30)
setf
assigns values one-by-one, in-order. Common Lisp also has psetf
which
is "parallel setf
". This evaluates all the values first, then sets all the
variables at once:
(let ((a 10) (b 20))
(setf a 50
b (+ 1 a)) ; => (50 51)
(psetf a 90
b (+ 1 a))) ; => (90 52)
Assignment is nice, but often you want to change the value of a variable by taking its current value and computing something with it. If you're making a ball that travels across the screen you might have something like this:
def move_ball(ball, distance):
ball.x = ball.x + distance
(defun move-ball (ball distance)
(setf (slot-value ball 'x)
(+ (slot-value ball 'x)
distance)))
If you find slot-value
too wordy to type, you could make a macro to get rid of
it (and save yourself a quote too). .
is already used for something else so
you'd need to pick a different character:
(defmacro $ (instance slot-symbol)
`(slot-value ,instance ',slot-symbol))
(defun move-ball (ball distance)
(setf ($ ball x)
(+ ($ ball x)
distance)))
In practice, though, you rarely use slot-value
. Instead you usually use
accessor functions:
(defun move-ball (ball distance)
(setf (ball-x ball)
(+ (ball-x ball)
distance)))
Anyway, back to assignment. What we wrote above works, but it's a bit
annoying to have to type the "place" that we're working with twice. To cut down
on this repetition many languages offer operators like +=
and ++
:
x = x + 10
⬇
x += 10
y = y + 1
⬇
y++
Common Lisp's version of these is called incf
.
(incf x 10)
(incf y)
Notice how the s-expression syntax means we can use the same operator for both
of the forms, instead of needing separate +=
and ++
operators.
Other languages often have similar "in-place" operators for other numeric
operations like -=
, --
, *=
, and so on. Common Lisp has decf
for
subtraction, but doesn't have versions for multiplication or division built in.
But we can add them with a few macros:
(defmacro mulf (place factor)
`(setf ,place (* ,place ,factor)))
(defmacro divf (place divisor)
`(setf ,place (/ ,place ,divisor)))
Unfortunately these are not quite right — we've committed the cardinal
macro-writing sin of splicing in the place
expression multiple times. If
place
has side effects they'll happen more times than expected.
Luckily, Common Lisp has a macro that wraps up the boring process of handling
this for us. define-modify-macro
lets us fix the problem:
(define-modify-macro mulf (factor) *)
(define-modify-macro divf (divisor) /)
Now we've got versions of *=
and /=
in Common Lisp. While we're here we
should do things right and allow divf
to be used with just a place, which
will mean "set place
to 1/place
":
(define-modify-macro divf (&optional divisor)
(lambda (value divisor)
(if (null divisor)
(/ 1 value)
(/ value divisor))))
(let ((x 10))
(divf x 2) ; => 5
(divf x) ; => 1/5
(divf x)) ; => 5
Note that we don't really need the 1
in (/ 1 value)
, because Common Lisp's
/
works the same way when you call it with a single argument. I'm not sure
what behavior would make sense for a unary (mulf place)
, so let's ignore that
one and move on.
So far we've been talking about the idea of setting a variable to the result of
computing some function on its current value. We used define-modify-macro
to
define some support for extra functions, but defining a mutator macro for every
function we might possibly want to use could get tedious. What if we generalize
this a bit more?
(define-modify-macro callf (function)
(lambda (value function)
(funcall function value)))
(let ((x -1))
(callf x #'abs) ; => 1
(callf x #'1+) ; => 2
(callf x #'sin) ; => 0.9092974
(callf x #'round) ; => 1
(callf x #'-)) ; => -1
Now we've got something a bit higher-level. With callf
we can mutate
a variable with a unary function. But what about functions that need more
arguments? Again, Common Lisp does the right thing and handles &rest
parameters correctly in define-modify-macro
:
(define-modify-macro callf (function &rest args)
(lambda (value function &rest args)
(apply function value args)))
(let ((x -1))
(callf x #'abs) ; => 1
(callf x #'+ 10) ; => 11
(callf x #'* 2 4)) ; => 88
(let ((foo (list 0 1 2 3)))
(callf foo #'reverse)
; => (3 2 1 0)
(callf foo #'append (list -1 -2 -3)))
; => (3 2 1 0 -1 -2 -3)
That's better. Now we can mutate a variable by applying a function to its current value and some other arguments. This might come in handy when we don't want to bother defining a full-blown modification macro just to use it once or twice.
At this point we've got something similar to Michael Malis' zap macro.
A minor difference is for zap
the order of the function and the place are
reversed. I believe his reason was that it makes the resulting form look more
like the function call that is conceptually being made:
(zap #'+ x 5)
; ↓ ↓ ↓
(setf x ( + x 5 ))
Unfortunately, because the place is no longer the first argument we can't use
define-modify-macro
to handle the single-evaluation boilerplate for us any
more. Instead we need to use get-setf-expander
and work with its results.
Malis' excellent Getting Places article explains how that works, and if you
want to fully understand the rest of the code in this post you should read that
one first.
The difference between callf
and Malis' zap
are just cosmetic. But they're
still not completely flexible. What if we want to call a function, but we need
the value to be an argument other than the first? Of course we could use
a lambda
:
(let ((x (list :coffee :soda :tea)))
(callf x (lambda (l) (remove :soda l))))
But that is really ugly. It would be nice if we had a way to specify which argument to the function should be the current value. This is Lisp, so we can pick whatever syntax we want. What should we choose? Often a good first step is to look at how other languages do things. Clojure's anonymous function syntax is one possibility:
(map #(+ 10 %) [1 2 3])
; => (11 12 13)
Clojure uses the magic symbol %
as a placeholder for the argument. We can
modify Malis' zap
macro to do the same. We'll call it zapf
, and we'll swap
the place back into the first argument like all the other modify macros, since
the aesthetic reason no longer holds. The changes are in bold:
; Original version
(defmacro zap (fn place &rest args)
(multiple-value-bind
(temps exprs stores store-expr access-expr)
(get-setf-expansion place)
`(let* (,@(mapcar #'list temps exprs)
(,(car stores)
(funcall ,fn ,access-expr ,@args)))
,store-expr)))
; New version
(defmacro zapf (place fn &rest args)
(multiple-value-bind
(temps exprs stores store-expr access-expr)
(get-setf-expansion place)
`(let* (,@(mapcar #'list temps exprs)
(,(car stores)
(funcall ,fn ,@(substitute access-expr '% args))))
,store-expr)))
And now we can use %
to say where the place's value should go:
(let ((x (list :coffee :soda :tea)))
(zapf x #'remove :soda %)
; => (:coffee :tea)
(zapf x #'cons :water %)
; => (:water :coffee :tea)
(zapf x #'append % (list :cocoa :chai)))
; => (:water :coffee :tea :cocoa :chai)
Note that substitute
will replace all occurrences of %
with the access
expression, so we can do something like (zapf x #'append % %)
if we want.
Some people will find this new version unpleasant, because it effectively
captures the %
variable. It's not hygienic, by design. I personally don't
mind it, and I like the brevity it allows.
We're almost done, but I want to push things just a bit further. Let's revisit the original example of the ball moving across the screen:
def move_ball(ball, distance):
ball.x = ball.x + distance
(defun move-ball (ball distance)
(setf (ball-x ball)
(+ (ball-x ball) distance)))
Of course we can simply use +=
or incf
here:
def move_ball(ball, distance):
ball.x += distance
(defun move-ball (ball distance)
(incf (ball-x ball) distance))
Suppose we also wanted to make the ball wrap around the screen when it flies off one side. We can do this using modulus:
def move_ball(ball, distance, screen_width):
ball.x += distance
ball.x %= screen_width
(defun move-ball (ball distance screen-width)
(incf (ball-x ball) distance)
(callf (ball-x ball) #'mod % screen-width))
We could define modf
if we wanted to make that second call a bit nicer. But
let's take a moment to reflect. Notice how we're back to having to type out
(ball-x ball)
twice again. It's better than typing it four times, but can we
do better?
zapf
currently takes a place, a function, and a list of arguments. What if we
made it more like setf
, and just have it take a place and an expression? We
could still use %
as a placeholder, but could let it appear anywhere in the
(possibly nested) expression form.
One way to do this would be to simply replace substitute
with subst
in the
zapf
code. (The difference between these extremely-confusingly-named
functions is that substitute
replaces element of a sequence, and subst
replaces elements of a tree.)
This, however, is the wrong strategy. Blindly replacing %
everywhere in the
expression will work for many cases, but will break in edge cases like (god help
you) nested zapf
forms. We could try to walk the code of the expression we
get, but down this path lies madness and implementation-specificity.
The solution is much, much simpler. We can just use a normal let
to capture
%
in the expression:
; Final version
(defmacro zapf (place expr)
(multiple-value-bind
(temps exprs stores store-expr access-expr)
(get-setf-expansion place)
`(let* (,@(mapcar #'list temps exprs)
(,(car stores)
(let ((% ,access-expr))
,expr)))
,store-expr)))
And now move-ball
only requires the bare minimum of typing:
(defun move-ball (ball distance screen-width)
(zapf (ball-x ball)
(mod (+ distance %) screen-width)))
For completeness it would be nice to allow zapf
to take any number of
place/value pairs, just like setf
. I'll leave this as an exercise for you to
try if you're interested.
It took me a while to arrive at this form of the macro. I personally find it lovely and readable. If you disagree, that's fine too! Lisp is a wonderful substrate for building a language that fits how you think and talk about problems. Play around with your own syntax and find out what feels most natural for you.
Before we leave, let's just ponder one more thing. Performance is often ignored
these days, but what if we care about making the computer go fast? The final
zapf
macro is handy and expressive, but what cost have we paid for this
abstraction?
First let's tell SBCL that we want to go fast and return to our original setf
strategy:
(deftype nonnegative-fixnum ()
`(integer 0 ,most-positive-fixnum))
(deftype positive-fixnum ()
`(integer 1 ,most-positive-fixnum))
(defstruct ball
(x 0 :type nonnegative-fixnum)
(y 0 :type nonnegative-fixnum))
(declaim (ftype (function (ball fixnum positive-fixnum))
move-ball))
(defun move-ball (ball distance screen-width)
(declare (optimize (speed 3) (safety 0) (debug 0)))
(setf (ball-x ball)
(mod (+ distance (ball-x ball))
screen-width)))
How does that turn out?
; disassembly for MOVE-BALL
; Size: 82 bytes. Origin: #x1005E5E12E
; 2E: 488B410D MOV RAX, [RCX+13] ; no-arg-parsing entry point
; 32: 48D1F8 SAR RAX, 1
; 35: 498D3C00 LEA RDI, [R8+RAX]
; 39: 488BDE MOV RBX, RSI
; 3C: 48D1FB SAR RBX, 1
; 3F: 4885DB TEST RBX, RBX
; 42: 7431 JEQ L3
; 44: 488BC7 MOV RAX, RDI
; 47: 4899 CQO
; 49: 48F7FB IDIV RAX, RBX
; 4C: 4C8BC0 MOV R8, RAX
; 4F: 488BC2 MOV RAX, RDX
; 52: 48D1E0 SHL RAX, 1
; 55: 4885C0 TEST RAX, RAX
; 58: 7510 JNE L2
; 5A: L0: 488BD8 MOV RBX, RAX
; 5D: L1: 4889590D MOV [RCX+13], RBX
; 61: 488BD3 MOV RDX, RBX
; 64: 488BE5 MOV RSP, RBP
; 67: F8 CLC
; 68: 5D POP RBP
; 69: C3 RET
; 6A: L2: 4885FF TEST RDI, RDI
; 6D: 7DEB JNL L0
; 6F: 488D1C30 LEA RBX, [RAX+RSI]
; 73: EBE8 JMP L1
; 75: L3: 0F0B0A BREAK 10 ; error trap
; 78: 07 BYTE #X07
; 79: 08 BYTE #X08 ; DIVISION-BY-ZERO-ERROR
; 7A: FE9E03 BYTE #XFE, #X9E, #X03 ; RDI
; 7D: FE9E01 BYTE #XFE, #X9E, #X01 ; RBX
Well that's not too bad. I'm not sure why SBCL still performs a check for
division by zero for the mod
even though we said that screen-width
can't
possibly be zero. But hey, we're at a manageable amount of assembly, which is
pretty nice for a high-level language!
Now for the zapf
version:
(defun move-ball (ball distance screen-width)
(declare (optimize (speed 3) (safety 0) (debug 0)))
(zapf (ball-x ball)
(mod (+ distance %) screen-width)))
Okay, let's bite the bullet. How bad are we talking?
; disassembly for MOVE-BALL
; Size: 70 bytes. Origin: #x1005F47C7B
; 7B: 488B410D MOV RAX, [RCX+13] ; no-arg-parsing entry point
; 7F: 48D1F8 SAR RAX, 1
; 82: 488D3C03 LEA RDI, [RBX+RAX]
; 86: 4885F6 TEST RSI, RSI
; 89: 742B JEQ L2
; 8B: 488BC7 MOV RAX, RDI
; 8E: 4899 CQO
; 90: 48F7FE IDIV RAX, RSI
; 93: 488BD8 MOV RBX, RAX
; 96: 488D0412 LEA RAX, [RDX+RDX]
; 9A: 4885C0 TEST RAX, RAX
; 9D: 750D JNE L1
; 9F: L0: 48D1E2 SHL RDX, 1
; A2: 4889510D MOV [RCX+13], RDX
; A6: 488BE5 MOV RSP, RBP
; A9: F8 CLC
; AA: 5D POP RBP
; AB: C3 RET
; AC: L1: 4885FF TEST RDI, RDI
; AF: 7DEE JNL L0
; B1: 4801F2 ADD RDX, RSI
; B4: EBE9 JMP L0
; B6: L2: 0F0B0A BREAK 10 ; error trap
; B9: 07 BYTE #X07
; BA: 08 BYTE #X08 ; DIVISION-BY-ZERO-ERROR
; BB: FE9E03 BYTE #XFE, #X9E, #X03 ; RDI
; BE: FE1E03 BYTE #XFE, #X1E, #X03 ; RSI
Wait, it's actually shorter? What about all those extra variables the zapf
macro expands into? It turns out that SBCL is really quite good at optimizing
let
statements and local variables.
This is why I love Common Lisp. You can be rewriting the syntax of the language and working on a tower of abstraction one moment, and looking at X86 assembly to see what the hell the computer is actually doing the next. It's wonderful.