While I was doing some programming in Racket, I found that I wanted to be able to define variables in the middle of a cond block, so that I could use the variable in subsequent conditions, without having to add an extra level of nesting.
To do this, I created the following macro:
(define-syntax (cond/define stx) (syntax-parse stx ;; terminating with else [(_ [(~datum else) expr ...]) #'(begin expr ...)] ;; continuing with define [(_ ((~datum define) name expr) cond ...) #'(let ([name expr]) (cond/define cond ...))] ;; continuing with a condition [(_ [condition expr ...] cond ...) #'(if condition (begin expr ...) (cond/define cond ...))]))
Here an example of using the macro to write a program that detects if a byte string is text in the ISO-2022-JP character encoding:
(: iso-2022-jp? : Bytes -> Boolean) (define (iso-2022-jp? bs) (let loop ([i 0]) (cond/define ;; reached the end with no match? [(>= i (bytes-length bs)) #f] (define b (bytes-ref bs i)) ;; must only use 7-bits [(>= b 128) #f] ;; must contain byte 0x1B somewhere [(not (= b #x1B)) (loop (add1 i))] ;; byte 0x1B must be immediately followed by ( or $ (define b2 (bytes-ref bs (add1 i))) [else (or (= b2 (char->integer #\()) (= b2 (char->integer #\$)))])))
If I hadn't used cond/define
, I would have had to nest several cond
s and else
s, which would have been annoying and would have made the code hard to read.
You are welcome to use, reproduce and modify my cond/define
macro without restriction. Attribution is optional.
To demonstrate more or less how the macro works, let's work with a smaller example that just checks if a byte string is 7-bit:
(: 7-bit? : Bytes -> Boolean) (define (7-bit? bs) (let loop ([i 0]) (cond/define ;; reached the end with no match? [(>= i (bytes-length bs)) #t] (define b (bytes-ref bs i)) ;; must only use 7-bits [(>= b 128) #f] [else (loop (add1 i))])))
Now to wade into the macro:
(define-syntax (cond/define stx)
This defines the symbol cond/define
as a macro. Whenever cond/define
is used in the program, it will be processed at compile time. This will capture the surrounding s-exp and save it as stx
. So in the above example, stx
is the following:
(cond/define [(>= i (bytes-length bs)) #t] (define b (bytes-ref bs 1)) [(>= b 128) #f] [else (loop (add1 i))])
Technically, stx
is a syntax object, not an s-expression. So it's not quite the datum above. It's a special data type that also has extra information attached, like where that s-exp appeared in the source code, and what its scope is. But we won't worry about that too much. You should read Fear of Macros (section 3.2) if you want to learn all about syntax objects.
(syntax-parse stx
We want to extract information from stx
and do pattern-matching on it. syntax-parse
is critical for this. It lets us go through and see if the syntax object looks like something. And if it does, we can process parts of that something.
cond/define
can contain multiple conditions and definitions within it. To process a variable amount of data, we will make cond/define
recursive. The usual way to program recursion is to handle the base case first, and make the other cases build upon it. Because the else
block is the final thing to be executed, let's use that as our base case.
;; terminating with else [(_ [(~datum else) expr ...])
This is a pattern for pattern matching. It will be matched against stx
. Let's say we're currently processing the base case, so stx
is:
(cond/define [else (loop (add1 i)) ])
It will be matched against this pattern:
(_ [(~datum else) expr ... ])
Key things to know about the pattern:
_
in the pattern means match any single s-exp.- Any word in the pattern, like
expr
, means match any single s-exp and save it as that word for later use. This is called a pattern variable. You know how(define life 42)
lets you save thelife
variable for later use? Words in the pattern do the same thing, they save a pattern variable for later use. ...
in the pattern means match the last thing 0 or more times.(~datum foo)
in the pattern means it expects to match literally the s-expfoo
(rather than treatingfoo
as a pattern variable, which it would otherwise do).
Let's step through it:
cond/define
matches_
[
starts a nested s-exp in the data, and[
starts a nested s-exp in the patternelse
matches(~datum else)
(loop (add1 i))
matchesexpr ...
- This binds
expr ...
as a pattern variable - When
expr ...
is used later, it will return the saved code(loop (add1 i))
- This binds
Great, that's how it matches. Back to looking at the big macro again:
#'(begin expr ...)]
This line starts with #'
. This is a template. Since it is in the last position of syntax-parse
, it will be used as the result of the macro. This happens at compile time. As you're aware, the result of a macro is not just a Racket value, it is code that replaces the original code.
The resulting code is (begin (loop (add1 i)))
(because expr ...
was used as a pattern variable).
;; continuing with define [(_ ((~datum define) name expr) cond ...)
This is the next pattern in the syntax-parse
. If the s-exp didn't match the previous pattern, we try matching it against this one instead.
Looking at the pattern, you can see it's very similar to the previous one, but it matches define
instead of else
. There are also some more subtle differences:
- Rather than binding the pattern variable
expr ...
to have many expressions, we bind two pattern variablesname
andexpr
. These correspond to the two arguments thatdefine
takes. - This is not the recursion base case, so it needs to do recursion. The pattern ends with
cond ...
. This will pick up any further definitions/conditions and allow us to recurse through those later.
That was dense. Let's look at an example. We'll match this data:
(cond/define (define life (* 2 21)) [else life])
Against this pattern:
[(_ ((~datum define) name expr ) cond ... )
life
matchesname
, this bindsname
as a pattern variable with the valuelife
(* 2 21)
matchesexpr
, this bindsexpr
as a pattern variable with the value(* 2 21)
- This isn't
42
yet because we're still in compile time. - Later, during run time, this will be evaluated and will result in
42
.
- This isn't
[else life]
matchescond ...
#'(let ([name expr]) (cond/define cond ...))]
Since this line starts with #'
, it's another template. Like before, the pattern variables will be substituted in the template, and this is the returned code. After substitution, it looks like this:
(let ([life (* 2 21)]) (cond/define [else life]))]
Wait a sec - the returned code includes another use of cond/define
? Yep, that's the recursion. Immediately after this substitution finishes, Racket will notice that the returned code still contains a macro, so it'll need to process that macro as well.
This recursion is how we process multiple definitions/conditions in cond/define
. We process the first one and recurse through the others.
;; continuing with a condition [(_ [condition expr ...] cond ...) #'(if condition (begin expr ...) (cond/define cond ...))]))
This is the final syntax-parse
pattern. You know everything you need to know to understand this one, so I won't explain it - instead, I'll point you to the macro stepper in DrRacket. The macro stepper allows you to visualise how Racket processes all the macros in your code.
Open DrRacket and paste in all the following code:
#lang racket/base (require (for-syntax racket/base syntax/parse)) (define-syntax (cond/define stx) (syntax-parse stx ;; terminating with else [(_ [(~datum else) expr ...]) #'(begin expr ...)] ;; continuing with define [(_ ((~datum define) name expr) cond ...) #'(let ([name expr]) (cond/define cond ...))] ;; continuing with a condition [(_ [condition expr ...] cond ...) #'(if condition (begin expr ...) (cond/define cond ...))])) (define (7-bit? bs) (let loop ([i 0]) (cond/define ;; reached the end with no match? [(>= i (bytes-length bs)) #f] (define b (bytes-ref bs i)) ;; must only use 7-bits [(>= b 128) #f] [else #t]))) (7-bit? #"abc123") (7-bit? #"\11") (7-bit? #"a\223c123")
Now click the Macro Stepper button at the top. The Macro Stepper window will open. Make sure it has "Macro hiding" set to "Standard" at the bottom. Now look at the middle pane to see the macro transformation. Racket has noticed that the red highlight code is a macro. It has replaced it with the green code below.
Since this is a recursive macro, the green code below still contains a macro. Racket knows about this. It has just analysed the first macro for now so that you can see what it's doing.
You can use the "Step" buttons at the top of the window to continue stepping through each execution of the macro. You'll notice that the amount of code being changed is reduced with each step as it gets closer to the recursion base case.
You can also click things in the Macro Stepper to highlight them yellow. This helps you track where something was before after after the macro.
Once you reach the end, the Macro Stepper will say "Expansion finished", and it'll look like this:
The different colours correspond to code being generated from different steps in the Macro Stepper. The red code was generated in the first step, the light green in the second step, the dark green in the third step, and the brown in the fourth step. All of the black code - you'll notice it before, after, and in the middle of the 7-bit function - is code that is present in the source file and wasn't created by any macro.
That's it for this example. You can find more examples of syntax-parse
in the documentation.
If you're just dipping your toes into Racket's macros, the Fear of Macros guide is excellent. I had to read it several times and experiment with my own macros for several months before I was able to write cond/define
. Previous iterations of it were much more messy and "unhygienic". syntax-parse
lets you write hygienic macros without direct s-exp manipulation - but it is much trickier to do at first because it's a whole separate API that you have to learn.
— Cadence
P.S.: if you forget to (require (for-syntax racket/base syntax/parse))
when trying to run any of this code, you'll get some very strange errors, like:
- _: wildcard not allowed as an expression after encountering unbound identifier (which is possibly the real problem): syntax-parse
- syntax: unbound identifier; also, no #%app syntax transformer is bound in the transformer phase