[R6RS] Enumerations proposal pre-draft

Tue May 23 15:15:06 EDT 2006

Mike wrote:
> > How about we flush the <index-of> procedure altogether?
> > Nothing in the proposal uses it, and it's hard to imagine
> > any good use for it.
> 
> I was thinking that it doubled as a predicate for the enumeration
> values.

But we can get that predicate by extracting it from any subset
of the universe.

> Arguably, that could be done via memq, but I was getting the
> impression that a macro could generate a faster version, also of the
> index-of operation.

Presumably the predicate we extract from any subset of the
universe would be fast.

> Possibly I'm wrong again---will Larceny
> recognize, say, for memq, that its list argument is constant and
> revert to hashing or something?

Yes.  When inlining is authorized for memq, as it will always
be when memq is imported from an R6RS library, and the second
argument is a constant list, it gets rewritten as a disjunction
of eq? tests.  Larceny has been doing that for a long time.

Larceny fully expands case expressions into the core language.
The resulting calls to memv are rewritten into calls to memq
when appropriate.  Those calls to memq are rewritten into calls
to eq? when appropriate.  Control optimization recognizes such
calls in the core language, long after case expressions and
other syntactic sugar have been macro-expanded away.

But all that is immaterial.  The predicate we extract from any
subset of the universe might be written in a less performant
language, such as C.

> > 2.  revert <type-name> to your original macro (although
> >     your draft also referred to it as a procedure and
> >     used it as such; my repair of that bug was what
> >     led me to overload <type-name>).
> 
> I'm probably slightly confused about what this means exactly---what's
> the canonical way to generate a list or vector of the enumeration
> values after the change?

I had in mind that we might require even the short form "to
specify either <constructor-syntax> or <constructor>".

> If we reverted to my original macro, wouldn't it simplify matters to
> also revert to giving a separate name to some opaque object
> representing the enum-set type?  Like so:
> 
> (define-enum-type :color
>   color
>   (black white purple maroon)
>   ...)

No.  That would complicate rather than simplify.

> This would avoid confusion over, say, `enum-set-predicate', which, if
> I understand your proposal correctly, would have:
> 
> ((enum-set-predicate ((enum-set-constructor (color)) 'black)
>  ((enum-set-constructor (color)) 'white))
>  => #t

Good point.  We should just do away with enum-set-predicate.
Your example above would become

(define-enumeration color
  (black white purple maroon)
  color-set)

(enum-set-member? 'black (enum-set-universe (color-set))) => #t

> This would allow us to have the `define-enumeration' macro to be about
> the syntax only, and relegate all the procedural stuff (<index-of>,
> <constructor>, <predicate>) to operations that extract them from the
> enum-set-type object.

Okay.  I have done that in my revision of the proposal, below.

> > Another issue that needs to be resolved is whether the
> > <constructor> takes a bunch of symbols as its arguments,
> > or a list of symbols as its single argument.
> 
> I would think that <constructor-syntax> will be much more common.
> Given that there's `enum-set-constructor', I don't think <constructor>
> is needed in the syntax form.  Given that it's not expected to be the
> common form, I think a list of symbols would be more appropriate.

Okay.

As with records, we might as well have a procedural interface.
At the very least, the procedural interface will help people
to understand what's going on with the syntactic interface.

                                * * *

Here is yet another draft of the enumerations proposal.

Highlights:

- Enumerated values are symbols.

- Everything can be done within the procedural interface.

- There is also one syntactic form, which is used to define
  a finite ordered (enumerated) universe of symbols.

- The representation of a set of symbols is not specified by
  this proposal.

                                * * *

Procedural interface.

 (make-enumeration <symbol-list>) => enum-set

`make-enumeration' takes an arbitrary list of symbols and
returns the universe consisting of those symbols, considered
as a subset of itself.  The canonical ordering of the symbols
in that universe is the same as the ordering of the first
appearance of each symbol in the list that was passed to
make-enumeration.

Let enum-set range over the finite sets of symbols that can
be defined using make-enumeration.  Then we have the following
additional procedures:

 (enum-set-universe enum-set) => enum-set
 (enum-set-indexer enum-set) => procedure
 (enum-set-constructor enum-set) => procedure
 (enum-set->list enum-set) => list of symbols
 (enum-set-member? symbol enum-set) => boolean
 (enum-set-subset? enum-set enum-set) => boolean
 (enum-set=? enum-set enum-set) => boolean
 (enum-set-union enum-set enum-set) => enum-set
 (enum-set-intersection enum-set enum-set) => enum-set
 (enum-set-difference enum-set enum-set)
 (enum-set-complement enum-set) => enum-set
 (enum-set-projection enum-set enum-set) => enum-set

`enum-set-universe' returns the set of all symbols that comprise
the universe of its argument; (enum-set-universe x) is equivalent
to (enum-set-complement (enum-set-difference x x)).

`enum-set-indexer' returns a unary predicate that, given a symbol
that is in the universe, returns its 0-origin index within the
canonical ordering of the symbols in the universe; given a value
not in the universe, the unary predicate returns #f.
enum-set-indexer could be defined as follows (untested):

    (define (enum-set-indexer set)
      (let* ((symbols (enum-set->list (enum-set-universe set)))
             (cardinality (length symbols)))
        (lambda (x)
          (let ((probe (memq x symbols)))
            (if probe
                (- cardinality (length probe))
                #f)))))

`enum-set-constructor' returns a unary procedure that, given a
list of symbols that belong to the universe, returns the subset
of that universe that contains exactly the symbols in the list.
If any value in the list is not a symbol that belongs to the
universe, then the unary procedure must raise a &domain
exception.

`enum-set->list' returns a list of the symbols that belong to
its argument, in the canonical order that was specified when
define-enumeration was used to define the enumeration type.

`enum-set-member?' returns true if and only if its first argument
is an element of its second argument.

`enum-set-subset?' returns true if and only if the universe of its
first argument is a subset of the universe of its second argument
(considered as sets of symbols) and every element of its first
argument is a member of its second.

`enum-set=?' returns true if and only if its first argument is a
subset of its second and vice versa, as determined by the
enum-set-subset? procedure.  This implies that the universes of
the two sets are equal as sets of symbols, but does not imply
that they are equal as enumeration types.

`enum-set-union' takes two enumeration sets that share the same
enumeration type as universe, and returns their union.

`enum-set-intersection' takes two enumeration sets that share the same
enumeration type as universe, and returns their intersection.

`enum-set-difference' takes two enumeration sets that share the same
enumeration type as universe, and returns their difference.

`enum-set-complement' takes an enumeration set and returns
its complement with respect to its universe.

`enum-set-projection' projects its first argument into the
universe of its second, dropping any elements of its first
argument that do not belong to the universe of its second.
(If its first argument is a subset of the universe of its
second, then no elements are dropped, and the injection is
returned.)

Examples:

(define colors
  (make-enumeration '(black white purple maroon)))

(define color-index (enum-set-indexer colors))
(define make-color-set (enum-set-constructor colors))

(enum-set=?
 colors
 (enum-set-universe colors))       => #t

(color-index 'purple)              => 2

(enum-set->list
 (make-color-set
  'black 'purple))                 => (black purple)

(enum-set-member? 'white
 (make-color-set 'white 'maroon))  => #t

(enum-set-subset?
 (enum-set-complement (color))
 (color))                          => #t

(enum-set=?
 (make-color-set 'black 'maroon)
 (enum-set-complement
  (make-color-set 'white 'purple))) => #t

(enum-set-subset?
 (make-color-set 'white)
 (make-enumeration
  '(black white red green)))       => #t

(enum-set=?
  (make-color-set
   'black 'white)
  ((enum-set-constructor (make-enumeration '(black white red green)))
   'black 'white))                 => #t

(enum-set->list
 (enum-set-projection (make-enumeration '(black white red green))
                      colors))     => (black white)

                                * * *

Explicit-naming syntactic interface:

(define-enumeration <type-name> ; Mike's symbol-checking macro
  (<symbol> ...)                ; the symbols in this universe
  <constructor-syntax>)         ; macro that constructs sets

where <type-name> is an identifier that will be bound to an
uninteresting macro; <symbol> ... are the symbols that form the
universe of the enumeration (in order); <constructor-syntax> is
an identifier that will be bound to a macro that, given any
finite sequence of the <symbol> ..., possibly with duplicates,
expands into an expression that evaluates to the set of those
symbols.

(<type-name> <symbol>) checks at macro-expansion time whether
<symbol> is in the universe associated with <type-name>.  If
it is, then (<type-name> <symbol>) is equivalent to '<symbol>.
If it isn't, then an exception must be raised at macro expansion
time, whatever that means.

(<constructor-syntax> <symbol> ...) checks at macro-expansion
time whether every <symbol> ... is in the universe associated
with <type-name>.  If one or more isn't, then an exception
must be raised at macro expansion time, whatever that means.
Otherwise (<constructor-syntax> <symbol> ...) is equivalent to
((enum-set-constructor (<constructor-syntax>)) (list '<symbol> ...)).

Examples:

(define-enumeration color
  '(black white purple maroon)
  color-set)

(color black)                      => black
(color purpel)                     => <expansion-time error>
(enum-set->list (color-set))       => ()
(enum-set->list
 (color-set maroon white))         => (white maroon)

[end of proposal]