[R6RS] Re: strawman module syntax

Thu Jul 8 04:57:25 EDT 2004

This message is on the pragmatics of the module-system design.  It' s
a follow-up to Matthew's original proposal, but I've tried to take
into account some of the subsequent discussion.  Since so much time
has passed, I'm repeating myself on a lot of things.  If you're bored
by my lengthy gripes, you might want to skip to the end, which has a
revised proposal.

First off, I have done a lot of hacking with the module and unit
systems in MzScheme and Scheme 48.  They are pragmatically very
similar for many uses, and, I believe, already constitute a good match
to the problem we're trying to solve.  Matthew's work on semantics and
phase separation is, IMHO, exactly what we need, and should form the
basis of the R6RS module system.  I think there's still some room for
improvement on the pragmatic side.  None of which affects the semantic
basis, though.

Here are my gripes with Matthew's proposal:

[Matthew:  Your name appears frequently---you may feel you've already
argued everything I'm asking questions about.  I would appreciate if
you could comment on some of these issues nevertheless---I may not
have understood your position completely.]

Minor gripes:

- INCLUDE should be called something else---people would expect
  textual inclusion, when in fact it means doing textual inclusion and
  then wrapping a BEGIN around it.  (For similar reasons, I think
  BEGIN is misplaced here.)  I suggest renaming it to ... I can't
  think of anything but FILE.  Shucks.  Matthew, could you expand on
  the rationale of distinguishing between "Unix-style relative path"
  and "OS-specific absolute path" in <path-spec>?

- In MzScheme, I really hate the restriction that, in the language
  designation, I can only put a <require> and not a <require-spec>.
  Matthew---could you expand on why this is so?

- I feel often restricted by having only one module per file.  Marc
  mentioned that this restriction may also make it difficult to write
  one-file scripts.  (And, conversely, I appreciate the capability in
  the Scheme 48 module system to put several module definitions in one
  file.)

- In MzScheme, I miss the ability to have one piece of code + imports
  define several modules.  By transitivity, I miss the availability of
  interfaces as separate entities.

- Scheme 48 requires specifying whether an export is syntax or not; I
  consider this worthwhile information, that might also help the
  implementation.

Bigger gripes:

- I don't agree that components (in the sense of parametric modules)
  should be underneath the level of *this* module system.

  In MzScheme, components are implemented as units, which are dynamic
  entities, and turning a module into a component involves first
  stuffing the module body into a unit and then export a unit
  instantiation from the module itself.  This means turning a dynamic
  entity (the unit) into a static one (the module) which my gut tells
  me is the wrong way around.  I would rather see this kind of
  progression:

  static component -> static module -> dynamic component
  or directly
  static component -> dynamic component

  But this is mostly theory, and I'm getting ahead of myself ...

  Moreover, there's a real pragmatic problem in that, to turn a module
  into a component, I need to change its insides.  (Matthew will
  probably say that this is precisely the intention.)  This is OK for
  crack programmers who have the full source code at their disposal,
  and perform these changes at will.  However, it falls down for
  people:

  - who have only a foggy understanding of the distinction between a
    module and a component, but do understand linking in C
  - who use third-party modules shipped in some kind of binary format,
    and need to do component things with it

  [Example: I once had a student write a networked version of Java AWT
  (in the days the AWT wasn't uncool yet).  There's no way to make
  existing AWT programs use it because they refer not just to any old
  implementation of the AWT interface, but *the official  Sun AWT
  itself*.  This sucks.]

  I agree with Matthew that it's good to have a module have a defined,
  fixed semantics.  I disagree that this semantics should be attached
  to what's specified in REQUIRE exclusively---I would like that
  semantics to be fixed *in the context of a specific application*.
  (What's an "application"?  A collection of modules thrown together
  with all dependencies resolved.  Roughly)
  This isn't all that different from the way MzScheme works in the
  real world---I can set a different PLTCOLLECTS aka CLASSPATH and
  thus change the context a program runs in.  (But Matthew has pointed
  out that he considers some of the uses of this I outlined as
  abuses.)

  To make a long story short: I think that module *interfaces*, not
  definitions should have globally unique meanings, and that the
  implementation for any given interface should be determined at
  compile/link time.  This would mirror the way cc/ld works, and I
  think it's the way to go.

  This is not the same as turning modules into components.  I
  would still expect that any single interface in a given application
  has only one implmenetation.  But it leaves a the possibility
  conversion path from a module to component in place---that path
  doesn't exist with MzScheme's current system, because the module
  bodies don't contain enough information.

  This would imply separatating interfaces from modules, and would
  benefit from allowing several definitions per file.

Revised proposal:

So---here something that incorporates my changes.  Note that it's
syntactically closer to Scheme 48 than MzScheme, but it's really
Matthew's system semantically.  (That would be my characterization,
anyway.  He may disagree though.)

A summarizing note on the pragmatics of application organization: (I'm
avoiding the word "program.")

- An application consists of several files written in the
  configuration language whose syntax is outlined before.  (These may
  refer other files written in other languages that are, however, also
  conceptually part of these same files.)

- Imports are of interfaces only, not implementations, and the
  references are by name.

- If any module or interface definition wants to refer to an interface
  not defined in the same file, it needs to specify a globally unique
  reference to that interface.

- A Scheme system might support some kind of interactive incremental
  loading facility for the configuration files constituing an
  application, or some cc/ld-like batch mode.  This involves some kind
  of mapping of concrete files/library collections into the global
  namespace.

Here is the syntax:

<config form> -> <interface def> | <module def>

<interface def> -> (DEFINE-INTERFACE <identifier> <interface exp>)

<interface exp> -> (EXPORT <export spec> ...)
                 | (COMPOUND-INTERFACE <identifier> ...)
                 | <interface name>

<interface name> -> <identifier> ; reference to interface in the same file
                  | <one of several global interfaces, such as R6RS-CORE>
                  | (INTERFACE-REF <string> ... <identifier>)
  ; an globally unique path specification of a file, followed by an interface
  ; identifier defined in that file

<export spec> -> <identifier>
               | (<identifier> :SYNTAX)
               | ((<identifier> ...) :SYNTAX)

<interface ref> -> 

<module def> -> (DEFINE-MODULE <identifier> ; its name
                               <interface name> ; export interface
                               <interface name> ; language designation
                  <clause> ...)
<module def>  | (DEFINE-MODULES ((<identifier> ; its name
                                  <interface name>) ; export interface
                                 ...)
                                <interface name> ; language designation
                  <clause> ...)

<clause> -> (REQUIRE <require spec> ...)
          | (CODE <command or definition> ...)
          | (FILES <abstract file name> ...)

<require spec> -> <interface name>
            | (ONLY <require spec> <identifier> ...)
                                 ; Restrict to specific ids
            | (EXCEPT <require spec> <identifier> ...)
                                 ; Remove specific ids
            | (ADD-PREFIX <require spec> <identifier>)
                                 ; Add a uniform prefix
            | (DROP-PREFIX <require spec> <identifier>)
                                 ; Remove a uniform prefix
            | (RENAME <require spec> (<identifier> <identifier>) ...)
                                 ; Rename
            | (ALIAS <require spec> (<identifier> <identifier>) ...)
                                 ; Add additional name

Note:

- We would still need to decide on a concrete syntax for path
  specifications, along the lines of Java has.

- This is not any comment on the design Chez-style modules; I think
  they should (modulo the names bound by it, plus phases) stay roughly
  the same, and be accorded library status.  Note that this proposal
  would really make Chez-style and "Mike-style" modules quite
  distinct.  (And I believe that's a good thing.)

-- 
Cheers =8-} Mike
Friede, V?lkerverst?ndigung und ?berhaupt blabla