[R6RS] libraries

Wed Aug 16 11:39:14 EDT 2006

Aziz Ghuloum and I have been working on a library implementation and, in a
parallel effort, so has Andre van Tonder.  Some efficiency issues have
come to light that, in my mind, casts doubt on our plan to go with phased
libraries.

The following is from Andre:

  Perhaps I can state more clearly what my discomfort is
  with the currently intended phase semantics.  I would like to
  compile the library

  (library foo
     (import (r5rs))
     (export f)
     (define f (let ((n 0))
                 (lambda () (set! n (+ n 1)) n)))))

  to something like

  (define foo.f (unspecified))
  (define (foo msg)
     (case msg
       ((invoke) (set! foo.f
                       (let ((n 0))
                         (lambda () (set! n (+ n 1)) n)))))
       ......)))

  References to f in client code inporting foo can then simply be
  expanded to foo.x and are thereby *statically* resolved.

  ...

  But when there is a separate dynamic environment for each meta-level, then as 
  far as I can see one is forced to compile the module to something like:

  (define (foo msg env)
     (case msg
       ((invoke) (env-set! foo.f
                           (let ((n 0))
                             (lambda () (set! n (+ n 1)) n)))
                           env))
       ......)))

  and references to f in client code must now be compiled to the
  equivalent of

    (env-lookup 'foo.f env-for-current-meta-level)

  This seems rather bad to me.  One has lost static resolution of
  references and the module instead can dynamically and unpredictably
  affect multiple meta-environments.  Unless I am missing something obvious,
  it seems that the intended semantics will make static analyses
  and optimizations more difficult.

Aziz and I (and probably Andre by now) don't believe that it's quite that
bad.  We should be able to produce instead something like:

    (env-ref n env)

where n is a constant offset determined at expansion time.

Unfortunately, this still requires an extra indirect over what would be
possible with the simpler "one phase" model, and it is also enough to
inhibit some optimizations (perhaps all interlibrary optimization in less
sophisticated compilers).  Worse still, env is a variable and will end up
free in any code that references library bindings, i.e., most code.  This
means that many procedures that would not have required closures will now
require closures, which increases run-time overhead in multiple ways.

It should be possible to eliminate the extra indirects and closures by
creating a new copy of all library code each time the library is visited
or invoked, and it should be possible to recover full optimization by
recompiling each library each time it is visited or invoked, but either
would make visiting or invoking a library even more expensive than we
had envisioned.  It would also bloat the run-time image, especially when
the code for a large library like r6rs has to be copied.

These implications should have been clear to me from Matthew's "compilable
and composable modules" paper, but I've been laboring under the impression
only syntax objects are affected and that the overhead would all be paid
at expansion and library invocation time.  This is not the case, and in a
high-performance Scheme implementations, the run-time overhead would be
significant.

Kent