[R6RS] libraries

Thu Aug 17 19:45:28 EDT 2006

At Wed, 16 Aug 2006 11:39:14 -0400, dyb at cs.indiana.edu wrote:
> Unfortunately, this still requires an extra indirect over what would
> be possible with the simpler "one phase" model, [...] Worse still,
> env is a variable and will end up free in any code that references
> library bindings, i.e., most code. This means that many procedures
> that would not have required closures will now require closures,
> which increases run-time overhead in multiple ways.

I understand this. 

And, if I understood our conversation yesterday, you estimate a 10-15%
performance hit for the extra indirection and closures, right?

Does the 10-15% estimate already take into account that no indirection
and closure is needed for purely functional code that doesn't refer
(transitively) to mutable variables or syntax constants? Would all R6RS
functions be functional in this sense, so there's no penalty until a
programmer starts using a module (at phase 0) that includes a mutable
top-level variable or syntax constant?

Meanwhile, it sounds like Larceny always has an indirection and closure
anyway. Is that right? If so, do that mean there wouldn't be a
performance hit at this level for Larceny? Put another way, would
Larceny be 10-15% faster without the indirection?

> and it is also enough to
> inhibit some optimizations (perhaps all interlibrary optimization in less
> sophisticated compilers). 

I still don't understand this.

I can see how converting to `(env-ref n env)' too early in the
compilation pipeline might obscure references to bindings in a library
top level, and that would inhibit optimizations. But that just sounds
like a problem in the compiler.

In particular, I can see how viewing `library' as a macro over an
R5RS-like language would force an early conversion to `(env-ref n
env)'. In other words, a library-oblivious intermediate language is not
very good for compiling library-based programs. That makes sense to me,
since I think a top-level module form is a core construct, and not
merely syntactic sugar.

Is there some deeper reason, though? It seems like a reference to X
from library L is statically a reference to X from library L, and so
it's available to all optimizations that a programmer should expect in
a lexically scoped language.

Finally, a question about the performance cost of not having phases. As
we've distilled it down, my interest in phasing is that if code refers
to an imported binding X, then X will definitely be bound at run time.
Consequently, you don't need an "is X defined?" check at run time when
referring to an imported X.

Doesn't the phaseless model mean that an "is defined?" check will be
needed sometimes, or is this check easy to avoid (even in less
sophisticated compilers)?

Thanks,
Matthew