[R6RS] Safe/unsafe mode
dyb at cs.indiana.edu
dyb at cs.indiana.edu
Thu Jul 13 01:08:27 EDT 2006
> The relevance of this technique to our discussion is
> that it divides the responsibility between caller and
> callee in a way that Kent probably has not considered.
I hope you didn't place any bets on that.
> In addition to a semantics, many programmers like to
> have a mental model of how the semantics might be
> implemented. For my semantics, a simple mental model
> is that every procedure has two entry points, one for
> when it is called from safe code, and another that may
> be used (but does not have to be) when the procedure
> is called from unsafe code. Even if the second entry
> point is called from unsafe code, it may behave exactly
> the same as the safe entry point.
I'm not sure this model explains everything, but never mind that. The
correspondingly simple model for my semantics is that each identifier
exported from a standard library has two bindings, a safe one referenced
from safe code and an unsafe one referenced from unsafe code. Of course,
the unsafe one may behave exactly the same as the unsafe one.
> Now for an example:
> (let ((f -))
> (declare unsafe)
> (f '(a)))
> With my preferred semantics, and with the mental model
> I sketched above, the unsafe call to f is allowed to
> use the unsafe entry point when it calls the - procedure,
> so all bets are off. My preferred semantics allows the
> compiler optimization known as copy propagation, so the
> example above is equivalent to
> (let ()
> (declare unsafe)
> (- '(a)))
My prefered semantics allows copy propagation as well, but the example
above is equivalent to:
Just to prove I'm not making all of this up as we go along, I'll make this
concrete by showing how my prefered semantics has been implemented in Chez
Scheme for the past ten years or more. You can even download Petite Chez
Scheme and follow along, if you like.
In the output of the expander, Chez Scheme identifies each primitive
reference as safe or unsafe using the syntax (\#primitive n prim),
where for historical reasons, n=2 for safe and n=3 for unsafe.
(\#primitive n prim) can be abbreviated #n%prim. A primitive reference
resulting from a reference to an import from the scheme (or r5rs,
etc.) module is expanded into #2%prim by default, except in optimize-level
3 (unsafe) code, where it is expanded into #3%prim. The \#primitive or
#n%prim syntax can also be used directly in source code.
The optimizer propagates the entire \#primitive form, not just the
primitive name, as the following transcript shows.
Chez Scheme Version 7.0a
Copyright (c) 1985-2005 Cadence Research Systems
> (expand/optimize '(let () (import scheme) (let ((f +)) (f x))))
> (expand/optimize '(let ((f #2%-)) (f x)))
> (expand/optimize '(let ((f #2%-)) (#3%- (f (#3%- x)))))
(#3%- (#2%- (#3%- x)))
> (expand/optimize '(let ((f #3%-)) (f x)))
> (expand/optimize '(let ((f #3%+)) (f x)))
The only thing presently missing in Chez Scheme is a way to locally mark a
subexpression of a top-level expression (larger than an identifier) safe
or unsafe. Instead, the entire top-level expression is treated as safe or
unsafe, except for primitives explicitly marked with #2% or #3%, depending
on the value of the optimize-level parameter.
> I believe that Kent's preferred semantics, if and when
> we see it, will be sensitive to the name by which a
> procedure is called. Calling a procedure by one of its
> names (e.g. -) will probably not be equivalent to calling
> the procedure by another of its names (e.g. f). Beyond
> that speculation, I will just await Kent's semantics.
It is not sensitive to the name by which a procedure is called, but rather
to whether the value is a safe or unsafe version of the primitive.
> The relevance of this to the MLton compiler is that, if
> the compiler's flow analysis can establish that, for a
> call to f, - is the only procedure that can flow to f,
> calling f will generate the same code as for a call to
> -. This is true for some Scheme compilers also, even
> (occasionally) for Twobit.
This is true in Chez Scheme as well, but whether the safe or unsafe
version of - is propagated depends on whether the original reference to -
was to the safe or unsafe version, so that the semantics is the same
whether the copy propagation occurs or not.
More information about the R6RS