[R6RS] Safe/unsafe mode question

Tue Jul 4 15:18:49 EDT 2006

Mike wrote:
> I have a really naive question: Couldn't the safe/unsafe mode
> distinction be provided via the module system?  Like so:
> 
> - In an import form, you can specify the desired safety / debug /
>   etc. level you like, and you get something like "the safe version of
>   the R6RS core" or the "unsafe version of the list library".
> 
> - In a module body, you (optionally) designate a definition to only
>   apply to a (set of) safety / debug / etc. settings.
> 
> This leaves a burden with the implementor of a module to ensure some
> kind of semantic consistency between the different variants, but that
> consistency seems already tenuous with the existing proposals.

Safety (as opposed to expressing our disapproval of
scandalous coding practices) is mostly about global
invariants and the boundaries between modules, not
about the interiors of modules.

For a simple example, consider integer->char and char->integer.
The contract for integer->char says its argument must be a
Unicode scalar value, and its result will be a character.
The contract for char->integer says its argument will be a
character, and its result will be a Unicode scalar value.

If these contracts are not checked at run time, things can
go wrong.  In safe code, the checking can and does occur on
either side of the call, depending on what the compiler can
figure out statically.  A compiler like Twobit, for example,
will call one version of integer->char if it doesn't know
anything about the argument, will call integer->char:fix if
it knows the argument is a fixnum but isn't sure it is within
range, will call integer->char:idx if it knows the argument
is a non-negative fixnum, or will call integer->char:trusted
if knows the argument is both a fixnum and a Unicode scalar
value.  So we already have multiple versions of standard
library procedures running around, even in safe mode.

In Larceny, safe code maintains the global invariant that no
character is ever created except by converting a Unicode scalar
value into a character.  Because of that invariant, the
char->integer procedure doesn't have to check to see whether
the fixnum it returns is a Unicode scalar value, and string-ref
doesn't have to check to see whether the bits that it pulls
out of a string correspond to a Unicode scalar value.

In some other system, the global invariants may be different,
so the char->integer and string-ref procedures may, in safe
mode, have to check the values they are returning.  This
checking is especially likely in systems that interoperate
with other runtime systems (e.g. C/Java/C#) whose global
invariants and representations are different from Larceny's.

The responsibilities for maintaining the global invariants
of safe mode are divided between the compiler, the caller,
and the callee.  They are divided differently in different
systems, because different systems have different global
invariants.  That is why you cannot expect to get a truly
portable semantics for safety checking based on whether you
use the safe or unsafe version of specific modules.

If we ignore the role of Larceny's compiler, for example,
then using the safe, full-checking version of the character
procedures together with an unsafe version of the string
procedures will be enough to ensure that every call to
string-ref results in a character that corresponds to a
Unicode scalar value.

In some other implementation of Scheme, where the most
closely corresponding global invariant is enforced in part
by string-ref, using the safe, full-checking version of
the character procedures together with an unsafe version
of the string procedures will *not* be enough to ensure
that every call to string-ref results in a character that
corresponds to a Unicode scalar value.

> I guess this would bring us closer to Kent's proposal than Will's.  It
> seems Will's objections are mostly ambiguities in Kent's proposal
> (such as whether the responsibility for arity checking is with the
> caller or callee), but my impression is that this is a matter of
> specification.

The only way for us to specify a truly portable semantics
for safety checking is for us to try to specify all of the
global invariants and to assign specific responsibilities
for enforcing those invariants to different procedures and
modules.

For us to attempt that would be foolhardy.  Given the huge
loopholes we are ignoring in the semantics of our core
language, even some that we have (recklessly) created,
the idea that we would succeed at this far more ambitious
task is laughable.  Not only would we fail, we would
alienate every user and every implementor, because our
division of labor and our global invariants would not
match the reality of any real system, and there are sound
reasons for much of the diversity seen in real systems.

Will