[R6RS] Scripts and toplevel code
Anton van Straaten
anton at appsolutions.com
Fri Aug 11 06:01:32 EDT 2006
Here's my take on the options for scripts. I'm sure it's longer than it
needs to be, but I wanted to make it available asap.
1. Summary of current proposals
The script proposal discussed in email by Kent and Mike addresses making
an R6RS program executable by prepending a Unix script header, along
with a few other details necessary for dealing with the script entry
point and command line arguments. The rationale for such scripts is the
same is that in SRFI 22, which boils down to having a standard way to
run an R6RS program, to support distribution of Scheme code that's
portable across platforms and Scheme implementations.
(Note that the Unix-specific aspect here does not interfere with making
the same scripts portable to Windows and (I believe) the Mac OSes.)
One major design decision for scripts is whether to support top-level
code in scripts, or whether to require scripts to be implemented as R6RS
libraries. Mike and Kent had settled on the library approach, with the
final proposal summarized here:
https://r6rs.scheming.org/node/324#comment-1652 . The following syntax
for a script was proposed:
<script header> <script spec> <library>
<script header> -> #!/usr/bin/env scheme-script <line break> #!r6rs
<script spec> -> (script <library name> <entry name>)
This proposal is very general and has numerous good properties (expanded
on in the Design Factors section below). However, it also results in a
minimal script being somewhat heavyweight. As I mentioned in an earlier
note, a "Hello world" script would look like this:
(script hello-world hello)
(define (hello arguments)
(display "Hello world!")))
To allow for less heavyweight scripts, a subsequent exchange between
Kent and I discussed supporting scripts which contain top-level code, as
an alternative to a library. This has resulted in the syntax proposed
at https://r6rs.scheming.org/node/344#comment-1680 :
#!/usr/bin/[env ]scheme-script <line break> [#!r6rs]
This allows both for more concise small scripts:
(display "Hello world!")
... as well as allowing for scripts which contain one or more libraries,
which can be invoked by top-level code.
The major difference between these two proposals centers around adding
support for top-level code. It has been proposed that the semantics for
such top-level code be equivalent to that of a library body.
2. Rationale for supporting top-level code
Currently, the only legal R6RS program consists purely of libraries
(some of which may be wrapped in scripts, per the existing proposal).
The library specification requires libraries to be explicitly named, and
to explicitly specify imports and exports. The script proposal adds
further requirements, resulting in the heaviness mentioned earlier. Not
all programs require this level of infrastructure.
One of the traditional benefits of Scheme has been a kind of scalability
along the dimension of source-code rigor: a program can start out as an
experiment in the REPL, progress perhaps via cut & paste to becoming a
relaxed script, with little structure, and later evolve into a more
rigorously structured program. This allows programs to evolve "from
scripts to programs" (a catchphrase used repeatedly by the PLT group, in
a slightly different but closely related context). This property of
Scheme is worth preserving, and worth addressing in the R6RS.
Scheme implementations will certainly provide ways of executing code
outside of libraries. If the R6RS doesn't address this possibility, then
it essentially becomes "the Scheme library specification", only
addressing the semantics of code inside R6RS libraries.
It's also worth noting that abstracting out the verbosity of the script
boilerplate is exactly the kind of thing that's often resolved with
macros in Scheme. However, in this case, that's not possible, because
the library specification prohibits macros from generating libraries.
This is an unusual situation in Scheme, and it provides an additional
reason to address the issue in the R6RS.
Finally, related to the issue of a relaxed syntax, the requirement that
all definitions precede expressions in library bodies is unnecessarily
restrictive for many kinds of lighter-weight applications, including
many scripting applications. Again, many Schemes are certain to
continue providing support for interleaved definitions and expressions
in top-level code, and if the R6RS can address this requirement in a
reasonable way, supporting it should be considered.
3. Design factors
The following covers major choices involved in a script specification
for the R6RS. The Design Rationale section of SRFI 22 is also relevant,
and not all of the points it raises will be repeated here.
The following general structure for scripts is assumed below:
<script header> <script spec> <script body>
These components are covered individually below. <script spec> is
addressed last, because it depends on the choices made for <script body>.
3.1 Script Header
The choices for <script-header> are fairly constrained, and the proposal
so far (as specified above) seems fine. Making the #!r6rs specification
optional would help a little for quick & dirty scripts.
3.2 Script Body
3.2.1 Script Startup
There are two major choices for specifying how execution of a script
should be started:
(a) By executing top level code (or code at the top level of a library
(b) By executing a particular named procedure, with a name that is either:
(i) fixed by the specification, e.g. "main".
(ii) specified in the <script spec>, as in the current proposal.
For choice (a), a way to access the command line arguments is required,
e.g. a procedure named 'command-line'. For choice (b), command-line
arguments can be passed as ordinary arguments to the script entry procedure.
3.2.2 Syntax of Script Body
There are three major choices for the syntax of the script body:
(a) An explicit library definition, as in the current proposal. A
variation on this could support multiple explicit libraries in a script,
in which case some means is needed for selecting which library contains
the script startup code, e.g. via the <script spec>.
(b) A top-level program of some kind. This would most likely closely
resemble, or be identical to, a library body. Depending on the details,
this approach can complicate testing, debugging, and other kinds of
reuse, if merely loading a script causes it to be executed.
(c) Support both (a) and (b).
188.8.131.52 Related script body issues
If a script body is a library, it implies that other scripts and
libraries can import it. This is an advantage for debugging, testing,
and other reuse of scripts.
If a script body can contain multiple libraries, distribution of an
application as a set of libraries in a single script file becomes possible.
If a script body consists of top-level code, it raises the question of
whether import of a script by other libraries and scripts should be
supported. To support this via the existing library import mechanism,
it must be possible to treat the script body as a kind of library, which
means that at a minimum, it needs a name. One way to do this would be
to provide a name as part of <script spec>.
Alternatively, scripts containing top-level code could remain anonymous.
This would mean that they could not be invoked directly from other
Scheme code, unless procedures for that purpose are provided, such as
the 'load-script' and 'invoke-script' procedures described by Kent in:
184.108.40.206 Interleaved definitions and expressions
If the script body consists of top-level code, it could be specified to
support interleaved definitions and expressions, to provide a more
relaxed syntax, as mentioned in the rationale.
Mike has pointed out that this issue is orthogonal to that of the
ability to portably execute scripts. For example, a new 'begin'-style
form could be provided to support interleaving. However, even if such a
form were provided, it could make sense to implement top-level script
bodies in terms of that form. Any conflation of concepts here seems
3.3 Script Spec
The purpose of <script spec> is to specify the name of the library
containing a script, and the procedure entry point within that library.
No <script spec> is needed if a script body either consists of top-level
code, or a single library containing top-level code, either of which,
when evaluated, causes the script to start executing, so that no other
entry point needs to be specified.
However, if a script contains multiple libraries, or if scripts are
started via a procedure rather than top level code, then some means of
specifying the appropriate library and procedure is needed.
3.4. Other choices
There are some more minor choices, such as how to handle command line
arguments. To a large extent, such choices will be dictated by other
more major decisions, such as whether the script entry point is a
procedure. Discussion of other such minor details has been omitted.
More information about the R6RS