[R6RS] I/O issues

Mon Aug 7 13:36:40 EDT 2006

Mike Sperber wrote:
> I've checked in revisions for the I/O SRFIs.  These mostly represent
> my attempt to go through them an identify open issues.

The newest version contains several enormously long lines,
which diminishes the usefulness of tools that track changes
to specific lines.

> It would be
> nice if we could make some progress on resolving them in the next
> conference.  Here's a list:
> 
> - The syntax for `transcoder' and `update-transcoder' is the way it is
>   so that extensions can be added.  Note the language saying that any
>   clause not covered by the spec is ignored, rather than causing an
>   exception to be raised.  Given that we're raising exceptions almost
>   everwhere else, maybe we should move to a procedural interface.

While it is true that we have gone hog-wild about requiring
exceptions to be raised by standard procedures, thereby making
them non-extensible, thereby encouraging the proliferation of
non-standard procedures that will do essentially the same thing
as the standard procedures but add extensions that programmers
find necessary, I believe that was unwise.  I think it would be
especially unwise to perpetuate this tendency in an io system,
and it would be even more unwise of us to think no extensions
will be necessary in an io system that is still being designed
less than a month before the first public draft of R6RS.

> - The eol-style issue is more complicated: Unicode, in order to
>   "resolve" the CR/LF mess, has an additional scalar value U+2028
>   meant to replace the old conventions eventually.  There are several
>   ways to deal with this:
> 
>  1. Ignore the issue.
> 
>  2. Make `get-line' accept U+2028 in addition to LF.
> 
>  3. Introduce another eol-style called "ls".  This will translate
>     CR/LF and LF to U+2028 on output.  On input, it's less
>     clear---supposedly no transcoding happens.  For the lf and crlf
>     styles, these all get transcoded to U+2028.  This means that
>     `get-line' (or whatever it ends being called) would accept U+2028
>     as a line delimiter exclusively.
> 
>     This would probably be closest to the intentions of the Unicode
>     committee, but, I think, goes too far against common expectations
>     that LF delimits a line.  Think of (put-string "\n")---we don't
>     even have an escape for U+2028, and Latin-1 ... We might introduce
>     a replacement for `newline' (called `put-ls' or something), but
>     it's a mess.

I prefer #2.

> - On the transcoder-updating issue, I suggest a compromise.  Add a
>   "settable" field to the transcoder (defaults to #f) that says
>   whether the port opened with the transcoder will support changing it
>   mid-stream.  Supply a `set-output-port-transcoder!' procedure that
>   changes the transcoder if the existing transcoder is settable.
>   (Instead of making the transcoder itself mutable, as suggested by
>   Will.)  This has the advantage that we can replace a settable
>   transformer by a non-settable one, letting the implementation choose
>   an appropriate buffering strategy, and forcing no buffering only
>   when a settable transcoder is replaced by another settable one.

If I understand you correctly, you are proposing a side
effect to the underlying port rather than a side effect
to the transcoder.  While limiting such side effects to
ports by making ports default to immutable would be an
improvement over what was first proposed, I still don't
understand why you are opposed to side effecting the
transcoder instead of the port.

Furthermore I still don't understand why you think a
dynamic change to the transcoder should disable buffering
on the underlying port.  Please explain.

(Perhaps I misunderstand, because your last sentence quoted
above appears to talk about settable transcoders even with
the compromise.  Please clarify.)

> - The syntax for `file-options' was originally meant to be extensible
>   by implementation-specific flags.  Should we keep it that way or
>   clamp down on the available flags?

I favor the extensibility.

> - If we restrict the construction of transcoders and file options to
>   what the report specifies, do we leave some room for procedures to
>   accept values created by system-specific facilities outside of R6RS?
>   (This is already the case for file names.)  Note that, if we do
>   restrict file options to what's specified, they make no sense for
>   readers and input ports.  In that case, should we still provide for
>   an optional file-options argument with, say, `open-file-input-port'?

We should not restrict.  We should leave room.  We should provide.

> - In what library do the condition types live?  As they're shared
>   between primitive I/O and port I/O, I suggest a separate library.

That's okay with me.

> - The &i/o-reader/writer and &i/o-port condition types provide a
>   reader, writer or port with a condition.  To what extent should we
>   require the various operations to include this in a condition?
>   Specifically, if, say, the read! procedure passed to
>   `make-simple-reader', raises an exception, should `reader-read!'
>   catch it and augment it with a &i/o-reader/writer condition?  (I'm
>   thinking not---these condition types are purely informational, but
>   in interactive operation and debugging.)

I favor simplification of the io conditions where feasible, to
bring them more in line with the rest of the simplified condition
hierarchy.  If I understand you correctly, you are saying you
can't foresee a good programmatic reason for requiring the extra
complexity.  If my understanding is correct, then I favor not
requiring the extra complexity.

Will