Chapter 20

I/O

20.1 File names

The file names in most common operating systems, despite their appearance in most cases, are not text: For example, Unix uses null-terminated byte sequences, and Windows uses null-terminated sequences of UTF-16 code units. On Unix, the textual representation of a file name depends on the locale, environmental setting. In both cases, a file name may be an invalid encoding and thus not correspond to a string. An appropriate representation for file names that covers these cases while still offering convenient access to file-system names through strings is still an open research problem. Therefore, R⁶RS allows specifying file names as strings, but also allows an implementation to add its own representation for file names.

20.2 File options

The flags specified for file-options represent only a common subset of meaningful options on popular platforms. The file-options form does not restrict the <file-options name>s, so implementations can extend the file options by platform-specific flags.

20.3 End-of-line styles

The set of end-of-line styles recognized by the (rnrs ports (6)) library is not closed because end-of-line styles other than those listed might become commonplace in the future.

20.4 Error-handling modes

The set of error-handling modes is not closed because implementations may support error-handling modes other than those listed.

20.5 Binary and textual ports

The plethora of widely used encodings for texts makes providing textual I/O significantly more complicated than the simple model offered by R⁵RS. In particular, realistic textual I/O should address encodings such as UTF-16 that include a header word determining the “actual” encoding of the rest of the byte stream, stateful encodings, and textual formats such as XML, which specify the encoding in a header line. Consequently, a library implementing textual I/O should support specifying an encoding upon opening a port, but should also support opening a port in “binary mode” to determine the encoding and switch to “text mode”.

In contrast, arbitrary switching between “binary mode” and “text mode” is difficult to support, as it may interfere with efficient buffering strategies, and because the semantics may be unclear in the case of stateful encodings. Consequently, the (rnrs io ports (6)) library allows switching from “binary mode” to “text mode” by converting a binary port into a textual port, but not the other way around. The transcoded-port procedure closes the binary port to preclude interference between the binary port and the textual port constructed from it. Applications that read from sources that intersperse binary and textual data should open a binary port and use either bytevector->string or the procedures from the (rnrs bytevectors (6)) library to convert the binary data to text.

The separation of binary and textual ports enables creating ports from both binary and textual sources and sinks. It also makes creating both binary and textual versions of some procedures many procedures necessary.

20.6 File positions

Transcoded ports do not always support the port-position and set-port-position! operations: The position of a transcoded port may not be well-defined, and may be hard to calculate even when defined, especially when transcoding is buffered.

20.7 Freshness of standard ports

The ports returned by standard-input-port, standard-output-port, and standard-error-port is fresh so it can be safely closed or converted to a textual port without risking the usability of an existing port.

20.8 Argument conventions

While the (rnrs io simple (6)) library provides mostly R⁵RS-compatible procedures for performing textual I/O, the (rnrs io ports (6)) library uses a different convention for argument ordering. In particular, the port is always the first argument. This enables the use of optional arguments for information about the data to be read or written, such as the range in a bytevector. As this convention is incompatible with the convention of (rnrs io simple (6)), corresponding procedures have different names.