Formal comment #201 (defect) bytevector aliasing severely impedes optimizations Reported by: Brad Lucier Version: 5.92 I presume that many people might want to use bytevectors as described in this report to increase computational speed and decrease memory requirements by avoiding boxing/unboxing of objects that might otherwise be boxed when held in generic vectors. At least, I don't see any other way to get this type of performance or reduce memory in this way. By having only one type of bytevector that aliases all of 32-bit integers, 64-bit integers, 32-bit IEEE 754 numbers, and 64-bit IEEE 754 numbers, optimization opportunities for compilers are severely degraded. One does not know, for example, whether storing a 64-bit IEEE double into bytevector A changes the value of a 32-bit integer read from bytevector B without actually checking whether A and B are the same objects and whether the range of indices used to access A and B overlap. This very problem has been recognized by recent C standards, which forbid such types of aliasing except by going through (char *). (The proposed R6RS bytevectors would propose a problem for more than Scheme->C compilers, however---it is a library design problem.) It could be said by analogy that the proposed R6RS libraries offer *only* a (char*) (more tamed than in C), which solves one small class of problems (how to allow semi-portable, low-level translation between data types that can be considered sequences of bytes; how to write I/O device drivers; ...) while completely ignoring a much larger and more important class of problems (allowing fast and memory- efficient access to large arrays of homogeneous numerical data). RESPONSE: Bytevectors are a general-purpose mechanism for representing sequences of bytes that may encode various things, including various sequences of numbers. It is true that bytevectors are the only standard data type of R6RS that could be used to encode sequences of unboxed numbers. It is also true that the use of unencapsulated bytevectors for this purpose would lead to aliasing between bytevectors that are used for unrelated purposes, which would reduce the effectiveness of compiler optimizations. The solution to this problem is a general-purpose idiom that represents values of an abstract data type T as instances of a sealed and opaque record type that encapsulates bytevectors and/or other general-purpose structures that encode the state of the value. Since the record type is sealed, it is impossible for non-record values or instances of other record types to alias to values of T. Since the record type is opaque, it is impossible to get at the encapsulated state except by going through the operations of T. If the operations of T that create values of T do not introduce aliasing between bytevectors, and none of the operations of T expose the encapsulated bytevectors, then aliasing between the bytevectors encapsulated within values of T will be impossible, and compilers can rely on that fact. The sealed and opaque features of R6RS records, together with R6RS bytevectors, make it possible to construct a portable reference implementation for new data types that provide fast and memory-efficient arrays of homogeneous numerical data. The design and implementation of those data types is beyond the scope of R6RS, but the R6RS provides the machinery needed to implement them effectively.