[R6RS] draft Unicode SRFI
Thu Jun 30 02:25:34 EDT 2005
Good work---many thanks!
I've got a number of comments---as usual, I'm staying silent on the
stuff I like and sticking to the criticism, hopefully constructive.
Matthew Flatt <mflatt at cs.utah.edu> writes:
> * I added the \<eol><whitespaces> and \<space> string escapes, as
> discussed on this list.
I think I probably missed \<space>---does this mean I can only
terminate a variable-length escape sequence with a space? If so, that
> * Marc's here-string and quoted symbols are in.
> To discuss:
Whatever we don't resolve here should probably go into an issues
> * I left octal escapes for strings intact for compatibility for C.
> (Also, I actually use them --- perhaps from spending too much time
> with UTF-8 encoding.) There's no octal for characters, though.
Did you use octal escapes to denote UTF-8 code units or actually
scalar values? (I'm still opposed.)
> * I added an extension for symbols that allows any non-whitespace
> character above 127 where a <letter> is allowed. Is this too
Yes. Shouldn't we at least restrict to Unicode letters and numbers?
It seems to me we at least should exclude Unicode separators.
> Also, should we try to allow `->' as a symbol at this
If we talk about symbols at all, we should. (More on that below.)
- Could the SRFI please have an issues section where the things we
haven't agreed on are listed? On that list, by me, are:
o duplication and potential confusion between #\linefeed and
o alternate syntax for numerical scalar values in character and
o Anton's (Perl's) generalization of here strings
Otherwise, we'll just run around the same block on the SRFI list as
here needlessly. (We'll probably run around it anyway, but this
way, at least it doesn't drop out of the sky for the readers.
Politically, I also think we'll get into trouble with some of the
potential participants of the discussion if we don't at least agree
on that the issues are.)
- I think the delimiter issue for character literals could use an
example. Otherwise, the point may get lost on the casual reader.
- The document says "any C string literal is also a Scheme string
literal": I don't believe that's true anymore, as the \x syntax is
variable-length in C. (The sentence is literally true, I guess, but
not in a meaningful way.) As a result, I'm pretty confused on the
compatibility issue---if we're not compatible with C, we could also
make octal escapes fixed-length at least, to make the whole
scalar-value-literal issue a little less patchwork than it seems
now. Compatibility with C and Java should also be in the issues
- I think the whole symbol-syntax issue hasn't been discussed
adequately on the list, and I think it doesn't need to be in this
SRFI. If it is, there should be an entry in the issues section, and
a rationale. Moreover, as you pointed out, there should be a
grammar for the lexical syntax.
- What are your plans wrt the reference implementation? In my mind,
we could and should provide one for most of it. I'd be happy to
- I don't understand how I could portably use the locale functionality
in my code, since the document doesn't specify a single string I
might use as a locale name. Also, the locale stuff could (and, to
my mind should) have a reference implementation for at least some
locales from the Unicode standard. (We could bum a starting point
off Alex Shinn, I think.)
- The sentence on UnicodeData.txt should probably be expanded a little
bit and include a link to
understandable by non-Unicode-wizards.
- The section on here strings should probably refer to the scsh
manual, and possibly to the manuals of PLT Scheme and Gambit-C.
- "can be includes" -> "can be included"
- "returna" -> "return a"
- The document makes out Neil Van Dyke as an R6RS editor.
Cheers =8-} Mike
Friede, V?lkerverst?ndigung und ?berhaupt blabla
More information about the R6RS