[R6RS] draft Unicode SRFI

Thu Jun 30 11:15:37 EDT 2005

Matthew Flatt <mflatt at cs.utah.edu> writes:

> At Thu, 30 Jun 2005 08:24:33 +0200, Michael Sperber wrote:> >  * I added the \<eol><whitespaces> and \<space> string escapes, as
>> >   discussed on this list.
>> 
>> I think I probably missed \<space>---does this mean I can only
>> terminate a variable-length escape sequence with a space?
>
> No. It's so you can terminate a \<eol> sequence and continue with spaces.
> See Kent's message here for an example:
>
>  http://mailman.iro.umontreal.ca/mailman/private/r6rs/2005-June/000655.html

Ah, thanks for pointing that out.  So what about tabs?  This whole
thing just seems very kludgy and marginal to me, especially if we've
got here strings.

> I think we want Unicode symbols to be in Scheme symbols, for example.

OK.

>> It seems to me we at least should exclude Unicode separators.
>
> Separators are defined by SRFI-14 to be whitespace, right?

Ah, OK, I see what you mean: they're part of the whitespaces, so I
guess that's OK.  But I'm still worried about things like
punctuation.  I'd rather have a positive than a negative definition of
symbol constituents, anyway, for the same reasons Marc mentioned for
the symbol syntax in general when the -> thing came up.

>> - Could the SRFI please have an issues section where the things we
>>   haven't agreed on are listed?
>
> Ok, I'll add that.

It occurred to me that the bar notation for symbol/identifier literals
may also have some unpleasant interaction with the mantissa-width
notation.

>> - The document says "any C string literal is also a Scheme string
>>   literal": I don't believe that's true anymore, as the \x syntax is
>>   variable-length in C.
>
> In that case, I favor changing \x, but...
>
>>  (The sentence is literally true, I guess, but
>>   not in a meaningful way.)  As a result, I'm pretty confused on the
>>   compatibility issue---if we're not compatible with C, we could also
>>   make octal escapes fixed-length at least, to make the whole
>>   scalar-value-literal issue a little less patchwork than it seems
>>   now. Compatibility with C and Java should also be in the issues
>>   section probably.
>
> ... there seems to be more support among the editors to ditch octal and
> not worry about complete compatibility with C. That's ok with me.

I'm more confused than ever: I asked the question explicitly whether
compatibility with C and/or Java was important, and only Marc
replied---but we have some weird mix now that I'm not positive is
really going to make things zippy for the C/Java people.  I asked if
people actively preferred the \[xuU] notation over Gambit-C's, and the
situation is similarly confused.  Marc came closest by saying:

http://mailman.iro.umontreal.ca/mailman/private/r6rs/2005-February/000422.html

> Although Gambit has supported this notation for some time now, I'm not
> convinced it is really the best approach.  I think a syntax that is
> shared by characters and strings would be better (and have a single
> unified syntax).

The SRFI draft doesn't have a completely shared syntax, and I made a
proposal to make them consistent with Gambit-C's syntax.  I may not be
reading the discussion right.  (It's not worth keeping up the SRFI
submission.)

-- 
Cheers =8-} Mike
Friede, V?lkerverst?ndigung und ?berhaupt blabla