[R6RS] R6RS Unicode SRFI controversial issues

Sat Jun 18 16:43:50 EDT 2005

Marc Feeley wrote:
> The #<< is followed (to the end of line) by a keyword that must come  by 
> itself on a line to indicate the end of the string.  All the  characters 
> between #<<KEY<newline> and <newline>KEY<newline> are  included verbatim 
> in the string (note that the <newline> before the  keyword is not 
> included, so that it is possible to have multiline  strings that don't 
> end in a newline).
> 
> Gambit also has the #<X...X syntax of Scsh (where X is an arbitrary  
> character which delimits the characters of the string).  It is  
> analogous to the \verb|...| form of latex.
> 
> How do people feel about these syntaxes?  Based on your response we  could:
> 
> 1) drop the here-string syntax and allow newlines in the normal  string 
> syntax
> 
> 2) adopt the here-string syntax and allow newlines in the normal  string 
> syntax
> 
> 3) adopt the here-string syntax and disallow newlines in the normal  
> string syntax
> 
> My preference is option 3.

Of these options, I'm in favor of option 2.  Mixing code of various 
kinds has become very common in all sorts of contexts, what with the 
web, XML, SQL, and Schemes hosted in other language environments.  It 
would be useful for Scheme to be able to handle this well.

I'd also be open to alternative means to the same end, i.e. I'm not 
wedded to the here-string syntax.  However, it does have the benefit of 
being well-known and implemented in at least a couple of Schemes.

Re disallowing newlines in normal strings, R5RS and current 
implementations already allow them, so banning them would require a 
strong argument.  I don't see the argument that such strings are hard 
for humans to parse as being strong enough.

One observation about the here-string syntax as proposed above: 
Perl-style here-strings (see e.g. [1]) support code like this:

   $data = foo(<<THING, arg2, arg3);
   This is a long quoted line.
   THING

IIUC, the Scsh/Gambit approach would require something more like this:

   (set! data (foo <<THING
   This is a long quoted line.
   THING
   arg2 arg3))

...which seems quite a bit less readable to me.  Are there any drawbacks 
to supporting something more like the following?

   (set! data (foo #<<THING arg2 arg3))
   This is a long quoted line.
   THING

In this case, the end-of-string token (THING in this case) would 
presumably need to follow the rules for identifier syntax, but that 
seems OK to me - it should always be possible to construct a 
sufficiently distinct identifier.

Anton

[1] http://www.stonehenge.com/merlyn/UnixReview/col12.html