[R6RS] comments on rough draft for hash tables
Anton van Straaten
anton at appsolutions.com
Wed Mar 8 10:33:32 EST 2006
Thanks for the quick response, Will. I've made changes in SVN in
response to all of your points, as detailed below.
> I'd prefer hash-table-get/default,
> by analogy with the name used by SRFI-69:
Done. For consistency, I've also renamed all the hash-table-get/xxx
procedures to hash-table-ref/xxx, which matches SRFI-69's convention.
> We also should have something like
> Procedure: hash-table-get/call hash-table key f => object
> Returns the value associated with key in the hash-table
> if the hash table contains key; otherwise tail-calls f on key.
> In my opinion, hash-table-get/call is more useful and more
> efficient than hash-table-get/thunk, which often requires
> the user to obtain the thunk by writing things like
> (lambda () (f key)).
>>Procedure: hash-table-contains hash-table
> This should be
> Procedure: hash-table-contains? hash-table key
> If we change /flag to /default or add the /call version,
> then we should make the corresponding change or addition
> to the update procedures.
Both done. (Any "unnecessary" procedures can be cut later.)
>>Procedure: hash-table-fold hash-table procedure init => value
>>Procedure: hash-table-for-each procedure hash-table => unspecified
> What happens if the procedure adds or deletes keys from
> the hash-table?
The issue item about "Side effects in thunks" was intended to indicate
that this is yet to be addressed. I've relabeled it as "Side effects in
higher-order procedures", and changed the language to include "This
should be addressed somehow, if only by a statement that the behavior
caused by such procedures is unspecified."
> The specification of hash-table-clear!
> implies that this is allowed.
The code in question was a leftover from earlier work. I've deleted it.
>>Procedure: eq-hash object => integer
>> Returns a hash value for object based on its location.
>>This procedure is the hash function used by make-eq-hash-table.
> No, we shouldn't guarantee that this procedure is used by
> make-eq-hash-table. In some systems it may be, but it is
> unlikely to be the same hash function in systems whose
> garbage collectors move objects.
> Hah. The implementors of MIT Scheme did not anticipate
> garbage collectors that run 100 times per second.
Removed. That behavior arose from a desire to have the reflection
procedures produce something useful and consistent with the rest of the
spec, when called on an eq or eqv hash table:
>>Procedure: hash-table-hash-function hash-table => procedure
>> Returns the hash function used by hash-table.
> Should these procedures be defined on hash tables that
> were created by calling make-eq-hash-table or
You're right that hash-table-hash-function shouldn't be, so I've added
provisional language to that effect. What about
> O(1) retrieval cannot be guaranteed for arbitrary hash
> functions. You can't even guarantee amortized O(1)
> retrieval for the system's hash functions. The best
> you can do is to guarantee average-case amortized O(1)
> retrieval for the system's hash functions.
Good point. I took that example from SRFI 69, which actually uses
language such as "Given a good hash function, this operation should have
an (amortised) complexity of O(1) with respect to the number of
associations in hash-table." This is now quoted in the issue text. I
mainly wanted to document the question of whether the proposal should
include such constraints on implementations.
>>Concurrency: R6RS does not deal with concurrency. Should
>>this proposal say any thing about it?
> We might not say anything about it, but we need to think
> about it. Any implementation that supports concurrency is
> going to have to implement some kind of mutual exclusion
> for operations that have side effects, and some will need
> a bit of mutual exclusion even for hash-table-ref.
> As specified, the updating operations are not atomic, so
> they create no new problems. (Some implementors might try
> to implement them as atomic operations, but we can let
> them learn the hard way.)
> The hash-table-fold and hash-table-for-each procedures
> already have a problem, even without concurrency.
I've added these remarks to the issue text.
More information about the R6RS