[CI] BHUNT comparison experiments

Alexander Rasin alexr at cs.brown.edu
Fri Jun 20 10:20:52 EDT 2008


>> Note that if the way I interpreted the paper, the exception table contains 
>> the whole exception tuple (rather than just price/catid as Hideaki defined 
>> it)
>> This avoids doing the very join that is currently happening (CATID in (list 
>> of CATIDs)), and instead one can simply execute the query in parallel on 
>> whatever is read off the bump and the exception table.  In that case, I 
>> would expect BHUNT to outperform CIs, at least when errors tuples are not 
>> tightly clustered.
>
> But that would make the exceptions table even bigger!
>

I agree.   But firstly, this is how I interpret the paper -- BHUNT 
relations are for column pairs and exception table described in the 
paper contains 5 columns

Secondly, we already agree that BHUNT isn't suited for large number of 
exceptions. 
For a small number of scattered exceptions, however, such table would beat 
CI.  And with a small number of exceptions (100-s or 1000-s), this 
exceptions table would be trivially small.


Alex


More information about the CI mailing list