Skip to main content.
home | support | download

Back to List Archive

Re: Fw: Re: More?

From: <00prneubauer(at)>
Date: Thu Mar 05 1998 - 13:36:37 GMT
Paul J. Lukas wrote:  
>	The breadth-first strategy of SWISH++ rather than the depth-
>	first one of SWISH-E is certainly better suited to web
>	indexing.

This is an interesting proposition, but I'm afraid it's not obvious to
me why this should be so.  For reasons I currently don't understand, I
appear to have missed getting some mail recently, so I am not sure
exactly what the matter under discussion was here, but it appears to
involve indexing remote web sites (in addition to | instead of) a
local web site.  

At first glance, a breadth-first strategy looks like it would spend
all its time listing sites and never get around to individual
pages/documents.  I think it is clear that this strategy would never
work for something like Altavista, so I'm sure that this was not what
Paul Lukas was referring to. :-)

Still, even for a limited number of sites (e.g., one) it's not obvious
to me why breadth-first is going to be faster, more efficient or
otherwise better suited than depth-first.  Depth-first strikes me as
more obvious/natural, but I have certainly not ever thought about it
in any detail.  Is the efficiency dependent on how broad the breadth
is and/or how deep the depth is?  If so, at what point does it flip? 
If not, what makes one better for web indexing?  Paul L. says that
swish++ is "an order of magnitude faster than SWISH-E."  I am not
going to dispute the truth of this statement, but if the factor that
makes it true is a breadth-first algorithm rather than a depth-first
algorithm, then there is probably a simple explanation for why this is
so.  I don't see it at the moment.  I humbly beg enlightenment.

Paul Neubauer  
For PGP Public Key send mail with subject="Send PGP Public Key" 
1024 bits -- Key ID: 3FEB993D
Key Fingerprint: 85 AA A5 91 00 49 7A 7B  23 26 F7 B8 DB 72 C9 48
Received on Thu Mar 5 05:45:13 1998