Re: 2.4.3 Refuses to Index Virtual Host

From: Bill Moseley <moseley(at)>
Date: Sat Apr 09 2005 - 16:48:15 GMT
On Sat, Apr 09, 2005 at 07:29:20AM -0700, fh oregon wrote:
> As for your second point, I'm not any kind of company - just a guy with 
> a fairly large personal web site who (as a hobby) hosts some email lists 
> and web sites for a car club and a food club.

But looks like an email/hosting service.  That's not
you? looks like a company, too.  So I'm confused.  Or are
those just names you borrowed to use on this list?

> Since the root of 
> ""  ("/CARS") is contained within the "" tree, I 
> would expect that it would be indexed on the same pass.  It would be 
> interesting to understand just how swish-e traverses the website tree - 
> in looking at the log file, it appears to be jumping around and not 
> following the directory structure as I would expect.  Kinda makes me go 
> "hmmmm".

Think about it.  All a web spider can do is follow links.  It has no
idea about your directory structure at all.  Many web sites don't even
have any directory structure -- they are all dynamically created from
a database.

There is no way for the spider to know that is
contained in's web or file space.  If you had a
spider that didn't limit to specific hosts then you would end up
indexing the entire Web.  If you want to index a host you have to tell
the spider to index that host.

Post more details about what you want to do and we can help you get it

Bill Moseley

Received on Sat Apr 9 09:48:31 2005