On Wed, Dec 06, 2006 at 07:16:30PM -0800, Lesley Walker wrote:
> > spider.pl default http://yoursite.to.index/ > out.txt
>
> Thanks, I hadn't read far enough to know about that "default" option. I was
> busy setting up a config file based on the minimal example - if I'd seen
> that line in the docs first I would have done that straight away.
It's shown the first line in the first section of the spider docs. ;)
> My mission is to allow searching in some password-protected sub-sites that
> aren't linked from the main page so I think I'll have to do them each
> individually.
Are you going to include a description from the document in search
results? Kind of defeats the purpose of password protected if you can
get to it from the search index.
> Would it make sense to maintain a separate index for each one rather than
> put it all in together with the main index, even though they're all pretty
> small?
Is that so you can limit searches to specific areas? I'm not sure it
makes much difference -- if they can be identified by the path then
you can use ExtractPath to create a metaname for searching each or all
sites. Or, you could use separate indexes. Probably doesn't matter,
although I'd probably have one index.
> I think I like the idea of leaving the main site index as it is and treating
> the new bits separately.
There's a little extra overhead searching multiple indexes, but for
small number of records it won't make much difference.
--
Bill Moseley
moseley@hank.org
Unsubscribe from or help with the swish-e list:
http://swish-e.org/Discussion/
Help with Swish-e:
http://swish-e.org/current/docs
swish-e@sunsite.berkeley.edu
Received on Wed Dec 6 20:44:30 2006