On Wed, Dec 06, 2006 at 07:16:30PM -0800, Lesley Walker wrote:
> > spider.pl default http://yoursite.to.index/ > out.txt
> Thanks, I hadn't read far enough to know about that "default" option. I was
> busy setting up a config file based on the minimal example - if I'd seen
> that line in the docs first I would have done that straight away.
It's shown the first line in the first section of the spider docs. ;)
> My mission is to allow searching in some password-protected sub-sites that
> aren't linked from the main page so I think I'll have to do them each
Are you going to include a description from the document in search
results? Kind of defeats the purpose of password protected if you can
get to it from the search index.
> Would it make sense to maintain a separate index for each one rather than
> put it all in together with the main index, even though they're all pretty
Is that so you can limit searches to specific areas? I'm not sure it
makes much difference -- if they can be identified by the path then
you can use ExtractPath to create a metaname for searching each or all
sites. Or, you could use separate indexes. Probably doesn't matter,
although I'd probably have one index.
> I think I like the idea of leaving the main site index as it is and treating
> the new bits separately.
There's a little extra overhead searching multiple indexes, but for
small number of records it won't make much difference.
Unsubscribe from or help with the swish-e list:
Help with Swish-e:
Received on Wed Dec 6 20:44:30 2006