Skip to main content.
home | support | download

Back to List Archive

Re: DEFAULT_CONFIG_FILE in 2.2 question

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Wed Sep 11 2002 - 18:21:52 GMT
At 10:56 AM 09/11/02 -0700, Jody Cleveland wrote:
>> Regardless, switch to <swishdocpath> and <swishdescription> would probably
>fix.
>
>I'm not understanding. Switch it where?

 http://swish-e.org/2.2/docs/SWISH-RUN.html#Searching_Command_Line_Arguments

See the section on -x.

>> You need two config options.  I think this is all described in the
>> swish.cgi docs, too.
>> 
>> IndexContents HTML .html
>> StoreDescription HTML <body>
>
>When I index, I run swish-e -S prog -c spider.config
>In that config file, I have this:
>StoreDescription HTML <body> 200000
>DefaultContents HTML
>IndexContents HTML2 .htm .html
>IndexContents TXT .txt .conf

So all your .htm, .html are type HTML2, and .txt and .conf are type TXT,
but StoreDescription is only saving the <body> for docs of type HTML.

I'd try:

DefaultContents HTML2
IndexContents TXT2 .txt .conf
StoreDescription HTML2 <body> 200000
StoreDescription TXT2 200000

That's saying all docs are HTML2, with the exception of .txt and .conf
which are TXT2.  And then two Store Description's are needed because docs
are not of type HTML2 or type TXT2.

>Which I believe I took right from the docs.

Quite possible.  I'll fix if you can point it out.

Sorry for all confusion about the document types.  That's all due to having
two sets of parsers possible -- not to mention that we talk about HTML docs
in the general sense, and also HTML and HTML2 "types" as far as swish-e
processing is concerned.


-- 
Bill Moseley
mailto:moseley@hank.org
Received on Wed Sep 11 18:25:27 2002