Documentation not too helpful :(
Let's steer the question this direction...
In my SWISH-E configuration file, I have
StoreDescription HTML* <body> 20000
Using spider.pl, with -v 3, I can see that it's using the HTML2 parser. So shouldn't it then pass that onto my StoreDescription line? When I run a swish.cgi through my browser, I don't get any text to highlight (summary, whatever you wanna call it) for anything other than HTML files. I haven't tried TXT or PPT, but PDF and DOC don't have anything. They don't have null though.
Anyway, what do I have to change, and where do I change it, to get the snapshot (summary, highlightable area, etc) to display on the search page?
I feel like that didn't make any sense... Thanks in advance!
-Alan
>
> From: Bill Moseley <moseley@hank.org>
> Date: 2004/06/02 Wed PM 04:04:28 EDT
> To: adivey1@cox.net
> CC: Multiple recipients of list <swish-e@sunsite.berkeley.edu>
> Subject: Re: Config files and spider.pl
>
> On Wed, Jun 02, 2004 at 12:11:13PM -0700, adivey1@cox.net wrote:
> > Where can I find documentation that'll tell me which configuration
> > files are overriden by using spider.pl? Obviously, IndexDir is
> > specified in the file, but entries like StoreDescription and
> > IndexContents, and MetaNames, how do I know which ones are being read
> > and ignored?
>
> http://www.swish-e.org/current/docs/SWISH-CONFIG.html
>
> * Swish-e CONFIGURATION FILE
> o Alphabetical Listing of Directives
> o Directives that Control Swish
> o Administrative Headers Directives
> o Document Source Directives
> o Document Contents Directives
> o Directives for the File Access method only
> o Directives for the HTTP Access Method Only
> o Directives for the prog Access Method Only
> o Document Filter Directives
> + Filtering with SWISH::Filter
> + Filtering with the FileFilter feature
> * Document Info
>
> So, they are suppose to be broken up by options that control the
> indexing vs. options that control what files are passed to swish for
> indexing.
>
> So, setting swish-e to follow symbolic links probably isn't going to
> effect spidering. But MetaNames is how the data is indexed, regardless
> of where it comes from.
>
> Technical writers always welcome.
>
> --
> Bill Moseley
> moseley@hank.org
>
>
Received on Thu Jun 3 09:25:36 2004