On Thu, Jun 17, 2004 at 10:00:12AM -0700, adivey1@cox.net wrote:
> So would the following work?
>
> (delete the DefaultContents line)
> IndexOnly HTML* .htm .html .cfm .doc .pdf .ppt
> NoContents .doc .pdf .ppt
No, not if using the spider.pl. IndexOnly is when reading the file
system. When using spider.pl (i.e. -S prog) it's up to the program to
control what docs are passed to swish-e for indexing.
IIRC, NoContents will work with -S prog, but it's not really worth using
-- just filter the content down to nothing (or maybe just a newline may
be required) instead of sending the entire doc to swish just to be
ignored.
BTW -- a lot of sites (easily found with google) give this example:
IndexOnly .html
NoContents .jpg .mov .gif
which won't work since swish-e only looks at .html files. That is,
swish-e checks IndexOnly before looking at NoContents.
--
Bill Moseley
moseley@hank.org
Unsubscribe from or help with the swish-e list:
http://swish-e.org/Discussion/
Help with Swish-e:
http://swish-e.org/current/docs
swish-e@sunsite.berkeley.edu
Received on Thu Jun 17 23:53:14 2004