Skip to main content.
home | support | download

Back to List Archive


From: Bill Moseley <moseley(at)>
Date: Fri Feb 11 2005 - 19:03:24 GMT
On Fri, Feb 11, 2005 at 10:52:14AM -0800, Shaffer, Chris wrote:
> Hi...  I've gotten swish-e (using to crawl a couple of our
> intranet sites.  The filters seem to be working okay for excel.  And it
> seems to be looking at word documents.  However, (using swish.cgi), I
> don't get any descriptions for those word docs.


> Any idea where I can look?  I have no idea where to begin digging.

Sure. just writes to stdout, so you can run it on a few
test docs and see what it outputs.  Do it on a file that generates
a description and then another that doesn't and compare.

> StoreDescription HTML* <body> 200000

Make sure in the output that the document's header is indeed

$ SPIDER_QUIET=1 /usr/local/lib/swish-e/ default http://localhost/apache/test.doc  | head
Path-Name: http://localhost/apache/test.doc
Content-Length: 1713
Last-Mtime: 1108148269
Document-Type: TXT*

That's saying the document is TXT*, so you would need to add another
StoreDescription line for TXT*

Bill Moseley

Unsubscribe from or help with the swish-e list:

Help with Swish-e:
Received on Fri Feb 11 11:03:25 2005