Skip to main content.
home | support | download

Back to List Archive

Re: Indexing XLS Files

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Thu Jul 17 2003 - 20:45:29 GMT
On Thu, Jul 17, 2003 at 01:32:12PM -0700, Jeffrey.Grunstein@ny.frb.org wrote:
> We're running Swish-E 2.2.1 on a Solaris 9 box.  Doing a spider crawl.
> We were trying to index Excel files but kept getting "Invalid Content Type"
> errors.  We got it to work by modifying the XLtoHTML function to include
> both application/vnd.ms-excel and application/vnd.ms-excel.

Can you give more details.  I couldn't find "Invalid Content Type" with 
a quick grep of the source tree.

If possible can you put up an example XLS document and a sample config?

> 
> But now that the XLS files are actually being indexed, only 2 words are
> indexed per
> file.
> 
> Does anybody know what these two words happen to be and how can I get the
> entire spreadsheet to be indexed?

Using -T indexed_words will show what words are being indexed.



-- 
Bill Moseley
moseley@hank.org
Received on Thu Jul 17 20:45:47 2003