On Fri, Sep 22, 2006 at 04:46:33PM -0700, Philippe A. wrote:
> FileFilter .doc ./lib/swish-e/swish_filter.pl '"%p" "%P"'
That's a lot of work to just run catdoc. Maybe just specify catdoc
as your FileFilter is all you are doing is indexing .doc files.
> TranslateCharacters :ascii7:
>
> A word spelled "montr=E9al" gets converted to "montrcal", as shown by -T
> INDEXED_WORDS.
> Adding:[7:swishdefault(1)] 'montrcal' Pos:2 Stuct:0x9 ( BODY FILE =
> )
> Adding:[7:swishdefault(1)] 'montrcal' Pos:3 Stuct:0x9 ( BODY FILE =
> )
I'd try catdoc from the command line and see what it's outputting.
I'm guessing from that =E9 above that your document is encoded in
Windows-1252, so you you might need to tell catdoc that's the source
encoding.
--
Bill Moseley
moseley@hank.org
Unsubscribe from or help with the swish-e list:
http://swish-e.org/Discussion/
Help with Swish-e:
http://swish-e.org/current/docs
swish-e@sunsite.berkeley.edu
Received on Fri Sep 22 17:30:45 2006