Skip to main content.
home | support | download

Back to List Archive

Re: Indexing differs for 2 lines swapped in file

From: Dominique Phommahaxay <dominique.phommahaxay(at)not-real.writeme.com>
Date: Wed Oct 29 2003 - 02:04:53 GMT
> Works fine on Linux.  I tried it on Windows and it does fail to index.
> 
> I suspect that you have found a bug in the Windows version of libxml2.
> But it's one of those bugs where it probably should not do what it's
> doing, but also you probably should not be trying to parse a
> pipe-separated text file with an HTML parser.
Sure I would certainly use swish-e within the requirements and limitations that are available.

> Add to your config file:
> 
>  ParserWarnLevel 9
> 
> and you will see how many errors are generated.
Somehow I could not redirect the warning to a file using the '>'. But there are lots of errors which are mostly of the following 3 types (these are the first 3 error messages):

C:/private/work/jway/BTTITLE01312003/BTTitle01312003-1.csv:10: error: htmlParseEntityRef: expecting ';'
0120415542|PAP|Statistics With Mathematica|BK&CD-ROM|TXT|ENG|Abell, Martha L./ B
                                                    ^
C:/private/work/jway/BTTITLE01312003/BTTitle01312003-1.csv:26: error: htmlParseEntityRef: no name
0940322080|HRD|India|BOOK & CD|GEN|ENG|Silvers, Robert B. (Edt)/ Epstein, Barbar
                           ^
C:/private/work/jway/BTTITLE01312003/BTTitle01312003-1.csv:34: error: htmlParseEntityRef: no name
rna/ Stemmler, Patricia/ Shotwell, Rita/ Wirth, Mirian|WILTR|PRETT||John Wiley &


> To fix, try setting this in your config file:
> 
>  DefaultContents TXT*
> 
> That avoids using libxml2.  Let me know if that fixes your problem.
Yes using DefaultContents TXT* does fixe the problem (J2Ee is now indexed and found).

How else can I help to contribute to the correction of this issue with libxml2?

Thanks for your time and responsiveness to all inquiries,

Dominique
-- 
__________________________________________________________
Sign-up for your own personalized E-mail at Mail.com
http://www.mail.com/?sr=signup

CareerBuilder.com has over 400,000 jobs. Be smarter about your job search
http://corp.mail.com/careers
Received on Wed Oct 29 02:17:07 2003