Hi,
I have noticed, that when I use libxml2 on my indexed files, special
characters are stripped off (in my case czech characters)
Switching to DefaultContents HTML solved that problem - (together with
TranslateCharacters directive)
I tried it with these configurations : swish-e 2.5.2 and swish-e 2.2 on
Linux; v2.4.2 and v2.2 on Windows - both OS's behaved in the same way,
so I expect it is not in the configuration of the computers. (Am I
wrong? [on Linux I have LANG=cs_CZ;LANGUAGE=czech])
best regards
roman
below is some output from -T INDEXED_WORDS
#my config
TranslateCharacters ľ®ą©»«žŽšŠťŤ zzssttzzsstt
PropCompressionLevel 6
DefaultContents HTML
####debug output from windows v2.4.2 with HTML2
Indexing Data Source: "External-Program"
Indexing "perl.exe"
External Program found: C:\PERL\BIN\/perl.exe
Adding:[1:id(10)] 'rego' Pos:1 Stuct:0x1 ( FILE )
Adding:[1:id(10)] 'rapid' Pos:2 Stuct:0x1 ( FILE )
Adding:[1:id(10)] '025669' Pos:3 Stuct:0x1 ( FILE )
Adding:[1:id(10)] '025669' Pos:2 Stuct:0x85 ( META HEAD FILE )
Adding:[1:idd(16)] '20040226' Pos:5 Stuct:0x85 ( META HEAD FILE )
Adding:[1:ti(11)] 'buchlovské' Pos:8 Stuct:0x85 ( META HEAD FILE )
Adding:[1:ti(11)] 'nám' Pos:9 Stuct:0x85 ( META HEAD FILE )
Adding:[1:ti(11)] 'stí' Pos:10 Stuct:0x85 ( META HEAD FILE )
Adding:[1:au(12)] 'mar' Pos:13 Stuct:0x85 ( META HEAD FILE )
Adding:[1:au(12)] 'álková' Pos:14 Stuct:0x85 ( META HEAD FILE )
----this is correct from linux (v2.5.2; DefaultContents HTML2), in
windows it would be the same
Indexing Data Source: "External-Program"
Indexing "/usr/bin/perl"
External Program found: /usr/bin/perl
Adding:[1:id(10)] 'rego' Pos:1 Stuct:0x1 ( FILE )
Adding:[1:id(10)] 'rapid' Pos:2 Stuct:0x1 ( FILE )
Adding:[1:id(10)] '025669' Pos:3 Stuct:0x1 ( FILE )
Adding:[1:id(10)] '025669' Pos:2 Stuct:0x85 ( META HEAD FILE )
Adding:[1:idd(16)] '20040226' Pos:5 Stuct:0x85 ( META HEAD FILE )
Adding:[1:ti(11)] 'buchlovské' Pos:8 Stuct:0x85 ( META HEAD FILE )
Adding:[1:ti(11)] 'náměstí' Pos:9 Stuct:0x85 ( META HEAD FILE )
Adding:[1:au(12)] 'marsálková' Pos:12 Stuct:0x85 ( META HEAD
FILE )
Adding:[1:au(12)] 'zdenka' Pos:13 Stuct:0x85 ( META HEAD FILE )
Adding:[1:au(12)] 'ing' Pos:14 Stuct:0x85 ( META HEAD FILE )
Adding:[1:au(12)] 'zdenka' Pos:15 Stuct:0x85 ( META HEAD FILE )
Adding:[1:au(12)] 'marsálková' Pos:16 Stuct:0x85 ( META HEAD
FILE )
Received on Fri Nov 19 06:43:17 2004