On Thu, Jul 03, 2003 at 02:31:21AM -0700, Tim Freedom wrote:
>
> 1. I have various mailing-lists which are archived as UTF-8 files. I didn't
> do anything special about indexing them (no conversion is needed) and
> I added the following line to the search.tt file (after <head>)
First, my knowledge of character issues is limited.
>
> <meta http-equiv="content-type" content="text/html; charset=UTF-8">
I'm not so sure that would make any difference. Does it? I say that
because with the libxml2 parser text is converted from UTF-8 to 8859-1
before swish-e processes that text.
> I see results - which is great, but the highlighting seems to mess things
> up. Instead of seeing my words highlighted properly, I see question
> marks with sprinkled yellow highlights (every other question mark in a
> row gets the yellow highlight). Is there anyway to fix that. I use the
> default hightlight method (ie. I don't over-ride anything in that area,
> so I believe its using 'PhraseHighlight' which is fine). Not sure if
> you need to include 'use utf8;' in your code or what so that the various
> multibyte characters get grouped appropriately. Any ideas/solutions ?
Not really. I just recently upgraded to Perl 5.8.0 where character
encoding works better. It's been a while since I looked but with older
Perl versions I had weird problems. For example, to tokenize the text
for highlighting I split the text using the WordCharacters setting
passed back in the swish-e results header. I had some odd problems once
and it turned out I had text in Perl flagged as UTF8 but the split
function in some cases was splitting in the middle of a multi-byte char
then I ended up with invalid UTF8 strings. I suspect that's fixed in
5.8.0.
Can you work up an example to demonstrate the problem? You might also
try the SimpleHighlight module instead -- that only splits on
whitespace. If that changes things then maybe that's a clue where the
problem is.
> 2. Using the 'TemplateToolkit' method - if I don't find a result for my
> search I get a red "no result" _above_ the search form. Is there anyway
> to control the location of where that string gets inserted (and all other
> error strings. I saw that it gets spewed out as STDERR and didn't follow
> it after that.
Sure, that's the entire point of using the template. In search.tt:
[% WRAPPER page %]
[% PROCESS swish_header %]
[% title = PROCESS title %]
[% IF ! search.results %]
[% PROCESS show_message %] <<<<<< move this
[% PROCESS search_form %]
[% ELSE %]
[% PROCESS search_form %]
[% PROCESS nav_bar %]
[% PROCESS results_list %]
[% END %]
[% PROCESS swish_footer %]
[% END %]
--
Bill Moseley
moseley@hank.org
Received on Thu Jul 3 13:07:50 2003