Skip to main content.
home | support | download

Back to List Archive

Re: swish

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Thu May 31 2001 - 14:42:22 GMT
At 06:30 AM 05/31/01 -0700, Philip Mak wrote:
>On Thu, 31 May 2001, Bill Moseley wrote:
>
>> Yes.  There will be an example CGI script in the source distribution
>> that will provide some form of term highlighting, and context output
>> (showing a few words on either side of the matched term(s)).
>>
>> Don't get too excited, as there are some open issues.
>
>A suggestion for the script: Instead of just using SWISH-E's
>StoreDescription feature, how about allowing a user-defined function to be
>provided that takes the matched document name as the input, and returns
>the contents of the document (converted to plain text if necessary) to the
>script, which then hilights the right places?

It basically works that way now.  You pass to the highlighting function the
text to highlight.  If you wanted to insert code to read from an external
source, that's not a problem.

But, the text needs to be parsed first -- for example, if you are indexing
html documents you would need to use something like HTML::Parser to extract
out the text from the markup.  

And it gets more complicated.  Say you do a metaname search for swishtitle
(match words in the title only).  Then you might only want to highlight
matching words in the title instead of the body of the document.

And if you want to display html back in your results, you need to be tricky
about matching words in href tags as then you might want to highlight the
entire <a> text.

So to answer your question, yes, it will be easy to modify the script to
read from an external file.  But implementing that will be, as they say,
left to the reader.

>This way, people can use your example script no matter how the original
>document is stored (text file, Berkeley DB, MySQL, etc.).

Philip, are you indexing files in a MySQL database using -S prog?  If so,
I'd like to see the script you are using.  There isn't an example yet in
the swish distribution for indexing from a RDBMS.



Bill Moseley
mailto:moseley@hank.org
Received on Thu May 31 14:42:44 2001