At 02.09.2002 07:54 +0200, you wrote:
>>1. encode the binary with base64 or like and decode them for display
>
>Yhat is interesting .... can you give me more details? Or give me some
>link? When I have encoded such data, where do I have to put them? In
>properties? And after that, How can i retrive them? Attention: I'm using
>xml2 files and parser.
If you use the -S prog feature of swish-e, it is easy to put the binaries
in the index.
Just write a script that encodes the binary data and embeds them in a xml
document:
<document>
<title>a title</title>
<encoded>fhjhjhjghhgjgjgjghjg.....jghjkghjghjghjghjfghj</encoded>
<description>some useful text here</description>
..
</document>
For search you can use the perl module delivered with swish-e and decode
the binary:
SwishSearch($handle, "a query", 1, "title encoded description, "rank desc");
my %results;
while( @results{ @standard, @props } = SwishNext( $handle )) {
print $results{"title"};
print decode($results{"encoded"});
}
(...or so...see docs for details)
This approach a one _big_ backdraw: the encoded data have to be stored in
the index file (resp. in the props file) and will blow it up. And it make
indexing slow - the system has to process all the data and produces
properties for it.
So imho the "store the id" approach is better...
>>2. just store an id or filename which point the binary data.
>
>I have already used mySQl, but I'm wondering if it is possible to store
>data in swish-e rether than external archive.
It is possible and might be useful if you have small binary properties.
I was technical director of one of the biggest german searchengines 'til
end of last year (Infoseek Germany) and we had an image search, that showed
thumbs for the images found. We found that it was better to set up an
external data store for them, because they were too big and made indexing
extremely slow.
Greetings
Guido
>Thank you for your kind answer.
>Best Regards
>
>Cristiano Corsani
>----------------------------------------
>Biblioteca Nazionale Centrale di Firenze
>Piazza Cavalleggeri 1
>50122 Firenze
>Tel.: +39 055 24919 220
>mailto:cristiano.corsani@bncf.firenze.sbn.it
>
Received on Tue Sep 3 19:06:47 2002