Skip to main content.
home | support | download

Back to List Archive

RE: Document Summaries/Descriptions

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Wed Nov 15 2000 - 13:58:12 GMT
At 02:35 AM 11/15/00 -0800, jmruiz@boe.es wrote:
>So, a document may contain both title and description, right?
>I would also like the possibility that the description can be a field 
>(Metaname). What about:
>
>StoreDescription <field>|size
>
>For a field (the filed may be enclosed by <>). Eg:
>
>StoreDescription <myfield>

I haven't really thought much about it, but instead of adding a new
StoreDescriptions parameter, could Properties be extended instead?

	Properites body:500  # save the first 500 chars of <body>

       Properties *:500  # save first 500 chars of document

Of course, with some document types you may end up with an incomplete
document.

OTOH, I'm not sure that this feature can't be handled outside of swish if
Properties won't work in some case.  It's faster to access the document
summaries if they are in the index, but it might come at the expense of
speed when searching -- and that is swish's main job.

If using the file system you can always access the documents from your CGI
front-end to show a summary of the first x characters.  If indexing with
the httpd method then maybe the spider can extract the first x characters
and save it to a local file or database depending on your needs.  That
would be better with HTML as you could use HTML::TreeBuilder to extract out
correct HTML instead of just chopping it off after x number of characters.



Bill Moseley
mailto:moseley@hank.org
Received on Wed Nov 15 13:59:44 2000