Steve Thomas wrote:
> Others might be interested in what I've been doing with swish-e and
> metadata. Primarily, what I wanted to do was to be able to create files
> of metadata, index them with swish-e, but have the search results point
> users to the actual files described by the metadata, rather than the
> metadata files.
That sounds like a reasonable solution. I have experimented with
similar things. Indexing only the metadata has its merits in many
cases. I do try to keep the metadata with the document where possible
(i.e. HTML). But, seperating it in some cases has definite
possibilities (nonHTML metadata is one definite plus).
> enhancements to swish-e to make all this a bit...
> 1. an option to have dc.title replace title in the index;
> 2. an option to have dc.identifier replace file name in the index;
> 3. an option to limit indexing to just the HEAD part of an html file
> 4. an option to recognise and index rdf data, in the same ways.
Excellent ideas. Have you thought about having swish-e call an external
script or program to do some of this? A custom spider might be able to
do some of those things more efficiently and flexibly (I don't think
anything inherently limits the spider to http).
--
,David Norris
Open Server Architecture Project - http://www.opensa.org/
Dave's Web - http://www.webaugur.com/dave/
ICQ Universal Internet Number - 412039
E-Mail - dave@webaugur.com
Received on Thu Mar 9 18:40:31 2000