I have a site that is structured in html with multiple items per page, with sets of information about each item deliminated by div tags with a descriptive class attribute.
Shortened Example:
<DIV class="content">
<div class="product-details">
<div class="product-authors">
John Doe
</div>
</div>
<div class="product-details">
<div class="product-authors">
Jane Doe
</div>
</div>
</div>
Currently I am just indexing the full text of the page and the default swish properties for each page. The source is html, so I assume it's defaulting to use the HTML parser.
I would like to make a search available to search just the contents of the "Author" div's, for example.
I've been trying to define and use a property for the Author class, but without success.
I think I need to use some combination of metanames in the index config file and in the search cgi, but I've been unable to figure out the exact format to use.
I assume it's going to be something along the lines of:
UndefinedMetaTags ignore
XMLClassAttributes class # Not supported by the HTML parser?
MetaNames swishtitle swishdocpath swishdescription div.product-authors
in the index config file.
Is this possible? Would I have to convert to strict xhtml in order to use the XML parser to use the class attribute as a property/metatag? Or am I missing something else?
What occurs when I try the above is that the index appears to work (it reports "4 properties sorted." without any errors), but the search script returns "Unknown property name to sort by: Property 'div.authors' is not defined in index '<my index file>'" when I try to search by div.authors.
Anyone have an example of something like this working?
Thanks for any help,
Thomas Sewell
Received on Fri Mar 12 11:00:44 2004