Skip to main content.
home | support | download

Back to List Archive

Re: Indexing Multiple <title>tags

From: Peter Karman <peter(at)not-real.peknet.com>
Date: Wed Mar 29 2006 - 03:08:29 GMT
xpath will not work within swish-e, as swish-e uses SAX parser.

Your best bet is to filter your input prior to indexing and actually 
change the tag name, either change the other two titles to e.g. 'title1' 
and 'title3' or change the real title to e.g. 'titlereal' and then use 
config options to map swishtitle to 'titlereal'. The first option is 
probably easier.

Vijay scribbled on 3/28/06 2:11 PM:
> I am implementing search interface for XML based website using SWISH-e (with libxml2) .Everything works fine except the what user sees on the search results title.The search results taking the contents from all of the three <title>tags that I have in each of my XML pages in  the collection.Is there anyway that I can limit the indexing to only the second <title> in the document.
>   The sample document is available at http://cse.unl.edu/~vbandaru/xmlsearch/vijay.xml
>   and sample search result is 
>   1 Birds of Nebraska Birds on Our Street - Feathered Choristers Who Salute the Coming Day With Notes of Joy Omaha Sunday Bee -- rank: 1000 
>   But I want the search results title to apprear like this by ignoring 1st and 3rd title contents:
>   
> 1 Birds on Our Street - Feathered Choristers Who Salute the Coming Day With Notes of Joy -- rank: 1000 
>   
> I tried using some of the directives for ignoring metanames for title.But it makes all title to disapper.I tried reading old threads but they are not of much help.IS there any structered way to access 2nd title title ( as in XPATH  like /TEI.2/text/body/cit/bibl/title ) and index its contents.
>   The  config file is available at  http://cse.unl.edu/~vbandaru/xmlsearch/birds.conf
> Any help will be appreciated.
>   Thanks
> Vijayender Reddy Bandaru
> 
> 
> 				
> ---------------------------------
>  Jiyo cricket on Yahoo! India cricket
> Yahoo! Messenger Mobile Stay in touch with your buddies all the time.
> 
> 
> *********************************************************************
> Due to deletion of content types excluded from this list by policy,
> this multipart message was reduced to a single part, and from there
> to a plain text message.
> *********************************************************************
> 

-- 
Peter Karman  .  http://peknet.com/  .  peter(at)not-real.peknet.com
Received on Tue Mar 28 19:08:33 2006