Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] How to search within XML tags where XML tag appears multiple times

From: Peter Karman <peter(at)not-real.peknet.com>
Date: Thu Nov 18 2010 - 15:07:30 GMT
Tony Seddon wrote on 11/18/10 6:56 AM:
> 
> I have an XML structure like...
> 
> <name>
>   <first>
>     Tom
>   </first>
>   <last>
>     Smith
>   <last/>
> </name>
> <name>
>   <first>
>     Bob
>   </first>
>   <last>
>     Jones
>   <last/>
> </name>
> <name>
>   <first>
>     Jim
>   </first>
>   <last>
>     Farmer
>   <last/>
> </name>
> 
> In the config I have /UndefinedMetaTags auto/
> 
> If I search for Tom Jones using...
>  /swish-e -w  "name=(Tom Jones)"  /
> It matches the XML above although it should not.
> 
> How can I devise a search that will not match Tom Jones in the above XML ?
> 

You can't.

"UndefinedMetaTags auto" is a brute-force config option, and should only be used
if (a) you know already what XML tags are in your collection and you want all of
them, or (b) you want to discover what tags are there in order to create a
config file.

If you instead did:

MetaNames name
DontBumpPositionOnEndTags first last
DontBumpPositionOnStartTags first last

you could do:

[karpet@pekmac:~/tmp/swish-names]$ swish-e -w 'name=("tom jones")'
# SWISH format: 2.5.8
# Search words: name=("tom jones")
# Removed stopwords:
err: no results
.
[karpet@pekmac:~/tmp/swish-names]$ swish-e -w 'name=("tom smith")'
# SWISH format: 2.5.8
# Search words: name=("tom smith")
# Removed stopwords:
# Number of hits: 1
# Search time: 0.000 seconds
# Run time: 0.008 seconds
1000 test.xml "test.xml" 185
.


See
http://swish-e.org/docs/swish-config.html#dontbumppositiononendtags

-- 
Peter Karman  .  http://peknet.com/  .  peter(at)not-real.peknet.com
_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Thu Nov 18 10:07:34 2010