Skip to main content.
home | support | download

Back to List Archive

Re: search in xml files.

From: <dasoso(at)not-real.alumni.uv.es>
Date: Wed Apr 20 2005 - 14:04:53 GMT
  Hi all. 

   Ok Bill, so swish-e (after indexing) can't search for a word inside
the xml files if the user doesn't indicate the metaTag in the search
string, I have to use -T option and work with these results in order
to return the correct xml files that match to the user. Is that what
you said?

   But it's hard to do with complex searches as foo1 and (foo2 or ...
and (foo3 and foo4))


Thanks.



> On Sat, Apr 16, 2005 at 10:07:31AM -0700, dasoso@alumni.uv.es wrote:
> > > Do you have swish installed?  You can often use the "-T
indexed_words"
> > > to get answers to your questions.
> 
> >    Ok Bill, but I was talking about .xml files. -w Tom works perfectly
> > with html files but doesn't return de xml files where appears the
word.
> 
> I was talking about XML files, too.
> 
> 
> > 1.xml
> > ::::::::::::::
> > 
> > <book>
> >    <title> Tomy Sawyer </title>
> > </book>
> 
> 
> > ::::::::::::::
> > 2.xml
> > ::::::::::::::
> > <person>
> >     <name> Tomy Smith </name>
> > </person>
> > 
> > 
> > swish-e-2.4.3> swish-e -w Sawyer
> > # SWISH format: 2.4.3
> > # Search words: Sawyer
> > # Removed stopwords:
> > # Number of hits: 1
> > # Search time: 0.002 seconds
> > # Run time: 0.025 seconds
> > 1000 /home/.../1.html "1.html" 41
> > 
> >     Only returns the html files where appears Sawyer, I would like to
> > know the xml that have the word Sawyer. I know I can do it with -T
> > index_words_full, but only for one word.
> 
> That's not what I said.  I said use -T indexed_words -- you do that
> when indexing.
> 
>    swish-e -i 1.xml 2.xml -T indexed_words
> 
> If you can't search for what you think you can search for then you
> use -T indexed_words to make sure you are actually indexing what you
> think you are indexing.
> 
> It's like car keys.  If you can't find them, then you only need to
> learn where you put them to know how to find them.
Received on Wed Apr 20 07:05:05 2005