make sure you have the latest CVS build of swish-e. I know a fix went in
last week to catch a bug with the -r/-u option (per Bill's email of
yesterday). I was using the latest version when I ran my test.
Tilo Muetze wrote on 12/6/04 9:19 AM:
> Hi Peter,
>
> Peter Karman wrote:
>
>>it works for me, so all I can guess is that there's something amiss with
>>how it's dealing with filtered files? I suggest trying some of the
>>debugging features, with -T and -v and so forth, to try and figure out
>>what's happening. note that in my example below, the -r run says "0
>>total words" -- to indicate that it was removed, I guess.
>>
>
> [snip]
>
> Hmm I really don't know whats going wrong here:
> <<<
> muet05@lnxsop02:/leitstand/master/bagjas$ cat test.xml
> <foo>bar </foo>
>
> muet05@lnxsop02: ~/swish-e-2.5.2-2004-12-03/src/swish-e -i test.xml
> Indexing Data Source: "File-System"
> Indexing "test.xml"
> Removing very common words...
> no words removed.
> Writing main index...
> Sorting words ...
> Sorting 1 words alphabetically
> Writing header ...
> Writing index entries ...
> Writing word text: Complete
> 1 unique word indexed.
> 4 properties sorted.
> 1 file indexed. 17 total bytes. 1 total words.
> Elapsed time: 00:00:00 CPU time: 00:00:00
> Indexing done!
>
> muet05@lnxsop02: ~/swish-e-2.5.2-2004-12-03/src/swish-e -w bar
> # SWISH format: 2.5.2
> # Search words: bar
> # Removed stopwords:
> # Number of hits: 1
> # Search time: 0.001 seconds
> # Run time: 0.005 seconds
> 1000 test.xml "test.xml" 17
> .
>
> muet05@lnxsop02: ~/swish-e-2.5.2-2004-12-03/src/swish-e -r -i test.xml
> Indexing Data Source: "File-System"
> Indexing "test.xml"
> Removing very common words...
> no words removed.
> Writing main index...
> Sorting words ...
> Sorting 1 words alphabetically
> Writing header ...
> Writing index entries ...
> Writing word text: Complete
> 1 unique word indexed.
> 4 properties sorted.
> 1 file indexed. 17 total bytes. 1 total words.
> Elapsed time: 00:00:00 CPU time: 00:00:00
> Indexing done!
>
> muet05@lnxsop02: ~/swish-e-2.5.2-2004-12-03/src/swish-e -w bar
> # SWISH format: 2.5.2
> # Search words: bar
> # Removed stopwords:
> # Number of hits: 1
> # Search time: 0.001 seconds
> # Run time: 0.005 seconds
> 1000 test.xml "test.xml" 17
> >>>
>
> From today I will be on a business trip for a week, so until next week
> I will not be able to do a deeper analysis whats exactly going on here.
> Anyway I thank you very much for your help so far, it's greatly appreciated.
>
> Regards,
> Tilo
--
Peter Karman . http://www.cray.com/craydoc/ . karman(at)not-real.cray.com
Received on Mon Dec 6 07:28:14 2004