This error does not occur in all situations. It only occurs with certain files and with certain keywords. I have produced a test case that might help you, and can send anyone who wants it the resulting indexes.
Starting with the source code of swish-e, swish-e-2.4.2/prog-bin Created two subdirs that contain a subset of the files.
I have narrowed it down to one set of files that works, and another that does not work,
ls3122:~/swish-e-2.4.2/prog-bin # ls test12
Makefile spider.pl
ls3122:~/swish-e-2.4.2/prog-bin # ls test13
Makefile spider.pl spider.pl.in
Indexing of those directories :
ls3122:~/swish-e-2.4.2/prog-bin # swish-e -i test12 -f test12.index
Indexing Data Source: "File-System"
Indexing "test12"
Removing very common words...
no words removed.
Writing main index...
Sorting words ...
Sorting 1,303 words alphabetically
Writing header ...
Writing index entries ...
Writing word text: Complete
Writing word hash: Complete
Writing word data: Complete
1,303 unique words indexed.
4 properties sorted.
2 files indexed. 76,979 total bytes. 10,594 total words.
Elapsed time: 00:00:00 CPU time: 00:00:00
Indexing done!
ls3122:~/swish-e-2.4.2/prog-bin # swish-e -i test13 -f test13.index
Indexing Data Source: "File-System"
Indexing "test13"
Removing very common words...
no words removed.
Writing main index...
Sorting words ...
Sorting 1,303 words alphabetically
Writing header ...
Writing index entries ...
Writing word text: Complete
Writing word hash: Complete
Writing word data: Complete
1,303 unique words indexed.
4 properties sorted.
3 files indexed. 142,443 total bytes. 19,606 total words.
Elapsed time: 00:00:00 CPU time: 00:00:00
Indexing done!
ls3122:~/swish-e-2.4.2/prog-bin # swish-e -w test -f test13.index
Warning: Failed to uncompress Property. zlib uncompress returned: -5. uncompressed size: 105 buf_len: -19960
# SWISH format: 2.4.2
# Search words: test
# Removed stopwords:
err: no results
.
ls3122:~/swish-e-2.4.2/prog-bin # swish-e -w test -f test12.index
# SWISH format: 2.4.2
# Search words: test
# Removed stopwords:
# Number of hits: 2
# Search time: 0.000 seconds
# Run time: 0.020 seconds
1000 test12/spider.pl "spider.pl" 65479
852 test12/Makefile "Makefile" 11500
.
if I run hexdump on the index.property file, and the diff the two, it gives me this result :
ls3122:~/swish-e-2.4.2/prog-bin # diff -u test13.index.prop.txt test12.index.prop.txt
--- test13.index.prop.txt 2004-07-05 17:24:19.000000000 +0200
+++ test12.index.prop.txt 2004-07-05 17:23:36.000000000 +0200
@@ -1,9 +1,7 @@
-0000000 0000 0000 e940 d771 000f 6574 7473 3331
+0000000 0000 0000 e940 dd71 000f 6574 7473 3231
0000010 4d2f 6b61 6665 6c69 0865 0000 0000 0000
-0000020 2c00 08ec 0000 0000 4000 71e9 13ad 7400
-0000030 7365 3174 2f33 7073 6469 7265 702e 2e6c
-0000040 6e69 0008 0000 0000 0000 b8ff 0008 0000
-0000050 0000 e940 c071 0010 6574 7473 3331 732f
-0000060 6970 6564 2e72 6c70 0008 0000 0000 0000
-0000070 c7ff 0008 0000 0000 e940 ad71
-000007c
+0000020 2c00 08ec 0000 0000 4000 61e9 10ee 7400
+0000030 7365 3174 2f32 7073 6469 7265 702e 086c
+0000040 0000 0000 0000 ff00 08c7 0000 0000 4000
+0000050 61e9 00ee
+0000053
It also does not occur on all the words.
But all the words are the same in both indexes :
swish-e -k* -f test13.index > test13.words
swish-e -k* -f test12.index > test12.words
diff test12.words test13.words
Produces no difference.
If it look for the work, make, I get no errors.
ls3122:~/swish-e-2.4.2/prog-bin # swish-e -w make -f test13.index
1000 test13/spider.pl "spider.pl" 65479
1000 test13/spider.pl.in "spider.pl.in" 65464
645 test13/Makefile "Makefile" 11500
.
ls3122:~/swish-e-2.4.2/prog-bin # swish-e -w make -f test12.index
1000 test12/spider.pl "spider.pl" 65479
645 test12/Makefile "Makefile" 11500
So I also tried the test 14 with just spider.pl and spider.pl.in
swish-e -w test -f test14.index
1000 test14/spider.pl "spider.pl" 65479
1000 test14/spider.pl.in "spider.pl.in" 65464
and test15 with spider.pl.in and Makefile
swish-e -w test -f test15.index
1000 test15/spider.pl.in "spider.pl.in" 65464
852 test15/Makefile "Makefile" 11500
please tell me if you have any ideas how to approach this problem, and if you want me to send you a tarball.
Mike
Received on Mon Jul 5 09:16:41 2004