Skip to main content.
home | support | download

Back to List Archive

Bug in sorting by swishdocpath in 2.2.3

From: Stephen E. Bacher <seb(at)not-real.draper.com>
Date: Mon Jan 27 2003 - 16:22:59 GMT
I'm running SWISH-E 2.2.3 under Solaris 8, using the -s flag
to specify the sort order of the search results (using text
indexing):

 -s swishdocpath asc

It seems, however, that the sorting by swishdocpath does not work
properly when the path contains components longer than 8 characters.

Here is an illustration:  I have two test directories, one
called "testjunk" and one called "testjunk2".

% cd testjunk
/home/seb1525/search/mail/test/testjunk
% more swish-e.config
IndexDir /usr/local/my2/htdocs/seb1525/mysearch/db/mail/test/testjunk
IndexFile index.swish-e 
IndexOnly .txt
UseStemming no
IndexReport 3
FollowSymlinks no
IndexPointer /seb1525/mysearch/db/mail/test/testjunk/
IndexAdmin Steve Bacher (seb@draper.com)
IndexContents TXT .txt
ReplaceRules prepend /seb1525/mysearch/db/mail/test/testjunk/
% swish-search -w 'not foo' -f index.swish-e -s swishdocpath asc
# SWISH format: 2.2.3
# Search words: not foo
# Number of hits: 10
# Search time: 0.031 seconds
# Run time: 0.147 seconds
1000 /seb1525/mysearch/db/mail/test/testjunk/./M0000001.txt "M0000001.txt" 1544
1000 /seb1525/mysearch/db/mail/test/testjunk/./M0000002.txt "M0000002.txt" 1365
1000 /seb1525/mysearch/db/mail/test/testjunk/./M0000003.txt "M0000003.txt" 1290
1000 /seb1525/mysearch/db/mail/test/testjunk/./M0000004.txt "M0000004.txt" 295
1000 /seb1525/mysearch/db/mail/test/testjunk/./M0000005.txt "M0000005.txt" 3110
1000 /seb1525/mysearch/db/mail/test/testjunk/./M0000006.txt "M0000006.txt" 448
1000 /seb1525/mysearch/db/mail/test/testjunk/./M0000007.txt "M0000007.txt" 606
1000 /seb1525/mysearch/db/mail/test/testjunk/./M0000008.txt "M0000008.txt" 760
1000 /seb1525/mysearch/db/mail/test/testjunk/./M0000009.txt "M0000009.txt" 341
1000 /seb1525/mysearch/db/mail/test/testjunk/./M0000010.txt "M0000010.txt" 1311
.
% cd ../testjunk2
/home/seb1525/search/mail/test/testjunk2
% more swish-e.config
IndexDir /usr/local/my2/htdocs/seb1525/mysearch/db/mail/test/testjunk2
IndexFile index.swish-e 
IndexOnly .txt
UseStemming no
IndexReport 3
FollowSymlinks no
IndexPointer /seb1525/mysearch/db/mail/test/testjunk2/
IndexAdmin Steve Bacher (seb@draper.com)
IndexContents TXT .txt
ReplaceRules prepend /seb1525/mysearch/db/mail/test/testjunk2/
% swish-search -w 'not foo' -f index.swish-e -s swishdocpath asc
# SWISH format: 2.2.3
# Search words: not foo
# Number of hits: 17
# Search time: 0.011 seconds
# Run time: 0.147 seconds
1000 /seb1525/mysearch/db/mail/test/testjunk2/./M0000009.txt "M0000009.txt" 620
1000 /seb1525/mysearch/db/mail/test/testjunk2/./M0000008.txt "M0000008.txt" 2041
1000 /seb1525/mysearch/db/mail/test/testjunk2/./M0000007.txt "M0000007.txt" 2907
1000 /seb1525/mysearch/db/mail/test/testjunk2/./M0000006.txt "M0000006.txt" 705
1000 /seb1525/mysearch/db/mail/test/testjunk2/./M0000005.txt "M0000005.txt" 1702
1000 /seb1525/mysearch/db/mail/test/testjunk2/./M0000004.txt "M0000004.txt" 735
1000 /seb1525/mysearch/db/mail/test/testjunk2/./M0000003.txt "M0000003.txt" 3823
1000 /seb1525/mysearch/db/mail/test/testjunk2/./M0000001.txt "M0000001.txt" 8488
1000 /seb1525/mysearch/db/mail/test/testjunk2/./M0000002.txt "M0000002.txt" 3663
1000 /seb1525/mysearch/db/mail/test/testjunk2/./M0000011.txt "M0000011.txt" 1284
1000 /seb1525/mysearch/db/mail/test/testjunk2/./M0000017.txt "M0000017.txt" 1725
1000 /seb1525/mysearch/db/mail/test/testjunk2/./M0000016.txt "M0000016.txt" 1102
1000 /seb1525/mysearch/db/mail/test/testjunk2/./M0000015.txt "M0000015.txt" 5730
1000 /seb1525/mysearch/db/mail/test/testjunk2/./M0000014.txt "M0000014.txt" 995
1000 /seb1525/mysearch/db/mail/test/testjunk2/./M0000013.txt "M0000013.txt" 476
1000 /seb1525/mysearch/db/mail/test/testjunk2/./M0000012.txt "M0000012.txt" 2084
1000 /seb1525/mysearch/db/mail/test/testjunk2/./M0000010.txt "M0000010.txt" 1834
.
% 

Notice that "testjunk" is sorted correctly, but "testjunk2" is not.

It doesn't seem to have anything to do with the number of hits,
either, because much larger databases (with pathname components
under the 8-character "limit") don't have this problem.

Any clues?

Steve Bacher
Draper Laboratory
seb@draper.com
Received on Mon Jan 27 16:23:26 2003