Greetings fellow Swish-e users,
I have recently completed installation of Swish-e on an apache server
machine with the follows details:
Swish-e version - 2.4.5
Apache version - 2.0.52
I have two short questions:
1) I have noticed that indexing of PDF files seems to be limited to the
root directory. I have PDFs in the root directory and ones in a sub
directory. Only PDFs in the root directory ever appear in search
results. It is my understanding that swish-e automatically recurses
subdirectories when indexing. Is this not also the case with indexing of
PDF's?
2) I have also noticed that Swish-e does not seem to be indexing numbers
inside of Excel or other Office files very well. When I search for a
number I know to be in an indexed file, for example 22469, the search
often yeilds no results.
Here is the contents of my configuration file:
IndexFile index.swish-e
IndexDir /var/www/html
IndexDir /var/www/twiki/data
FollowSymLinks yes
WordCharacters abcdefghijklmnopqrstuvwxyz0123456789.-
IgnoreFirstChar .-
IgnoreLastChar .-
BeginCharacters abcdefghijklmnopqrstuvwxyz0123456789
EndCharacters abcdefghijklmnopqrstuvwxyz0123456789
ReplaceRules remove /var/www/html
FollowSymLinks yes
IndexReport 2
IgnoreWords file:
/var/www/swish-e/share/doc/swish-e/examples/conf/stopwords/english.txt
TranslateCharacters :ascii7:
BumpPositionCounterCharacters |.
IndexOnly .html .htm .doc .ppt .xls .pdf .rtf .txt .jpg .bmp .png
NoContents .jpg .gif .bmp .png .ico
FileFilter .pdf share/doc/swish-e/examples/filter-bin/_pdf2html.pl
IndexContents HTML .pdf
Any support you can provide is greatly appreciated!
Thanks,
Peter
_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Mon Sep 24 15:20:59 2007