I have SWISH-E 2.4.3 installed on Windows Server 2003. I've installed unrtf, xhtml, and wvware and made sure they're in my path, and I have the pp2html.pm and Rtf2html.pm filters in the Filters directory. I've also installed Parse::Excel and all other Perl modules required for the various filters.
When I set my swish.conf so it will call DirTree.pl with IndexDir and SwishProgParameters, I get lots of fun, confusing errors that I have not been able to resolve. For now I get around them by running "perl lib/swish-e/DirTree.pl Y: | swish-e -e -S prog -i stdin -c swish.conf". That's worked for the individual files, but I just started it on the whole directory.
It also hasn't been indexing all the directories in the path, even though they index fine if I specify them individually. I'm not sure yet if that's related or because there are just a lot files (~350k docs, 19k folders, 15GB of space). I'm trying to eliminate one problem at a time.
Anyway, here are the errors:
2024 Warning - Y:/path/to/file: Use of unitialized value in pattern match (m//) at C:/Perl/lib/IO/Handle.pm line 348.
2024 Warning - Y:/path/to/file: Use of unitialized value in concatentation (.) or string at C:/Perl/lib/IO/Handle.pm line 358.
3112 Warning - Y:/path/to/file: Use of unitialized value in substitution (s///) at C:/Perl/lib/HTML/Entities line 458.
3112 Warning - Y:/path/to/file: Use of unitialized value in numeric lt (<) at C:\SWISH-E\lib\swish-e\perl/SWISH/Filters/XLtoHTML.pm line 70.
There are more, but they're all variations of these four. I think they're coming from Perl, but I don't know enough Perl to analyze or troubleshoot the code. I had to resort to trial and error to get the correct regex syntax to stop DirTree.pl from returning .css files.
I've tested a few different configurations, and the only time it fails is if I use -S prog and have IndexDir set to DirTree.pl. If I set SwishProgParameters to Y:, or to any document that errors when I try Y:, I get the errors.
I've made sure the documents convert properly with each program. I've made sure the correct filter is chosen for individual documents using swish-filter-test. I've made sure DirTree.pl properly calls the filters for individual documents by running it against them directly. I've also made sure swish-e indexes the individual documents properly by piping the results from DirTree.pl to swish-e.
This is my config:
IndexContents HTML* .htm .html .ppt .xls .rtf .doc .pdf
IndexContents TXT* .txt .log .csv
IndexContents XML* .xml
Metanames swishtitle swishdocpath
StoreDescription TXT* 1000
StoreDescription HTML* <body> 1000
StoreDescription XML* <body> 1000
I can't provide links to documents or attach them directly because the system is on a closed network. I've searched through the swish-e email archives for the last couple days but couldn't find anything that might explain the behaviour or identify a fix. Thanks for your help.
Received on Thu Aug 4 08:41:06 2005