Skip to main content.
home | support | download

Back to List Archive

FW: PDF indexing suddenly stopped working

From: Chad Day <CDay(at)not-real.mindshare.net>
Date: Fri Dec 02 2005 - 15:02:41 GMT
Correctly formatted this time, my apologies.

PDF indexing suddenly stopped working .. No idea why either. ☹

>From the indexing process (swish-e –c swish.conf –v 3 –S http)

retrieving http://dev.website.org/files/Joomla%20Quick%20Start.pdf?PHPSESSID=413c04013e7c3505db9a68bedf8a8951 (3)...
sleeping 1 seconds before fetching http://dev.website.org/files/Joomla%20Quick%20Start.pdf?PHPSESSID=413c04013e7c3505db9a68bedf8a8951
Now fetching [http://dev.website.org/files/Joomla%20Quick%20Start%201.0.pdf?PHPSESSID=413c04013e7c3505db9a68bedf8a8951]...Status: 200. application/pdf

$ cat swish.conf
# Example configuration file

# Tell Swish-e what to index (same as -i switch above)
IndexDir http://dev.website.org/index.php
IndexFile /usr/local/apache/htdocs/website.index 
IndexOnly .php .txt .html .htm .pdf .xml .htm .shtml

# Index the PDF files
FileFilter .pdf /usr/X11R6/bin/pdftotext '"%p" -'

# Tell Swish-e that .txt files are to use the text parser.
IndexContents TXT* .txt .pdf
IndexContents XML* .xml
IndexContents HTML* .htm .html .shtml .php

PropertyNamesMaxLength 1000 swishdescription
PropertyNameAlias swishdescription body

StoreDescription TXT* 250000
Delay 1

# Otherwise, use the HTML parser
DefaultContents HTML*

Any ideas? 
Chad Day
Developer
Mindshare Interactive Campaigns, LLC
202.654.0832 - www.mindshare.net 
Received on Fri Dec 2 07:02:42 2005