Skip to main content.
home | support | download

Back to List Archive

problems indexing PDF files when using the HTTP method

From: Chris Blackstone <cblackst(at)not-real.teacher.mail.arlington.k12.va.us>
Date: Tue Apr 17 2001 - 16:24:41 GMT
I downloaded and installed Rainer's swish-e 1.3.2 enhanced with filter
option from
http://www.bnmsp.de/home/rainer.scherg/

I compiled it and everything works fine (html and pdf are indexed),
provided I index using the FS method.

However, when I try to index a site using the HTTP method, the PDF files
don't get indexed and, often, swish-e-filter dumps core.

When I try to index a site using the HTTP method, I make the following
changes to the .config file

replace
	IndexDir /usr/local/www/htdocs/departments/personnel/jobs/
with
	IndexDir http://jobs.arlington.k12.va.us/index.html

comment out directives under "DIRECTIVES FOR FILESYSTEM ONLY"

set under "DIRECTIVES for HTTP METHOD ONLY"

	MaxDepth 5
	Delay 60
	SpiderDirectory /usr/local/www/cgi-bin/swish-bin/ (this works when I
use standard swish-e to index both HTTP and FS methods)


Are there any other changes I should make? Anyone encounter this before?
I'm completely stumped, and would really like to get this to work as my
school's site is significantly increasing the use of PDF documents on
our site.

Thanks in advance for any assistance,
chris

-- 
chris blackstone  |  web services coordinator

Arlington Public Schools
1426 N. Quincy St.
Arlington, VA 22207
Phone:  703.228.6185
Fax:    703.875.9491
Pager:  703.612.3042
http://www.arlington.k12.va.us
Received on Tue Apr 17 16:27:01 2001