Skip to main content.
home | support | download

Back to List Archive

RE: Indexing PDF Files and HTML pages

From: <Rainer.Scherg(at)not-real.rexroth.de>
Date: Thu Sep 14 2000 - 15:43:06 GMT
You should never see a core file.
Are you using the latest version of swish-e - if not, please update.

If you are still getting a core file, please check

 - if the core is produced by swish-e or by the filter prog.

 - the swish-e config (there are some slight changes from 1.3.2 to 2.x).
   (should not produce a core in any way...)
   Check the pathes to the filter scripts!
   Remember:
     A background process may have another environment (e.g. PATH) than a
     foreground process.

 - the filter script (use an empty filter script, which only
   writes some debug information to stderr to check the input).

 - the filter program itself. What filter are you using to convert
   pdf to txt (ghostscript, xpdf, other)? Check these by invoking
   the them manually.


cu - rainer


-----Original Message-----
From: Chris Blackstone
[mailto:cblackst@teacher.mail.arlington.k12.va.us]
Sent: Thursday, September 14, 2000 5:25 PM
To: Rainer.Scherg@rexroth.de
Subject: Re: [SWISH-E] RE: Indexing PDF Files and HTML pages


I tried the first option, using the config file on your site
(http://www.msp.baynet.de/home/rainer.scherg/) as a guide, but it dumps
core every time.
Are there any limitations when indexing PDF files using the HTTP method?

Rainer.Scherg@rexroth.de wrote:
> 
> If you have swish-e 2.0.1 (or at least 1.3.2f2) installed,
> you can install a pdf filter into swish. Swish will be able
> to store html and pdf into one index file.
> 
> Also possible: 2 separate index processes (html and pdf) and
> a later merge of these files.
> 
> cu - rainer
> 
> -----Original Message-----
> From: Chris Blackstone
> [mailto:cblackst@teacher.mail.arlington.k12.va.us]
> Sent: Thursday, September 14, 2000 4:40 PM
> To: Multiple recipients of list
> Subject: [SWISH-E] Indexing PDF Files and HTML pages
> 
> I have the ability to index PDF files on my site, as well as HTML files,
> but the results are in separate swish files. Is there a way to index PDF
> and HTML files together, or alternately, search multiple swish indexes
> and have the results displayed on 1 page? I tried to modify the config
> file that I use to index PDF files, but I get a core dump every time.


[....]


----------------------------------------------------------------------
This Mail has been checked for Viruses
Attention: Encrypted Mails can NOT be checked !

* * *

Diese Mail wurde auf Viren ueberprueft
Hinweis: Verschluesselte Mails koennen NICHT geprueft werden !
----------------------------------------------------------------------
Received on Thu Sep 14 15:43:15 2000