Hi to all,
I have problem to index.
I want to launch all by cron
But the process starts and immediately it dies.
It seems that the process wants an output on a terminal.
I haven't problemen if I send the command on the command line.
If I send this command on the command line: "swish-e -S prog -c swish.conf
-v 2 >log.sw &", I don't everything in the file "log.sw". Some pieces are
on terminal. On terminal there are logs like that:
bash-2.05a$ swish-e -S prog -c swish.conf -v 2 > log.l &
[1] 16703
bash-2.05a$ ./spider.pl: Reading parameters from 'spider.conf'
Summary for: http://meneghetti.univr.it/CATDOP99.doc
DOC transformed: 1 (0.1/sec)
Total Bytes: 7,925 (660.4/sec)
Total Docs: 1 (0.1/sec)
Unique URLs: 1 (0.1/sec)
Summary for: http://www.cilea.it/Virtual_Library/bibliot/doppi/prova_pdf.pdf
PDF transformed: 1 (0.1/sec)
Total Bytes: 16,072 (1236.3/sec)
Total Docs: 1 (0.1/sec)
Unique URLs: 1 (0.1/sec)
[1]+ Done swish-e -S prog -c swish.conf -v 2 >log.sw
INFO:
My crontab is so:
20 17 * * * swish-e -S prog -c swish.conf -v 2 >log.sw &
In swish.conf I write:
# Program to read documents
IndexDir ./spider.pl
# Define the config file for the spider to use
SwishProgParameters spider.conf
# Use libxm2 for parsing documents
DefaultContents HTML*
IndexContents TXT* .txt .text
# Cache document contents in the index for context display
StoreDescription HTML <body>
StoreDescription HTML2 <body>
I don't modified anything in spider.pl
My spider.conf is so:
my %Server1 = (
base_url =>
'http://wwwbiblio.polito.it/it/documentazione/bcadoppi.html',
email => 'tajoli@cilea.it',
delay_min => .2,
max_size => 1_000_000,
max_depth => 0,
keep_alive => 1,
);
my %Server2 = (
base_url =>
'http://www.cilea.it/Virtual_Library/bibliot/doppi/doppivalli
sneri.html',
email => 'tajoli@cilea.it',
delay_min => .2,
max_size => 1_000_000,
max_depth => 0,
keep_alive => 1,
);
[...]
@server=( \%Server1, \%Server2, ....);
In fact I want to index many single web pages on differents sites
I work with swish-e 2.2.3 on Linux 2.4.18-19.7.xsmp (Red Hat)
Any ideas ?
Thanks for all.
Zeno Tajoli
tajoli@cilea.it
CILEA - Segrate (MI)
02 / 26995321
Received on Thu Apr 17 16:33:34 2003