On Mon, 28 Dec 1998, I wrote:
> If local filesystem space is an issue, i.e., you don't want to
> copy an entire other web site to your local filesystem as you
> index it, I'm sure it would be possible to write a slightly
> more complicated Perl script that would delete the files after
> they are indexed as the get/index cycle progresses. You'd
> probably en up doing something using the IPC::Open2 Perl module
> (see the Perl 5 "Camel" book, p. 344): open a bidirectional
> pipe to index with the -v3 option so the script could tell when
> file has been indexed so the file could be deleted safely.
I've done just that by creating an httpindex command. You can
tell it to do nothing with the copied files, delete them as
indexing progresses (as described above) or replace them with
their descriptions extracted via the extract_description()
function in my WWW.pm Perl module.
Hence, to index files on remote servers, the functionality was
added "externally" to SWISH++ and none of the C++ code had to be
modified.
I've put up SWISH++ 1.5b1:
ftp://shell3.ba.best.com/pub/pjl/software/swish++-1.5b1.tar.gz
There is also a text and PDF man-page for httpindex(1).
Feedback appreciated.
- Paul
Received on Tue Dec 29 16:50:00 1998