I assume that the 'generated URLs' are coming from swish.cgi. Can you see what
the cmd line returns?
% swish-e -w shell
?
You probably need to configure swish.cgi to set the correct base URL, and/or
play with the ReplaceRules config option.
John Young scribbled on 3/15/05 1:02 PM:
> Hello,
>
> I am trying to set up Swish-e 2.4.3 on a Solaris box to
> use with local copy of the UNIXhelp web pages from the University
> of Edinburgh (http://unixhelp.ed.ac.uk/index.html) on a non-public
> web server. The server is running Apache httpd 2.0.53.
>
> My swish.conf file looks like this:
> #
> # SWISH configuration file
> #
>
> IndexDir /usr/opt/MSMBweb/htdocs/Edinburgh
> IndexFile /usr/opt/MSMBweb/htdocs/Edinburgh/index.swish
> IndexName "Index of UNIXhelp 1.3"
> IndexDescription "This is a full index of UNIXhelp release 1.3."
> IndexPointer "http://msmb.larc.nasa.gov/cgi-bin/unixhelp_search"
> IndexAdmin "webmaster"
>
> IndexOnly .html
> IndexReport 3
> NoContents .gif .xbm .au .mov .mpg
> IgnoreLimit 70 200
> #
>
> I tried indexing the pages by doing:
>
> % ../cgi-bin/swish-e -i ./Edinburgh -c ./Edinburgh/swish.conf
>
> Which resulted in a great deal of output ending with:
> Removing very common words...
> Getting IgnoreLimit stopwords: Complete
> 13 words removed by IgnoreLimit:
> a, gov, by, of, to, larc, the, site, maintained, this, msmb, nasa, help,
> Writing main index...
> Sorting words ...
> Sorting 3,952 words alphabetically
> Writing header ...
> Writing index entries ...
> Writing word text: Complete
> Writing word hash: Complete
> Writing word data: Complete
> 3,952 unique words indexed.
> 4 properties sorted.
> 884 files indexed. 1,307,022 total bytes. 87,710 total words.
> Elapsed time: 00:00:07 CPU time: 00:00:07
> Indexing done!
>
> It seems to have worked, and searching for, say, "shell"
> produces a reasonable list of pages. But the generated URLs
> are incorrect, e.g. the first result is for
>
> "1. Using the Bourne shell to interpret a shell script"
>
> and the associated URL is
>
> http://msmb.larc.nasa.gov/cgi-bin/Edinburgh/scrpt/scrpt1.2.4.html
>
> but it *should* be
>
> http://msmb.larc.nasa.gov/Edinburgh/scrpt/scrpt1.2.4.html.
>
> I cannot figure out why "cgi-bin" is being incorrectly inserted into
> the results. I have searched the documentation on http://swish-e.org
> but I still have not found the answer. Any suggestions? Is anyone
> else on this list using the Edinburgh UNIXhelp pages with Swish-e?
>
> JY
> ------------------------------------------------------------
> John E. Young B1148/R202
> Analytical Services and Materials, Inc. (757) 864-8659
>
>
>
--
Peter Karman . http://peknet.com/ . peter(at)not-real.peknet.com
Received on Tue Mar 15 11:23:03 2005