Skip to main content.
home | support | download

Back to List Archive

Sv: Having Trouble getting set up.

From: Nils Lastein <nila(at)not-real.dsr.kvl.dk>
Date: Sat Dec 11 1999 - 13:01:51 GMT
Your problem is that the website use frames - the spider traverse only into the frameset! Solution: Because it's your own server you have access to the raw HTML-files (hoping that it's not dynamic website) meaning that the easiest way is to use FS-method.

Or... Point the spider to the 'content-page' in your framesset and hope that it has sufficient links.

nila


>
>I strongly suspect that I'm being denser than usual today, so I thought
>I'd float this past the more knowledgeable folks.
>
>What I'm trying to do:  run Swish-E on a UNIX sys V system.
>Goal: Be able to index all the links in my divisional web pages for quick
>and easy search.
>Problem: The index doesn't appear to be spidering.
>Web site heirarchy:  Everything lives under http://mango/~ghelms/Search,
>starting with index.html.  There are a lot of frames in here; primary
>pages are things like sriindex.html, qaindex.html, etc.  QA's pages
>point off to yet another machine; this is the kind of thing I desperately
>need spidered.
>user.config file settings:  IndexDir is set to
>http://mango/~ghelms/Search/index.html
>IndexPointer is set to the same thing.
>Other settings:  I'm using HTTP; filesystem stuff has been commented
>out.  config.h is using HTTP.
>End results:  I get a (very small) list of about 10 unique words.  When
>I go look at the index, I see things like "mozilla", which isn't anywhere
>in my division's web pages.  Plus, I'd expect to see a LOT more words.
>And the only URL that shows up is the IndexDir URL and nothing else.
>
>Any ideas what brain-dead thing I forgot to do?
>
>----
>Gretchen Helms
>Project Manager: Csearch
>Excite@Home x2199 (pager: beepghelms@excitecorp.com)
>
>
Received on Sat Dec 11 04:58:37 1999