Skip to main content.
home | support | download

Back to List Archive

RE: Swish-e depth

From: Thomas McDonald <tom.mcdonald(at)not-real.westernsouthernlife.com>
Date: Wed Feb 19 2003 - 17:56:33 GMT
Well, I'll just answer my own question.  The source of the problems stems
from the fact that our site uses a custom content management routine  for
404 errors.  Fyi, this helps search engines spider our site because we give
them 'spider food', i.e., url that end in .htm that don't really exist, but
are checked for and the correct content is routed to the requester

Anyway, for the content management to work then cookies have to be enabled.
I made the following setting in the config file and everything worked:

	use_cookies	=>	1,

Wallah!

-----Original Message-----
From: swish-e@sunsite.berkeley.edu
[mailto:swish-e@sunsite.berkeley.edu]On Behalf Of Thomas McDonald
Sent: Wednesday, February 19, 2003 9:00 AM
To: Multiple recipients of list
Subject: [SWISH-E] Swish-e depth


Swish doesn't seem to be indexing my whole site.  I read the config file
instructions and it says:

MaxDepth defines how many links the spider should follow before stopping. A
value of 0 configures the spider to traverse all links. The default is
MaxDepth 5.

So I set the depth to 0 (you can see my config below), but when I searched
the index for keywords that should get hits on a particular deep page I find
that the deep page is not in the results.  For example, I searched for the
'accumulation' keyword and I didn't get the 'accumulation tool' page.

Please feel free to index our site if you want to try it.  It is a public
site.  The page link is right there on the site map, but it doesn't get
indexed:
http://www.wslife.com/operator.asp?location=home&location=planning+and+retir
ement&location=financial+tools&location=accumulation&location=step1

Could it have something to do with the long length of this url?  Or maybe
because it is a dynamic url?

I have run the index several times with different depths.  I started at 5
and worked my way down.  Could it be that Swish-e isn't updating my index
file?  The date on the file changes every time I rerun the index command.
Here is the index command:

swish-e.exe -c ex.config -S http

Here is my config file (ex.config)

IndexFile wsl_test.index
IndexDir http://www.wslife.com/sitemap.asp
DefaultContents HTML
StoreDescription HTML <body> 20000
EquivalentServer http://www.wslife.com http://www.westernsouthernlife.com

Regards,



Thomas McDonald
Title: Principal Consultant
Sogeti USA
4445 Lake Forest Drive
Suite 550
Cincinnati, OH 45242




*********************************************************************
Due to deletion of content types excluded from this list by policy,
this multipart message was reduced to a single part, and from there
to a plain text message.
*********************************************************************
Received on Wed Feb 19 17:57:27 2003