Skip to main content.
home | support | download

Back to List Archive

Re: Behavior of max_depth in

From: Cas Tuyn <cas.tuyn(at)>
Date: Fri Jan 12 2007 - 14:39:26 GMT

max-depth only applies to spidering. If you spider, it does not matter
whether the linked-to file is in a parent or child directory, as long
as it is on the same server domain (or a same-host or follow-links

If your start document or any other spidered document contains links
to parent directories it will index those, yes. If you only have
relative links without any "../" in them you should stay below your
start level.

I spider with maxdepth=9 which takes 7 hours on our intranet.


On 1/12/07, andy rosbrook <> wrote:
> Hello all,
> I am curious on how the max_depth setting works in and sub
> domains. For example if i index the url and set the
> max_depth to 2 will the spider stay within the sub folder for links or will
> it look inside
> I've done a few tests and it seems to go back up into root folders at
> certain times, i assume when it needs more links? Can anyone explain how it
> traverses the pages and if it is possible to limit the spider to only take
> links from the sub domain?
> thanks
> andy
> _________________________________________________________________
> MSN Hotmail is evolving  check out the new Windows Live Mail

Bookmark  voor de beste salsafeestjes!
Received on Fri Jan 12 06:39:27 2007