Skip to main content.
home | support | download

Back to List Archive

[swish-e] just extracting link structure, not indexing content

From: Darrell Berry <darrell(at)not-real.ku24.com>
Date: Fri Mar 09 2007 - 12:46:52 GMT
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi -- is there a standard way to just get the *link structure*  
(rather than content index) of a site using the swish-e tools  
(spider.pl i guess)?

all i want from the output of my crawl is something like

www.domain.tld -> www.domain.tld/help
www.domain.tld -> www.domain.tld/info
www.domain.tld/info -> www.domain.tld.info2
www.domain.tld/info -> www.domain.tld

ie just spidering the whole domain and showing which pages link to  
which, recursively -- no content, no indexing...? i can find similar  
questions in the archives, but not a definitive answer -- all help  
appreciated

thanks

d

- --
Ku24 Limited: Technology for Creative Business

e: darrell@ku24.com       |   m: 07947 817 564
w: http://www.ku24.com    |   t: 020 7193 0249




-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (Darwin)

iD8DBQFF8VdBq7dCitsyiyMRAr2ZAKCtJJBODpuUG5QdGSleO82HDix1IwCgiM/Y
sCdNL/amrbaw482Cii8vaLo=
=NXJu
-----END PGP SIGNATURE-----
_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Fri Mar 9 07:43:05 2007