Hi Cas,
I don't want to follow the subversion.tigris.org link
just the ones from the subversion dirs that are in the
form:
<dir name="dtupdates" href="dtupdates/" />
As shown in the spider.pl output at the end of the
email ;-)
Cheers,
Brian
--- Cas Tuyn <cas.tuyn@gmail.com> wrote:
> Brian,
>
> The spider stays within the start-domain
> localhost/svn, otherwise it
> could go on and index the whole Internet. There is a
> setting
> (follow-hosts or something) that allows you to say
> that links to
> subversion.tigris.org may be followed. Also look at
> same-hosts if
> these two hosts are actually equal but have a
> different domain (like
> www.tigirs.org and tigris.org).
>
> Regards,
>
> Cas
>
>
> On 1/16/07, Brian Ling <brian_ling_gandj@yahoo.com>
> wrote:
> > Hi all,
> >
> > I've just started using swish-e so sorry if this
> is a
> > bit newbie.
> >
> > I want to index a subversion repository via it's
> > web/apache front end, but I can't seem to get
> > spider.pl to follow the links in the default
> > subversion output.
> >
> > I'm calling the spider directly with:
> > /usr/local/lib/swish-e/spider.pl ./spider.conf it
> > finds and outputs the main subversion page (output
> at
> > end of mail) but doesn't follow any of the links.
> > Everything appeared to install OK. I'm on OS X
> 10.4.8
> > What am I missing?
> >
> > spider.conf:
> > @servers = (
> > {
> > email => 'test@test.co.uk',
> > base_url =>
> > 'http://localhost/svn/',
> > same_hosts => [ '127.0.0.1' ],
> > use_default_config => 1,
> > link_tags => [qw/ a frame dir
> /],
> > },
> > );
> > 1;
> >
> > output from spider.pl:
> >
> > /usr/local/lib/swish-e/spider.pl: Reading
> parameters
> > from './spider.conf'
> > Path-Name: http://localhost/svn/
> > Content-Length: 1232
> > Document-Type: xml*
> >
> > <?xml version="1.0"?>
> > <?xml-stylesheet type="text/xsl"
> > href="/xslt/svnindex.xsl"?>
> > <!DOCTYPE svn [
> > <!ELEMENT svn (index)>
> > <!ATTLIST svn version CDATA #REQUIRED
> > href CDATA #REQUIRED>
> > <!ELEMENT index (updir?, (file | dir)*)>
> > <!ATTLIST index name CDATA #IMPLIED
> > path CDATA #IMPLIED
> > rev CDATA #IMPLIED>
> > <!ELEMENT updir EMPTY>
> > <!ELEMENT file EMPTY>
> > <!ATTLIST file name CDATA #REQUIRED
> > href CDATA #REQUIRED>
> > <!ELEMENT dir EMPTY>
> > <!ATTLIST dir name CDATA #REQUIRED
> > href CDATA #REQUIRED>
> > ]>
> > <svn version="1.3.0 (r17949)"
> > href="http://subversion.tigris.org/">
> > <index rev="170" path="/">
> > <dir name="SubversionNotes"
> > href="SubversionNotes/" />
> > <dir name="altirsCustomInventory"
> > href="altirsCustomInventory/" />
> > <dir name="appsMan" href="appsMan/" />
> > <dir name="artwork" href="artwork/" />
> > <dir name="bootDVD-CD" href="bootDVD-CD/" />
> > <dir name="docs" href="docs/" />
> > <dir name="dtupdates" href="dtupdates/" />
> > <dir name="localMachine" href="localMachine/"
> />
> > <dir name="netlogon" href="netlogon/" />
> > <dir name="tools" href="tools/" />
> > </index>
> > </svn>
> >
> > Summary for: http://localhost/svn/
> > Connection: Close: 1 (1.0/sec)
> > Total Bytes: 1,232 (1232.0/sec)
> > Total Docs: 1 (1.0/sec)
> > Unique URLs: 1 (1.0/sec)
> >
> > Thanks for any pointer,
> >
> > Brian
> >
> >
> >
> >
>
____________________________________________________________________________________
> > Now that's room service! Choose from over 150,000
> hotels
> > in 45,000 destinations on Yahoo! Travel to find
> your fit.
> > http://farechase.yahoo.com/promo-generic-14795097
> >
>
>
> --
> Bookmark http://kayakfun.info/salsagids/ voor de
> beste salsafeestjes!
>
____________________________________________________________________________________
TV dinner still cooling?
Check out "Tonight's Picks" on Yahoo! TV.
http://tv.yahoo.com/
Received on Tue Jan 16 07:41:08 2007