Skip to main content.
home | support | download

Back to List Archive

Re: Swish-E and HTML documents with frames

From: Chris Humphries <ChrisJMH(at)not-real.vermilion99.freeserve.co.uk>
Date: Mon Feb 28 2000 - 12:18:37 GMT
Ron,

Another problem with frames is that pages are often sourced from a 
different domain. I came across a page where virtually all the content came 
from a domain other than the one which held the main page. All the pages in 
the frameset were indexed, but because the site had other important content 
only accessible through href links, these would not be available, even when 
spidering, because Swish-E would reject them for being "Wrong method or 
server". Perhaps Swish-E needs an option to spider documents from other 
domains albeit not to the same level that it spiders documents from the 
same domain (NetAttache from Tympani has this feature).

Chris Humphries


-----Original Message-----
From:	Ron Samuel Klatchko [SMTP:rsk@brightmail.com]
Sent:	Sunday, February 27, 2000 1:40 AM
To:	Chris Humphries
Subject:	Re: [SWISH-E] Re: Swish-E and HTML documents with frames



On Sat, 26 Feb 2000, Chris Humphries wrote:
> This is very true, and if one were spidering indiscriminately, it would 
be
> a problem because there is probably no way of knowing that the page you 
had
> found *was* indirectly referenced. However, most of my indexing so far 
has
> been just the first page of a Web site, which means that my approach to
> reading through the frames is probably safe. Each Web site will already
> have been looked at by a human being and its basic structure understood.
>
> If you can think of a case you would like to see handled that isn't 
handled
> by the approach I am using, I would really appreciate it if you could
> supply a url for me to try out.


That was the only thing.  I get the feeling we're in agreement that
there's no practical solution for the problem I brought up.  But I have a
feeling that you're going to have people banging on you about this issue.
If I may put in my two cents, I'd definitely consider putting in a major
disclaimer about this.

moo
Received on Mon Feb 28 07:22:20 2000