Re: Adding files from external site - suggestions?

From: Rob de Santos AFANA <rdesantos(at)>
Date: Sun Mar 07 2004 - 13:48:07 GMT
Bill Moseley wrote: 
> just fetches web pages, indexes the content and extracts out

> the links into a queue of other URLs to index.  Extracted links
> to other sites are just ignored, unless they are setup as "same_hosts"

> -- although that's more for mapping and to the
> host name.

OK, understood.  Any reason why I couldn't map
to my host?  Particularly if I set up redirection in .htaccess on my
site so that sent users to the other site's pages?

> If what you want to do is insert the content of another page 
> into the page being indexed then I'd probably use 
> filter_content to scan for the links to the other site, fetch 
> that page or pages and extract the content and add it into 
> the current page being indexed.

No, not really what I had in mind, though it *might* work.  I'm waiting
to hear from the other site's web guru to see how his pages are
structured.  If they are "dynamic", e.g. regenerated when needed that
might complicate this. 
> The extracted links are not available to the filter so you 
> would have to extract them yourself.

Shouldn't be that hard, if needed.  Redirection seems simpler though.
I'm satisfied if I can simply include the appropriate subset of pages
from the other site in my index at this stage. 


