Paras Fadte wrote on 03/19/2010 12:52 AM: > Hi, > > Is it possible to index hyperlinks present on a webpage which would be > referring to some other hosts ? Following is the example > > > Example: > > http://mysite.com/index.html has say 3 hyperlinks viz. > http://a.com/a.html , http://b.com/b.html , http://c.com/c.html . So > when I index "http://mysite.com/index.html" using spider.pl > <http://spider.pl> and use swish.cgi to do a search by using "b.html" in > search field with metaname selected as "swishdocpath" it should show a > clickable "http://b.com/b.html" link. > > Is this possible in swish-e ? possible. but not exactly like you're describing. In your example you were not interested in the contents of 'b.html' but only that it is registered as a document. You could instead just tell swish-e to index the contents of <a> href attribute values (link names) with http://swish-e.org/docs/swish-config.html#htmllinksmetaname Otherwise, if you really want to index the contents of files mysite links to, you could abuse the same_hosts feature: http://swish-e.org/docs/spider.html#same_hosts But same_hosts won't really do what you want since it will index the link under mysite.com rather than b.com. -- Peter Karman . http://peknet.com/ . peter(at)not-real.peknet.com _______________________________________________ Users mailing list Users@lists.swish-e.org http://lists.swish-e.org/listinfo/usersReceived on Fri Mar 19 10:07:15 2010