Skip to main content.
home | support | download

Back to List Archive

Re: spider a database

From: Bill Moseley <moseley(at)not-real.hank.org>
Date: Fri Nov 04 2005 - 20:28:11 GMT
On Fri, Nov 04, 2005 at 03:16:27PM -0500, Michael Porcaro wrote:
> Please bear with me here and thank you for your patience.  I looked at
> your link and searched around.  By searching, I assume that swish-e can
> spider databases, I wasn't really sure about this before.  I came across
> this document.  Is this the right thing to read, in order to figure out
> how to spider my dynamic pages?

Sorry, I was confused as I thought you wanted to index docs in a
database without using http.  Which is it?

If you want to index stuff in a database then search for the MySQL.pl
file in the swish-e distribution.

 http://cvs.sourceforge.net/viewcvs.py/swishe/swish-e/prog-bin/MySQL.pl?rev=1.2&view=auto

> Also, I am confused as to where I should direct the config file to
> spider the dynamic links.  Let's say I want to spider this particular
> file:
> 
> http://www.youngcomposers.com/forum/Piano-Music-f50.html

How does the spider, of anyone for that matter, if that's a static
file or a dynamically generated file?

> Piano-Music-f50.html is actually a php generated file with an html
> alias, but I don't know where to direct swish-e to spider this file.

I have no idea what an html alias is in that context, but you point
the spider to the same place you would point anyone else.  To its url.


> When I spider the files under /home/yc/www/forum (my local site for
> www.youngcomposers.com), all it does is spider the files that run the
> forum, not the actual content dynamic pages, such as
> "Piano-Music-f50.html" or equivalently
> http://www.youngcomposers.com/forum/index.php?showforum=50

The term "spider" implies you are spidering your web site, most likely
with the oddly named program "spider.pl".  That would be spidering
like google does -- by accessing your documents via the web.

Please go back and look at the docs again.

http://swish-e.org/docs/install.html#general_configuration_and_usage

http://swish-e.org/docs/install.html#spidering_and_searching_with_a_web_form_

http://swish-e.org/docs/spider.html


> So I guess my basic question would be, what is the address of my dynamic
> files?  A very poor guess is, my database files are located here:
> 
> /var/lib/mysql/
> 
> But is this the address to spider?  Or do I spider /home/yc/www/forum
> instead?  

Maybe better is someone else answers that one.

-- 
Bill Moseley
moseley@hank.org

Unsubscribe from or help with the swish-e list: 
   http://swish-e.org/Discussion/

Help with Swish-e:
   http://swish-e.org/current/docs
   swish-e@sunsite.berkeley.edu
Received on Fri Nov 4 12:28:11 2005