Skip to main content.
home | support | download

Back to List Archive

Re: [swish-e] Using ExtractPath to Exclude Some Subdirectory from Search Result

From: Ronny Rahardjo <rrahardjo(at)not-real.gmail.com>
Date: Fri Sep 18 2009 - 18:37:07 GMT
Hi Peter,

Thanks for your help on this. Now, I can narrowing the issue, but I have few
questions:

1. How can I find out if my runindex.bat is calling which config? The
problem is when I run a command, swish-e -c swish.config, it is complained
for my spider.pl (because of the incorrect path). However, my scheduled task
for runindex.bat run just fine. So, I need to know if it is really execute
spider.pl or something else.

2. Here is some of the content on my base url which may cause the issue,
thats why I want to exclude it from indexing using test url:
    <div class="tabset">
       <ul>
        <li><a href="#tab1_1" class="tab
active"><span>Content</span></a></li>
        <li><a href="#tab1_2" class="tab"><span>Content</span></a></li>
        <li><a href="#tab1_3" class="tab"><span>Content</span></a></li>
       </ul>
      </div>
      <!-- tab 1 tabset-one -->
      <div class="tab innovaton tab-box" id="tab1_1">
       <div class="tab-holder">
        <strong class="replace">Title</strong>
        <p><a href="news/1250.html"><u>Content</u></a
      </div>
       <a href="/news/11223.html" class="read-more-btn">Read More</a>
      </div>

I think the issue is on the tabset (javascript), so I want to exclude it
from my indexing. Could you please let me know how to exclude any <a
href="#tab"> using test_url? Or you have any other method which can exclude
them.

Thanks.
On Fri, Sep 18, 2009 at 7:41 AM, Peter Karman <peter@peknet.com> wrote:

> Ronny Rahardjo wrote on 09/17/2009 07:07 PM:
> > I have swishe.config under my installation folder C:\SWISH-E\bin which
> > IncludeConfigFIle common.config.
> > Is that mean my configuration file is common.config?
>
> per your other thread, I now know you are using spider.pl. If you want
> to exclude certain URLs from being indexed, look at
>
> http://swish-e.org/docs/spider.html#test_url
>
> --
>  Peter Karman  .  http://peknet.com/  .  peter(at)not-real.peknet.com
> _______________________________________________
> Users mailing list
> Users@lists.swish-e.org
> http://lists.swish-e.org/listinfo/users
>


_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Fri Sep 18 14:37:10 2009