Skip to main content.
home | support | download

Back to List Archive

[swish-e] Swish-E IgnoreMetaTags does not work

From: Dennis Gerasimenko <dgerasimenko(at)not-real.einsteinindustries.com>
Date: Tue Jul 27 2010 - 17:50:15 GMT
Hi.

I am running Swish-E 2.4.7 (on RHEL5) and I am trying to skip a few HTML
tags (specifically “script” as in <script ...></script>) inside HEAD and
BODY, while parsing HTML files but, despite configuration directive
“IgnoreMetaTags script style link select”, tag “script” is still being
parsed. That generates many errors such as:

error: Unexpected end tag : dt '<dt>' + listingName + '</dt>' +

What am I doing wrong? Here is my config file:

--- CONFIG FILE ---

# Index only HTML and text files
IndexOnly .html .htm

# Otherwise, use the HTML parser
DefaultContents HTML*

# Define metanames ranks
MetaNamesRank  10 title
MetaNamesRank   5 swishdefault

# Add document description to index
StoreDescription HTML* <body> 20000

# Define custom properties (meta description)
PropertyNames     metadescription
PropertyNameAlias metadescription description

# Ignore total number of words when ranking
IgnoreTotalWordCountWhenRanking no

# Ignore select HTML tag
IgnoreMetaTags script style link select

# Define max depth
MaxDepth 6

# Define delay (seconds)
Delay 0

# Define location of the spider script
SpiderDirectory /usr/local/swish-e/lib/swish-e/

# Define temporary directory
TmpDir /var/tmp

--- END CONFIG FILE ---

Any help is appreciated. Thank you.
  
_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Tue Jul 27 13:50:29 2010