Skip to main content.
home | support | download

Back to List Archive

RE: obeyRobotsNoIndex & IndexContents

From: MattO <matto(at)not-real.tellme.com>
Date: Thu May 06 2004 - 04:25:54 GMT
Figured this out I think.

IndexContents HTML2 .html

Need to use the HTML2 parser to index the contents.

MattO
-----Original Message-----
From: MattO [mailto:matto@tellme.com] 
Sent: Wednesday, May 05, 2004 9:05 PM
To: 'swish-e@sunsite.berkeley.edu'
Subject: obeyRobotsNoIndex & IndexContents


Using SWISH-E 2.4.2 on Solaris 5.8 i386

If, in my swish-e cfg I set:

  obeyRobotsNoIndex yes

and I have the following in my content

  <meta name="robots" content="noindex">

during indexing I see:
  blah.html - Using DEFAULT (HTML2) parser -  (Skipped due to Robots
Excluion Rule in meta tag)

Aside from the typo in the output, that's what I'd expect.

If I then add the following directive to my cfg:

  IndexContents HTML .html

and rebuild the index

  blah.html - Using HTML parser -  (496 words)

Apologies if this is FAQ, but can't I get swish-e to obey the
"obeyRobotsNoIndex" rule primarily and only "IndexContents" if the meta
isn't present?

Thanks.

MattO
Received on Wed May 5 21:25:56 2004