Skip to main content.
home | support | download

Back to List Archive

Issue with indexing

From: John Kelley <john(at)not-real.atypica.com>
Date: Thu Jan 13 2005 - 15:13:20 GMT
I'm on version 2.2.3 and am finding indexing is missing terms that appear 
on the page.
Attached is a single page example.  I indexed the one page, but searching 
for many of the terms that appear in the page produces no result.  The 
indexer reported no problems with the html.

Example: searched for Kiddush on the page with zero results.  Using the 
HTML2 parser and direct disk indexing

Any pointers as to what I should check for?

Here is a partial collection of my configs:
StoreDescription HTML* <body> 200000000
DefaultContents HTML*
IndexReport 3
ParserWarnLevel 3
IndexContents HTML* .htm .html .shtml .php .phtml .php3
IgnoreMetaTags script style
PropertyNamesMaxLength 200000000 swishdescription
PropertyNameAlias swishdescription body
MetaNames swishtitle summary swishdocpath
MetaNameAlias summary description overview body
IndexReport 3

Thanks,

John  
--=====================_1714218625==.ALT
Content-Type: text/html; charset="us-ascii"

<html>
<body>
I'm on version 2.2.3 and am finding indexing is missing terms that appear
on the page.<br>
Attached is a single page example.&nbsp; I indexed the one page, but
searching for many of the terms that appear in the page produces no
result.&nbsp; The indexer reported no problems with the html.<br><br>
Example: searched for <font color="#0000FF"><u>Kiddush</u></font> on the
page with zero results.&nbsp; Using the HTML2 parser and direct disk
indexing<br><br>
Any pointers as to what I should check for?<br><br>
Here is a partial collection of my configs:<br>
StoreDescription HTML* &lt;body&gt; 200000000<br>
DefaultContents HTML*<br>
IndexReport 3<br>
ParserWarnLevel 3<br>
IndexContents HTML* .htm .html .shtml .php .phtml .php3<br>
IgnoreMetaTags script style<br>
PropertyNamesMaxLength 200000000 swishdescription<br>
PropertyNameAlias swishdescription body<br>
MetaNames swishtitle summary swishdocpath<br>
MetaNameAlias summary description overview body<br>
IndexReport 3<br><br>
Thanks,<br><br>
John </body>
</html>

--=====================_1714218625==.ALT--



*********************************************************************
Due to deletion of content types excluded from this list by policy,
this multipart message was reduced to a single part, and from there
to a plain text message.
*********************************************************************
Received on Thu Jan 13 07:13:25 2005