Skip to main content.
home | support | download

Back to List Archive

Re: win2k unknown header problem

From: David L Norris <dave(at)not-real.webaugur.com>
Date: Thu Sep 26 2002 - 18:44:25 GMT
On Thu, 2002-09-26 at 11:48, Matt Kynaston wrote:
> I've got a single page (swish.php - I've replaced it with hello world for
> testing) pulled from database that includes links to everything I want
> indexed (to a depth of 1).

OK, I've done what you describe here and can reproduce the warnings
you're seeing.  Single page full of links with a "No-Contents: 1"
header.  And, I do not see this with RC1.  However, everything does
appear to be indexed in both 2.2RC1 and the current 2.2 build.  The
resulting index is the same.

Bill, spider.pl shouldn't be returning any contents along with a
nocontents, right?  I think this is the problem.  spider.pl should
extract the links unless there is a "robots nofollow" directive and it
should return an empty document to SWISH-E if there is a "robots
nocontents" directive.


$ ls 
index.html  test1.html  test2.html  test3.html

$ cat index.html 
<html>
<head>
<title>This is not indexed</title>
<meta name="robots" content="nocontents">
</head>
<body>
<a href="test1.html">Test 1</a>
<a href="test2.html">Test 2</a>
<a href="test3.html">Test 3</a>
</body>
</html>

C:\SWISH-Erc>swish-e -w testing
# SWISH format: 2.2rc1
# Search words: testing
# Number of hits: 4
# Search time: 0.010 seconds
# Run time: 0.040 seconds
1000 http://daneel/temp/testing/test3.html "3 Testing SWISH-E" 136
1000 http://daneel/temp/testing/test2.html "2 Testing SWISH-E" 140
1000 http://daneel/temp/testing/test1.html "1 Testing SWISH-E" 136
200 http://daneel/temp/testing/ "" 218
.

C:\SWISH-E>swish-e -w testing
# SWISH format: 2.2
# Search words: testing
# Number of hits: 4
# Search time: 0.010 seconds
# Run time: 0.060 seconds
1000 http://daneel/temp/testing/test3.html "3 Testing SWISH-E" 136
1000 http://daneel/temp/testing/test2.html "2 Testing SWISH-E" 140
1000 http://daneel/temp/testing/test1.html "1 Testing SWISH-E" 136
200 http://daneel/temp/testing/ "" 218
.

-- 
 David Norris
  Dave's Web - http://www.webaugur.com/dave/
  Augury Net - http://home.webaugur.com/
  ICQ - 412039
Received on Thu Sep 26 18:49:22 2002