Skip to main content.
home | support | download

Back to List Archive

swish.cgi is not returning results

From: Peggy Eaton <peaton(at)not-real.eosdata.gsfc.nasa.gov>
Date: Fri Oct 04 2002 - 18:16:28 GMT
I'm attempting to use the swish.cgi script that comes with the 
swish-e-2.2.1 distribution, and searches are not working.  I get a 
message in the results page: "Failed to find end of results".  I've 
included all the information on various pieces of the puzzle below. 
Sorry its so long.

I'm indexing a single index.html file:
--------------------------------------------------------------------
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
   <head>
     <title>Test Page for Swish-e Indexing</title>
         <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
         <meta name="description" content="Information, tools and 
reports for DAAC web sites">
         <link rel="StyleSheet" href="/INTERNAL/web/web.css">
   </head>

   <body>
     <h1>Test Page for Swish-e Indexing</h1>

<hr>

<ul>
<li><A HREF="/INTERNAL/web/reports/"><b>Reports</b></a><br>(Bobby, 
Compliance, Links)<br>
<li><A HREF="/INTERNAL/web/policies/"><b>Policies</b></a><br>(Web page 
requirements, Development Checklists)<br>
<ul>
<li><a href="/INTERNAL/web/policies/MODIS"><b>MDST Checklist</b></a>  <br>
</ul>
<li><A HREF="/INTERNAL/web/tools/"><b>Web Page 
Tools</b></a><br>(weblint, tidy,
bobby)<br>
<li><A HREF="/INTERNAL/web/curator/"><b>Web Information</b></a><br>(Web 
directory locations, Server aliases, Baselines, CM/Promotion info)<br>
<li><A HREF="/INTERNAL/web/templates/"><b>Web Page 
Templates</b></a><br>(Science, WHOM, and Guide web page templates)
</ul>

     <hr>
     <address><a href="mailto:peaton@daac.gsfc.nasa.gov">Peggy 
Eaton</a></address>
<!-- Created: Fri Oct  4 12:55:49 EDT 2002 -->
<!-- hhmts start -->
Last modified: Fri Oct  4 13:03:00 EDT 2002
<!-- hhmts end -->
   </body>
</html>

--------------------------------------------------------------
My config file is swish_search.conf:
daacdev2$ cat swish_search.conf
####################################################
#
# Swish-e configuration file
#####################################################

# DIRECTORIES TO INDEX

IndexDir /usr/daac/doc/internal/web

# TYPES OF DOCS TO INDEX

IndexContents HTML2 .shtml .html .htm

DefaultContents HTML2

#INDEX ONLY FILES WITH THESE EXTENSIONS

IndexOnly .html .shtml

################
# INDEX DETAILS
################

#VERBOSITY LEVEL OF FEEDBACK WHEN INDEXING

IndexReport 3

ParserWarnLevel 2

#WHERE TO PLACE THE INDEX FILE

IndexFile swish_search.index

# TYPES OF DOCS NOT TO INDEX

NoContents .doc .gif .js .pdf .php .txt .xml

###########################################################
# MetaNames for both special searching and property set up
###########################################################

#MetaNames subject

#MetaNames description

###########################################
# Properties to be returned in the results
###########################################

StoreDescription HTML <body> 200
StoreDescription XML <body> 200
StoreDescription HTML2 <body> 200
StoreDescription XML2 <body> 200

PropertyNameAlias swishdescription body

ReplaceRules replace "/usr/daac/doc/internal" "/INTERNAL"

---------------------------------------------------------------
Indexing command:
daacdev2$ Swish-e -c swish_search.conf -i index.html -T indexed_words 
properties
Indexing Data Source: "File-System"
Indexing "index.html"

Checking file "index.html"...
   index.html - Using HTML2 parser -     Adding:[1:swishdefault(1)] 
'test'   Pos:2  Stuct:0x7 ( HEAD TITLE FILE )
     Adding:[1:swishdefault(1)]   'page'   Pos:3  Stuct:0x7 ( HEAD TITLE 
FILE )
     Adding:[1:swishdefault(1)]   'for'   Pos:4  Stuct:0x7 ( HEAD TITLE 
FILE )
     Adding:[1:swishdefault(1)]   'swish'   Pos:5  Stuct:0x7 ( HEAD 
TITLE FILE )     Adding:[1:swishdefault(1)]   'e'   Pos:6  Stuct:0x7 ( 
HEAD TITLE FILE )
     Adding:[1:swishdefault(1)]   'indexing'   Pos:7  Stuct:0x7 ( HEAD 
TITLE FILE )
     Adding:[1:swishdefault(1)]   'information'   Pos:11  Stuct:0x5 ( 
HEAD FILE )    Adding:[1:swishdefault(1)]   'tools'   Pos:12  Stuct:0x5 
( HEAD FILE )
     Adding:[1:swishdefault(1)]   'and'   Pos:13  Stuct:0x5 ( HEAD FILE )
     Adding:[1:swishdefault(1)]   'reports'   Pos:14  Stuct:0x5 ( HEAD 
FILE )
     Adding:[1:swishdefault(1)]   'for'   Pos:15  Stuct:0x5 ( HEAD FILE )
     Adding:[1:swishdefault(1)]   'daac'   Pos:16  Stuct:0x5 ( HEAD FILE )
     Adding:[1:swishdefault(1)]   'web'   Pos:17  Stuct:0x5 ( HEAD FILE )
     Adding:[1:swishdefault(1)]   'sites'   Pos:18  Stuct:0x5 ( HEAD FILE )
     Adding:[1:swishdefault(1)]   'test'   Pos:20  Stuct:0x29 ( HEADING 
BODY FILE )
     Adding:[1:swishdefault(1)]   'page'   Pos:21  Stuct:0x29 ( HEADING 
BODY FILE )
     Adding:[1:swishdefault(1)]   'for'   Pos:22  Stuct:0x29 ( HEADING 
BODY FILE )
     Adding:[1:swishdefault(1)]   'swish'   Pos:23  Stuct:0x29 ( HEADING 
BODY FILE )
     Adding:[1:swishdefault(1)]   'e'   Pos:24  Stuct:0x29 ( HEADING 
BODY FILE )     Adding:[1:swishdefault(1)]   'indexing'   Pos:25 
Stuct:0x29 ( HEADING BODY FILE )
     Adding:[1:swishdefault(1)]   'reports'   Pos:26  Stuct:0x49 ( EM 
BODY FILE )    Adding:[1:swishdefault(1)]   'bobby'   Pos:27  Stuct:0x49 
( EM BODY FILE )
     Adding:[1:swishdefault(1)]   'compliance'   Pos:28  Stuct:0x9 ( 
BODY FILE )     Adding:[1:swishdefault(1)]   'links'   Pos:29  Stuct:0x9 
( BODY FILE )
     Adding:[1:swishdefault(1)]   'policies'   Pos:30  Stuct:0x49 ( EM 
BODY FILE )
     Adding:[1:swishdefault(1)]   'web'   Pos:31  Stuct:0x49 ( EM BODY 
FILE )
     Adding:[1:swishdefault(1)]   'page'   Pos:32  Stuct:0x9 ( BODY FILE )
     Adding:[1:swishdefault(1)]   'requirements'   Pos:33  Stuct:0x9 ( 
BODY FILE )
     Adding:[1:swishdefault(1)]   'development'   Pos:34  Stuct:0x9 ( 
BODY FILE )    Adding:[1:swishdefault(1)]   'checklists'   Pos:35 
Stuct:0x9 ( BODY FILE )     Adding:[1:swishdefault(1)]   'mdst'   Pos:36 
  Stuct:0x49 ( EM BODY FILE )
     Adding:[1:swishdefault(1)]   'checklist'   Pos:37  Stuct:0x49 ( EM 
BODY FILE )
     Adding:[1:swishdefault(1)]   'web'   Pos:38  Stuct:0x49 ( EM BODY 
FILE )
     Adding:[1:swishdefault(1)]   'page'   Pos:39  Stuct:0x49 ( EM BODY 
FILE )
     Adding:[1:swishdefault(1)]   'tools'   Pos:40  Stuct:0x49 ( EM BODY 
FILE )
     Adding:[1:swishdefault(1)]   'weblint'   Pos:41  Stuct:0x49 ( EM 
BODY FILE )    Adding:[1:swishdefault(1)]   'tidy'   Pos:42  Stuct:0x9 ( 
BODY FILE )
     Adding:[1:swishdefault(1)]   'bobby'   Pos:43  Stuct:0x9 ( BODY FILE )
     Adding:[1:swishdefault(1)]   'web'   Pos:44  Stuct:0x49 ( EM BODY 
FILE )
     Adding:[1:swishdefault(1)]   'information'   Pos:45  Stuct:0x49 ( 
EM BODY FILE )
     Adding:[1:swishdefault(1)]   'web'   Pos:46  Stuct:0x49 ( EM BODY 
FILE )
     Adding:[1:swishdefault(1)]   'directory'   Pos:47  Stuct:0x9 ( BODY 
FILE )
     Adding:[1:swishdefault(1)]   'locations'   Pos:48  Stuct:0x9 ( BODY 
FILE )
     Adding:[1:swishdefault(1)]   'server'   Pos:49  Stuct:0x9 ( BODY FILE )
     Adding:[1:swishdefault(1)]   'aliases'   Pos:50  Stuct:0x9 ( BODY 
FILE )
     Adding:[1:swishdefault(1)]   'baselines'   Pos:51  Stuct:0x9 ( BODY 
FILE )
     Adding:[1:swishdefault(1)]   'cm'   Pos:52  Stuct:0x9 ( BODY FILE )
     Adding:[1:swishdefault(1)]   'promotion'   Pos:53  Stuct:0x9 ( BODY 
FILE )
     Adding:[1:swishdefault(1)]   'info'   Pos:54  Stuct:0x9 ( BODY FILE )
     Adding:[1:swishdefault(1)]   'web'   Pos:55  Stuct:0x49 ( EM BODY 
FILE )
     Adding:[1:swishdefault(1)]   'page'   Pos:56  Stuct:0x49 ( EM BODY 
FILE )
     Adding:[1:swishdefault(1)]   'templates'   Pos:57  Stuct:0x49 ( EM 
BODY FILE )
     Adding:[1:swishdefault(1)]   'science'   Pos:58  Stuct:0x49 ( EM 
BODY FILE )    Adding:[1:swishdefault(1)]   'whom'   Pos:59  Stuct:0x9 ( 
BODY FILE )
     Adding:[1:swishdefault(1)]   'and'   Pos:60  Stuct:0x9 ( BODY FILE )
     Adding:[1:swishdefault(1)]   'guide'   Pos:61  Stuct:0x9 ( BODY FILE )
     Adding:[1:swishdefault(1)]   'web'   Pos:62  Stuct:0x9 ( BODY FILE )
     Adding:[1:swishdefault(1)]   'page'   Pos:63  Stuct:0x9 ( BODY FILE )
     Adding:[1:swishdefault(1)]   'templates'   Pos:64  Stuct:0x9 ( BODY 
FILE )
     Adding:[1:swishdefault(1)]   'peggy'   Pos:65  Stuct:0x9 ( BODY FILE )
     Adding:[1:swishdefault(1)]   'eaton'   Pos:66  Stuct:0x9 ( BODY FILE )
     Adding:[1:swishdefault(1)]   'last'   Pos:67  Stuct:0x9 ( BODY FILE )
     Adding:[1:swishdefault(1)]   'modified'   Pos:68  Stuct:0x9 ( BODY 
FILE )
     Adding:[1:swishdefault(1)]   'fri'   Pos:69  Stuct:0x9 ( BODY FILE )
     Adding:[1:swishdefault(1)]   'oct'   Pos:70  Stuct:0x9 ( BODY FILE )
     Adding:[1:swishdefault(1)]   '4'   Pos:71  Stuct:0x9 ( BODY FILE )
     Adding:[1:swishdefault(1)]   '13'   Pos:72  Stuct:0x9 ( BODY FILE )
     Adding:[1:swishdefault(1)]   '03'   Pos:73  Stuct:0x9 ( BODY FILE )
     Adding:[1:swishdefault(1)]   '00'   Pos:74  Stuct:0x9 ( BODY FILE )
     Adding:[1:swishdefault(1)]   'edt'   Pos:75  Stuct:0x9 ( BODY FILE )
     Adding:[1:swishdefault(1)]   '2002'   Pos:76  Stuct:0x9 ( BODY FILE )
  (71 words)
           swishdocpath: 6 ( 10) S: "index.html"
             swishtitle: 7 ( 30) S: "Test Page for Swish-e Indexing"
           swishdocsize: 8 (  4) N: "0000000001316"
      swishlastmodified: 9 (  4) D: "2002-10-04 13:03:00"
       swishdescription:10 (200) S: "Test Page for Swish-e Indexing Test 
Page for Swish-e Indexing Reports (Bobby, Reports (Bobby, Compliance, 
Links) Compliance, Links) Policies (Web Policies (Web page requirements, 
Development Checklist"

Removing very common words...
no words removed.
Writing main index...
Sorting words ...
Sorting 48 words alphabetically
Writing header ...
Writing index entries ...
   Writing word text: Complete
   Writing word hash: Complete
   Writing word data: Complete
48 unique words indexed.
5 properties sorted.
1 file indexed.  1316 total bytes.  71 total words.
Elapsed time: 00:00:00 CPU time: 00:00:00
Indexing done!
-------------------------------------------------------------
Successful line command search:
daacdev2$ $DAACDIR/bin/Swish-e -w web -p body -f swish_search.index
# SWISH format: 2.2.1
# Search words: web
# Number of hits: 1
# Search time: 0.001 seconds
# Run time: 0.110 seconds
1000 index.html "Test Page for Swish-e Indexing" 1316 "Test Page for 
Swish-e Indexing Test Page for Swish-e Indexing Reports (Bobby, Reports 
(Bobby, Compliance, Links) Compliance, Links) Policies (Web Policies 
(Web page requirements, Development Checklist"
.

--------------------------------------------------------------------
gdb info:
daacdev2$ gdb Swish-e
GNU gdb 5.2
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain 
conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "mips-sgi-irix6.5"...
(gdb) run  -w web  -p body -f swish_search.index
Starting program: /usr/daac/dev/bin/Swish-e -w web  -p body -f 
swish_search.index
/usr/daac/dev
science user software
# SWISH format: 2.2.1
# Search words: web
# Number of hits: 1
# Search time: 0.001 seconds
# Run time: 0.110 seconds
1000 index.html "Test Page for Swish-e Indexing" 1316 "Test Page for 
Swish-e Indexing Test Page for Swish-e Indexing Reports (Bobby, Reports 
(Bobby, Compliance, Links) Compliance, Links) Policies (Web Policies 
(Web page requirements, Development Checklist"
.

Program exited normally.
(gdb)

---------------------------------------------------------------------
Below are the sections of code in swish.cgi that I customized:

#------------ Configuration ----------------------

     # Set these to as needed for your system and your index file

     # You might want to read these in from a file (based on
     # the script name or extra path info), or use PerlSetVars
     # under mod_perl to pass in the parameters.


     # These paths are normally outside of webspace

     # This one is obvious, I hope:

     $Swish_Binary = '/usr/daac/dev/bin/Swish-e';


     # The index file can also be a reference to an array of index files.

     $Swish_Index  = '/usr/daac/doc/internal/web/swish_search.index';

     # The template file is the one supplied with this example CGI script

     $Tmpl_Path    = '/usr/daac/dev/etc/swish.tmpl';


     # Here list the properties that are defined in your index,
     # and that you want displayed with your search results
     # Comment out if not used

#    @PropertyNames   = qw/last_name first_name city phone/;


     # If you defined MetaNames in your document (to search by field)
     # specify their names here.  These will be used when generating the 
query.
     # Comment out if not used
     # This adds a radio group on the form for limiting your search if 
more than one
     # MetaName is listed.

#    @MetaNames = qw/name description/;


     # The $Metaname_Default does two things.  If you set @MetaNames to 
more than one
     # value, this will set the default radio button selected when the 
script first starts.
     # If you set MetaNames to the empty list, but set $MetaName_Default 
to a value, then
     # this value will be used as the metaname for all queries.

     $Metaname_Default = 'description';  # set the default radio button

     # if $All_Meta is set true, and @MetaNames is not the empty list, then
     # all queries must be a metaname search.
     # if $All_Meta is set false, the an additional radio button will be 
added
     # to allow searching without a metaname prepended to the query.
     # For HTML docs, it's typically 0, for XML set to 1.

     $All_Meta = 0;


     $Page_Size    = 20;  # results per page

#---------- End of Configuration ----------------------
.
.
.
# use lib 'path/to/local/perl/library';
use lib qw(
         /usr/daac/dev/lib/perl
         /usr/people/peaton/perl/lib/site_perl/5.6.0/HTML
         /usr/people/peaton/perl/lib/site_perl/5.6.0/SWISH/
         /usr/people/peaton/perl/lib/site_perl/5.6.0
         /usr/people/peaton/perl/lib/site_perl/5.6.0/IP27-irix/HTML
         /usr/people/peaton/perl/lib/site_perl/5.6.0/IP27-irix/Sys
         /usr/people/peaton/perl/lib/5.6.0/IP27-irix/
);

-------------------------------------------------------------
The remainder of the script is unchanged.  I following the commands in 
the script to upload the needed perl modules from CPAN (yesterday):
HTML-FillInForm-1.00
HTML-Parser-3.26
HTML-Template-2.6
SWISH-0.07
Sys-Signal-0.02
Time-HiRes-1.37

----------------------------------------------------------------
System Info:
daacdev2$ uname -a
IRIX64 daacdev2 6.5 10100655 IP27
daacdev2$ uname -R
6.5 6.5.14f
Received on Fri Oct 4 18:21:18 2002