I've included my confg file and a sample of results. Why do I get Null
titles and no descriptions? I'll also include what I'm using as a search
tool. It's basically swish.cgi modified for our site with no secondary
sorts or date routine.
How do I force swish-e not to follow all links when using the http method?
Is this even possible?
# cat ublin.config
# DIRECTIVES COMMON to HTTP and FILESYSTEM METHODS
IndexDir http://ublin.lib.buffalo.edu/webcat/bibcat/A/A/E/9
IndexFile /usr/local/bin/index.swish
IndexReport 3
IgnoreTotalWordCountWhenRanking no
MaxDepth 1
IndexComments 0
MaxDepth 5
Delay 1
TmpDir /export/home/thomasr/tmp/
StoreDescription HTML <body> 5000
Results of search:
1 (NULL) -- rank: 1000
No Content saved: Check StoreDescription setting
2 (NULL) -- rank: 944
No Content saved: Check StoreDescription setting
3 (NULL) -- rank: 914
No Content saved: Check StoreDescription setting
A sample of the html pages I'm trying to index:
<title> E/E/F/8/403 University at Buffalo Libraries Web Catalog</title> <br>
<h3>
United States. Bureau of Land Management.</h3> <br>
<h3>
BLM Wyoming fishing opportunities / United States Department of the
Interior, Bureau of Land Management.</h3> <br>
<h3>
Wyoming fishing opportunities</h3> <br>
<h3>
Title within map border: Fishing opportunities [place] Wyoming</h3> <br>
<h3>
Title within map border: Fishing opportunities in [place] Wyoming</h3>
<br>
Scales differ. <br>
[Washington, D.C.?] ; The Bureau, [1991- <br>
maps : col. ; 45 x 59 cm. or smaller, on sheets 46 x 61 cm., folded to 23
x 11 cm. <br>
Relief shown by shading. <br>
Panel title. <br>
"This brochure was developed in cooperation with the Wyoming Game and Fish
Department." <br>
Includes descriptive index to fishing areas and descriptive distance list
for each area. <br>
Text, Wyoming map showing available BLM surface maps, and col. ill. on
verso. <br>
Northeast and central -- South central -- Southwest -- Big Horn River. <br>
<br>
Fishing Wyoming Maps. <br>
<br>
Fishing Big Horn River (Wyo. and Mont.) Maps. <br>
<br>
Wyoming. Game and Fish Dept. <br>
<hr>
<br>
<A HREF="http://ublin.lib.buffalo.edu/webcat/about/about1.html">About this
page</A>
<br>
<A HREF="http://ublin.lib.buffalo.edu/holcat/E/E/F/8/403.html">Holdings: How
to find this item</A>
<br>
<A HREF="http://ublib.buffalo.edu/libraries/cgi-bin/catalogw.cgi">Search the
UB Libraries Catalog</A>
<br>
<A HREF="http://ublin.lib.buffalo.edu/marcat/E/E/F/8/403.mrc">Download this
MARC record</A>
<br>
<hr>
University at Buffalo
<br>
State University of New York
As you can see they are just text files.
Thanks,
Rich
Received on Thu Jan 24 13:42:15 2002