Hi guys. I've been using swish-e on a production site fantastically with
2.4.3 for a while now, but now i'm looking at using it on another site
and I cant seem to replicate the success I've had on our production site.
The problem I have is with the PropertyName in the config file.....
I'll provide a simple example and see if I'm incorrect with my
assumptions (probably, cause we all know why you shouldn't assume).
------------------1: test.conf --->8-------------------
IndexFile test.index
DefaultContents HTML*
PropertyNames id type
MetaNames id type
-------------------8<-------------------------
My assumption: this will save the index in a file called "test.index"
During indexing, additional information will be stored per document,
namely the property names id and type
During searching, the user may optionally search on particular meta
data, namely the meta names id and type
--------------2: doc.html ------->8---------------------
<h1>hello</h1>
<id>1</id>
<name>hi</name>
<type>product</type>
------------------------8<--------------------------
Okay, not the most interesting file to search, but in my thinking the
indexer should pick out the <id> and the <type> tags, and store them as
propertys.
------------3: command ------->8---------------------
[matt@test swish-test]$ /opt/swish-e-2.4.5/bin/swish-e -c test.conf -i
doc.html -T indexed_words -T properties
Indexing Data Source: "File-System"
Indexing "doc.html"
Adding:[1:swishdefault(1)] 'hello' Pos:1 Stuct:0x21 ( HEADING
FILE )
Adding:[1:swishdefault(1)] '1' Pos:2 Stuct:0x1 ( FILE )
Adding:[1:swishdefault(1)] 'hi' Pos:3 Stuct:0x1 ( FILE )
Adding:[1:swishdefault(1)] 'product' Pos:4 Stuct:0x1 ( FILE )
swishdocpath: 6 ( 8) S: "doc.html"
swishdocsize: 8 ( 4) N: "71"
swishlastmodified: 9 ( 4) D: "2007-02-09 13:36:44 EST"
Removing very common words...
no words removed.
Writing main index...
Sorting words ...
Sorting 4 words alphabetically
Writing header ...
Writing index entries ...
Writing word text: Complete
Writing word hash: Complete
Writing word data: Complete
4 unique words indexed.
6 properties sorted.
1 file indexed. 71 total bytes. 4 total words.
Elapsed time: 00:00:00 CPU time: 00:00:00
Indexing done!
[matt@test swish-test]$
-------------------8<---------------
Here I see the file has been index, all the words have been indexed and
associtated with the swishdefault metadata (not completely sure on that
assumption). After that file was indexed the document stores the
swishdocpath, swishdocsize and swishlastmodified as properties of that
document. This is where I would like the id and type propeties stored as
well.
If someone could clear up my understanding of whats happening that would
be fantastic. I'm not sure how I got it working on the production site,
i've even tried installing swish-e-2.4.3 with no success in reproducing
the properties I want to store.
Thankyou in advance
Matt.
_______________________________________________
Users mailing list
Users@lists.swish-e.org
http://lists.swish-e.org/listinfo/users
Received on Thu Feb 8 22:50:33 2007