Skip to main content.
home | support | download

Back to List Archive

Re: returning HTML code

From: Peter Karman <peter(at)not-real.peknet.com>
Date: Tue May 31 2005 - 16:55:01 GMT
no.

swish-e ignores tags and converts entities when saving properties.

you could, however, pre-convert your tags and & in order to abuse that 
feature:

karman@topaz08 17% swish-e -i foo.html -c c -v3
Parsing config file 'c'
Indexing Data Source: "File-System"
Indexing "foo.html"

Checking file "foo.html"...
   foo.html - Using HTML2 parser -  (4 words)

Removing very common words...
no words removed.
Writing main index...
Sorting words ...
Sorting 3 words alphabetically
Writing header ...
Writing index entries ...
   Writing word text: Complete
   Writing word hash: Complete
   Writing word data: Complete
3 unique words indexed.
5 properties sorted.
1 file indexed.  61 total bytes.  4 total words.
Elapsed time: 00:00:00 CPU time: 00:00:00
Indexing done!
karman@topaz08 18% swish-e -w test -p swishdescription
# SWISH format: 2.4.3
# Search words: test
# Removed stopwords:
# Number of hits: 1
# Search time: 0.000 seconds
# Run time: 0.020 seconds
1000 foo.html "foo.html" 61 "<bar>my test</bar>"
.
karman@topaz08 19% cat c
StoreDescription HTML2 <body>
IndexContents HTML2 .html
karman@topaz08 20% cat foo.html
<html>
<body>
&lt;bar&gt;my test&lt;/bar&gt;
</body>
</html>
karman@topaz08 21%


Nicholas W. Miller wrote on 05/31/2005 10:58 AM:
> Hello,
> 
> Is it possible to configure swish-e to return a page's HTML code  
> instead of the just the page's visible text?
> 
> Thanks,
> 
> Nick

-- 
  Peter Karman . http://peknet.com/ . peter(at)not-real.peknet.com
Received on Tue May 31 09:55:03 2005