Skip to main content.
home | support | download

Back to List Archive

Re: Segmentation fault while indexing with"StoreDescription"

From: Dietmar Rabich <do.not.send(at)not-real.gmx.de>
Date: Tue Apr 20 2004 - 07:11:13 GMT
Hi Jose,

it is a little bit difficult ... There are 3 paths with html files.

path1: 3 html files
path2: 38,278 html files
path3: 64,168 html files

The whole directory is 5,565,272 kByte - unzipped. And some of the documents
are confidential.

Here is an extract of the old header file:

# Swish-e format: 2.2.3
# 
# Name: ...
# Saved as: index-html.swish-e
# Counts: 501968 words, 105898 files
# Indexed on: 2004-04-16 05:09:12 CEST
# Description: ...
# Pointer: (no pointer)
# Maintained by: ...
# DocumentProperties: Enabled
# Stemming Applied: 0
# Soundex Applied: 0
# Fuzzy Indexing Mode: None
# IgnoreTotalWordCountWhenRanking: 1
# WordCharacters: ... (not changed)
# MinWordLimit: 2
# MaxWordLimit: 80
# BeginCharacters: ... (not changed)
# EndCharacters: ... (not changed)
# IgnoreFirstChar: 
# IgnoreLastChar: 

I think there are much too much files for Swish-E 2.4.2. We've tried 2.4.1
too, but the same result: segmentation fault.

cu Dietmar.

> Hi Dietmar,
> 
> (I cannot contact you directly because of your email address)
> If possible, can you gzipped "path1" and  "path2" and make them 
> available to me to try them?
> 
> cu
> Jose
> 
> Dietmar Rabich escribió:
> 
> >Some more information:
> >
> >In many other cases Swish-E crashes too. In each case there are many
> >documents to be indexed. Here an example:
> >
> >..
> >Removing very common words...
> >no words removed.
> >Writing main index...
> >Sorting words ...
> >Sorting 170,500 words alphabetically
> >Writing header ...
> >Writing index entries ...
> >  Writing word text:  20%Segmentation fault
> >
> >cu Dietmar.
> >
> >  
> >
> >>I've just a problem while indexing HTML-Files. I have update Swish-E
> from
> >>version 2.2.3 to 2.4.2. Indexing with the old version works fine. Now I
> >>get
> >>a message "segmentation fault".
> >>
> >>The config file is simple:
> >>
> >>IndexDir ../../path1 ../../path2
> >>IndexOnly .html
> >>IndexReport 3
> >>IndexFile ./test.swish-e
> >>IndexContents HTML .html
> >>DefaultContents HTML
> >>StoreDescription HTML <body> 2000
> >>...

-- 
"Sie haben neue Mails!" - Die GMX Toolbar informiert Sie beim Surfen!
Jetzt aktivieren unter http://www.gmx.net/info
Received on Tue Apr 20 00:11:15 2004