I have libxml2.so.2.5.7 and swish-e-2.2.3(built with libxml2) installed with my student account in the school machine (Sun-Blade-1000 with OS 5.8). I have prepared one config file for testing with a small set of URLs as following: -----------------------------test.config--------------------------------- #This is the config file number test for swish-e IndexDir http://www.lexcotile.com/products.htm IndexDir http://www.epicurious.com/g_gourmet/g03_qanda/tips.html IndexDir http://www.infotile.com/asa/html/powdered.html IndexDir http://www.infotile.com/beaumont/tips/adhesives.html IndexFile /home1/k/ke/ken/kenyulin/bin/test.index IndexName "Online Searching Services for AEC Product Procurements" IndexDescription "This is an index to test a small prototype" IndexPointer "http://ckdd.cee.uiuc.edu/research/index.html" IndexAdmin "Ken-Yu Lin" IndexReport 3 UseStemming no IgnoreTotalWordCountWhenRanking no WordCharacters abcdefghijklmnopqrstuvwxyz\&#;0123456789.@|,-'"[](~!@$%^{}_+?a'e'i'o'u'u"n~A'E'I'O'U'U"N~?? IndexComments 0 MaxDepth 2 Delay 20 DefaultContents HTML2 TmpDir /home1/k/ke/ken/kenyulin/temp/ SpiderDirectory /home1/k/ke/ken/kenyulin/bin/ --------------------------------------------------------------------------- -- However, whenever I tried indexing via the http method using swish-e, it always ended up with "err: No unique words indexed!". ------------------------------indexing result----------------------------- Indexing Data Source: "HTTP-Crawler" Indexing "http://www.lexcotile.com/products.htm" retrieving http://www.lexcotile.com/products.htm (0)... Indexing "http://www.epicurious.com/g_gourmet/g03_qanda/tips.html" retrieving http://www.epicurious.com/g_gourmet/g03_qanda/tips.html (0)... Indexing "http://www.infotile.com/asa/html/powdered.html" retrieving http://www.infotile.com/asa/html/powdered.html (0)... Indexing "http://www.infotile.com/beaumont/tips/adhesives.html" retrieving http://www.infotile.com/beaumont/tips/adhesives.html (0)... Removing very common words... no words removed. Writing main index... err: No unique words indexed! . ------------------------------------------------------------------------------- But when I tried the same config file in another machine where libxml2 libraries were installed by someone else, it worked (this machine runs libxml2.so.2.5.7 and linux). And I could see from the screen that the HTML2 parser was utilized and how many words were indexed. -------------------------------------------------------- - Using HTML2 parser - (324 words) Skipping http://www.infotile.com/beaumont/index.htm: Too deep. Skipping http://www.infotile.com/beaumont/index.htm: Too deep. Skipping http://www.infotile.com/beaumont/locations/index.htm: Too deep. Skipping http://www.southportceramics.com.au/: Wrong method or server. ............... -------------------------------------------------------- Has anyone experienced this before? Why is this happening? It looks like with the first machine, the HTML2 parser is not found (even thought I have specified it in the config file and installed the needed libxml2 librarires.) But when I installed swish-e, I did see that libxml2 was connected. Wired ~ ~ ~ Any help will be very appreciated. Thank you! Ken-Yu Lin.Received on Mon Jun 2 16:37:08 2003