Skip to main content.
home | support | download

Back to List Archive

java... suite

From: Jean-Michel David <jmdavid(at)not-real.capella.org>
Date: Wed Mar 06 2002 - 18:27:17 GMT
OK, some informations:

I don't want to give online access now because it's on a workstation and
I don't want too much load.

The wrapper is part of a content management engine of our own.  I try
currently to extract just the Swish wrapper off the servlets package to
put it in a .jsp that will be more easy to distribute and modify.

It needs to run a servlet runner (I tried JRun 3.1 - NT and Tomcat 4 -
Linux successfully) and the following libraries:

Jakarta ORO - Java RexExp
Xerces - XSLT Processor
Xalan - XML Parser
JTidy - for HTML encode/decode

Here is what the import section looks like:

<%@page import="java.io.*"%>
<%@page import="java.util.*"%>
<%@page import="org.apache.oro.text.perl.*"%>
<%@page import="org.apache.oro.text.regex.*"%>
<%@page import="org.w3c.tidy.EntityTable"%>

<%@page import="javax.xml.transform.Source"%>
<%@page import="javax.xml.transform.Result"%>
<%@page import="javax.xml.transform.stream.StreamSource"%>
<%@page import="javax.xml.transform.stream.StreamResult"%>
<%@page import="javax.xml.transform.Transformer"%>
<%@page import="javax.xml.transform.TransformerException"%>
<%@page import="javax.xml.transform.TransformerFactory"%>
<%@page import="javax.xml.transform.TransformerConfigurationException"%>

<%@page import="javax.xml.transform.Templates"%>
<%@page import="org.xml.sax.SAXException"%>

There are two utility classes (made in Capella :)):

StringUtils with following methods: nullToValue, HTMLDecode, split and
replace (regexp based)
XMLUtils with following method: applyXSL

The main class reads a config file and puts default values in a
hashtable (swish exec path, catalog, params...)

I use the following method to obtain stuff from Swish (Windows here,
Linux slightly different):

//**params uses the -x operator syntax of Swish like this (in the config
file)**//
String params =
"\"<swishrank>\\t<swishdocpath>\\t<swishtitle>\\t<swishdescription>\\t<description>\\t<swishdocsize>\\t<swishlastmodified>\\n\"";

//that way, the list of fields is dynamic and the XML too!
String[] command = new String[3];
command[0] = "cmd.exe";
command[1] = "/c";
command[2] = swishPath + " -m " + maxResults + " -b " + recOffset + " -f
" + catalog + " -w \"" + words + "\" -x " + params;
Process process = Runtime.getRuntime().exec(command);

//Then I use a buffered reader like this:

BufferedReader input = new BufferedReader(new
InputStreamReader(process.getInputStream()));
//results iterating
while ((line = input.readLine()) != null) {
 //parsing, building XML string (use of StringUtils)
}
//applying XSL transformation (use of XMLUtils)

What's really good with this is that you have a low-level object that
produces an XML string and you can modify the XSL to adjust your
layout.  The XML has the following structure:
<search>
 <page_offset>0</page_offset>
 <nresults>12</nresults>
<!--will become a .jsp-->
 <self_servlet>org.capella.ed.servlets.Search</self_servlet>
 <words>capella + other</status>
 <results>
  <result>
   <rank>1000</rank>
   <url>www.capella.org</url>
   <title>capella</title>
   <size>40000 octets</size>
   <!--more fields (dynamic)-->
  </result>
  <!--more results-->
 </results>
 <pos>
  <from>1</from>
  <to>5</to>
  <on>12</on>
 </pos>
 <nav>
  <!--prev and next page offset-->
  <prev>0</prev>
  <next>1</next>
  <!--a random number to force a reload : search.jsp?z=0.000333-->
  <z>0.000333</z>
 </nav>
</search>

So, keep in touch, I'll try to package it as soon as possible, probably
next week or so and I'll post the .jsp then.
--
--------------------------------
Jean-Michel David
président et directeur technique
Capella Technologies
jmdavid@capella.org
514-849-1494 ext. 105
866-849-9873
Received on Wed Mar 6 18:28:39 2002