
从Google code下载最新的配书代码,建议解压到C盘,不然路径很麻烦。
编译前确保Java安装并配置好环境变量,apache ant的环境变量可加可不加,但是一定要下载一个。把下面的傻瓜批处理复制,新建一个run.bat,粘贴进去,最终放到C:\iWeb2\build目录里面:
@Echo Off title Hankcs's program CD\ %~d0 CD %~dp0 SET ANT_HOME=C:\Tools\apache-ant-1.9.3 SET PATH=%JAVA_HOME%\bin;%ANT_HOME%\bin;%PATH% SET CLASSPATH= ant >> log.txt
这里我假设你的ant放在了
C:\Tools\apache-ant-1.9.3
然后双击run.bat就编译出了iweb2.jar。
Buildfile: C:\iWeb2\build\build.xml
init:
[echo] ------------------- Algorithms of the Intelligent Web ----------------------
[echo]
[echo] PATH = ${env.PATH}
[echo] CLASSPATH = ${env.CLASSPATH}
[echo] java.home = C:\Program Files\Java\jdk1.7
[echo] ant.home = C:\Tools\apache-ant-1.9.3
[echo] ------------------------------------------------------
[echo]
[echo] root: C:\iWeb2
[echo] build: C:\iWeb2/build
[echo] deploy: C:\iWeb2/deploy
compile:
[javac] C:\iWeb2\build\build.xml:105: warning: 'includeantruntime' was not set, defaulting to build.sysclasspath=last; set to false for repeatable builds
[javac] Compiling 1 source file to C:\iWeb2\build\bytecode
dist:
[jar] Updating jar: C:\iWeb2\build\dist\iweb2.jar
[copy] Copying 1 file to C:\iWeb2\deploy\lib
[echo] The iweb2.jar has been created and placed in C:\iWeb2/deploy/lib
BUILD SUCCESSFUL
Total time: 1 second
然后就可以双击C:\iWeb2\deploy\bin下面的bsc.bat使用BeanShell了,对于最开始的一段代码,直接复制到BeanShell里就能执行。
如果你很讨厌BeanShell的话,自己建一个Main Class也并非难事:
package com.hankcs;
import iweb2.ch2.shell.FetchAndProcessCrawler;
import iweb2.ch2.shell.LuceneIndexer;
import iweb2.ch2.shell.MySearcher;
/**
* @author hankcs
*/
public class Main
{
public static void main(String[] args)
{
// ------------------------------------------------------
// Collecting data and searching with Lucene
// ------------------------------------------------------
//
// -- Data (default URL list)
//
FetchAndProcessCrawler crawler = new FetchAndProcessCrawler("C:/iWeb2/data/ch02",5,200);
crawler.setDefaultUrls();
crawler.run();
//
// -- Lucene
//
LuceneIndexer luceneIndexer = new LuceneIndexer(crawler.getRootDir());
luceneIndexer.run();
MySearcher oracle = new MySearcher(luceneIndexer.getLuceneDir());
oracle.search("armstrong",5);
}
}
输出结果是一样的:
Starting url group: 1, current depth: 0, total known urls: 19, maxDepth: 5, maxDocs: 200, maxDocs per group: 50, pause between docs: 500(ms) DEBUG: Filtered url: 'mailto: eoswald@betanews.com' DEBUG: Filtered url: 'mailto: sfulton@betanews.com' DEBUG: Filtered url: 'mailto:wfaries@bloomberg.net' DEBUG: Filtered url: 'http://www.cse.gob.ni' Finished url group: 1, urls processed in this group: 19, current depth: 0, total urls processed: 19 Starting url group: 2, current depth: 0, total known urls: 19, maxDepth: 5, maxDocs: 200, maxDocs per group: 50, pause between docs: 500(ms) Finished url group: 2, urls processed in this group: 0, current depth: 0, total urls processed: 19 Starting url group: 3, current depth: 1, total known urls: 19, maxDepth: 5, maxDocs: 200, maxDocs per group: 50, pause between docs: 500(ms) Finished url group: 3, urls processed in this group: 0, current depth: 1, total urls processed: 19 Timer (s): [Crawler processed data] --> 0.424 Starting the indexing ... Indexing completed! Search results using Lucene index scores: Query: armstrong Document Title: Lance Armstrong meets goal in painful marathon debut Document URL: file:/c:/iWeb2/data/ch02/sport-01.html --> Relevance Score: 0.397706508636475 _______________________________________________________________________ Document Title: New York 'tour' Lance's toughest Document URL: file:/c:/iWeb2/data/ch02/sport-03.html --> Relevance Score: 0.312822639942169 _______________________________________________________________________ Document Title: New York City Marathon Document URL: file:/c:/iWeb2/data/ch02/sport-02.html --> Relevance Score: 0.226110160350800 _______________________________________________________________________ Process finished with exit code 0
码农场
可否解释 一下,怎樣运行那些 Java class? 我遇到 JNI error。