从Google code下载最新的配书代码,建议解压到C盘,不然路径很麻烦。
编译前确保Java安装并配置好环境变量,apache ant的环境变量可加可不加,但是一定要下载一个。把下面的傻瓜批处理复制,新建一个run.bat,粘贴进去,最终放到C:\iWeb2\build目录里面:
@Echo Off title Hankcs's program CD\ %~d0 CD %~dp0 SET ANT_HOME=C:\Tools\apache-ant-1.9.3 SET PATH=%JAVA_HOME%\bin;%ANT_HOME%\bin;%PATH% SET CLASSPATH= ant >> log.txt
这里我假设你的ant放在了
C:\Tools\apache-ant-1.9.3
然后双击run.bat就编译出了iweb2.jar。
Buildfile: C:\iWeb2\build\build.xml init: [echo] ------------------- Algorithms of the Intelligent Web ---------------------- [echo] [echo] PATH = ${env.PATH} [echo] CLASSPATH = ${env.CLASSPATH} [echo] java.home = C:\Program Files\Java\jdk1.7 [echo] ant.home = C:\Tools\apache-ant-1.9.3 [echo] ------------------------------------------------------ [echo] [echo] root: C:\iWeb2 [echo] build: C:\iWeb2/build [echo] deploy: C:\iWeb2/deploy compile: [javac] C:\iWeb2\build\build.xml:105: warning: 'includeantruntime' was not set, defaulting to build.sysclasspath=last; set to false for repeatable builds [javac] Compiling 1 source file to C:\iWeb2\build\bytecode dist: [jar] Updating jar: C:\iWeb2\build\dist\iweb2.jar [copy] Copying 1 file to C:\iWeb2\deploy\lib [echo] The iweb2.jar has been created and placed in C:\iWeb2/deploy/lib BUILD SUCCESSFUL Total time: 1 second
然后就可以双击C:\iWeb2\deploy\bin下面的bsc.bat使用BeanShell了,对于最开始的一段代码,直接复制到BeanShell里就能执行。
如果你很讨厌BeanShell的话,自己建一个Main Class也并非难事:
package com.hankcs; import iweb2.ch2.shell.FetchAndProcessCrawler; import iweb2.ch2.shell.LuceneIndexer; import iweb2.ch2.shell.MySearcher; /** * @author hankcs */ public class Main { public static void main(String[] args) { // ------------------------------------------------------ // Collecting data and searching with Lucene // ------------------------------------------------------ // // -- Data (default URL list) // FetchAndProcessCrawler crawler = new FetchAndProcessCrawler("C:/iWeb2/data/ch02",5,200); crawler.setDefaultUrls(); crawler.run(); // // -- Lucene // LuceneIndexer luceneIndexer = new LuceneIndexer(crawler.getRootDir()); luceneIndexer.run(); MySearcher oracle = new MySearcher(luceneIndexer.getLuceneDir()); oracle.search("armstrong",5); } }
输出结果是一样的:
Starting url group: 1, current depth: 0, total known urls: 19, maxDepth: 5, maxDocs: 200, maxDocs per group: 50, pause between docs: 500(ms) DEBUG: Filtered url: 'mailto: eoswald@betanews.com' DEBUG: Filtered url: 'mailto: sfulton@betanews.com' DEBUG: Filtered url: 'mailto:wfaries@bloomberg.net' DEBUG: Filtered url: 'http://www.cse.gob.ni' Finished url group: 1, urls processed in this group: 19, current depth: 0, total urls processed: 19 Starting url group: 2, current depth: 0, total known urls: 19, maxDepth: 5, maxDocs: 200, maxDocs per group: 50, pause between docs: 500(ms) Finished url group: 2, urls processed in this group: 0, current depth: 0, total urls processed: 19 Starting url group: 3, current depth: 1, total known urls: 19, maxDepth: 5, maxDocs: 200, maxDocs per group: 50, pause between docs: 500(ms) Finished url group: 3, urls processed in this group: 0, current depth: 1, total urls processed: 19 Timer (s): [Crawler processed data] --> 0.424 Starting the indexing ... Indexing completed! Search results using Lucene index scores: Query: armstrong Document Title: Lance Armstrong meets goal in painful marathon debut Document URL: file:/c:/iWeb2/data/ch02/sport-01.html --> Relevance Score: 0.397706508636475 _______________________________________________________________________ Document Title: New York 'tour' Lance's toughest Document URL: file:/c:/iWeb2/data/ch02/sport-03.html --> Relevance Score: 0.312822639942169 _______________________________________________________________________ Document Title: New York City Marathon Document URL: file:/c:/iWeb2/data/ch02/sport-02.html --> Relevance Score: 0.226110160350800 _______________________________________________________________________ Process finished with exit code 0
可否解释 一下,怎樣运行那些 Java class? 我遇到 JNI error。