《智能Web算法》环境配置和BeanShell使用-码农场

《智能Web算法》环境配置和BeanShell使用

从Google code下载最新的配书代码，建议解压到C盘，不然路径很麻烦。

编译前确保Java安装并配置好环境变量，apache ant的环境变量可加可不加，但是一定要下载一个。把下面的傻瓜批处理复制，新建一个run.bat，粘贴进去，最终放到C:\iWeb2\build目录里面：

@Echo Off 
title Hankcs's program
CD\ 
%~d0
CD %~dp0
SET ANT_HOME=C:\Tools\apache-ant-1.9.3
SET PATH=%JAVA_HOME%\bin;%ANT_HOME%\bin;%PATH%
SET CLASSPATH=
ant >> log.txt

这里我假设你的ant放在了

C:\Tools\apache-ant-1.9.3

然后双击run.bat就编译出了iweb2.jar。

Buildfile: C:\iWeb2\build\build.xml

init:
     [echo] ------------------- Algorithms of the Intelligent Web ----------------------
     [echo] 
     [echo] PATH = ${env.PATH}
     [echo] CLASSPATH = ${env.CLASSPATH}
     [echo] java.home = C:\Program Files\Java\jdk1.7
     [echo] ant.home = C:\Tools\apache-ant-1.9.3
     [echo] ------------------------------------------------------
     [echo] 
     [echo] root: C:\iWeb2
     [echo] build: C:\iWeb2/build
     [echo] deploy: C:\iWeb2/deploy

compile:
    [javac] C:\iWeb2\build\build.xml:105: warning: 'includeantruntime' was not set, defaulting to build.sysclasspath=last; set to false for repeatable builds
    [javac] Compiling 1 source file to C:\iWeb2\build\bytecode

dist:
      [jar] Updating jar: C:\iWeb2\build\dist\iweb2.jar
     [copy] Copying 1 file to C:\iWeb2\deploy\lib
     [echo] The iweb2.jar has been created and placed in C:\iWeb2/deploy/lib

BUILD SUCCESSFUL
Total time: 1 second

然后就可以双击C:\iWeb2\deploy\bin下面的bsc.bat使用BeanShell了，对于最开始的一段代码，直接复制到BeanShell里就能执行。

如果你很讨厌BeanShell的话，自己建一个Main Class也并非难事：

package com.hankcs;

import iweb2.ch2.shell.FetchAndProcessCrawler;
import iweb2.ch2.shell.LuceneIndexer;
import iweb2.ch2.shell.MySearcher;

/**
 * @author hankcs
 */
public class Main
{
    public static void main(String[] args)
    {
// ------------------------------------------------------
//   Collecting data and searching with Lucene
// ------------------------------------------------------


//
// -- Data (default URL list)
//
        FetchAndProcessCrawler crawler = new FetchAndProcessCrawler("C:/iWeb2/data/ch02",5,200);
        crawler.setDefaultUrls();
        crawler.run();

//
// -- Lucene
//
        LuceneIndexer luceneIndexer = new LuceneIndexer(crawler.getRootDir());
        luceneIndexer.run();

        MySearcher oracle = new MySearcher(luceneIndexer.getLuceneDir());

        oracle.search("armstrong",5);


    }
}

输出结果是一样的：

Starting url group: 1, current depth: 0, total known urls: 19, maxDepth: 5, maxDocs: 200, maxDocs per group: 50, pause between docs: 500(ms)
DEBUG: Filtered url: 'mailto: eoswald@betanews.com'
DEBUG: Filtered url: 'mailto: sfulton@betanews.com'
DEBUG: Filtered url: 'mailto:wfaries@bloomberg.net'
DEBUG: Filtered url: 'http://www.cse.gob.ni'
Finished url group: 1, urls processed in this group: 19, current depth: 0, total urls processed: 19
Starting url group: 2, current depth: 0, total known urls: 19, maxDepth: 5, maxDocs: 200, maxDocs per group: 50, pause between docs: 500(ms)
Finished url group: 2, urls processed in this group: 0, current depth: 0, total urls processed: 19
Starting url group: 3, current depth: 1, total known urls: 19, maxDepth: 5, maxDocs: 200, maxDocs per group: 50, pause between docs: 500(ms)
Finished url group: 3, urls processed in this group: 0, current depth: 1, total urls processed: 19
Timer (s): [Crawler processed data] --> 0.424
Starting the indexing ... Indexing completed! 


Search results using Lucene index scores:
Query: armstrong

Document Title: Lance Armstrong meets goal in painful marathon debut
Document URL: file:/c:/iWeb2/data/ch02/sport-01.html          -->  Relevance Score: 0.397706508636475
_______________________________________________________________________
Document Title: New York 'tour' Lance's toughest
Document URL: file:/c:/iWeb2/data/ch02/sport-03.html          -->  Relevance Score: 0.312822639942169
_______________________________________________________________________
Document Title: New York City Marathon
Document URL: file:/c:/iWeb2/data/ch02/sport-02.html          -->  Relevance Score: 0.226110160350800
_______________________________________________________________________


Process finished with exit code 0

知识共享署名-非商业性使用-相同方式共享：码农场 » 《智能Web算法》环境配置和BeanShell使用

《智能Web算法》环境配置和BeanShell使用

评论 1

我的作品