放牧代码和思想
专注自然语言处理、机器学习算法
    正处于一个非常忙的阶段,抱歉不会经常回应任何联络

Window 7 64位 Python 2.7 NLTK 安装

目录

《Natural Language Processing with Python》里用到的NLTK在64位Windows上安装出了些问题。

我的工作站环境是Win7 AMD 64 + Python 2.7.6 64 bit,Visual Studio从VC6.0到VS2013全部安装了。

其中Python的详细版本是Python 2.7.6 (default, Nov 10 2013, 19:24:24) [MSC v.1500 64 bit (AMD64)]。

按照NLTK安装主页的指引:

Source installation (for 32-bit or 64-bit Windows)

  1. Install Python: http://www.python.org/download/releases/2.7.3/

  2. Install Numpy (optional): http://www.lfd.uci.edu/~gohlke/pythonlibs/#numpy

  3. Install Setuptools: http://pypi.python.org/packages/2.7/s/setuptools/setuptools-0.6c11.win32-py2.7.exe

  4. Install Pip: Start>Run... c:\Python27\Scripts\easy_install pip

  5. Install PyYAML and NLTK: Start>Run... c:\Python27\Scripts\pip install pyyaml nltk

  6. Test installation: Start>All Programs>Python27>IDLE, then type import nltk

1至4步是个人都会,不说了。问题出在第5步,执行命令后报错:

Traceback (most recent call last):

  File "<string>", line 1, in <module>

  File "c:\users\admini~1\appdata\local\temp\pip_build_Administrator\pyyaml\setu
p.py", line 342, in <module>

    'test': test,

  File "C:\Program Files\Python27\lib\distutils\core.py", line 152, in setup

    dist.run_commands()

  File "C:\Program Files\Python27\lib\distutils\dist.py", line 953, in run_comma
nds

    self.run_command(cmd)

  File "C:\Program Files\Python27\lib\distutils\dist.py", line 972, in run_comma
nd

    cmd_obj.run()

  File "build\bdist.win-amd64\egg\setuptools\command\install.py", line 53, in ru
n

  File "C:\Program Files\Python27\lib\distutils\command\install.py", line 563, i
n run

    self.run_command('build')

  File "C:\Program Files\Python27\lib\distutils\cmd.py", line 326, in run_comman
d

    self.distribution.run_command(command)

  File "C:\Program Files\Python27\lib\distutils\dist.py", line 972, in run_comma
nd

    cmd_obj.run()

  File "C:\Program Files\Python27\lib\distutils\command\build.py", line 127, in
run

    self.run_command(cmd_name)

  File "C:\Program Files\Python27\lib\distutils\cmd.py", line 326, in run_comman
d

    self.distribution.run_command(command)

  File "C:\Program Files\Python27\lib\distutils\dist.py", line 972, in run_comma
nd

    cmd_obj.run()

  File "c:\users\admini~1\appdata\local\temp\pip_build_Administrator\pyyaml\setu
p.py", line 169, in run

    _build_ext.run(self)

  File "C:\Program Files\Python27\lib\distutils\command\build_ext.py", line 337,
 in run

    self.build_extensions()

  File "c:\users\admini~1\appdata\local\temp\pip_build_Administrator\pyyaml\setu
p.py", line 211, in build_extensions

    with_ext = self.check_extension_availability(ext)

  File "c:\users\admini~1\appdata\local\temp\pip_build_Administrator\pyyaml\setu
p.py", line 237, in check_extension_availability

    depends=ext.depends)

  File "C:\Program Files\Python27\lib\distutils\msvc9compiler.py", line 473, in
compile

    self.initialize()

  File "C:\Program Files\Python27\lib\distutils\msvc9compiler.py", line 383, in
initialize

    vc_env = query_vcvarsall(VERSION, plat_spec)

  File "C:\Program Files\Python27\lib\distutils\msvc9compiler.py", line 299, in
query_vcvarsall

    raise ValueError(str(list(result.keys())))

ValueError: [u'path']

问题的关键在于脚本Python27/Lib/distutils/msvc9compiler.py需要调用64位的VC2008编译器,但是却找不到编译脚本,于是终止了。StackOverflow上有人解决了这个问题:

Since you're using the 64 bit version of Python, once you have installed Microsoft Windows SDK for Windows 7 and .NET Framework 3.5 SP1 (which installs the 64bit compiler that isn't installed when installing Visual Studio 2008 SP1 Express Edition); You need to copy the vcvars64.bat to a location where vcvarsall.bat claims it to be.

From %CSIDL_PROGRAM_FILESX86%\Microsoft Visual Studio 9.0\VC\bin\, you need to copy vcvars64.bat to amd64\vcvarsamd64.bat.

Note the amd part in the destination file name.

在当初安装 Visual Studio 2008 的时候,我并没有安装64位的编译器,现在需要额外安装 Microsoft Windows SDK for Windows 7 and .NET Framework 3.5 SP1,SP1包里面包含一个64位的编译器。不过国内在线SP1安装速度不理想,推荐选择ISO

安装完了之后发现Program Files (x86)\Microsoft Visual Studio 9.0\VC\bin下面多出了一堆东西,其中vcvars64.bat就是64位的编译脚本,将 vcvars64.bat 拷贝到同目录的 amd64\vcvarsamd64.bat,然后重新运行c:\Python27\Scripts\pip install pyyaml nltk命令,成功编译完成:

Downloading/unpacking pyyaml

  Downloading PyYAML-3.11.tar.gz (248kB): 248kB downloaded

  Running setup.py egg_info for package pyyaml

Downloading/unpacking nltk

  Downloading nltk-2.0.4.tar.gz (955kB): 955kB downloaded

  Running setup.py egg_info for package nltk

    Downloading http://pypi.python.org/packages/source/d/distribute/distribute-0

.6.21.tar.gz

    Extracting in c:\users\admini~1\appdata\local\temp\tmpfsknv6

    Now working in c:\users\admini~1\appdata\local\temp\tmpfsknv6\distribute-0.6

.21

    Building a Distribute egg in c:\users\admini~1\appdata\local\temp\pip_build_

Administrator\nltk

    c:\users\admini~1\appdata\local\temp\pip_build_Administrator\nltk\distribute

-0.6.21-py2.7.egg

    warning: no previously-included files matching '*~' found anywhere in distri

bution

Installing collected packages: pyyaml, nltk

  Running setup.py install for pyyaml

    checking if libyaml is compilable

    D:\Program Files (x86)\Microsoft Visual Studio 9.0\VC\BIN\amd64\cl.exe /c /n

ologo /Ox /MD /W3 /GS- /DNDEBUG "-IC:\Program Files\Python27\include" "-IC:\Prog

ram Files\Python27\PC" /Tcbuild\temp.win-amd64-2.7\Release\check_libyaml.c /Fobu

ild\temp.win-amd64-2.7\Release\check_libyaml.obj

    check_libyaml.c

    build\temp.win-amd64-2.7\Release\check_libyaml.c(2) : fatal error C1083: Can

not open include file: 'yaml.h': No such file or directory

    libyaml is not found or a compiler error: forcing –without-libyaml

    (if libyaml is installed correctly, you may need to

     specify the option –include-dirs or uncomment and

     modify the parameter include_dirs in setup.cfg)

  Running setup.py install for nltk

    warning: no previously-included files matching '*~' found anywhere in distri

bution

Successfully installed pyyaml nltk

Cleaning up…

安装nltk_data的命令是:

>>> import nltk
>>> nltk.download()

选择一个book即可,如果国内下载速度头疼的话可以用NLTK离线包。下载完成后解压放在下列指定的nltk_data目录即可。

    – 'C:\\Users\\Administrator/nltk_data'

    – 'C:\\nltk_data'

    – 'D:\\nltk_data'

    – 'E:\\nltk_data'

    – 'C:\\Program Files\\Python27\\nltk_data'

    – 'C:\\Program Files\\Python27\\lib\\nltk_data'

    – 'C:\\Users\\Administrator\\AppData\\Roaming\\nltk_data'

然后用一句

from nltk.book import *

测试一下,看到如下界面说明安装完成。

另外,这本书的例子还用到了几个拓展包,使用下列命令来自动下载安装:

"C:\Program Files\Python27\Scripts\"easy_install NumPy

"C:\Program Files\Python27\Scripts\"easy_install matplotlib

全程自动,Python is amazing!

安装成功的情况只有一个,安装失败的原因千千万万,这篇文章留作参考。

知识共享许可协议 知识共享署名-非商业性使用-相同方式共享码农场 » Window 7 64位 Python 2.7 NLTK 安装

分享到:更多 ()

评论 欢迎留言

  • 昵称 (必填)
  • 邮箱 (必填)
  • 网址

我的开源项目

HanLP自然语言处理包基于DoubleArrayTrie的Aho Corasick自动机