Rasa_NLU_Chi学习笔记(二):初尝训练之妙趣

准备好文件total_word_feature_extractor_zh.dat,将它放到指定目录。

专题中上一篇文章已经提过利用python setup.py install来安装的问题,还是逐一安装感觉更可控。需要安装的包有mitie。

pip install mitie报错:

  Building wheel for mitie (setup.py) ... error
  ERROR: Command errored out with exit status 1:
   command: 'd:\programs\python\python38\python.exe' -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\Administrator\\AppData\\Local\\Temp\\pip-install-2kau9q3r\\mitie_ce10b05bf4a748c7b8d0ddce9b0d68b0\\setup.py'"'"'; __file__='"'"'C:\\Users\\Administrator\\AppData\\Local\\Temp\\pip-install-2kau9q3r\\mitie_ce10b05bf4a748c7b8d0ddce9b0d68b0\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' bdist_wheel -d 'C:\Users\Administrator\AppData\Local\Temp\pip-wheel-1fhfv6eq'
       cwd: C:\Users\Administrator\AppData\Local\Temp\pip-install-2kau9q3r\mitie_ce10b05bf4a748c7b8d0ddce9b0d68b0\
  Complete output (40 lines):
  running bdist_wheel
  running build
  Traceback (most recent call last):
    File "C:\Users\Administrator\AppData\Local\Temp\pip-install-2kau9q3r\mitie_ce10b05bf4a748c7b8d0ddce9b0d68b0\setup.py", line 44, in get_cmake_version
      out = subprocess.check_output(['cmake', '--version'])
    File "d:\programs\python\python38\lib\subprocess.py", line 411, in check_output
      return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
    File "d:\programs\python\python38\lib\subprocess.py", line 489, in run
      with Popen(*popenargs, **kwargs) as process:
    File "d:\programs\python\python38\lib\subprocess.py", line 854, in __init__
      self._execute_child(args, executable, preexec_fn, close_fds,
    File "d:\programs\python\python38\lib\subprocess.py", line 1307, in _execute_child
      hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
  FileNotFoundError: [WinError 2] 系统找不到指定的文件。

使用命令pip install git+https://github.com/mit-nlp/MITIE.git进行安装感觉更加靠谱。不过安装得非常缓慢,会长时间卡住没有反应:

C:\Users\Administrator>pip install git+https://github.com/mit-nlp/MITIE.git
Looking in indexes: https://pypi.tuna.tsinghua.edu.cn/simple
Collecting git+https://github.com/mit-nlp/MITIE.git
  Cloning https://github.com/mit-nlp/MITIE.git to c:\users\administrator\appdata\local\temp\pip-req-build-bgrgbbiq
  Running command git clone -q https://github.com/mit-nlp/MITIE.git 'C:\Users\Administrator\AppData\Local\Temp\pip-req-build-bgrgbbiq'

虽然在控制台上看不到下载进度,不过进入对应的目录通过看属性的文件大小还是能够发现正在下载。第一次下载到12分钟失败,并不支持断点续传。后面连续尝试多次皆失败。于是尝试从github上下载,https://github.com/mit-nlp/MITIE,下载解压后在路径下cmd,然后再运行python setup.py install。这种方法在windows下也会报错:

running install
running build
Traceback (most recent call last):
  File "setup.py", line 44, in get_cmake_version
    out = subprocess.check_output(['cmake', '--version'])
  File "D:\Programs\Python\Python38\lib\subprocess.py", line 411, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
  File "D:\Programs\Python\Python38\lib\subprocess.py", line 489, in run
    with Popen(*popenargs, **kwargs) as process:
  File "D:\Programs\Python\Python38\lib\subprocess.py", line 854, in __init__
    self._execute_child(args, executable, preexec_fn, close_fds,
  File "D:\Programs\Python\Python38\lib\subprocess.py", line 1307, in _execute_child
    hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] 系统找不到指定的文件。

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "setup.py", line 51, in <module>
    setup(
  File "D:\Programs\Python\Python38\lib\distutils\core.py", line 148, in setup
    dist.run_commands()
  File "D:\Programs\Python\Python38\lib\distutils\dist.py", line 966, in run_commands
    self.run_command(cmd)
  File "D:\Programs\Python\Python38\lib\distutils\dist.py", line 985, in run_command
    cmd_obj.run()
  File "D:\Programs\Python\Python38\lib\distutils\command\install.py", line 545, in run
    self.run_command('build')
  File "D:\Programs\Python\Python38\lib\distutils\cmd.py", line 313, in run_command
    self.distribution.run_command(command)
  File "D:\Programs\Python\Python38\lib\distutils\dist.py", line 985, in run_command
    cmd_obj.run()
  File "setup.py", line 16, in run
    if LooseVersion(self.get_cmake_version()) < '3.1.0':
  File "setup.py", line 47, in get_cmake_version
    ", ".join(e.name for e in self.extensions))
  File "D:\Programs\Python\Python38\lib\distutils\cmd.py", line 103, in __getattr__
    raise AttributeError(attr)
AttributeError: extensions

看来在windows下安装MITIE之前还得将一些环境配好,包括VS里的cl、boost以及cmake。

安装boost之前需要验证vs的环境变量:把vs安装路径E:\Program Files (x86)\Microsoft Visual Studio\2019\Community\VC\Tools\MSVC\14.28.29333\bin\Hostx64\x64加入环境变量,并在该路径下cmd,键入‘cl’不报错则配置成功。

安装boost:cd 到目录E:\develop-environment\boost_1_67_0\tools\build下面执行bootstrap.bat。然而还是报错:Failed to bootstrap the build engine Please consult bootstrap.log for furter diagnostics.

于是到boost官网上去找到了更新的版本:boost_1_75_0。cd 到目录E:\develop-environment\boost_1_75_0\tools\build下面执行bootstrap.bat,OK。

接下来cd到E:\项目备份\Rasa_NLU_Chi\MITIE-master\examples中运行python setup.py install就可以通过了。

如果是在pycharm中运行train.py,运行参数得设对:-c ../sample_configs/config_jieba_mitie_sklearn.yml --data ../data/examples/rasa/demo-rasa_zh.json --path models --project Rasa_NLU_Chi。models的路径和train.py同级。../表示train.py的上级目录。在windows下由于路径和linux不同,还得在配置文件的相关地方改一下路径,如下图所示:

修改完后就可以开始进行奇妙的训练之旅了,运行结果如下:

Training to recognize 2 labels: 'food', 'disease'
Part I: train segmenter
words in dictionary: 200000
num features: 271
now do training
C:           20
epsilon:     0.01
num threads: 1
cache size:  5
max iterations: 2000
loss per missed segment:  3
C: 20   loss: 3 	0.444444
C: 35   loss: 3 	0.444444
C: 20   loss: 4.5 	0.555556
C: 5   loss: 3 	0.444444
C: 20   loss: 1.5 	0.444444
C: 20   loss: 6 	0.555556
C: 20   loss: 5.25 	0.555556
C: 21.5   loss: 4.65 	0.555556
C: 16.9684   loss: 4.72073 	0.555556
C: 18.2577   loss: 4.43072 	0.555556
C: 18.2131   loss: 4.55681 	0.555556
C: 20   loss: 4.4 	0.555556
C: 20.9694   loss: 4.47547 	0.555556
best C: 20
best loss: 4.5
num feats in chunker model: 4095
train: precision, recall, f1-score: 1 1 1 
Part I: elapsed time: 3 seconds.

Part II: train segment classifier
now do training
num training samples: 9
C: 200   f-score: 1
C: 400   f-score: 1
C: 300   f-score: 1
C: 100   f-score: 1
C: 0.01   f-score: 1
C: 50.005   f-score: 1
C: 25.0075   f-score: 1
C: 12.5088   f-score: 1
C: 6.25938   f-score: 1
C: 3.13469   f-score: 1
C: 1.57234   f-score: 1
C: 0.791172   f-score: 1
C: 0.400586   f-score: 1
best C: 0.791172
test on train: 
3 0 
0 6 

overall accuracy: 1
Part II: elapsed time: 3 seconds.
df.number_of_classes(): 2
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
Fitting 2 folds for each of 6 candidates, totalling 12 fits
[Parallel(n_jobs=1)]: Done  12 out of  12 | elapsed:    0.0s finished

猜你喜欢

转载自blog.csdn.net/dragon_T1985/article/details/114303311