简介

gcForest 号称是可以替代深度神经网络的森林算法，可以解决传统深度学习的一些弊端，比如超参数多的问题。这类介绍文章在社区和csdn上多不胜数，这里有一篇大佬总结的用法以供参考：gcForest算法原理及Python实现
这篇文章主要是踩过的坑，后面的同学如果遇到相似的问题不妨借鉴一下我的处理方式，如果有不对的地方欢迎指出。如果能解决你的问题，也希望各位大佬为我留下首赞。

资源传送门

GitHub源项目(包含example demo)
paper in arxiv
GitHub 上demo项目-墙体表面缺陷检测分类器：此项目比较了多种算法，其中包含gcforest
…

我的配置

macOS MoJava / windows 10 x86 / windows 10 x64
python 3.6.8 / python 3.7.3 （作者说是支持python3.5 ，我使用的是3.6和3.7）

依赖的第三方库

pip3 install scikit-image
pip3 install scipy

安装使用gcForest遇到的问题

安装

1.在github上直接下载或者git clone ：下载链接
2.我是直接下载的压缩包，所以这里直接解压zip
解压后整个文件夹
3.（1）可以直接把整个项目用 pycharm打开，尝试里面内置的demo（使用mnist手写数据的小demo），这里跳过不谈。
（2）将lib 文件夹里的 gcforest 文件夹整个复制，放入你的python site_packages里。（我是放在anaconda 虚拟环境里所以我的路径是 ‎⁨…▸ ⁨anaconda3⁩ ▸ ⁨envs⁩ ▸ ⁨gcforest⁩ ▸ ⁨lib⁩ ▸ ⁨python3.6⁩ ▸ ⁨site-packages⁩ ▸ ⁨gcforest⁩）
4.然后就可以直接 import 了： from gcforest.gcforest import GCForest

错误问题

NameError: name ‘basestring’ is not defined

Traceback (most recent call last):
  File "demo_Defect-Detection-Classifier.py", line 110, in <module>
    X_train_enc = gc.fit_transform(X_train, y_train)
  File "/Users/shin/anaconda3/envs/gcforest/lib/python3.6/site-packages/gcforest/gcforest.py", line 31, in fit_transform
    self.fg.fit_transform(X_train, y_train, X_test, y_test, train_config)
  File "/Users/shin/anaconda3/envs/gcforest/lib/python3.6/site-packages/gcforest/fgnet.py", line 54, in fit_transform
    layer.fit_transform(train_config)
  File "/Users/shin/anaconda3/envs/gcforest/lib/python3.6/site-packages/gcforest/layers/fg_win_layer.py", line 81, in fit_transform
    if np.all(self.check_top_cache(phases, ti)):
  File "/Users/shin/anaconda3/envs/gcforest/lib/python3.6/site-packages/gcforest/layers/base_layer.py", line 50, in check_top_cache
    top = self.data_cache.get(phase, top_name, ignore_no_exist=True)
  File "/Users/shin/anaconda3/envs/gcforest/lib/python3.6/site-packages/gcforest/data_cache.py", line 83, in get
    assert isinstance(data_name, basestring), "data_name={}, type(data_name)={}".format(data_name, type(data_name))
NameError: name 'basestring' is not defined

解决办法
凡是出现这样的错误找到最后一个报错，这里是：
File "/Users/shin/anaconda3/envs/gcforest/lib/python3.6/site-packages/gcforest/data_cache.py", line 83, in get assert isinstance(data_name, basestring), "data_name={}, type(data_name)={}".format(data_name, type(data_name))
打开源包里的data_cache.py，找到对应的代码行，替换basestring > str，点击保存即可。

TypeError: ‘float’ object cannot be interpreted as an integer

Traceback (most recent call last):
  File "demo_Defect-Detection-Classifier.py", line 110, in <module>
    X_train_enc = gc.fit_transform(X_train, y_train)
  File "/Users/shin/anaconda3/envs/gcforest/lib/python3.6/site-packages/gcforest/gcforest.py", line 31, in fit_transform
    self.fg.fit_transform(X_train, y_train, X_test, y_test, train_config)
  File "/Users/shin/anaconda3/envs/gcforest/lib/python3.6/site-packages/gcforest/fgnet.py", line 54, in fit_transform
    layer.fit_transform(train_config)
  File "/Users/shin/anaconda3/envs/gcforest/lib/python3.6/site-packages/gcforest/layers/fg_win_layer.py", line 92, in fit_transform
    X_win = get_windows(X, self.win_x, self.win_y, self.stride_x, self.stride_y, self.pad_x, self.pad_y)
  File "/Users/shin/anaconda3/envs/gcforest/lib/python3.6/site-packages/gcforest/utils/win_utils.py", line 59, in get_windows
    X_win = X_win.reshape((int(n), int(nh), nw, nc))
TypeError: 'float' object cannot be interpreted as an integer

原因
出现这样的错误是python3里 float不能作为int
解决方案
还是打开最后一个提示错误的文件，找到相应的句子，将里面有可能为float类型的参数在调用前强制转换成int类型：int()
我这里可以看到：X_win = X_win.reshape((int(n), int(nh), nw, nc)) 我将 n和nh参数强制转换了，而如果设断点可以看到nw这里也是float类型，所以也需要强制转换成int

only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolea

错误提示略，这个问题解决办法与上一个类似，原因是里面有float类型参数，将其强制转换成int 即可

Deep Forest: 使用gcForest时 遇到的源码问题详解和记录