一、Lightgbm简介
xgboost的出现,让调参侠们告别了传统的机器学习算法们:RF、GBM、SVM、LASSO等等,而微软推出了一个新的boosting框架Lightgbm更是向前进一步,渐渐有取代的xgboost这一开源框架地位的趋势。
相比于xgboost,Lightgbm速度大幅度提升,精度与xgboost不相上下。但是易用性和特性相比xgboost还有待提高,cv,early stopping这些非常重要的特性并没有找到。同时,支持一些语言的接口还有待完善(目前2018.05.12在官方wiki里面只看到了支持Python的API接口)
二、Lightgbm安装
参见官方wiki:https://lightgbm.readthedocs.io/en/latest/Installation-Guide.html# apachcn的中文wiki 翻译:http://lightgbm.apachecn.org/zh/2.0.11/index.html ,安装过程如下:
先检查系统中是否安装了cmake:
yuhuiliu@sinclab-desktop:~$ sudo apt-get install cmake [sudo] password for yuhuiliu: Reading package lists... Done Building dependency tree Reading state information... Done cmake is already the newest version (3.5.1-1ubuntu3). The following packages were automatically installed and are no longer required: bbswitch-dkms dkms lib32gcc1 libc6-i386 libjansson4 libvdpau1 libxnvctrl0 mesa-vdpau-drivers screen-resolution-extra vdpau-driver-all xserver-xorg-legacy Use 'sudo apt autoremove' to remove them. 0 upgraded, 0 newly installed, 0 to remove and 8 not upgraded.
显示cmake已经成功安装,执行以下命令:
git clone --recursive https://github.com/Microsoft/LightGBM ; cd LightGBM mkdir build ; cd build cmake .. make -j4编译好了LightGBM之后,参照之前浏览的https://stackoverflow.com/questions/44212706/why-importerror-no-module-named-lightgbm进入LightGBM/Python-package 中执行setup.py文件
yuhuiliu@sinclab-desktop:~/LightGBM/python-package$ sudo python3 setup.py install running install creating compile/include creating compile/include/LightGBM copying ../include/LightGBM/tree_learner.h -> ./compile/include/LightGBM copying ../include/LightGBM/application.h -> ./compile/include/LightGBM copying ../include/LightGBM/metric.h -> ./compile/include/LightGBM copying ../include/LightGBM/export.h -> ./compile/include/LightGBM copying ../include/LightGBM/feature_group.h -> ./compile/include/LightGBM creating compile/include/LightGBM/utils copying ../include/LightGBM/utils/random.h -> ./compile/include/LightGBM/utils copying ../include/LightGBM/utils/threading.h -> ./compile/include/LightGBM/utils copying ../include/LightGBM/utils/file_io.h -> ./compile/include/LightGBM/utils copying ../include/LightGBM/utils/openmp_wrapper.h -> ./compile/include/LightGBM/utils copying ../include/LightGBM/utils/log.h -> ./compile/include/LightGBM/utils copying ../include/LightGBM/utils/array_args.h -> ./compile/include/LightGBM/utils copying ../include/LightGBM/utils/common.h -> ./compile/include/LightGBM/utils copying ../include/LightGBM/utils/text_reader.h -> ./compile/include/LightGBM/utils copying ../include/LightGBM/utils/pipeline_reader.h -> ./compile/include/LightGBM/utils copying ../include/LightGBM/prediction_early_stop.h -> ./compile/include/LightGBM copying ../include/LightGBM/tree.h -> ./compile/include/LightGBM copying ../include/LightGBM/objective_function.h -> ./compile/include/LightGBM copying ../include/LightGBM/c_api.h -> ./compile/include/LightGBM copying ../include/LightGBM/network.h -> ./compile/include/LightGBM copying ../include/LightGBM/lightgbm_R.h -> ./compile/include/LightGBM copying ../include/LightGBM/R_object_helper.h -> ./compile/include/LightGBM copying ../include/LightGBM/config.h -> ./compile/include/LightGBM copying ../include/LightGBM/json11.hpp -> ./compile/include/LightGBM copying ../include/LightGBM/dataset.h -> ./compile/include/LightGBM copying ../include/LightGBM/meta.h -> ./compile/include/LightGBM copying ../include/LightGBM/bin.h -> ./compile/include/LightGBM copying ../include/LightGBM/boosting.h -> ./compile/include/LightGBM copying ../include/LightGBM/dataset_loader.h -> ./compile/include/LightGBM creating compile/src creating compile/src/metric copying ../src/metric/map_metric.hpp -> ./compile/src/metric copying ../src/metric/binary_metric.hpp -> ./compile/src/metric copying ../src/metric/metric.cpp -> ./compile/src/metric copying ../src/metric/regression_metric.hpp -> ./compile/src/metric copying ../src/metric/xentropy_metric.hpp -> ./compile/src/metric copying ../src/metric/multiclass_metric.hpp -> ./compile/src/metric copying ../src/metric/dcg_calculator.cpp -> ./compile/src/metric copying ../src/metric/rank_metric.hpp -> ./compile/src/metric copying ../src/c_api.cpp -> ./compile/src creating compile/src/treelearner copying ../src/treelearner/gpu_tree_learner.h -> ./compile/src/treelearner copying ../src/treelearner/split_info.hpp -> ./compile/src/treelearner copying ../src/treelearner/leaf_splits.hpp -> ./compile/src/treelearner copying ../src/treelearner/feature_parallel_tree_learner.cpp -> ./compile/src/treelearner copying ../src/treelearner/data_partition.hpp -> ./compile/src/treelearner copying ../src/treelearner/serial_tree_learner.cpp -> ./compile/src/treelearner copying ../src/treelearner/voting_parallel_tree_learner.cpp -> ./compile/src/treelearner copying ../src/treelearner/feature_histogram.hpp -> ./compile/src/treelearner copying ../src/treelearner/gpu_tree_learner.cpp -> ./compile/src/treelearner copying ../src/treelearner/serial_tree_learner.h -> ./compile/src/treelearner creating compile/src/treelearner/ocl copying ../src/treelearner/ocl/histogram64.cl -> ./compile/src/treelearner/ocl copying ../src/treelearner/ocl/histogram16.cl -> ./compile/src/treelearner/ocl copying ../src/treelearner/ocl/histogram256.cl -> ./compile/src/treelearner/ocl copying ../src/treelearner/parallel_tree_learner.h -> ./compile/src/treelearner copying ../src/treelearner/data_parallel_tree_learner.cpp -> ./compile/src/treelearner copying ../src/treelearner/tree_learner.cpp -> ./compile/src/treelearner creating compile/src/network copying ../src/network/linker_topo.cpp -> ./compile/src/network copying ../src/network/socket_wrapper.hpp -> ./compile/src/network copying ../src/network/linkers_mpi.cpp -> ./compile/src/network copying ../src/network/linkers_socket.cpp -> ./compile/src/network copying ../src/network/linkers.h -> ./compile/src/network copying ../src/network/network.cpp -> ./compile/src/network copying ../src/main.cpp -> ./compile/src creating compile/src/boosting copying ../src/boosting/score_updater.hpp -> ./compile/src/boosting copying ../src/boosting/goss.hpp -> ./compile/src/boosting copying ../src/boosting/dart.hpp -> ./compile/src/boosting copying ../src/boosting/gbdt_model_text.cpp -> ./compile/src/boosting copying ../src/boosting/boosting.cpp -> ./compile/src/boosting copying ../src/boosting/rf.hpp -> ./compile/src/boosting copying ../src/boosting/prediction_early_stop.cpp -> ./compile/src/boosting copying ../src/boosting/gbdt_prediction.cpp -> ./compile/src/boosting copying ../src/boosting/gbdt.h -> ./compile/src/boosting copying ../src/boosting/gbdt.cpp -> ./compile/src/boosting creating compile/src/application copying ../src/application/predictor.hpp -> ./compile/src/application copying ../src/application/application.cpp -> ./compile/src/application copying ../src/lightgbm_R.cpp -> ./compile/src creating compile/src/objective copying ../src/objective/xentropy_objective.hpp -> ./compile/src/objective copying ../src/objective/regression_objective.hpp -> ./compile/src/objective copying ../src/objective/objective_function.cpp -> ./compile/src/objective copying ../src/objective/binary_objective.hpp -> ./compile/src/objective copying ../src/objective/rank_objective.hpp -> ./compile/src/objective copying ../src/objective/multiclass_objective.hpp -> ./compile/src/objective creating compile/src/io copying ../src/io/parser.hpp -> ./compile/src/io copying ../src/io/parser.cpp -> ./compile/src/io copying ../src/io/dense_nbits_bin.hpp -> ./compile/src/io copying ../src/io/dense_bin.hpp -> ./compile/src/io copying ../src/io/ordered_sparse_bin.hpp -> ./compile/src/io copying ../src/io/tree.cpp -> ./compile/src/io copying ../src/io/sparse_bin.hpp -> ./compile/src/io copying ../src/io/file_io.cpp -> ./compile/src/io copying ../src/io/metadata.cpp -> ./compile/src/io copying ../src/io/dataset.cpp -> ./compile/src/io copying ../src/io/config.cpp -> ./compile/src/io copying ../src/io/json11.cpp -> ./compile/src/io copying ../src/io/bin.cpp -> ./compile/src/io copying ../src/io/dataset_loader.cpp -> ./compile/src/io copying ../windows/LightGBM.sln -> ./compile/windows copying ../windows/LightGBM.vcxproj -> ./compile/windows copying ../CMakeLists.txt -> ./compile/ copying ../LICENSE -> ./ INFO:LightGBM:Starting to compile the library. INFO:LightGBM:Starting to compile with CMake. running build running build_py INFO:root:Generating grammar tables from /usr/lib/python3.5/lib2to3/Grammar.txt INFO:root:Generating grammar tables from /usr/lib/python3.5/lib2to3/PatternGrammar.txt running egg_info writing dependency_links to lightgbm.egg-info/dependency_links.txt writing top-level names to lightgbm.egg-info/top_level.txt writing lightgbm.egg-info/PKG-INFO writing requirements to lightgbm.egg-info/requires.txt reading manifest file 'lightgbm.egg-info/SOURCES.txt' reading manifest template 'MANIFEST.in' no previously-included directories found matching 'build' warning: no files found matching '*.txt' warning: no files found matching '*.so' under directory 'lightgbm' warning: no files found matching '*.dll' under directory 'compile/Release' warning: no files found matching '*' under directory 'compile/compute' warning: no files found matching 'LightGBM.vcxproj.filters' under directory 'compile/windows' warning: no files found matching '*.dll' under directory 'compile/windows/x64/DLL' warning: no previously-included files matching '*.py[co]' found anywhere in distribution writing manifest file 'lightgbm.egg-info/SOURCES.txt' running install_lib creating /usr/local/lib/python3.5/dist-packages/lightgbm copying build/lib/lightgbm/VERSION.txt -> /usr/local/lib/python3.5/dist-packages/lightgbm copying build/lib/lightgbm/__init__.py -> /usr/local/lib/python3.5/dist-packages/lightgbm copying build/lib/lightgbm/plotting.py -> /usr/local/lib/python3.5/dist-packages/lightgbm copying build/lib/lightgbm/engine.py -> /usr/local/lib/python3.5/dist-packages/lightgbm copying build/lib/lightgbm/compat.py -> /usr/local/lib/python3.5/dist-packages/lightgbm copying build/lib/lightgbm/callback.py -> /usr/local/lib/python3.5/dist-packages/lightgbm copying build/lib/lightgbm/libpath.py -> /usr/local/lib/python3.5/dist-packages/lightgbm copying build/lib/lightgbm/basic.py -> /usr/local/lib/python3.5/dist-packages/lightgbm copying build/lib/lightgbm/sklearn.py -> /usr/local/lib/python3.5/dist-packages/lightgbm INFO:root:Installing lib_lightgbm from: ['../lib_lightgbm.so', 'compile/lib_lightgbm.so'] copying ../lib_lightgbm.so -> /usr/local/lib/python3.5/dist-packages/lightgbm byte-compiling /usr/local/lib/python3.5/dist-packages/lightgbm/__init__.py to __init__.cpython-35.pyc byte-compiling /usr/local/lib/python3.5/dist-packages/lightgbm/plotting.py to plotting.cpython-35.pyc byte-compiling /usr/local/lib/python3.5/dist-packages/lightgbm/engine.py to engine.cpython-35.pyc byte-compiling /usr/local/lib/python3.5/dist-packages/lightgbm/compat.py to compat.cpython-35.pyc byte-compiling /usr/local/lib/python3.5/dist-packages/lightgbm/callback.py to callback.cpython-35.pyc byte-compiling /usr/local/lib/python3.5/dist-packages/lightgbm/libpath.py to libpath.cpython-35.pyc byte-compiling /usr/local/lib/python3.5/dist-packages/lightgbm/basic.py to basic.cpython-35.pyc byte-compiling /usr/local/lib/python3.5/dist-packages/lightgbm/sklearn.py to sklearn.cpython-35.pyc running install_egg_info Copying lightgbm.egg-info to /usr/local/lib/python3.5/dist-packages/lightgbm-2.1.1.egg-info running install_scripts
按照之前在wiki上面的guide学到的套路应该配置到这里,lightgbm库就能够直接用了,
然而使用conda list | grep lig命令悲伤的并没有发现lightgbm这个包:
yuhuiliu@sinclab-desktop:~$ conda list |grep lig yuhuiliu@sinclab-desktop:~$再在python环境中import试一下:
yuhuiliu@sinclab-desktop:~$ python Python 3.6.4 |Anaconda, Inc.| (default, Jan 16 2018, 18:10:19) [GCC 7.2.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import lightgbm as lgb Traceback (most recent call last): File "<stdin>", line 1, in <module> ModuleNotFoundError: No module named 'lightgbm'
果然没有lightgbm这个module。
郁闷,删了build/文件重新配置好几次都显示没有‘lightgbm’这个包,也检查了cmake的版本,最下面这台sinc-server工作站是配置成功的,作为参照:
yuhuiliu@sinclab-desktop:~$ sudo apt-get install cmake [sudo] password for yuhuiliu: Reading package lists... Done Building dependency tree Reading state information... Done cmake is already the newest version (3.5.1-1ubuntu3).
yuhuiliu@sinc-server:~$ dpkg -l |grep cmak ii cmake 3.5.1-1ubuntu1 amd64 cross-platform, open-source make system ii cmake-data 3.5.1-1ubuntu1 all CMake data files (modules, templates and documentation)
这时不妨换个思路 ,参照http://lightgbm.apachecn.org/zh/2.0.11/Python-Intro.html#id1,查看wiki的Python Package Introduction,发现pip的源已经加进来这个source,这样我们就利用pip命令进行安装:
yuhuiliu@sinclab-desktop:~$ pip install lightgbm
The directory '/home/yuhuiliu/.cache/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
The directory '/home/yuhuiliu/.cache/pip' or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
Collecting lightgbm
Downloading https://files.pythonhosted.org/packages/bf/01/45e209af10fd16537df0c5d8a5474c286554c3eaf9ddb0ce04113f1e8506/lightgbm-2.1.1-py2.py3-none-manylinux1_x86_64.whl (711kB)
100% |████████████████████████████████| 716kB 709kB/s
Requirement already satisfied: numpy in ./anaconda3/lib/python3.6/site-packages (from lightgbm)
Requirement already satisfied: scikit-learn in ./anaconda3/lib/python3.6/site-packages (from lightgbm)
Requirement already satisfied: scipy in ./anaconda3/lib/python3.6/site-packages (from lightgbm)
Installing collected packages: lightgbm
Successfully installed lightgbm-2.1.1
You are using pip version 9.0.1, however version 10.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
再来检查一下:
yuhuiliu@sinclab-desktop:~$ conda list |grep lig lightgbm 2.1.1 <pip>
已经成功安装了lightgbm:
yuhuiliu@sinclab-desktop:~$ python Python 3.6.4 |Anaconda, Inc.| (default, Jan 16 2018, 18:10:19) [GCC 7.2.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import lightgbm as lgb >>>