昇思MindSpore详细教程

一、参考资料

张小白带你快速体验MindSpore V1.0(For ubuntu 18.04)

张小白教你如何在Ubuntu 18.04上源码安装MindSpore V1.0

重要资料文档

MindSpore算子众智 - Wiki - Gitee.com

MindSpore文档 - Gitee.com

二、术语解析

  • 高性能内核计算库NNACL,支持Sliding Windows、Im2Col+GEMM、Winograd等多种卷积优化算法。

三、重要说明

  • MindSpore支持整网在不同的硬件平台上运行,并不支持同一张网络的不同partition在不同的硬件平台上运行,这点和TensorFlow的graph partition异构运行模式不一样。

  • build.sh中默认的编译线程数为8,如果编译机性能较差可能会出现编译错误,可在执行中增加-j{线程数}来减少线程数量。如bash build.sh -e ascend -j4

  • 经过测试证明,MindSpore版本与CANN版本不是严格匹配的,一个CANN版本可能同时支持多个MindSpore版本

    测试设备:昇腾710服务器;

    CANN版本:CANN5.0.4;

    安装包 whl Binary
    MS1.9.0 × ×
    MS1.8.1 ×
    MS1.8.0 ×
    MS1.7.1 × WARNING警告
    MS1.7.0 × WARNING警告
    MS1.6.2 ×
  • 无论是CPU版本还是Ascend版本的MindSpore软件包,都需要与CANN版本对齐

四、相关介绍

1. MindSpore架构

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

2. MindSpore异构计算

给MindSpore添加一个新的硬件后端(device target)

如何给MindSpore添加一个新的硬件后端?快速构建测试环境!

如何给MindSpore添加一个新的硬件后端?快速构建测试环境!

MindSpore支持异构算力,除支持华为自研的达芬奇架构的Ascend NPU外还支持CPU(e.g. MKLDNN) 以及 GPU(e.g. CUDA kernels)算子的运行。
在这里插入图片描述

注意MindSpore支持整网在不同的硬件平台上运行,并不支持同一张网络的不同partition在不同的硬件平台上运行,这点和TensorFlow的graph partition异构运行模式不一样

总结模型的所有算子,应该是整个后端硬件都支持的算子。例如,如果需要将整网跑在Ascend昇腾设备上,则模型的所有算子在昇腾上都要支持,不存在部分支持的问题,不支持后端硬件异构(不支持一部分算子跑在CPU上,另外一部分跑在Ascend上)

3. MindSpore在GPU分布式训练

MindSpore易点通·精讲系列–模型训练之GPU分布式并行训练

MindSpore易点通·精讲系列–模型训练之GPU分布式并行训练

MindSpore易点通·精讲系列–模型训练之GPU分布式并行训练

【AI工程】05-基于MindSpore的Resnet-50模型分布式训练实践

4. 使用Mindspore Lite端到端部署LSTM

5. 自动微分

在这里插入图片描述

6. 自动并行

在这里插入图片描述

7. MindSpore支持的target后端

# mindspore/akg/python/akg/utils/validation_check.py
def get_backend(target_):
    target = target_.split()[0]
    if target == CCE:
        return "Ascend"
    elif target == CUDA:
        return "GPU"
    elif target == LLVM:
        return "CPU"
    return "UNKNOWN"

五、MindSpore Serving

基于MindSpore Serving部署推理服务

六、自定义算子开发

1. GPU算子开发

GPU算子全流程开发指导.pdf

GPU异构算子全流程开发指导
在这里插入图片描述

2. CPU算子开发

开源之夏-MindSpore CPU算子全流程开发指导

3. 算子前端定义

[04. 算子前端定义 - Wiki - Gitee.com](https://gitee.com/david-he91/mindspore/wikis/MindSpore算子众智/Ascend计算算子/Ascend计算算子接入指南/04. 算子前端定义)

4. 示例 - BartlettWindow算子

GPU operator implementation BartlettWindow

【众智】【计算-GPU开发】BartlettWindow

七、MindIR中间表达

1. 参考资料

中间表达MindIR

如何查看IR文件

2. 参考文献

[1] C. Click and M. Paleczny. A simple graph-based intermediate representation. SIGPLAN Not., 30:35–49, March 1995.

[2] Roland Leißa, Marcel Köster, and Sebastian Hack. A graph-based higher-order intermediate representation. In Proceedings of the 13th Annual IEEE/ACM International Symposium on Code Generation and Optimization, pages 202–212. IEEE Computer Society, 2015.

3. 重要说明

经过编译器的若干优化处理后,节点可能经过了若干变换(如算子拆分算子融合等),节点的源码解析调用栈信息与脚本可能无法完全一一对应,这里仅作为辅助手段。

4. 相关介绍

中间表示(IR)是程序编译过程中介于 源语言目标语言 之间的程序表示,以方便编译器进行程序分析和优化,因此IR的设计需要考虑从源语言到目标语言的转换难度,同时考虑程序分析和优化的易用性和性能。

MindIR是一种基于图表示的函数式IR,其最核心的目的是服务于 自动微分变换。自动微分采用的是 基于函数式编程框架的变换方法,因此IR采用了接近于ANF函数式的语义。此外,借鉴Sea of Nodes[1]和Thorin[2]的优秀设计,采用了一种基于显性依赖图的表示方式。

在图模式context.set_context(mode=context.GRAPH_MODE)下运行用MindSpore编写的模型时,若配置中设置了context.set_context(save_graphs=True),运行时会输出一些图编译过程中生成的一些中间文件,我们称为IR文件。

当前主要有三种格式的IR文件:

  • ir后缀 结尾的IR文件:文本格式,一种比较 直观易懂 的以文本格式描述模型结构的文件,可以直接用文本编辑软件查看。

  • dat后缀 结尾的IR文件:文本格式,一种相对于ir后缀结尾的文件格式定义 更为严谨 的描述模型结构的文件,包含的内容更为丰富,可以直接用文本编辑软件查看。

  • dot后缀 结尾的IR文件:图形化格式,描述了不同节点间的拓扑关系,可以用 graphviz 将此文件作为输入生成图片,方便用户直观地查看模型结构。对于算子比较多的模型,推荐使用可视化组件 MindInsight 对计算图进行可视化

    .dot 文件可以通过 graphviz 转换为图片格式来查看,例如将dot转换为png的命令是 dot -Tpng *.dot -o *.png

当网络规模不大时,建议使用更直观的 图形化格式 来查看,当网络规模较大时建议使用更高效的 文本格式 来查看。

4.1 如何保存IR

在训练脚本train.py中,我们在set_context函数中添加如下代码,运行训练脚本时,MindSpore会自动将编译过程中产生的IR文件存放到指定路径。

if __name__ == "__main__":
    context.set_context(mode=context.GRAPH_MODE)
    context.set_context(save_graphs=True, save_graphs_path="path/to/ir/files")

4.2 各阶段编译的IR文件

执行训练命令后,在指定的路径下生成如下文件。以数字下划线开头的IR文件是在ME编译图过程中输出的,pipeline各阶段分别会保存一次计算图。

.
├──00_parse_0000.dot
├──00_parse_0001.ir
├──00_parse_0002.dat
├──01_symbol_resolve_0003.dot
├──01_symbol_resolve_0004.ir
├──01_symbol_resolve_0005.dat
├──02_combine_like_graphs_0006.dot
├──02_combine_like_graphs_0007.ir
├──02_combine_like_graphs_0008.dat
├──03_inference_opt_prepare_0009.dot
├──03_inference_opt_prepare_0010.ir
├──03_inference_opt_prepare_0011.dat
├──04_abstract_specialize_0012.dot
├──04_abstract_specialize_0013.ir
├──04_abstract_specialize_0014.dat
...
  1. parse阶段会解析入口的construct函数;
  2. symbol_resolve阶段会递归解析入口函数直接或间接引用到的其他函数和对象;
  3. abstract_specializegraph evaluate阶段,会根据输入信息从而推导出所有节点的data typeshape信息;
  4. optimize阶段主要是进行和硬件无关的优化,自动微分与自动并行功能也是在该阶段展开;
  5. validate阶段会校验编译出来的计算图;
  6. task_emit阶段将计算图传给后端进一步处理;
  7. execute阶段会执行该计算图。

八、MindSpore ModelZoo

九、FAQ

Q:libmindspore_common.so: undefined symbol: PyExc_ValueError

MindSpore安装import mindspore报错 ImportError lib/libmindspore_backend.so undefined symbol
在这里插入图片描述

root@root:/home/ascend310_single_op_sample/build# ./tensor_add_sample
./tensor_add_sample: symbol lookup error: /data/YOYOFile/output-ascend/mindspore_ascend-1.8.2-linux_aarch64/lib/libmindspore_common.so: undefined symbol: PyExc_ValueError
root@44ff3bb0ad1d:/data/YOYOFile/Downloads/ascend310_single_op_sample# ./tensor_add_sample
./tensor_add_sample: symbol lookup error: /data/YOYOFile/output-ascend19/mindspore_ascend-1.9.1-linux_aarch64/lib/libmindspore_common.so: undefined symbol: PyExc_ValueError
root@44ff3bb0ad1d:/data/YOYOFile/mindspore# python -c "import mindspore;mindspore.run_check()"
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/__init__.py", line 19, in <module>
    from mindspore import common, dataset, mindrecord, train, log
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/common/__init__.py", line 17, in <module>
    from mindspore.common import dtype
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/common/dtype.py", line 23, in <module>
    from mindspore._c_expression import typing
ImportError: /root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/lib/libmindspore_common.so: undefined symbol: _ZN9mindspore5trace15DumpSourceLinesB5cxx11EPNS_7AnfNodeE
root@44ff3bb0ad1d:/data/YOYOFile/Downloads# python -c "import mindspore;mindspore.run_check()"
[WARNING] ME(3898:281473105558080,MainProcess):2022-12-29-06:48:32.687.133 [mindspore/run_check/_check_version.py:296] MindSpore version 1.9.0 and Ascend AI software package (Ascend Data Center Solution)version 1.81 does not match, the version of software package expect one of ['1.83'], please reference to the match info on: https://www.mindspore.cn/install
[ERROR] ME(3898,ffff90787a40,python):2022-12-29-06:48:32.877.015 [mindspore/ccsrc/runtime/hardware/device_context_manager.cc:46] LoadDynamicLib] Load dynamic library libmindspore_ascend failed, returns [/root/miniconda3/envs/ms18/lib/python3.9/site-packages/mindspore/lib/plugin/libmindspore_ascend.so: undefined symbol: MsprofRegisterCallback].
Segmentation fault (core dumped)
(ms18) root@44ff3bb0ad1d:/data/YOYOFile/Downloads# python -c "import mindspore;mindspore.run_check()"
[WARNING] ME(5911:281473596037696,MainProcess):2022-12-29-06:53:15.411.655 [mindspore/run_check/_check_version.py:306] MindSpore version 1.7.1 and "hccl" wheel package version 1.80 does not match, reference to the match info on: https://www.mindspore.cn/install
...
...
...
ImportError: cannot import name 'get_L1_info' from partially initialized module 'tbe.common.buildcfg' (most likely due to a circular import) (/root/miniconda3/envs/ms18/lib/python3.9/site-packages/tbe/common/buildcfg/__init__.py)
Segmentation fault (core dumped)
root@44ff3bb0ad1d:/data/YOYOFile/Downloads/ascend310_single_op_sample# ./tensor_add_sample
...
...
...
ImportError: cannot import name 'get_L1_info' from partially initialized module 'tbe.common.buildcfg' (most likely due to a circular import) (/usr/local/python3.9.2/lib/python3.9/site-packages/tbe/common/buildcfg/__init__.py)
[ERROR] ME(10365,ffffa095b010,tensor_add_sample):2022-12-29-07:03:39.769.291 [mindspore/ccsrc/cxx_api/model/acl/model_converter.cc:136] BuildAirModel] Call aclgrphBuildInitialize fail.
[ERROR] ME(10365,ffffa095b010,tensor_add_sample):2022-12-29-07:03:39.769.391 [mindspore/ccsrc/cxx_api/model/acl/model_converter.cc:228] operator()] Convert model from MindIR to OM failed
[ERROR] ME(10365,ffffa095b010,tensor_add_sample):2022-12-29-07:03:39.769.410 [mindspore/ccsrc/cxx_api/model/model_converter_utils/multi_process.cc:140] ChildProcess] Child process process failed
[WARNING] ME(10323,ffff9fba0f50,tensor_add_sample):2022-12-29-07:03:39.832.702 [mindspore/ccsrc/cxx_api/model/model_converter_utils/multi_process.cc:227] HeartbeatThreadFuncInner] Peer stopped
[ERROR] ME(10323,ffffa095b010,tensor_add_sample):2022-12-29-07:03:39.833.743 [mindspore/ccsrc/cxx_api/model/acl/model_converter.cc:208] operator()] Receive result model from child process failed
[ERROR] ME(10323,ffffa095b010,tensor_add_sample):2022-12-29-07:03:39.833.782 [mindspore/ccsrc/cxx_api/model/model_converter_utils/multi_process.cc:118] ParentProcess] Parent process process failed
[ERROR] ME(10323,ffffa095b010,tensor_add_sample):2022-12-29-07:03:40.833.964 [mindspore/ccsrc/cxx_api/model/acl/model_converter.cc:242] LoadMindIR] Convert MindIR model to OM model failed
[ERROR] ME(10323,ffffa095b010,tensor_add_sample):2022-12-29-07:03:40.833.992 [mindspore/ccsrc/cxx_api/model/acl/acl_model.cc:79] Build] Load MindIR failed.
Build model failed.
错误原因:
MindSpore的版本与CANN版本不匹配
MindSpore版本:1.8.2
CANN版本:5.0.4

解决办法:
版本对齐:https://www.mindspore.cn/versions

在这里插入图片描述

Q:libmindspore.so: undefined symbol: _ZN2ge8Operator21DynamicOutputRegisterEPKcjS2_b

root@44ff3bb0ad1d:/data/YOYOFile/Downloads/ascend310_single_op_sample# ./tensor_add_sample
./tensor_add_sample: symbol lookup error: /data/YOYOFile/Downloads/mindspore_ascend-1.9.0-linux_aarch64/lib/libmindspore.so: undefined symbol: _ZN2ge8Operator21DynamicOutputRegisterEPKcjS2_b

CANN版本:5.0.4
MindSpore版本:1.9.0
root@44ff3bb0ad1d:/data/YOYOFile/Downloads/ascend310_single_op_sample# ./tensor_add_sample
[WARNING] GE_ADPT(55906,ffff78c7f010,tensor_add_sample):2022-12-28-14:42:14.556.150 [mindspore/ccsrc/transform/graph_ir/graph_runner.cc:55] NewSession] no GE client, return nullptr!
[WARNING] GE_ADPT(55906,ffff78c7f010,tensor_add_sample):2022-12-28-14:42:14.556.190 [mindspore/ccsrc/transform/graph_ir/df_graph_manager.cc:157] SetGeSession] You are adding a empty Ge Session
[WARNING] GE_ADPT(55906,ffff78c7f010,tensor_add_sample):2022-12-28-14:42:14.556.202 [mindspore/ccsrc/transform/graph_ir/graph_runner.cc:55] NewSession] no GE client, return nullptr!
[WARNING] GE_ADPT(55906,ffff78c7f010,tensor_add_sample):2022-12-28-14:42:14.556.213 [mindspore/ccsrc/transform/graph_ir/graph_runner.cc:70] GraphRunner] graph runner sess_ is nullptr!
3
5
7
9

CANN版本:5.0.4
MindSpore版本:1.7.0、1.7.1
错误原因:
MindSpore的版本与CANN版本不匹配
经过测试证明,MindSpore版本与CANN版本不是严格匹配的,一个CANN版本可能同时支持多个MindSpore版本
经过测试,CANN5.0.4版本,支持MindSpore1.81、1.80、1.62,另外支持MindSpore1.70与1.71版本但会有WARNING警告,不支持MindSpore1.9

解决办法:
版本对齐:https://www.mindspore.cn/versions

Q:Load dynamic library libmindspore_ascend failed,xxx,libmindspore_ascend.so: undefined symbol

MindSpore安装import mindspore报错 ImportError lib/libmindspore_backend.so undefined symbol

[WARNING] ME(14012:281473224216768,MainProcess):2022-12-27-03:38:17.214.201 [mindspore/run_check/_check_version.py:296] MindSpore version 1.9.0 and Ascend AI software package (Ascend Data Center Solution)version 1.80 does not match, the version of software package expect one of ['1.83'], please reference to the match info on: https://www.mindspore.cn/install
[ERROR] ME(14012,ffff978b10c0,python):2022-12-27-03:38:17.385.372 [mindspore/ccsrc/runtime/hardware/device_context_manager.cc:46] LoadDynamicLib] Load dynamic library libmindspore_ascend failed, returns [/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/lib/plugin/libmindspore_ascend.so: undefined symbol: MsprofRegisterCallback].
Segmentation fault (core dumped)
[WARNING] ME(67272:281473573595840,MainProcess):2022-12-26-06:54:49.521.689 [mindspore/run_check/_check_version.py:289] MindSpore version 1.8.1 and Ascend AI software package (Ascend Data Center Solution)version 1.80 does not match, the version of software package expect one of ['1.82'], please reference to the match info on: https://www.mindspore.cn/install
[WARNING] ME(67272:281473573595840,MainProcess):2022-12-26-06:54:49.630.068 [mindspore/run_check/_check_version.py:380] Can not find ccec_compiler(need by mindspore-ascend), please check if you have set env PATH, you can reference to the installation guidelines https://www.mindspore.cn/install
[WARNING] ME(67272:281473573595840,MainProcess):2022-12-26-06:54:49.630.313 [mindspore/run_check/_check_version.py:391] Can not find driver so(need by mindspore-ascend), please check if you have set env LD_LIBRARY_PATH, you can reference to the installation guidelines https://www.mindspore.cn/install
错误原因:
MindSpore的版本与CANN(Ascend软件包)版本不匹配

当前CANN版本为5.0.4,对应的Ascend软件包版本为1.80,与MindSpore1.9版本不匹配
查看Ascend软件包版本:/usr/local/Ascend/ascend-toolkit/latest/opp/version.info

方法一:
降低MindSpore版本

方法二:
升级CANN版本,即可升级Ascend软件包

Q:Required program tclsh not found

安装tclsh

CMake Error at cmake/check_requirements.cmake:13 (message):
  Required program tclsh not found, please install the package and try
  building MindSpore again.
Call Stack (most recent call first):
  cmake/check_requirements.cmake:70 (find_required_program)
  CMakeLists.txt:13 (include)
错误原因:
未安装tclsh

方法一:
sudo apt-get install tcl -y

方法二(源码安装):
# 下载并解压
http://www.tcl.tk/software/tcltk/downloadnow85.html

cd /tmp
tar -xzvf tcl8.5.19-src.tar.gz

# 配置信息
cd /tmp/tcl8.5.8/unix
./configure --prefix=/usr/local/tcl

# 编译并安装
make -j8
make install

# 创建超链接
ln /usr/local/tcl/bin/tclsh8.5 /usr/bin/tclsh

Q:CMake Error: The current CMakeCache.txt directory xxx/CMakeCache.txt is different than the directory xxx where CMakeCache.txt was created.

其他类似的问题,都用该方法解决

pkg name:openssl,openssl
openssl config hash: cd9481f957e0f1cdfb5490048524b3da
_FIND:/data/YOYOFile/mindspore/build/mindspore/.mslib/openssl_1.1.1k_cd9481f957e0f1cdfb5490048524b3da
not find ssl in path: /data/YOYOFile/mindspore/build/mindspore/.mslib/openssl_1.1.1k_cd9481f957e0f1cdfb5490048524b3da/lib
download:  , openssl , https://gitee.com/mirrors/openssl/repository/archive/OpenSSL_1_1_1k.tar.gz
-- Populating openssl
CMake Error: The current CMakeCache.txt directory /data/YOYOFile/mindspore/build/mindspore/_deps/openssl-subbuild/CMakeCache.txt is different than the directory /home/yoyo/MyDocuments/mindspore/build/mindspore/_deps/openssl-subbuild where CMakeCache.txt was created. This may result in binaries being created in the wrong place. If you are not sure, reedit the CMakeCache.txt
CMake Error at /usr/local/share/cmake-3.23/Modules/FetchContent.cmake:1076 (message):
  CMake step for openssl failed: 1
Call Stack (most recent call first):
  /usr/local/share/cmake-3.23/Modules/FetchContent.cmake:1217:EVAL:2 (__FetchContent_directPopulate)
  /usr/local/share/cmake-3.23/Modules/FetchContent.cmake:1217 (cmake_language)
  cmake/utils.cmake:79 (FetchContent_Populate)
  cmake/utils.cmake:287 (__download_pkg)
  cmake/external_libs/openssl.cmake:83 (mindspore_add_pkg)
  cmake/mind_expression.cmake:26 (include)
  CMakeLists.txt:73 (include)
错误原因:
CMakeCache.txt文件存在冲突

解决办法:
删除CMakeCache.txt文件,重新编译
rm /home/yoyo/MyDocuments/mindspore/build/mindspore/_deps/openssl-subbuild where CMakeCache.txt

Q:git lfs found, but lfs files is not downloaded, you can perform the following steps

-- Warning: /home/yoyo/MyDocuments/mindspore/akg/prebuild/x86_64/analyze_align_dynamic.cc.o is not a valid binary file.
-- Warning: git lfs found, but lfs files is not downloaded, you can perform the following steps:
            1. Download files tracked by git lfs, executing the following commands:
               cd /home/yoyo/MyDocuments/mindspore/akg
               git lfs pull
            2. Re-compile the source codes
错误原因:
git lfs大文件没有下载,需要手动下载

解决办法:
cd /home/yoyo/MyDocuments/mindspore/akg
git lfs pull

Q:MindSpore relies on the 3 whl packages of "te", "topi" and "hccl" in the "fwkacllib" folder of the Ascend AI software package (Ascend Data Center Solution), please check whether they are installed correctly or not

[WARNING] ME(54846:281473061026496,MainProcess):2022-12-28-01:26:28.408.98 [mindspore/run_check/_check_version.py:302] CheckFailed:  ("No module named 'hccl'",)
MindSpore relies on the 3 whl packages of "te", "topi" and "hccl" in the "fwkacllib" folder of the Ascend AI software package (Ascend Data Center Solution), please check whether they are installed correctly or not, reference to the match info on: https://www.mindspore.cn/install
[WARNING] ME(54846:281473061026496,MainProcess):2022-12-28-01:26:28.411.36 [mindspore/run_check/_check_version.py:305] Please pay attention to the above warning, countdown: 3
[WARNING] ME(54846:281473061026496,MainProcess):2022-12-28-01:26:29.422.96 [mindspore/run_check/_check_version.py:305] Please pay attention to the above warning, countdown: 2
[WARNING] ME(54846:281473061026496,MainProcess):2022-12-28-01:26:30.435.65 [mindspore/run_check/_check_version.py:305] Please pay attention to the above warning, countdown: 1
/root/miniconda3/envs/ms19/lib/python3.9/site-packages/numpy/core/getlimits.py:500: UserWarning: The value of the smallest subnormal for <class 'numpy.float64'> type is zero.
  setattr(self, word, getattr(machar, word).flat[0])
/root/miniconda3/envs/ms19/lib/python3.9/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for <class 'numpy.float64'> type is zero.
  return self._float_to_str(self.smallest_subnormal)
/root/miniconda3/envs/ms19/lib/python3.9/site-packages/numpy/core/getlimits.py:500: UserWarning: The value of the smallest subnormal for <class 'numpy.float32'> type is zero.
  setattr(self, word, getattr(machar, word).flat[0])
/root/miniconda3/envs/ms19/lib/python3.9/site-packages/numpy/core/getlimits.py:89: UserWarning: The value of the smallest subnormal for <class 'numpy.float32'> type is zero.
  return self._float_to_str(self.smallest_subnormal)
Traceback (most recent call last):
  File "/data/YOYOFile/PyProject/20221226/test_add_aot.py", line 3, in <module>
    import mindspore as ms
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/__init__.py", line 18, in <module>
    from . import common, dataset, mindrecord, train, log
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/dataset/__init__.py", line 44, in <module>
    from .engine import *
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/dataset/engine/__init__.py", line 25, in <module>
    from ..callback import DSCallback, WaitedDSCallback
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/dataset/callback/__init__.py", line 16, in <module>
    from .ds_callback import DSCallback, WaitedDSCallback
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/dataset/callback/ds_callback.py", line 20, in <module>
    from mindspore.train.callback import Callback
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/train/__init__.py", line 20, in <module>
    from .model import Model
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/train/model.py", line 24, in <module>
    from .serialization import save_checkpoint
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/train/serialization.py", line 38, in <module>
    import mindspore.nn as nn
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/nn/__init__.py", line 20, in <module>
    from . import layer, loss, optim, metrics, wrap, grad, probability, sparse, dynamic_lr,\
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/nn/layer/__init__.py", line 20, in <module>
    from . import activation, normalization, container, conv, basic, embedding, pooling, image, quant, math, \
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/nn/layer/activation.py", line 23, in <module>
    from mindspore.ops import functional as F
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/ops/__init__.py", line 29, in <module>
    from . import composite, operations, functional
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/ops/composite/__init__.py", line 23, in <module>
    from .base import GradOperation, _Grad, HyperMap, Map, MultitypeFuncGraph, add_flags, \
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/ops/composite/base.py", line 28, in <module>
    from ..operations import _grad_ops
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/ops/operations/__init__.py", line 91, in <module>
    from . import _quant_ops
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/ops/operations/_quant_ops.py", line 26, in <module>
    import mindspore.ops._op_impl._custom_op
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/__init__.py", line 17, in <module>
    from .batchnorm_fold import _batchnorm_fold_tbe
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/batchnorm_fold.py", line 20, in <module>
    from te import tvm
ImportError: cannot import name 'tvm' from 'te' (/root/miniconda3/envs/ms19/lib/python3.9/site-packages/te/__init__.py)
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/__init__.py", line 18, in <module>
    from . import common, dataset, mindrecord, train, log
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/dataset/__init__.py", line 44, in <module>
    from .engine import *
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/dataset/engine/__init__.py", line 25, in <module>
    from ..callback import DSCallback, WaitedDSCallback
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/dataset/callback/__init__.py", line 16, in <module>
    from .ds_callback import DSCallback, WaitedDSCallback
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/dataset/callback/ds_callback.py", line 20, in <module>
    from mindspore.train.callback import Callback
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/train/__init__.py", line 20, in <module>
    from .model import Model
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/train/model.py", line 24, in <module>
    from .serialization import save_checkpoint
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/train/serialization.py", line 38, in <module>
    import mindspore.nn as nn
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/nn/__init__.py", line 20, in <module>
    from . import layer, loss, optim, metrics, wrap, grad, probability, sparse, dynamic_lr,\
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/nn/layer/__init__.py", line 20, in <module>
    from . import activation, normalization, container, conv, basic, embedding, pooling, image, quant, math, \
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/nn/layer/activation.py", line 23, in <module>
    from mindspore.ops import functional as F
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/ops/__init__.py", line 29, in <module>
    from . import composite, operations, functional
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/ops/composite/__init__.py", line 23, in <module>
    from .base import GradOperation, _Grad, HyperMap, Map, MultitypeFuncGraph, add_flags, \
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/ops/composite/base.py", line 28, in <module>
    from ..operations import _grad_ops
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/ops/operations/__init__.py", line 91, in <module>
    from . import _quant_ops
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/ops/operations/_quant_ops.py", line 26, in <module>
    import mindspore.ops._op_impl._custom_op
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/__init__.py", line 17, in <module>
    from .batchnorm_fold import _batchnorm_fold_tbe
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/ops/_op_impl/_custom_op/batchnorm_fold.py", line 20, in <module>
    from te import tvm
ImportError: cannot import name 'tvm' from 'te' (/root/miniconda3/envs/ms19/lib/python3.9/site-packages/te/__init__.py)
错误原因:
Anaconda虚拟环境中,暂未安装昇腾AI处理器配套软件所包含的whl包

解决办法:
在Anaconda虚拟环境中,安装昇腾AI处理器配套软件所包含的whl包

pip install sympy
pip install /usr/local/Ascend/ascend-toolkit/latest/lib64/topi-*-py3-none-any.whl
pip install /usr/local/Ascend/ascend-toolkit/latest/lib64/te-*-py3-none-any.whl
pip install /usr/local/Ascend/ascend-toolkit/latest/lib64/hccl-*-py3-none-any.whl

Q:Ascend error occurred, error message: EE9999: Inner Error!...rtStreamCreateWithFlags execute failed, reason=[feature not support]

[CRITICAL] DEVICE(61636,ffff821b9ac0,python):2022-12-28-01:35:47.099.789 [mindspore/ccsrc/runtime/device/ascend/ascend_kernel_runtime.cc:1164] InitDevice] Call rtStreamCreate, ret[207000]
[CRITICAL] DEVICE(61636,ffff821b9ac0,python):2022-12-28-01:35:47.100.156 [mindspore/ccsrc/runtime/device/ascend/ascend_kernel_runtime.cc:357] Init] Ascend error occurred, error message: EE9999: Inner Error!
        Unsupport flags, flags=4[FUNC:StreamCreate][FILE:api_error.cc][LINE:258]
        rtStreamCreateWithFlags execute failed, reason=[feature not support][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:45]

First error scene API: mindspore/ccsrc/runtime/device/ascend/ascend_kernel_runtime.cc:1164 InitDevice] Call rtStreamCreate, ret[207000]
Traceback (most recent call last):
  File "/data/YOYOFile/PyProject/20221226/test_add_aot.py", line 14, in <module>
    output = op(ms.Tensor(x0), ms.Tensor(x1))
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/ops/primitive.py", line 280, in __call__
    return _run_op(self, self.name, args)
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/common/api.py", line 61, in wrapper
    results = fn(*arg, **kwargs)
  File "/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/ops/primitive.py", line 719, in _run_op
    output = real_run_op(obj, op_name, args)
RuntimeError: mindspore/ccsrc/runtime/device/ascend/ascend_kernel_runtime.cc:357 Init] Ascend error occurred, error message: EE9999: Inner Error!
        Unsupport flags, flags=4[FUNC:StreamCreate][FILE:api_error.cc][LINE:258]
        rtStreamCreateWithFlags execute failed, reason=[feature not support][FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:45]

First error scene API: mindspore/ccsrc/runtime/device/ascend/ascend_kernel_runtime.cc:1164 InitDevice] Call rtStreamCreate, ret[207000]
错误原因:
低版本的MindSpore不支持新特性,MindSpore1.6(Ascend版本)暂不支持aot类型的自定义算子

解决办法:
升级CANN版本,升级MindSpore

Q:ImportError: cannot import name 'ms_kernel' from 'mindspore.ops'

    from mindspore.ops import ms_kernel
ImportError: cannot import name 'ms_kernel' from 'mindspore.ops' (/root/miniconda3/envs/ms19/lib/python3.9/site-packages/mindspore/ops/__init__.py)
错误原因:
低版本的MindSpore不支持新特性,MindSpore1.6(Ascend版本)暂不支持hybrid类型的自定义算子,无法用ms_kernel装饰器

解决办法:
升级CANN版本,升级MindSpore

Q:could not find a recent enough copy of autoconf.

download:  , ompi , https://gitee.com/mirrors/ompi/repository/archive/v4.0.3.tar.gz
-- Populating ompi
-- Configuring done
-- Generating done
-- Build files have been written to: /data/YOYOFile/MyDocuments/mindspore/build/mindspore/_deps/ompi-subbuild
[ 11%] Creating directories for 'ompi-populate'
[ 22%] Performing download step (download, verify and extract) for 'ompi-populate'
-- Downloading...
   dst='/data/YOYOFile/MyDocuments/mindspore/build/mindspore/_deps/ompi-subbuild/ompi-populate-prefix/src/v4.0.3.tar.gz'
   timeout='none'
   inactivity timeout='none'
-- Using src='https://gitee.com/mirrors/ompi/repository/archive/v4.0.3.tar.gz'
-- verifying file...
       file='/data/YOYOFile/MyDocuments/mindspore/build/mindspore/_deps/ompi-subbuild/ompi-populate-prefix/src/v4.0.3.tar.gz'
-- Downloading... done
-- extracting...
     src='/data/YOYOFile/MyDocuments/mindspore/build/mindspore/_deps/ompi-subbuild/ompi-populate-prefix/src/v4.0.3.tar.gz'
     dst='/data/YOYOFile/MyDocuments/mindspore/build/mindspore/_deps/ompi-src'
-- extracting... [tar xfz]
-- extracting... [analysis]
-- extracting... [rename]
-- extracting... [clean up]
-- extracting... done
[ 33%] No update step for 'ompi-populate'
[ 44%] No patch step for 'ompi-populate'
[ 55%] No configure step for 'ompi-populate'
[ 66%] No build step for 'ompi-populate'
[ 77%] No install step for 'ompi-populate'
[ 88%] No test step for 'ompi-populate'
[100%] Completed 'ompi-populate'
[100%] Built target ompi-populate
ompi_SOURCE_DIR : /data/YOYOFile/MyDocuments/mindspore/build/mindspore/_deps/ompi-src
Open MPI autogen (buckle up!)

1. Checking tool versions

   Searching for autoconf
  autoconf not found

=================================================================
I could not find a recent enough copy of autoconf.
I need at least 2.69, but only found the following versions:

    autoconf:

I am gonna abort.  :-(

Please make sure you are using at least the following versions of the
tools:

    GNU Autoconf: 2.69
    GNU Automake: 1.12.2
    GNU Libtool: 2.4.2
=================================================================
CMake Error at cmake/utils.cmake:181 (message):
  error! when ./autogen.pl in
  /data/YOYOFile/MyDocuments/mindspore/build/mindspore/_deps/ompi-src
Call Stack (most recent call first):
  cmake/utils.cmake:399 (__exec_cmd)
  cmake/external_libs/ompi.cmake:10 (mindspore_add_pkg)
  cmake/mind_expression.cmake:48 (include)
  CMakeLists.txt:73 (include)
错误原因:
缺少依赖包

解决办法:
安装依赖包,缺啥补啥
缺少其他依赖包,该方法也适用

sudo apt-get install autoconf
sudo apt-get install automake
sudo apt-get install libtool

猜你喜欢

转载自blog.csdn.net/m0_37605642/article/details/128992003