failed to query event: CUDA_ERROR_ILLEGAL_INSTRUCTION: an illegal instruction was encountered - 代码天地

failed to query event: CUDA_ERROR_ILLEGAL_INSTRUCTION: an illegal instruction was encountered

企业开发 2023-07-21 03:03:16 阅读次数: 0

在这里插入图片描述
报错信息：
Error polling for event status: failed to query event: CUDA_ERROR_ILLEGAL_INSTRUCTION: an illegal instruction was encountered
2022-03-24 23:32:13.170887: F tensorflow/core/common_runtime/gpu/gpu_event_mgr.cc:273] Unexpected Event status: 1

情况描述：
我自定义了一个损失函数

        def amp_loss(y_true, y_pred):  #其实就是幅频特性的损失
            #tf.squeeze先去掉axis=1的维度，因为Computes the 1-dimensional discrete Fourier transform of a real-valued signal over the inner-most dimension of input.
            #tf.signal.rfft做DFT
            #tf.math.abs求幅值
            #tf.expand_dims还原原来的axis=1的维度
            
                amplitude_true = tf.expand_dims(tf.math.abs( tf.signal.rfft(tf.squeeze(y_true))),-1)
                amplitude_pred = tf.expand_dims(tf.math.abs( tf.signal.rfft(tf.squeeze(y_pred))),-1)

                amplitude_loss = tf.math.reduce_mean(tf.math.square(amplitude_true - amplitude_pred))

                return amplitude_loss

报错是在model.fit的时候出现的，考虑是以为损失函数中的某些算术操作我的cuda不支持

1.检查cudnn版本有无问题
像我的笔记本是tensorflow2.2 GPU版本，cudnn好像是7.6.5

我尝试在aws sagemaker studio lab可以运行
上面的是tensorflow-gpu 2.6.2 cudnn=8.2.1

2.尝试使用CPU运行
在import tensorflow as tf 前面
加入

import os

#用CPU跑
os.environ["CUDA_DEVICE_ORDER"] = "PCI_BUS_ID"
os.environ["CUDA_VISIBLE_DEVICES"] = "-1"

以及尝试使用

with tf.device('/cpu:0'):

强制模型运行在CPU上，实测无问题。

猜你喜欢

转载自blog.csdn.net/aa2962985/article/details/123720909

failed to query event: CUDA_ERROR_ILLEGAL_INSTRUCTION: an illegal instruction was encountered

RuntimeError: CUDA error: an illegal instruction was encountered

illegal instruction

illegal instruction arm

【debug】illegal hardware instruction

Illegal instruction的解决方法

RuntimeError: CUDA error: an illegal memory access was encountered

ud2: illegal opcode, Undefined Instruction

Check failed: error == cudaSuccess (77 vs. 0) an illegal memory access was encountered

instruction

再次遇到RuntimeError: CUDA error: an illegal memory access was encountered

openwrt 编译node.js功能（解决Illegal instruction错误）

Jetson Nano -- import 报错“Illegal instruction (core dumped)

【已解决】Import cv2 Illegal instruction (core dumped)

Illegal character in query at index

【yolo系列：运行报错RuntimeError: CUDA error: an illegal memory access was encountered CUDA kernel errors m】

PyTorch - 大模型多卡训练 “CUDA error: an illegal memory access was encountered”

报错解决：CUDA Runtime error（an illegal memory access was encountered, cudaErrorIllegalAddress = 700）

arm-linux-gcc 4.3.3 Illegal instruction 及制作文件系统

numpy, pandas出现Illegal instruction (core dumped)问题的解决方案

一个低级Illegal instruction错误的定位--忽略编译期警告就得加倍偿还

Golang 编译MIPS helloworld程序出现 Illegal instruction 或者helloworld: applet not found 的解决

解决英伟达Jetson平台使用Python时的出现“Illegal instruction(cpre dumped)”错误

在jetson tx2 nx中遇见Illegal instruction (core dumped)

Illegal character (NULL, unicode 0) encountered: not valid in any content

全网首发：(解决办法)MAC OS Xcode给应用设置沙箱(Enable App Sandbox)之后，运行报错Illegal instruction: 4

玩转NVIDIA Jetson （30）--- 解决jetson平台使用Python import包时出现illegal instruction(cpre dumped)问题

Instruction Arrangement

instruction simulation

FT2000+ kylin aarch64 arm64 anaconda3可用的版本解决conda init 报错非法指令 illegal instruction

今日推荐

中国码农的“35岁魔咒”

蘭雅 CorelDRAW 插件 2024.5.1 国际劳动节版，免费下载

Arc Browser for Windows 1.0 正式 GA

90后程序员开发视频搬运软件、不到一年获利超 700 万，结局很刑！

《美国对全球网络空间安全与发展的威胁和破坏》报告发布

周排行

[编程题]学英语

[codeforces 1288A] Deadline 约数+模

Python的web开发

Docker在Centos 7上的部署

python编码

解决Ubuntu16.04 fatal error: json/json.h: No such file or directory

mysql并发插入

rest接口如何适应jsonp的方案

linux 终端上网设置

高数——等号两边同时求导、积分的解释

每日归档

更多

2024-05-04(7)

2024-05-03(19)

2024-05-02(0)

2024-05-01(4)

2024-04-30(1)

2024-04-29(40)

2024-04-28(0)

2024-04-27(56)

2024-04-26(39)

2024-04-25(22)