【排雷分享】神秘bug排雷总结

1. 前言

总结了遇到的神秘bug，长期更新，遇到了受不了想吐槽了就写一点。

2. 环境配置

2.1 Conda安装pytorch无法使用

报错信息如下：

1	`ImportError: /home/wsl/anaconda3/envs/testTorch/lib/python3.11/site-packages/torch/lib/libtorch_cpu.so: undefined symbol: iJIT_NotifyEvent`

解决方案：用pip重新安装一下mlk，例如

1	`pip install mkl==2024.0`

评价为pytorch官方都不推荐用conda安装了，不如转去uv。

2.2 cuda加速的attention报错

报错信息如下：

1
2
3

attn_output=torch.nn.functional.scaled_dot_product_attention(
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError:cuDNN Frontend error:[cudnn_frontend]Error:No execution plans support the graph.

哪哪导入都正常，一算attention就报错，最变态的一集。

解决方案：pytorch2.5.0的问题，避免使用该版本，换附近版本即可解决

参考Issue：RuntimeError: cuDNN Frontend error: cudnn_frontend] Error: No execution plans support the graph. · Issue #9704 · huggingface/diffusers

评价为最变态的一集，以后选定一个pytorch版本就用一个版本用到头得了，别乱挑了。

2.3 flash_attention安装报错

总结一下安装过程可能踩到的坑，不列举具体问题了。

首先就是官方不推荐Windows安装（愿意折腾的话可以找找社区提供的wheel），所以我用的WSL跑的。

不要尝试直接使用以下指令安装flash_attention，除非你cpu够豪横，内存够多：

1	`pip install flash-attn --no-build-isolation`

上面这个会拉源码现场编译，很费时间还可能爆内存，导致安装失败，推荐直接下预编译wheel。

Release v2.8.3 · Dao-AILab/flash-attention，从这个链接处获取最新的预编译包，注意要选择对应的条件版本。

例如我下载的flash_attn-2.8.3+cu12torch2.6cxx11abiFALSE-cp312-cp312-linux_x86_64.whl，就要满足以下的条件：

条件	要求值	如何检查
flash_attn-2.8.3	表明该flash_attention的版本是2.8.3	None
cuda12	CUDA 12.x（如 12.4, 12.6 等）虽然写的是cuda12，但可能不支持cuda12.4，尽可能在驱动版本支持的情况下安装稍微高一点版本的CUDA，我的这个包用CUDA12.4就不行，CUDA12.6就行。	`python -c "import torch; print(torch.version.cuda)"` 如果最后发现flash_attn导入失败的话，有可能就是cuda版本不够高，先 `nvidia-smi` 检查一下支持的最高的驱动版本，视情况是否更新驱动，然后`pip uninstall torch torchvision torchaudio flash_atten`，再安装高一点版本的cuda，例如从`torch2.6+cu124`换成`torch2.6+cu126`
PyTorch 版本	PyTorch 2.6.x	`python -c "import torch; print(torch.__version__)"`
cxx11abiFALSE	cxx11abi是FALSE，而不能是TRUE。这一点容易被忽略，不要选错了。	大多数官方 PyTorch wheel 默认是 `cxx11abiFALSE` ，要自查可以用`python -c "import torch;print(f"是否使用CXX11 ABI: {torch.compiled_with_cxx11_abi()}")"`
cp312	Python 3.12	`python --version` → 必须是 `3.12.x` （不能是 3.10、3.11、3.13 等）
linux_x86_64	Linux（x86_64 架构）	`uname -s` → 应为 `Linux` `uname -m` → 应为 `x86_64`

选择条件都满足的wheel包，再你安装完torch，cuda驱动后，把这个flash_attn-2.8.3+cu12torch2.6cxx11abiFALSE-cp312-cp312-linux_x86_64.whl包放在当前目录下，使用pip install flash_attn-2.8.3+cu12torch2.6cxx11abiFALSE-cp312-cp312-linux_x86_64.whl即可快速安装。

如果全部条件都符合，安装好之后却导入失败，大概率是CUDA版本不够高，重新卸载torch……flash_atten然后再安装高一点版本cuda的torch就行。

3. ClaudeCode

3.1 配置完api，使用时报错400

报错信息如下：

1	`⎿ API Error: 400"InvokeModelWithResponseStream: operation error Bedrock Runtime: InvokeModelWithResponseStream, https response error StatusCode: 400,`

解决方案：新增一个环境变量CLAUDE_CODE_DISABLE_EXPERIMENTAL_BETAS="1"

#排雷

【排雷分享】神秘bug排雷总结

https://blog.sheep0.top/2026/02/06/【排雷】神秘bug排雷总结/

作者

Sheep0

发布于

2026年2月6日

许可协议

【学习笔记】好文分享上一篇

【工具教程】uv包管理：torch安装与docker部署下一篇