官方 github 仓库CI

起了一个 ubuntu RISC-V 的 docker,使用 gcc14 和 python 3.12,但是 fail 很久了没修。 配置文件链接

x86 cross

酸鸽给 ArchLinux RISC-V python-pytorch 2.6.0 打了包,patch 文件链接

在 K1 上安装试了一下,不能直接用。

Python 3.13.11 (main, Dec 17 2025, 07:50:59) [GCC 15.2.1 20251112] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import torch
Traceback (most recent call last):
  File "<python-input-0>", line 1, in <module>
    import torch
  File "/usr/lib/python3.13/site-packages/torch/__init__.py", line 405, in <module>
    from torch._C import *  # noqa: F403
    ^^^^^^^^^^^^^^^^^^^^^^
ImportError: libprotobuf.so.29.2.0: cannot open shared object file: No such file or directory

经查询 ArchLinux 上的 libprotobuf 版本为 31,但是进行软链接仍找不到,于是尝试重新构建。

既然已经打成功了早期版本的包,何不复用。于是学习了 ArchLinux 打包指南 其中 asp 已不可用,使用pkgctl repo clone --protocol=https python-pytorch获取最新版 ArchLinux 官方 PKGBUILD。根据酸鸽的riscv64 patch进行修改。
主要是禁用了 cuda 删除了一些 RISC-V 中没有的包依赖。
跑了一整天摸鱼之后发现,-march=rv64gc 被错误地 parse 了。

copying torch/utils/data/datapipes/datapipe.pyi -> build/lib.linux-riscv64-cpython-313/torch/utils/data/datapipes
running build_ext
-- Building with NumPy bindings
-- Not using cuDNN
-- Not using CUDA
-- Not using XPU
-- Not using MKLDNN
-- Not using NCCL
-- Building with distributed package: 
  -- USE_TENSORPIPE=True
  -- USE_GLOO=True
  -- USE_MPI=False
-- Not using ITT
Copying functorch._C from functorch/functorch.so to /build/python-pytorch/src/pytorch/build/lib.linux-riscv64-cpython-313/functorch/_C.cpython-313-riscv64-linux-gnu.so
copying functorch/functorch.so -> /build/python-pytorch/src/pytorch/build/lib.linux-riscv64-cpython-313/functorch/_C.cpython-313-riscv64-linux-gnu.so
building 'torch._C' extension
creating build/temp.linux-riscv64-cpython-313/torch/csrc
-march=rv64gc -mabi=lp64d -O2 -pipe -fno-plt -fexceptions -Wp,-D_FORTIFY_SOURCE=3 -Wformat -Werror=format-security -fstack-clash-protection -fno-omit-frame-pointer -fPIC -I/usr/include/python3.13 -c torch/csrc/stub.c -o build/temp.linux-riscv64-cpython-313/torch/csrc/stub.o -Wall -Wextra -Wno-strict-overflow -Wno-unused-parameter -Wno-missing-field-initializers -Wno-unknown-pragmas -fno-strict-aliasing
error: command '-march=rv64gc' failed: No such file or directory

ERROR Backend subprocess exited when trying to invoke build_wheel
==> ERROR: A failure occurred in build().
    Aborting...
==> ERROR: Build failed, check /var/lib/archbuild/extra-riscv64/xyenchi/build

于是删了 PKGBUILD 中加 add_definition 到 cmake 里面的方法,写了 CFLAGS 和 CXXFLAGS,重新跑。
看起来还是会把 CFLAGS 识别错误,可能原本的 CC 不存在,改成 /usr/bin/gcc 试试。
改 /usr/bin/gcc 挺成功的,就是找不到 ROCm 相关的 .so 文件。 貌似 ROCm 不支持 RISC-V 但是旧版本的打出来了。禁用掉试试。
使用该 PKGBUILD 可以打出 pytorch 2.9.1,但仍需从 ArchLinux 官方 GitLab 仓库拉取相应 patch。

K1 native

官方文档上用的是 conda 的脚本,但是没有找到可以直接用的 conda,不是很懂,但是算了。

source .venv/bin/activate
pip install -U pip setuptools wheel
pip install -r requirements.txt
export USE_CUDA=0
export USE_NCCL=0
export USE_DISTRIBUTED=0
export USE_MPI=0
export USE_GLOO=0
export USE_MKLDNN=0
export BUILD_TEST=0
export BUILD_CAFFE2=0
python setup.py develop

起一个 python 虚拟环境,禁用掉只有 x86 能用的子模块跑,摸鱼一整天之后,成功oom了。

FAILED: [code=1] caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/RegisterCompositeExplicitAutograd_0.cpp.o
/usr/bin/c++ -DAT_PER_OPERATOR_HEADERS -DCAFFE2_BUILD_MAIN_LIB -DCPUINFO_SUPPORTED_PLATFORM=1 -DFMT_HEADER_ONLY=1 -DHAVE_MALLOC_USABLE_SIZE=1 -DHAVE_MMAP=1 -DHAVE_POSIX_FALLOCATE=1 -DHAVE_SHM_OPEN=1 -DHAVE_SHM_UNLINK=1 -DMINIZ_DISABLE_ZIP_READER_CRC32_CHECKS -DONNXIFI_ENABLE_EXT=1 -DONNX_ML=1 -DONNX_NAMESPACE=onnx_torch -DUSE_EXTERNAL_MZCRC -D_FILE_OFFSET_BITS=64 -Dtorch_cpu_EXPORTS -I/home/xyenchi/pytorch/build/aten/src -I/home/xyenchi/pytorch/aten/src -I/home/xyenchi/pytorch/build -I/home/xyenchi/pytorch -I/home/xyenchi/pytorch/nlohmann -I/home/xyenchi/pytorch/moodycamel -I/home/xyenchi/pytorch/torch/csrc/api -I/home/xyenchi/pytorch/torch/csrc/api/include -I/home/xyenchi/pytorch/caffe2/aten/src/TH -I/home/xyenchi/pytorch/build/caffe2/aten/src/TH -I/home/xyenchi/pytorch/build/caffe2/aten/src -I/home/xyenchi/pytorch/build/caffe2/../aten/src -I/home/xyenchi/pytorch/torch/csrc -I/home/xyenchi/pytorch/torch/headeronly -I/home/xyenchi/pytorch/third_party/miniz-3.0.2 -I/home/xyenchi/pytorch/third_party/kineto/libkineto/include -I/home/xyenchi/pytorch/third_party/cpp-httplib -I/home/xyenchi/pytorch/aten/src/ATen/.. -I/home/xyenchi/pytorch/c10/.. -I/home/xyenchi/pytorch/third_party/cpuinfo/include -I/home/xyenchi/pytorch/third_party/FP16/include -I/home/xyenchi/pytorch/third_party/fmt/include -I/home/xyenchi/pytorch/third_party/onnx -I/home/xyenchi/pytorch/build/third_party/onnx -I/home/xyenchi/pytorch/third_party/flatbuffers/include -isystem /home/xyenchi/pytorch/third_party/protobuf/src -isystem /home/xyenchi/pytorch/cmake/../third_party/eigen -isystem /home/xyenchi/pytorch/INTERFACE -isystem /home/xyenchi/pytorch/third_party/nlohmann/include -isystem /home/xyenchi/pytorch/third_party/concurrentqueue -isystem /home/xyenchi/pytorch/build/include -fvisibility-inlines-hidden -DNDEBUG -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -DC10_NODEPRECATED -Wall -Wextra -Werror=return-type -Werror=non-virtual-dtor -Werror=range-loop-construct -Werror=bool-operation -Wnarrowing -Wno-missing-field-initializers -Wno-unknown-pragmas -Wno-unused-parameter -Wno-strict-overflow -Wno-strict-aliasing -Wno-stringop-overflow -Wsuggest-override -Wno-psabi -Wno-error=old-style-cast -faligned-new -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-dangling-reference -Wno-error=dangling-reference -Wno-stringop-overflow -O3 -DNDEBUG -DNDEBUG -fPIC -fdiagnostics-color=always -Wall -Wextra -Wdeprecated -Wunused -Wno-unused-parameter -Wno-missing-field-initializers -Wno-array-bounds -Wno-unknown-pragmas -Wno-strict-overflow -Wno-strict-aliasing -Wredundant-move -Wno-interference-size -Wno-maybe-uninitialized -fvisibility=hidden -fopenmp -MD -MT caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/RegisterCompositeExplicitAutograd_0.cpp.o -MF caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/RegisterCompositeExplicitAutograd_0.cpp.o.d -o caffe2/CMakeFiles/torch_cpu.dir/__/aten/src/ATen/RegisterCompositeExplicitAutograd_0.cpp.o -c /home/xyenchi/pytorch/build/aten/src/ATen/RegisterCompositeExplicitAutograd_0.cpp
c++: fatal error: Killed signal terminated program cc1plus
compilation terminated.
ninja: build stopped: subcommand failed.

关了各种优化和并行之后,再跑一天还是 oom。
换个gcc版本吧

export CC=gcc-13
export CXX=g++-13

# 架构
export USE_CUDA=0
export USE_ROCM=0
export USE_HIP=0
export USE_TRITON=0
export USE_NCCL=0

# 分布式
export USE_DISTRIBUTED=0
export USE_GLOO=0
export USE_TENSORPIPE=0
export USE_MPI=0

# 性能组件
export USE_MKLDNN=0
export USE_OPENMP=0

# 构建规模
export BUILD_TEST=0
export BUILD_CAFFE2=0

# 内存保护
export MAX_JOBS=1
export CFLAGS="-O1 -g0"
export CXXFLAGS="-O1 -g0"

gcc降到13关了各种优化之后也跑出来了。
热烈祝贺。