Is ROCm no longer supported by 0.9.x? #374

Lookforworld · 2023-12-07T20:14:23Z

@ghostplant
I've tried to run ROCM on the WSL platform and haven't been able to find a good way, but I finally found it here and saw a silver lining. I want to try version 0.9.X but can't find a whl that supports ROCM. Install version 0.3.x and use BACKEND=c-rocm to prompt that the gpu cannot be found, use BACKEND=c-rocm_win64 to run and report "/home/root001/miniconda3/lib/python3.11/site-packages/antares_core/backends/c-rocm_win64/../../graph_evaluator/run_graph.cpp:14:29: error: ‘memalign’ was not declared in this scope
14 | void data_ptr = (void)memalign(256, length);" error, I don't know where to start to fix the error, is there an official guidance document to tell me the correct steps? 😒

ghostplant · 2023-12-08T01:36:21Z

0.9.x is a new implementation that can work on Windows without even WSL, while old features are all kept. So you still can follow "Path 1" to run ROCm custom kernels. The problem you suffered seems a C++ compatibility issue, let me fix it.

ghostplant · 2023-12-08T02:50:23Z

BTW, do you consider to turn to DirectX on Windows instead of ROCm on Windows?

Lookforworld · 2023-12-08T03:09:34Z

BTW, do you consider to turn to DirectX on Windows instead of ROCm on Windows?

@ghostplant
Thanks for your reply.
I have successfully installed ROCM and AMD SDK on Windows, and I have successfully compiled the llama.cpp, but I have not been able to use ROCM in WSL. Because many of the libraries I want to use do not support Windows and DirectX. The Antares I installed with "Path 1" has always been 0.3.X instead of 0.9.X, I want to see if 0.9.X can succeed and what should I do?
My device is gfx1100(7900xtx).

ghostplant · 2023-12-08T06:01:30Z

BTW, do you consider to turn to DirectX on Windows instead of ROCm on Windows?

@ghostplant Thanks for your reply. I have successfully installed ROCM and AMD SDK on Windows, and I have successfully compiled the llama.cpp, but I have not been able to use ROCM in WSL. Because many of the libraries I want to use do not support Windows and DirectX. The Antares I installed with "Path 1" has always been 0.3.X instead of 0.9.X, I want to see if 0.9.X can succeed and what should I do? My device is gfx1100(7900xtx).

I just push a commit to fix the error you reported.
Do you install AMD SDK released for Win32, or install HIP released for Linux in WSL?

Lookforworld · 2023-12-08T07:53:06Z

@ghostplant
Thanks!
My HIP in WSL:

HIP version: 5.7.31921-d1770ee1b
AMD clang version 17.0.0 (https://github.com/RadeonOpenCompute/llvm-project roc-5.7.0 23352 d1e13c532a947d0cbfc94759c00dcf152294aa13)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/rocm-5.7.0/llvm/bin

And the Antares I installed with "Path 1" has always been 0.3.X instead of 0.9.X, how to install it if I want to use 0.9.X with ROCm?

ghostplant · 2023-12-08T08:08:35Z

v0.9.x is a re-implementation without WSL dependency. It may take time to merge a lot of features from v0.3.x. Suggest using DirectX v0.9.x for now since they are similar in performance.

Lookforworld · 2023-12-08T08:21:14Z

v0.9.x is a re-implementation without WSL dependency. It may take time to merge a lot of features from v0.3.x. Suggest using DirectX v0.9.x for now since they are similar in performance.

@ghostplant
Thanks very much!
How to install the fixed version now for me?

ghostplant · 2023-12-08T08:33:51Z

v0.9.x is a re-implementation without WSL dependency. It may take time to merge a lot of features from v0.3.x. Suggest using DirectX v0.9.x for now since they are similar in performance.

@ghostplant Thanks very much! How to install the fixed version now for me?

The PR was just applied in PyPI. You can install antares >= 0.3.24.0 from pip to include the fixed change: https://pypi.org/project/antares/#files

Lookforworld · 2023-12-08T09:04:32Z

Thanks for your kind reply! I'll try it!

Lookforworld · 2023-12-08T13:33:18Z

@ghostplant
There's a new Erro😢:

 >> Backend = c-rocm_win64, Python PID = 1251, Task = lang.generic;

// ---------------------------------------------------------------------------
// GLOBALS: input0:float32[524288], input1:float32[524288] -> output0:float32[524288]
// BACKEND: c-rocm_win64 (default)
// CONFIG: null
// COMPUTE_V1: - einstein_v2("output0[N] = input0[N] + input1[N]", input_dict={"input0": {"dtype": "float32", "shape": [1024 * 512]}, "input1": {"dtype": "float32", "shape": [1024 * 512]}})


// ---------------------------------------------------------------------------
// LOCAL: template_op_kernel0 -- input0:float32[524288], input1:float32[524288] -> output0:float32[524288]

#include <hip/hip_runtime.h>
#include <hip/hip_fp16.h>

#ifndef __ROCM_COMMON_MACRO__
#define __ROCM_COMMON_MACRO__

#define __ITEM_0_OF__(v) (v).x
#define __ITEM_1_OF__(v) (v).y
#define __ITEM_2_OF__(v) (v).z
#define __ITEM_3_OF__(v) (v).w

#define __STORE_ITEM_0__(t, out, ido, in, idi) *(t*)(out + ido) = *(t*)(in + idi)
#define __STORE_ITEM_1__(t, out, ido, in, idi)
#define __STORE_ITEM_2__(t, out, ido, in, idi)
#define __STORE_ITEM_3__(t, out, ido, in, idi)

#define __AMDGFX__ gfx1100

__forceinline__ __device__ __half hmax(const __half &a, const __half &b) { return a > b ? a : b; }
__forceinline__ __device__ __half hmin(const __half &a, const __half &b) { return a < b ? a : b; }

#endif


extern "C" __global__ __launch_bounds__(1) void template_op_kernel0(float* __restrict__ input0, float* __restrict__ input1, float* __restrict__ output0) {
  // [thread_extent] blockIdx.x = 524288
  // [thread_extent] threadIdx.x = 1
  output0[(((int)blockIdx.x))] = (input0[(((int)blockIdx.x))] + input1[(((int)blockIdx.x))]);
}

// ---------------------------------------------------------------------------

[EvalAgent] Evaluating Modules .. (for backend = c-rocm_win64)
+ /opt/rocm/bin/hipcc /tmp/.antares-module-tempfile.cu --amdgpu-target=gfx1100 --genco -Wno-ignored-attributes -O2 -o /tmp/.antares-module-tempfile.cu.out
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
clang++: error: no such file or directory: '/tmp/.antares-module-tempfile.cu'
clang++: error: no input files
terminate called after throwing an instance of 'std::runtime_error'
  what():  Failed to execute command: sh -c 'wsl.exe sh -cx "/opt/rocm/bin/hipcc /tmp/.antares-module-tempfile.cu --amdgpu-target=gfx1100 --genco -Wno-ignored-attributes -O2 -o /tmp/.antares-module-tempfile.cu.out 1>&2"'


[EvalAgent] Results = {}

[Antares] Incorrect compute kernel from evaluator.

But the file is in the right place. Is that because of permissions? Or is there something else wrong?

ghostplant · 2023-12-09T10:09:02Z

Can you try this version: https://files.pythonhosted.org/packages/cb/fe/5fef007100d8beaa64113d1da466a057db656ef5e0731140883bfc0ca05e/antares-0.3.24.1-py3-none-manylinux1_x86_64.whl

Lookforworld · 2023-12-10T08:24:05Z

Can you try this version: https://files.pythonhosted.org/packages/cb/fe/5fef007100d8beaa64113d1da466a057db656ef5e0731140883bfc0ca05e/antares-0.3.24.1-py3-none-manylinux1_x86_64.whl

@ghostplant
The Erro didn't fix. This time there is no cu file under the path.😢

ghostplant · 2023-12-10T08:48:28Z

Can you attach the new error logs?

Lookforworld · 2023-12-10T10:48:06Z

Can you attach the new error logs?

@ghostplant
Okey, the logs:

 >> Backend = c-rocm_win64, Python PID = 450, Task = lang.generic;

// ---------------------------------------------------------------------------
// GLOBALS: input0:float32[524288], input1:float32[524288] -> output0:float32[524288]
// BACKEND: c-rocm_win64 (default)
// CONFIG: null
// COMPUTE_V1: - einstein_v2("output0[N] = input0[N] + input1[N]", input_dict={"input0": {"dtype": "float32", "shape": [1024 * 512]}, "input1": {"dtype": "float32", "shape": [1024 * 512]}})


// ---------------------------------------------------------------------------
// LOCAL: template_op_kernel0 -- input0:float32[524288], input1:float32[524288] -> output0:float32[524288]

#include <hip/hip_runtime.h>
#include <hip/hip_fp16.h>

#ifndef __ROCM_COMMON_MACRO__
#define __ROCM_COMMON_MACRO__

#define __ITEM_0_OF__(v) (v).x
#define __ITEM_1_OF__(v) (v).y
#define __ITEM_2_OF__(v) (v).z
#define __ITEM_3_OF__(v) (v).w

#define __STORE_ITEM_0__(t, out, ido, in, idi) *(t*)(out + ido) = *(t*)(in + idi)
#define __STORE_ITEM_1__(t, out, ido, in, idi)
#define __STORE_ITEM_2__(t, out, ido, in, idi)
#define __STORE_ITEM_3__(t, out, ido, in, idi)

#define __AMDGFX__ gfx1100

__forceinline__ __device__ __half hmax(const __half &a, const __half &b) { return a > b ? a : b; }
__forceinline__ __device__ __half hmin(const __half &a, const __half &b) { return a < b ? a : b; }

#endif


extern "C" __global__ __launch_bounds__(1) void template_op_kernel0(float* __restrict__ input0, float* __restrict__ input1, float* __restrict__ output0) {
  // [thread_extent] blockIdx.x = 524288
  // [thread_extent] threadIdx.x = 1
  output0[(((int)blockIdx.x))] = (input0[(((int)blockIdx.x))] + input1[(((int)blockIdx.x))]);
}

// ---------------------------------------------------------------------------

[EvalAgent] Evaluating Modules .. (for backend = c-rocm_win64)
+ /opt/rocm/bin/hipcc /mnt/c/Users/Modys/AppData/Local/Temp/.antares-module-tempfile.cu --amdgpu-target=gfx1100 --genco -Wno-ignored-attributes -O2 -o .antares-module-tempfile.cu.out
Warning: The --amdgpu-target option has been deprecated and will be removed in the future.  Use --offload-arch instead.
clang++: error: no such file or directory: '/mnt/c/Users/Modys/AppData/Local/Temp/.antares-module-tempfile.cu'
clang++: error: no input files
terminate called after throwing an instance of 'std::runtime_error'
  what():  Failed to execute command: sh -c 'wsl.exe sh -cx "/opt/rocm/bin/hipcc $TMP/.antares-module-tempfile.cu --amdgpu-target=gfx1100 --genco -Wno-ignored-attributes -O2 -o .antares-module-tempfile.cu.out 1>&2"'


[EvalAgent] Results = {}

[Antares] Incorrect compute kernel from evaluator.

ghostplant · 2023-12-11T02:19:44Z

It is unfortunate that I cannot reproduce this. Can you help to debug yourself why this two lines failed to create the file at C:\Users\Modys\AppData\Local\Temp\.antares-module-tempfile.cu. If it does create the file, then /mnt/c/Users/Modys/AppData/Local/Temp/.antares-module-tempfile.cu should be available inside WSL instead of not foud.

You can run vi $(antares pwd)/../backends/c-rocm_win64/include/backend.hpp to edit the file inline, saving the changes and it will be automatically recompiled at the next run of any antares compilation request.

Lookforworld · 2023-12-11T05:05:33Z

It is unfortunate that I cannot reproduce this. Can you help to debug yourself why this two lines failed to create the file at C:\Users\Modys\AppData\Local\Temp\.antares-module-tempfile.cu. If it does create the file, then /mnt/c/Users/Modys/AppData/Local/Temp/.antares-module-tempfile.cu should be available inside WSL instead of not foud.

You can run vi $(antares pwd)/../backends/c-rocm_win64/include/backend.hpp to edit the file inline, saving the changes and it will be automatically recompiled at the next run of any antares compilation request.

@ghostplant
Ok!
Thank you very much! If I have fixed it, I will tell you.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is ROCm no longer supported by 0.9.x? #374

Is ROCm no longer supported by 0.9.x? #374

Lookforworld commented Dec 7, 2023 •

edited

Loading

ghostplant commented Dec 8, 2023

ghostplant commented Dec 8, 2023

Lookforworld commented Dec 8, 2023 •

edited

Loading

ghostplant commented Dec 8, 2023 •

edited

Loading

Lookforworld commented Dec 8, 2023

ghostplant commented Dec 8, 2023

Lookforworld commented Dec 8, 2023

ghostplant commented Dec 8, 2023

Lookforworld commented Dec 8, 2023

Lookforworld commented Dec 8, 2023 •

edited

Loading

ghostplant commented Dec 9, 2023

Lookforworld commented Dec 10, 2023 •

edited

Loading

ghostplant commented Dec 10, 2023 •

edited

Loading

Lookforworld commented Dec 10, 2023

ghostplant commented Dec 11, 2023

Lookforworld commented Dec 11, 2023

Is ROCm no longer supported by 0.9.x? #374

Is ROCm no longer supported by 0.9.x? #374

Comments

Lookforworld commented Dec 7, 2023 • edited Loading

ghostplant commented Dec 8, 2023

ghostplant commented Dec 8, 2023

Lookforworld commented Dec 8, 2023 • edited Loading

ghostplant commented Dec 8, 2023 • edited Loading

Lookforworld commented Dec 8, 2023

ghostplant commented Dec 8, 2023

Lookforworld commented Dec 8, 2023

ghostplant commented Dec 8, 2023

Lookforworld commented Dec 8, 2023

Lookforworld commented Dec 8, 2023 • edited Loading

ghostplant commented Dec 9, 2023

Lookforworld commented Dec 10, 2023 • edited Loading

ghostplant commented Dec 10, 2023 • edited Loading

Lookforworld commented Dec 10, 2023

ghostplant commented Dec 11, 2023

Lookforworld commented Dec 11, 2023

Lookforworld commented Dec 7, 2023 •

edited

Loading

Lookforworld commented Dec 8, 2023 •

edited

Loading

ghostplant commented Dec 8, 2023 •

edited

Loading

Lookforworld commented Dec 8, 2023 •

edited

Loading

Lookforworld commented Dec 10, 2023 •

edited

Loading

ghostplant commented Dec 10, 2023 •

edited

Loading