Commit Graph

58 Commits (3a65d561a70f60d2c67f607d2b00a944c7c427ed)

Author SHA1 Message Date
Seunghoon Lee 9e91ac6310
therock use v2 instead of v2-staging 2025-11-20 16:12:23 +09:00
vladmandic d0ac508a59 lint
Signed-off-by: vladmandic <mandic00@live.com>
2025-11-17 20:12:47 -05:00
Seunghoon Lee 674a8f097d
windows rocm install devel 2025-11-17 13:04:16 +09:00
Seunghoon Lee 57e8da7a36
windows gfx120x disable MIOpen 2025-11-03 00:37:59 +09:00
Seunghoon Lee b0e147a459
handle different sizes of hipDeviceProp_t 2025-10-31 16:45:18 +09:00
Vladimir Mandic e972141917 linting
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-10-18 08:33:58 -04:00
Vladimir Mandic c8ca5cd75c load rocm.py only when needed
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-10-18 08:31:28 -04:00
Seunghoon Lee a36916616f
windows catch rocm detection failure 2025-10-18 12:00:30 +09:00
Seunghoon Lee 552c223569
use driver library, more checks for windows rocm 2025-10-18 01:17:30 +09:00
Disty0 6b67a9d0c4 SDNQ add check_mats to matmul 2025-09-30 01:58:13 +03:00
Vladimir Mandic c661497b87 linting
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-09-28 18:09:09 -04:00
Disty0 4a70e82b0c ROCm always use numpy on cholesky 2025-09-28 14:05:20 +03:00
Disty0 a47959b114 move ROCm Windows hijacks outside of torch install 2025-09-28 13:33:15 +03:00
Seunghoon Lee 06b14b070b
hijack torch.linalg.cholesky() 2025-09-28 18:43:37 +09:00
Seunghoon Lee 9c19e8a9b8
return correct item 2025-09-28 18:34:10 +09:00
Seunghoon Lee 6e8abf8dc3
hijack cholesky decomposition for therock pytorch 2025-09-28 18:03:02 +09:00
Seunghoon Lee 0de3732884
prioritize python pacakage check over PATH 2025-09-28 11:31:43 +09:00
Seunghoon Lee 35b1c6b07d
windows install rocm if not installed 2025-09-28 05:39:41 +09:00
Seunghoon Lee 059392afc7
clean up 2025-09-27 17:51:59 +09:00
Disty0 503a178794 Don't load rocm hsa on non wsl envs and remove unused lib hijacks 2025-09-27 11:39:24 +03:00
Seunghoon Lee 579b1f3175
do not load global amdhip64, let pytorch load it 2025-09-27 17:01:14 +09:00
Seunghoon Lee d52d84eda5
clean up 2025-09-27 16:49:29 +09:00
Luca Beltrame f00486c8f3
Fix HIP library detection
The code unconditionally checked `lib`, but on modern Linux distributions, 64-bit binaries are under `lib64` instead. 

Note that this might be slightly different on Debian-based distributions, but as I don't have one I can't test it.
2025-09-27 09:23:32 +02:00
Seunghoon Lee e3e41298d9
remove unused import 2025-09-27 11:29:55 +09:00
Seunghoon Lee 8d2aedd924
hotfix 2025-09-27 11:17:15 +09:00
Seunghoon Lee d63bc05b6c
update rocm.py to detect rocm-sdk packages 2025-09-27 11:12:13 +09:00
Disty0 4b76049aed fix basicsr and gfpgan 2025-09-10 19:13:08 +03:00
Disty0 266c9c0d3d Move Zluda Triton flash atten hijack to Triton Flash attention option 2025-09-03 16:16:41 +03:00
Disty0 d99777616b ROCm fix Segfault and remove the conflicting hipblaslt override 2025-08-29 19:03:49 +03:00
Disty0 74b6edf2df revert gfx1101 2025-06-11 19:25:05 +03:00
Disty0 6aa5c08fb0 Cleanup and update changelog 2025-06-11 16:03:33 +03:00
Disty0 71be3c7d45 ROCm don't override gfx with gfx1100 and gfx1101 + rocm 6.4 2025-06-11 15:47:25 +03:00
Disty0 df6b13ea47 Don't set gfx override with RX 9000 and above 2025-06-11 15:09:03 +03:00
Disty0 bd2d9d1677 Python 3.13 support 2025-06-09 22:58:08 +03:00
Disty0 748bdbf437 Update rocm flash attention repo with navi rotary fix 2025-04-08 00:15:57 +03:00
Seunghoon Lee 0eaa2c0378
rocm wsl better arch detection 2025-04-01 21:14:02 +09:00
Disty0 dc551bfd85 Set blaslt_tensile_libpath to empty string instead of None 2025-02-15 18:04:42 +03:00
Seunghoon Lee 13e9f4f3e1
flash attn triton is already merged into main 2025-02-14 23:21:56 +09:00
Disty0 f94196bcd1 Rename ROCm Flash atten hijack to CK Flash atten and enable AOTriton memory and flash atten by default 2025-02-13 22:01:06 +03:00
Seunghoon Lee 389bb78b8e
Fix bug. 2025-01-11 23:41:17 +09:00
Seunghoon Lee e43d8e9448
check hipblaslt availability in windows 2025-01-11 23:36:51 +09:00
Seunghoon Lee d597d5912d
use bitmasking for agent detection 2024-10-26 13:40:54 +09:00
Vladimir Mandic e0d702a3dd add gfx autodetect options
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2024-10-07 11:38:10 -04:00
Seunghoon Lee 470ea10e95
fix hipinfo 2024-09-27 10:24:55 +09:00
Seunghoon Lee b74af75622
add more known devices 2024-09-27 09:42:13 +09:00
Seunghoon Lee 96922d70a7
rocm&zluda handle apu 2024-09-26 13:34:11 +09:00
Seunghoon Lee 1395f5bf9e
add triton backend flash-attn (experimental) 2024-09-24 12:53:18 +09:00
Seunghoon Lee e246e55734
rocm install flash-attn if needed 2024-09-24 12:27:43 +09:00
Seunghoon Lee 9c4213e4a8
tcmalloc experiment 2024-08-09 16:11:12 +09:00
Seunghoon Lee 864263f570
accurate wsl check 2024-08-09 10:38:28 +09:00