Commit Graph

11 Commits (3a65d561a70f60d2c67f607d2b00a944c7c427ed)

Author SHA1 Message Date
Disty0 f4ee9c7052 Add Flex attention 2025-11-09 00:14:38 +03:00
Disty0 2bbbb684cc Rename CK Flash attention to just Flash attention 2025-11-08 23:24:40 +03:00
Disty0 a93715e0da Don't expose AMD Triton Flash Atten for non AMD 2025-11-08 23:20:55 +03:00
Disty0 93797dff8e ROCm enable Dynamic atten only for RDNA2 and older GPUs 2025-11-08 22:53:26 +03:00
Vladimir Mandic 56026c4e61 refactor attention handling
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-11-08 10:55:41 -05:00
Disty0 c51552af90 Enable triton flash atten option for rocm linux too 2025-09-03 16:26:00 +03:00
Disty0 266c9c0d3d Move Zluda Triton flash atten hijack to Triton Flash attention option 2025-09-03 16:16:41 +03:00
Vladimir Mandic 9e6928410f offloading improvements
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-08-25 10:16:38 -04:00
Disty0 6597a9111f Fix offload detection with CPU only envs 2025-08-16 16:12:43 +03:00
Vladimir Mandic b120778c4f lint
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-07-31 17:33:44 -04:00
Vladimir Mandic fa44521ea3 offload-never and offload-always per-module and new highvram profile
Signed-off-by: Vladimir Mandic <mandic00@live.com>
2025-07-31 11:40:24 -04:00