The APP and the installers : https://www.patreon.com/posts/120193330
Check below screenshots to see how to use it
Currently the APP works amazing with 4-bit quantization very fast
I am searching to lower VRAM usage even further with like adding CPU-Offloading and other stuff if possible
Previously we were lacking Triton but it now works perfect
My installer installs into a Python 3.10 VENV completely isolated and clean
You can see entire APP and installer source code
If you get Triton error make sure to delete your Triton cache after installing the app like below
C:UsersFurkan.triton
Source link
lol