

Same. I’m running Qwen3.6-35B-A3B-FP8 (Qwen3.6-35B-A3B-UD-IQ4_XS.gguf) via the turboquant fork of llama.cpp with a few tweaked memory settings, and I get like 40 tokens / second – nothing that required special insight on my part just following the instructions I saw on a youtube video I found via !LocalLLaMA@sh.itjust.works and asking claude to help me through the installation.
AI has no economic moat. There’s nothing stopping anyone from running LLMs locally.





https://www.amazon.com/dp/B0BV8H8HVD with linux mint installed.