Bewildered enthusiasts decry memory price increases of 100% or more — the AI RAM squeeze is finally starting to hit PC builders where it hurts

themachinestops@lemmy.dbzer0.com · 2 days ago

Bewildered enthusiasts decry memory price increases of 100% or more — the AI RAM squeeze is finally starting to hit PC builders where it hurts

brucethemoose@lemmy.world · 15 hours ago

4GB VRAM

Mmmmm… I would wait a few days, and try a GGUF quantization of Kimi Linear once its better supported: https://huggingface.co/moonshotai/Kimi-Linear-48B-A3B-Instruct

Otherwise you can mess with Qwen 3 VL now, in the native llama.cpp UI. But be aware that Qwen is pretty sycophantic like ChatGPT: https://huggingface.co/unsloth/Qwen3-VL-30B-A3B-Instruct-GGUF/blob/main/Qwen3-VL-30B-A3B-Instruct-UD-Q4_K_XL.gguf

If you’re interested, I can work out an optimal launch command. But to be blunt, with that setup, you’re kinda better off using free LLM APIs with a local chat UI.