DeepSeek-V4 Pro (1.6T-A49) and Flash (284B-A13)

TheCornCollector@piefed.zip · 8 days ago

DeepSeek-V4 Pro (1.6T-A49) and Flash (284B-A13)

TheCornCollector@piefed.zip · 10 days ago

Qwen3.6 27B released

TheCornCollector@piefed.zip · 11 days ago

Ah, I don’t know anything about Windows. I’m using Linux and both the latest ROCM (7.2.2) and latest vulkan (26.0.5) packages work without issues for combined gaming and AI. My reported numbers were with Vulkan at zero context for reference.

TheCornCollector@piefed.zip · edit-2 11 days ago

I’ve been using it for the past few days and the output quality seems to be on par or slightly better than 3.5 27b. The biggest issue is the token usage that has exploded with this revision. It can easily reason for 20k-25k tokens on a question where the qwen3.5 models used 10k. Since it runs more than 3 times faster, it still finished earlier than the 27b, but I won’t have any context/vram left to ask multiple questions.

Artificial Analysis has similar findings. Bar graph of output tokens for different models. Qwen3.6 35b: 140 million, Qwen3.5 35b: 100 million

TheCornCollector@piefed.zip · 16 days ago

I agree with the suggestion of the other commenters, just wanted to add that I personally run llama.cpp directly with the build in llama-server. For a single-user server this seems to work great and is almost always at the forefront of model support.

TheCornCollector@piefed.zip · edit-2 9 days ago

I’m running it with the UD_Q4_K_XL quant on 24GB VRAM 7900XTX at ~120-130* token/s. Since it’s an MOE model, CPU inference with 32 GB ram should be doable, but I won’t make any promises on speed.

*Edit: I had a configuration issue on my llama.cpp that reduced the performance. It was limited to 85 tk/s but that was user error on my part.

TheCornCollector@piefed.zip · 16 days ago

TheCornCollector

DeepSeek-V4 Pro (1.6T-A49) and Flash (284B-A13)

DeepSeek-V4 Pro (1.6T-A49) and Flash (284B-A13)

Qwen3.6 27B released

Qwen3.6 27B released

Qwen3.6-35B-A3B released

Qwen3.6-35B-A3B released