The Qwen3.5 models are still the best local models I’ve used, so I’m excited to see how this updated version performs.

  • FrankLaskey@lemmy.ml
    link
    fedilink
    English
    arrow-up
    0
    ·
    11 days ago

    Yes I did see that as well. That does seem to be the real Achilles heel here. Will have to try it myself to see how much it exacerbates context size limitations given I would be running it on a single 24 GB VRAM GPU. I wonder if adjusting reasoning effort parameters could make a difference without affecting quality too much?