Which open models are actually good at agentic coding?

hok@lemmy.dbzer0.com · 5 days ago

Which open models are actually good at agentic coding?

SuspciousCarrot78@lemmy.world · edit-2 1 day ago

For “fun”, I’ve been using Qwen3.6 27B, Gpt 5.4 mini and Claude to audit (one file at a time) my code. The workflow is

I flag issues
Claude writes the probe spec
Qwen Audits (OR via Roo)
GPT Audits (OR via Roo)
I review
Claude and I consolidate the bug reports
We run that past GPT in Codex
It tests / replicates it against code base
I review the output
Claude and I prioritise what needs fixing, what can be deferred and what can be ignored
I / we create the ticket with staged gates
GPT spins up a sandbox
Run gate 1
Fix bug 1
Smoke test against sandboxed
I review and discuss
Iterate
Again
Smoke test passes or we pivot to diff fix.
Once happy, back port and snap shot
Update ticket index and ticket itself with what we did, what worked, what was out of scope
Exfil new code to main, manually test again.
if all good, back up, archive (3-2-1).

Its been my experience (so far) that Qwen 3.6 27B is very capable in uncovering bugs, sometimes finding issues the others miss. Paradoxically, it’s not much cheaper to call via OR that GPT because it tends to skew verbose.

I may trial the 27B as the “hands” for a run or 2 (Qwen 3.6 35B has been unreliable for me via OR) to see how it does. Tight leash.

PS: This approach may be …overkill. I’m not a great code monkey, but I’m pretty decent at engineering, QA, and project management. I’m leveraging my skills, and this flow may not suit you. So, YMMV.