I have spent a few days tweaking this setup to attain these results:
| Model | Prompt (tok/s) | Generation (tok/s) |
|---|---|---|
gemma-26b-moe |
8.9 | 6.4 |
qwen3.5-4b-no-think |
21.5 | 8.4 |
Although modest, It is great for local parsing and analysis of my self-hosted homelab data where sending logs to external APIs is not desirable.
Typical workflows:
- Log analysis: Piping
journalctloutput to the API for error triage and root cause hypothesis generation. - Configuration synthesis: Generating AdGuard Home rewrite rules, nginx location blocks, or
fstabentries based on defined parameters. - Troubleshooting constraints: Querying for failure modes specific to the local topology (e.g., NFS mount failures over a 1 Gbps unmanaged switch, Tailscale DERUP routing behind CGNAT).
- Alert context: Correlating Beszel/Uptime Kuma notifications with service-specific knowledge (e.g., “mediabox CPU spike while SabNZBd is extracting”).
You must log in or # to comment.
Do you have 2*16GB or 32GB?
Memory: System RAM: total: 32 GiB available: 31.19 GiB used: 8.76 GiB (28.1%) Array-1: capacity: 64 GiB slots: 4 modules: 2 EC: None Device-1: ChannelA-DIMM0 type: DDR4 size: 16 GiB speed: spec: 2667 MT/s actual: 2666 MT/s Device-2: ChannelA-DIMM1 type: no module installed Device-3: ChannelB-DIMM0 type: DDR4 size: 16 GiB speed: spec: 2667 MT/s actual: 2666 MT/s Device-4: ChannelB-DIMM1 type: no module installedII was going to say dual channel improves IGPU performance, but that’s not really a factor here, is it? Is there a reason why you can’t upgrade the CPU though? I think youre stuck with AVX2 on that chip.


