I have spent a few days tweaking this setup to attain these results:

Model Prompt (tok/s) Generation (tok/s)
gemma-26b-moe 8.9 6.4
qwen3.5-4b-no-think 21.5 8.4

Although modest, It is great for local parsing and analysis of my self-hosted homelab data where sending logs to external APIs is not desirable.

Typical workflows:

  • Log analysis: Piping journalctl output to the API for error triage and root cause hypothesis generation.
  • Configuration synthesis: Generating AdGuard Home rewrite rules, nginx location blocks, or fstab entries based on defined parameters.
  • Troubleshooting constraints: Querying for failure modes specific to the local topology (e.g., NFS mount failures over a 1 Gbps unmanaged switch, Tailscale DERUP routing behind CGNAT).
  • Alert context: Correlating Beszel/Uptime Kuma notifications with service-specific knowledge (e.g., “mediabox CPU spike while SabNZBd is extracting”).
    • variety4me@lemmy.zipOP
      link
      fedilink
      English
      arrow-up
      0
      ·
      3 days ago
      Memory:
        System RAM: total: 32 GiB available: 31.19 GiB used: 8.76 GiB (28.1%)
        Array-1: capacity: 64 GiB slots: 4 modules: 2 EC: None
        Device-1: ChannelA-DIMM0 type: DDR4 size: 16 GiB speed: spec: 2667 MT/s
          actual: 2666 MT/s
        Device-2: ChannelA-DIMM1 type: no module installed
        Device-3: ChannelB-DIMM0 type: DDR4 size: 16 GiB speed: spec: 2667 MT/s
          actual: 2666 MT/s
        Device-4: ChannelB-DIMM1 type: no module installed
      
      • SuspciousCarrot78@lemmy.world
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 day ago

        II was going to say dual channel improves IGPU performance, but that’s not really a factor here, is it? Is there a reason why you can’t upgrade the CPU though? I think youre stuck with AVX2 on that chip.