I have spent a few days tweaking this setup to attain these results:

Model Prompt (tok/s) Generation (tok/s)
gemma-26b-moe 8.9 6.4
qwen3.5-4b-no-think 21.5 8.4

Although modest, It is great for local parsing and analysis of my self-hosted homelab data where sending logs to external APIs is not desirable.

Typical workflows:

  • Log analysis: Piping journalctl output to the API for error triage and root cause hypothesis generation.
  • Configuration synthesis: Generating AdGuard Home rewrite rules, nginx location blocks, or fstab entries based on defined parameters.
  • Troubleshooting constraints: Querying for failure modes specific to the local topology (e.g., NFS mount failures over a 1 Gbps unmanaged switch, Tailscale DERUP routing behind CGNAT).
  • Alert context: Correlating Beszel/Uptime Kuma notifications with service-specific knowledge (e.g., “mediabox CPU spike while SabNZBd is extracting”).
  • variety4me@lemmy.zipOP
    link
    fedilink
    English
    arrow-up
    0
    ·
    4 days ago
    Memory:
      System RAM: total: 32 GiB available: 31.19 GiB used: 8.76 GiB (28.1%)
      Array-1: capacity: 64 GiB slots: 4 modules: 2 EC: None
      Device-1: ChannelA-DIMM0 type: DDR4 size: 16 GiB speed: spec: 2667 MT/s
        actual: 2666 MT/s
      Device-2: ChannelA-DIMM1 type: no module installed
      Device-3: ChannelB-DIMM0 type: DDR4 size: 16 GiB speed: spec: 2667 MT/s
        actual: 2666 MT/s
      Device-4: ChannelB-DIMM1 type: no module installed
    
    • SuspciousCarrot78@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 days ago

      II was going to say dual channel improves IGPU performance, but that’s not really a factor here, is it? Is there a reason why you can’t upgrade the CPU though? I think youre stuck with AVX2 on that chip.