Chat formating, and how janky the “thinking” block is.
How words are broken up into tokens, not characters.
How particularly funky that gets with numbers.
Precisely how sampling “randomizes” the answers by visualizing “all possible answers” with the logprobs display.
And, thus, precisely how and why carb counting in ChatGPT fails, yet a measly local LLM on a desktop/phone could get it right with a little tooling or adjustment.
This is exactly what OpenAI/Anthropic don’t want you to do. They want users dumb and tethered, like a cloud subscription or social media platform. Not cognizant of how tools they are peddling as magic lamps actually work. And why, and how, they’re often stupid.
Better yet, download Qwen 3.5/3.6, with a “raw” notepad like Mikupad. Try it yourself:
https://huggingface.co/ubergarm/Qwen3.6-27B-GGUF
https://github.com/lmg-anon/mikupad
One might observe:
Chat formating, and how janky the “thinking” block is.
How words are broken up into tokens, not characters.
How particularly funky that gets with numbers.
Precisely how sampling “randomizes” the answers by visualizing “all possible answers” with the logprobs display.
And, thus, precisely how and why carb counting in ChatGPT fails, yet a measly local LLM on a desktop/phone could get it right with a little tooling or adjustment.
This is exactly what OpenAI/Anthropic don’t want you to do. They want users dumb and tethered, like a cloud subscription or social media platform. Not cognizant of how tools they are peddling as magic lamps actually work. And why, and how, they’re often stupid.