Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Are you running it locally with llama.cpp? If so, is it working without any tweaking of the chat template? The tool calls fail for me when using the default chat template, however it seems to work a whole lot better with this: https://huggingface.co/Qwen/Qwen3.5-35B-A3B/discussions/9#69...


I’ve been running it via llama-server with no issues. Running the latest Bartowski 6-bit quant


Bartowski? Like Chuck Bartowski from the TV show?


Different one. Bartowski is a minor celebrity in the local LLM world, together with Unsloth.


What's the selling point of these quants vs the Unsloth ones?


Sometimes unsloth has broken ones for a particular model, sometimes no quants at all, and there is subtle difference in behavior.


Thanks, i'll check his quants.


Have you tried the '--jinja' flag in llama-server?


Yes, it fails too. I’m using the unsloth q4_km quant. Similarly fails with devstral2 small too, fixed that by using a similar template i found for it. Maybe it’s the quants that are broken, need to redownload I guess.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: