Glitcher: Glitch-Token Discovery and ASR Validation
Quick Answer
Glitcher is an open-source Python toolkit for discovering under-trained tokens in large language model tokenizers and quantifying how badly they degrade model behavior. It implements three discovery strategies — embedding-norm thresholding, gradient-guided mining, and full-vocabulary scans — plus a multi-attempt Attack Success Rate validator. It is aimed at ML engineers, security researchers, and red teamers auditing a model's vocabulary surface, and is actively maintained.
Glitcher: Glitch-Token Discovery and ASR Validation
Glitcher is an open-source Python toolkit for finding the under-trained tokens that hide in modern LLM tokenizers and measuring how reliably they break model behavior. It is the runtime behind the Glitcher v2.0 research note and the full-vocabulary follow-on, and it sits in the security pillar of this site as a measurement tool rather than an exploit kit.
What it does
Glitcher implements three families of glitch-token discovery and a validation harness on top of them:
- Embedding-norm thresholding — flags vocabulary entries whose embedding norms fall outside the expected distribution, a fast first pass that surfaces obvious under-trained candidates.
- Gradient-guided behavioral mining — uses gradient signal from a target prompt to walk the vocabulary toward tokens that induce specific failure modes.
- Brute-force full-vocabulary scans — exhaustively probes every token in the tokenizer, the slowest but most complete strategy and the one used to produce the census in the follow-on paper.
- Multi-attempt ASR validation — scores each candidate token by Attack Success Rate across stochastic samples rather than a single greedy decode. This is the harness whose earlier determinism bug is documented in the methodology audit; the patched validator uses temperature sampling and reveals a continuous distribution of soft glitches that intermittently misbehave instead of the bimodal 0/100% artifact greedy decoding produced.
The intended use cases are tokenizer audits before model release, red-team baselining of an open-weights model's vocabulary surface, and reproducing or extending the published census results.
Who it's for
ML engineers and platform owners who need to know what their tokenizer ships, security researchers studying input attack surface against LLMs, and red teamers building a baseline against a specific target. Glitcher is a research-grade measurement tool; it is not a jailbreak generator and it is not a managed service. If you are looking for a turnkey production scanner, this is not it.
How to use it
The published runs were validated against meta-llama/Llama-3.2-1B-Instruct on an RTX 5090 in bfloat16. The toolkit is model-agnostic in design but has only been exercised end-to-end against that target in the public audit. A typical first run is a vocabulary scan against your own tokenizer and target model, executed inside a research environment with no production traffic on the same host.
See the repository README for the full CLI surface, configuration options, and reproduction recipes for the published experiments. The README is the source of truth for flags and install steps; this page is the editorial overview.
Status and roadmap
Glitcher is active. The most recent work, in early 2026, was the methodology audit that corrected the ASR validator's determinism bug and added temperature-sampled multi-attempt probes; those patches have been prepared for the repository. Known limitations: only one model family has been validated end-to-end, and the gradient-guided strategy is sensitive to the choice of seed prompt. Roadmap items are tracked in the repository rather than here.
Source and license
The canonical source lives at the GitHub repository linked above. The repository's license file is the authoritative statement of redistribution terms; this page does not assert a license that has not been confirmed.
Responsible use. Glitcher demonstrates a class of LLM input weakness — under-trained tokens that confound filtering, logging, and extraction pipelines. It is a tool for auditing models you operate or are authorized to test, not for probing third-party endpoints. Use it against research deployments, not production systems or someone else's API.
Related research
- Glitcher v2.0 research note — the originating paper documenting the toolkit's design.
- Full-vocabulary census follow-on — the exhaustive scan results and ASR distribution.
- ASR validation methodology audit — the determinism-bug correction and the case for temperature-sampled ASR.
- What are glitch tokens? — threat-model framing for readers who are new to the concept.
FAQ
Do I need a GPU to run Glitcher?
Full-vocabulary scans on a modern 128K-token tokenizer assume a CUDA-capable GPU; a single consumer card such as an RTX 5090 in bfloat16 is sufficient for the 1B-parameter target used in the published audit. CPU-only runs are feasible for small probe sets but are too slow to be practical for census-scale work.
Is Glitcher safe to run against a production model?
Run it against a research or staging deployment. The probes are content-safe by design but they deliberately exercise the edge-case behavior of the tokenizer and model, which is exactly the surface you do not want hitting production traffic paths or shared logging pipelines. Treat the output as red-team data and keep it off live request paths.
Derived From
Related Work
External References
FAQ
Do I need a GPU to run Glitcher?
Full-vocabulary scans on a modern 128K-token tokenizer assume a CUDA-capable GPU; a single consumer card such as an RTX 5090 in bfloat16 is sufficient for the 1B-parameter target used in the published audit. CPU-only runs are feasible for small probe sets but are too slow to be practical for census-scale work.
Is Glitcher safe to run against a production model?
Run it against a research or staging deployment. The probes are content-safe by design but they deliberately exercise the edge-case behavior of the tokenizer and model, which is exactly the surface you do not want hitting production traffic paths or shared logging pipelines. Treat the output as red-team data and keep it off live request paths.