Research-Grade MT Evaluation
The MT Eval Harness is a language-agnostic evaluation framework for machine translation, purpose-built for low-resource languages where commercial metrics fall short. It produces standardized JSON reports with chrF++, BLEU, exact match, and semantic validation scores.
Designed as a companion to i18n-rosetta: translation methods developed and validated inside the harness can be exported as rosetta-compatible plugins, creating a direct pipeline from research to production i18n.
Zero language-specific dependencies. Bring your own corpus, bring your own translation provider — the harness evaluates anything that produces text.