

A system with a better score is a better system. It is almost always used to rank machine translation systems. The absolute score returned by a metric is usually not interpretable alone. The smaller the distance is and the closer the system is to generate a translation of human quality. The objective of the automatic metric is to yield a score that can be interpreted as a distance between the translation hypothesis and the reference translation. The translation hypothesis and the reference translation are both translations of the same source text.

The cat is sleeping in the kitchen, so you should cook somewhere else. The cat sleeps in the kitchen so cook somewhere else. Translation hypothesis (generated by machine translation):.Le chat dort dans la cuisine donc tu devrais cuisiner ailleurs. Here is an example of a French-to-English translation: (Rarely) the source text translated by the machine translation system.At least one reference translation produced by humans.

