Posts Tagged ‘ROI’

Weighting Machine Translation Quality with Cost of Correction

2009/07/17

Machine translation (MT)) was one topic yesterday as I enjoyed lunch with Ben Cornelius and Ray Flournoy of Adobe Systems. Of course, a key criteria for evaluating these tools is the quality of the translated output. Most people would say it is the most important criteria. Speed, cost, and integration with other tools are also significant.

However, evaluation of any tool  should take the intended application into account. Some applications can use MT output directly. Many more require review and editing of the output before publication. Manual review and editing introduces labor costs and delay which can be significant.

Therefore we should look at the cost of operating the tool plus the cost of post-editing, when evaluating MT tools. Clearly, the optimum is to have 100% quality and no post-edits are required. But, this is not usually the case…

Not all editing tasks are the same. Some edits are easy to make and low cost to fix. Others are labor intensive. The entries that require editing may be obvious and therefore easy to find. Some may be subtle and require more intensive scrutiny to identify.

The post-editing needed by machine translation output will follow a pattern that varies with the MT engine (and its rules, or training, etc.). (Human authors also have a writing pattern requiring particular classes of edits.) Typographic and terminology substitution errors may be easy to address. Some grammar and style errors may be more costly. Consistency, flow and the relationship among sequences of sentences may be harder yet.

This suggests an interesting criteria for evaluating tools, the joint editing productivity and total operational cost of using the MT tool. An MT product that generates text needing edits that are both easy to find and to fix could be very low total cost. Another tool producing higher quality linguistic output, might still be less productive if post-editing is difficult.

A good metric for MT tools would be to assign a weight proportional to the cost of fixing a problem to each class of error. A document could have 100 typos and be much cheaper to ready for publication than a document with only a few consistency or contextual errors that required thought and consideration to address.

This metric would also help with process configuration. For example, if I have to produce both Mexican and Iberian Spanish translation, based on English source material, I have several options.

If “>-MT->” represents a machine translation step, and “>-PE->” represents a post-edit step:

Option Step 1 Step 2
A Simple MT, then PE en >-MT-> mx
en >-MT-> es
mx >-PE-> mx2
es >-PE-> es2
B mx to es en >-MT-> mx
mx >-MT-> es
mx >-PE-> mx2
es >-PE-> es2
C mx post-edit to es en >-MT-> mx
mx >-PE-> mx2
mx2 >-PE-> es
es >-PE-> es2
D es to mx en >-MT-> es
es >-MT-> mx
es >-PE-> es2
mx >-PE-> mx2
E es post-edit to mx en >-MT-> es
es >-PE-> es2
es2 >-PE-> mx
mx >-PE-> mx2

The scenario that is most effective is the one requiring the least editing. This may not correlate with unweighted measurements of each machine translator’s linguistic quality.

When I mentioned this, Ben recalled a demo by ProMT that the three of us attended recently. ProMT machine translation has a nice feature for managing placeholders used to represent program variables.

Here is an example sentence with two placeholders represented by an identifier in curly brackets.
“The file {0} contains {1} words.”

The filename and word count would be substituted at run-time for {0} and {1} respectively.

Many machine translation tools segment the text in between the placeholders rather than treating the placeholders as part of the syntax of the sentence. Therefore the placeholders are not properly addressed in the translated output. The problem is exacerbated by tools that convert markup tags to placeholders.

For example, according to Alex Yaneshevsky of ProMT, Idiom WorldServer converts:
<i>My name</i> <u> is </u> <b>Alex</b>
{1}My name{2} {3} is {4} {5}Alex{6}

The resulting translation gives:
{1}{2}Меня зовут Alex{6}{3}{4}{5}

Even if you don’t read Russian, you can see that “Alex” should retain placeholders as “{5}Alex{6}”.

Post-editors must remove the original placeholders from where they are positioned in the text and insert placeholders into the correct locations. This would be a significant cost consideration for either software or markup localization.

ProMT treats the placeholders as part of the sentence resulting in better placement in the output. This simplifies post editing and improves productivity.

(I am not commenting on ProMT translation quality. For this scenario their output significantly reduces post-editing cost.)

Ideally machine translation would deliver 100% quality. However, if the quality is less than 100%, then evaluating the combination of machine translation and post-editing effort is a more useful measure than selecting tools or configuring workflow based on just quality metrics. Higher quality might be irrelevant if it is more challenging for the human post editor to correct the text.

Advertisements