Columbia Technology | Ventures
  • print
  • email

Columbia Computer Science PhD Makes Computer Translation More Meaningful

Computer Science Ph.D. candidate Kristen Parton is working hard to help people understand each other better by making computers translate foreign languages more accurately.
Kristen Parton
Computer Science Ph.D. candidate
Her research focuses on how machine translation (MT) impacts cross-lingual search.
“For instance, if an English speaker wants to find out what Arabic newspapers are saying about the new Egyptian President Mohamed Morsy, she could use cross-lingual search to retrieve Arabic news articles about him, then translate them into English using MT,” explains Parton. “My dissertation explores ways to intelligently combine MT and search, and then automatically post-edit the translated results, so that the user gets more relevant, easier to understand information..”
Part of her research included authoring "Can Automatic Post-Editing make MT more Meaningful?", which won the Best Paper Award at EAMT 12 (Conference of the European Association for Machine Translation). The paper’s co-authors include Nizar Habash, a senior research scientist at the Center for Computational Learning Systems and adjunct associate professor of computer science, and Parton’s advisor, Kathleen McKeown, Columbia Engineering’s vice dean of research and the Henry and Gertrude Rothschild Professor of Computer Science. The research was done in collaboration with two MT researchers from Cambridge University, Gonzalo Iglesias and Adrià de Gispert, who are also co-authors.
While the quality of MT systems has improved considerably in the past decade – to the point that you can now read web pages in other languages relatively easily with Google or Bing translate – Parton says there is much work left to do.
“Translation is a very difficult problem in artificial intelligence,” she says. “Even human translators make mistakes.”
Despite the combination of intelligent algorithms, powerful cloud computing and growing amounts of training data on the web, MT systems still make translation errors. To mitigate this, human post-editors are often employed to correct machine translated documents.
“Our goal in this paper was to build automatic post-editors to detect and correct MT errors, focusing on those errors that most affect the translation's meaning, such as words that the computer didn't translate at all or mistranslated names.”
The image at right shows three types of edits that Parton’s automatic post-editors are designed to handle.
“In the first example, the MT system did not translate a verb, but rather deleted it entirely. Note that without the verb "discusses", the translated English sentence has a very different meaning than the original Arabic sentence," Parton explains. "The post-editor automatically detected the deletion, found the correct translation for it, and edited the translation to include the missing verb.
"In the second example, a name was mistranslated as the word "either", so that if you read the translation, you would never know that the sentence was supposed to have a name in it. In the third example," she continues, "a name was simply mistranslated. In both cases, the post-editor can automatically detect and correct the mistranslations."

—Story by Jeff Ballinger 



Article from Columbia Engineering News.