Software for CAT

From B-Ob8ungen
Jump to: navigation, search

It has taken me very long to start experimenting with CAT (look at the explanation in Wikipedia). I was afraid that

  1. the software is very expensive
  2. you have to invest much time, before you get results

On the page Comparison of CAT software I shall give an answer as to whether that's true or not.

The basics

To say it in simple terms: the text(s) you want (have) to translate is divided into segments (usually a sentence, meaning that wherever a full stop is found a segment ends). The segments that you have translated go into a translation memory. Whenever during the rest of the translation the program detects a similar sentence (you can specify how much they should resemble) you are offered the possibility to use this.

In addition to the translation memory you can also build your own set of terminology (dictionary or glossary) and (depending on the software you use) these words or phrases are offered to you for inclusion.

For the professionals the expensive programs offer the administration of translation jobs in that you can for instance

  • have different people involved with translation, proof-reading etc.
  • set deadlines for the jobs and see detailed reports on the work done

Best known programs

The best known program widely used in industry and almost a must for professional translators is Trados. According to the English Wikipedia it covers more than 70% of the market. They ask for more than € 700 for one licence.

A German blog called roxomatic presents some details on MemoQ, across and OmegaT. I, too, will deal with these three programs, since it is possible to get some for free.

Only in September 2012 I took a look at the Translation Toolkit of Google with the question as to whether or not it is suitable for joint efforts on certain translation projects.

The material for testing

I decided to start the experiment, when I got several interviews in Turkish related to hate crimes (this page is related to hate crimes in Turkey). I was asked to translate them into English. The interviewer had asked (more or less) the same questions and the interviewed people seemed to have used similar terminology.

My second idea was to develop a system for the Human Rights Foundation of Turkey. The documentation centre publishes daily reports that are translated into English. The articles in these reports often have the same title and standard formulations. This should be ideal for using CAT software. In addition, some people were translating these texts (either from Turkish or English) to German and they might benefit from CAT as well.

General problems

The various CAT-programs understand different kind of files. If you work with OmegaT, for instance, you have to convert your files to *.odt (the extension of OpenOffice), before you can insert them into what all these programs call a project. You could also work with simple text files (*.txt) or the internet language *.html, but the result would always be an OpenOffice file and, therefore you definitely need this program to work with OmegaT.

For other programs that are based on windows as the operating system many files for translation have invisible formatting that is not intended to be included in the target text. This is certainly a problem with MemoG.

Thirdly, the dictionaries again use different formats and it not easy to import and export these documents from one application to another. In across several steps are needed to include items into the glossary and if you miss an important element entries to your glossary will not show up.

Depending on the languages you are using the glossaries give reason for additional trouble. If you translate from English to German for instance you

  1. should not use articles to the nouns
  2. make additional entries for all possible cases, in particular adjectives

the English expression different needs to be given as verschieden, verschiedene, verschiedenes, verschiedenen, otherwise the program will warn you that an expression was not translated. The second problem would be that if you have a separate entry on differ (possible translation: anders sein), but only the word different appears you are in trouble again, because the warning would be that you did not translate differ. In any case you should have given a translation for differ as sind anders, because differs means ist anders.

Further reading