Using speech recognition software to translate and while interpreting from text

Updated: Mar 26

I'm always keen on exploiting technologies to become more efficient in my translation work. With the arrival of my baby girl nearly three months ago, typing with both hands has become a luxury. My daughter constantly demands cuddling, which I am happy to oblige. However, I needed to find a way to cuddle her without slowing down on my translation assignments. Therefore, I thought of giving speech recognition technology a try.

Before I begin to share my experience, I should note that this entire article has been written with speech recognition technology. I am well aware that there are many types of speech recognition tools on the market. Some are free and some others are quite pricey. I have acquired Dragon Naturally Speaking as it is installed in the computer lab where I work and I have given it a try before making the purchase. I should also reiterate that this article is not an attempt to advertise any speech recognition tool. It is just my personal experience harnessing speech recognition technology in my translation work and how it has helped me practice sight interpreting.

December was usually quiet work wise in the last few years, but in 2019, I had an extremely busy December. I took on many translation jobs while babysitting, and I desperately needed to find a way to type with both my hands occupied. I turned to speech recognition software. The free tools online, at least the ones I found, require one to speak/type into a designated field and copy/paste afterwards. Dragon Naturally Speaking (DNS), however, works wherever you wish to type. I chose British English as the default language as I was told that my accent is more British than it is American (although the software did not agree with me, as it turns out). However, it took me a while to train the software to get used to my accent.

My first assignment with DNS was a medical translation job with MemoQ. My laptop has an Intel Core i7 processor, 16GB RAM, and a Windows 10 system, in other words, quite fast. I also has a screen extension (as do many translators). I was happy with the speech at which DNS kept up with my dictation. The text production was also quite accurate. The only issue I had was that the tool is that sometimes it is not particularly adept at picking up the beginning of sentences. For example, if I say "I had an extremely busy December", it would give me "had an extremely busy December" or, very rarely, just "busy December". It was a learning curve for me to enunciate every word so that the tool would produce what I said into text accurately, without me having to go back and correct the sentences. It also took me some getting used to to say the punctuation, so what I really said in the previous example was "I had an extremely busy December full stop". The first few translation assignments were an experiment and I was noticeably slower than when I was typing. I did not despair and kept using DNS for subsequent assignments. Now I am much faster.

I was very pleased with the range of vocabulary DNS is able to produce without error. Most medical terms were produced instantly without me having to correct them. Some simple words can be tricky, though. When I said 'sight translation', for example, it gave me 'site translation', but to be fair, you cannot expect the machine to do everything for you. My favourite functionality is that you get to train the tool recognise your pronunciation of certain words. If it does not pick up the way you pronounce a particular word, you can train the tool to to do so. As a non-native English speaker, I find it very helpful.

Babysitting can be demanding and time-consuming but I still need to find time to keep up my interpreting practice as regular practice sessions are a life-long endeavour and a must in this profession. I have not been able to practise as much as I would like to, but at the end of the day, it is effectiveness of the practice sessions, rather than the time you spend, that matters. With a speech recognition tool in place, I have decided to shift the focus of my interpreting practice sessions in 2020 from regular simultaneous interpreting to interpreting from text, otherwise known as sight translation. With substantial training on DNS (which I'm still doing), I will eventually get my translation jobs done much more quickly while practising sight translation - two birds with one stone.

Medical translation is my niche as 80% of my translation work is medical. However, with interpreting, every assignment I take on tends to be in a different field. With speech recognition software, medical translation has become much more fun, as I will have to dictate the names of diseases, conditions, drugs and medical procedures. This has become an excellent practice for me as the terms are difficult to pronounce and I was never bothered to learn the pronunciation of these terms. When you get them wrong, so does the tool. I am having to look up the pronunciation of terms like 'rituximab', 'bend