top of page
Search

Using speech recognition software to translate and while interpreting from text

Updated: Mar 26, 2021

I'm always keen on exploiting technologies to become more efficient in my translation work. With the arrival of my baby girl nearly three months ago, typing with both hands has become a luxury. My daughter constantly demands cuddling, which I am happy to oblige. However, I needed to find a way to cuddle her without slowing down on my translation assignments. Therefore, I thought of giving speech recognition technology a try.


Before I begin to share my experience, I should note that this entire article has been written with speech recognition technology. I am well aware that there are many types of speech recognition tools on the market. Some are free and some others are quite pricey. I have acquired Dragon Naturally Speaking as it is installed in the computer lab where I work and I have given it a try before making the purchase. I should also reiterate that this article is not an attempt to advertise any speech recognition tool. It is just my personal experience harnessing speech recognition technology in my translation work and how it has helped me practice sight interpreting.


December was usually quiet work wise in the last few years, but in 2019, I had an extremely busy December. I took on many translation jobs while babysitting, and I desperately needed to find a way to type with both my hands occupied. I turned to speech recognition software. The free tools online, at least the ones I found, require one to speak/type into a designated field and copy/paste afterwards. Dragon Naturally Speaking (DNS), however, works wherever you wish to type. I chose British English as the default language as I was told that my accent is more British than it is American (although the software did not agree with me, as it turns out). However, it took me a while to train the software to get used to my accent.

My first assignment with DNS was a medical translation job with MemoQ. My laptop has an Intel Core i7 processor, 16GB RAM, and a Windows 10 system, in other words, quite fast. I also has a screen extension (as do many translators). I was happy with the speech at which DNS kept up with my dictation. The text production was also quite accurate. The only issue I had was that the tool is that sometimes it is not particularly adept at picking up the beginning of sentences. For example, if I say "I had an extremely busy December", it would give me "had an extremely busy December" or, very rarely, just "busy December". It was a learning curve for me to enunciate every word so that the tool would produce what I said into text accurately, without me having to go back and correct the sentences. It also took me some getting used to to say the punctuation, so what I really said in the previous example was "I had an extremely busy December full stop". The first few translation assignments were an experiment and I was noticeably slower than when I was typing. I did not despair and kept using DNS for subsequent assignments. Now I am much faster.


I was very pleased with the range of vocabulary DNS is able to produce without error. Most medical terms were produced instantly without me having to correct them. Some simple words can be tricky, though. When I said 'sight translation', for example, it gave me 'site translation', but to be fair, you cannot expect the machine to do everything for you. My favourite functionality is that you get to train the tool recognise your pronunciation of certain words. If it does not pick up the way you pronounce a particular word, you can train the tool to to do so. As a non-native English speaker, I find it very helpful.





Babysitting can be demanding and time-consuming but I still need to find time to keep up my interpreting practice as regular practice sessions are a life-long endeavour and a must in this profession. I have not been able to practise as much as I would like to, but at the end of the day, it is effectiveness of the practice sessions, rather than the time you spend, that matters. With a speech recognition tool in place, I have decided to shift the focus of my interpreting practice sessions in 2020 from regular simultaneous interpreting to interpreting from text, otherwise known as sight translation. With substantial training on DNS (which I'm still doing), I will eventually get my translation jobs done much more quickly while practising sight translation - two birds with one stone.


Medical translation is my niche as 80% of my translation work is medical. However, with interpreting, every assignment I take on tends to be in a different field. With speech recognition software, medical translation has become much more fun, as I will have to dictate the names of diseases, conditions, drugs and medical procedures. This has become an excellent practice for me as the terms are difficult to pronounce and I was never bothered to learn the pronunciation of these terms. When you get them wrong, so does the tool. I am having to look up the pronunciation of terms like 'rituximab', 'bendamustine' or 'mifepristone'. It may take me some more time initially, but in the long run, this will make me a much better 'medical speaker' in both languages I work with, which I'm sure will help me with interpreting at medical conferences.




Sight translation is a skill that is playing an increasingly essential role in the interpreting profession. I am often asked in public service interpreting, in medical appointments in particular, to interpret informed consent forms or other documents to the patient from text. I am almost never given the document beforehand and I am required to interpret on the spot. Sight translation also takes place in conference interpreting, where the interpreter is often given the script of the speech at the very last minute, leaving him/her very little time to look up terms/concepts in the speech, in which every word has been nitpicked and polished a thousand times. The interpreter has to interpret from the given text while stay vigilant throughout the process lest the speaker decides to deviate from the speech, which they almost always do. It is such an important skill and one would be foolish to neglect honing such skill.


Speech recognition technology has given me the perfect means to practice interpreting from text while making an income while working with it. I am convinced that with substantial training, I will be a much quicker dictator than I am a typist (which would be an amazing achievement as I blind type very quickly) in all the fields in which I translate. Also, this will prompt me to find more snap and accurate translation/interpreting solutions so I can do more in less time and spend more time with my daughter.

16 views0 comments
bottom of page