Sunday, July 6, 2014

Translating typos with Google search

Its probably fairly evident that I love Google's services by now, and I thought I'd just highlight a particularly useful feature in Google search.

When translating documents upwards of 10000 characters (10+ pages long), the chance of finding a typo somewhere is not by any means small. I have on numerous occasions banged my head over a frustrating translation, where a word would simply not make sense in context.

My usual procedure is to double tap Ctrl+C to bring up Golden Dict, into which I've plugged in an offline version of the indispensable WWWJDIC, which will resolve 90% of my queries immediately. If this fails, I alt-tab to my browser where Weblio is waiting with a whole host of dictionaries (including WWWJDIC, and the Life Sciences dictionary by Kyoto University), which will catch the remaining 9.9% or so (and also tends to give some very nice technical usage examples).

If a word fails at this point, frustration sets in as I splice and dice the word to see if the supposed word is actually 2 or more words strung together, but even this will not help if the word is a typo.

In this case, there are few options left available, but a Google search is often invaluable, not only providing some nice usage examples, but sometimes even finding a definition in some obscure internet glossary. In the case of typos however, Google will automatically search using the "correct" spelling. At this point, the fact the search term is actually a typo becomes clear and we can start at the beginning with the correct term*. Lovely! Imagine trying to work this out for yourself with paper dictionaries back before computers were on hand. Eugh.

A word of warning, though. Google love search terms that are common, and it isn't necessarily obvious whether the Google search is deciding against using your search terms because of a typo, or because it is just prioritising what it thinks you want to read about.

*Today I came across "胚葉体形成" in a source document, which can be split into "胚葉" germ layer, "体" body, "形成" formation, which I initially translated as "germ layer formation", but was unhappy with the context, and while dropping "body" improved the flow I wasn't happy about it: "body" could have referred to cells in a germ layer. I decided to Google the term and see if there was a useful precedent, only to be bombarded by results for the similar: "胚様体形成". At this point, I realised "胚様" and  "胚葉" are homophones, and that this may be a typo. Sure enough typing in "はいよう" with Microsoft Japanese IME puts "胚葉" at the top of the candidate suggestion list. "胚様体形成" translated to "embryoid body formation", which fit better into context.

