Tag Archives: spellchecking

Excel spellchecking – watch out for false negatives

When doing a spell check in Excel1, you should be aware that it accepts any word as correct as long as it contains a character that is outside the code page of the language checked.

So if you happen to misspell nämlich as namlich in German, Excel correctly flags your typo. However, if you are unlucky enough to spell it as nămlich, Excel happily absolves you. If this example seems contrived to you, think about foreign names. Excel leaves you on your own there, even if the correct form is in your spelling dictionary.

As long as you are aware of the issue, it shouldn’t be a major problem – of course you never rely on spell-checking only – or do you?

While I do use spell-checking as a safety net (a double one in fact by running my output through two spell-checking engines whenever possible), my primary means of living up to my own zero error tolerance is careful proofreading after completion – paced by a text-to-speech program that reads the text out to me and provides two major benefits: it prevents me from reading too fast and it alerts me to particularly surreptitious typos (long words, many consonants) by tripping over their pronunciation.

  1. Checked with Excel 2010. Let me know in the comments if you have seen the issue in later versions.

How do you check your final output for errors?

Over the years, I have had the privilege of reviewing the output of hundreds of fellow technical writers and translators. Many of them were gifted professionals; some had a level of technical knowledge or linguistic skill that I can only hope to attain some day.

But one area where I had to adjust my expectations with growing experience was the incidence of typos, or more generally gaffes of any kind that are so immediately obvious that they can only be attributed to insufficient final QA before delivery rather than to a lack of knowledge.

After doing some scientifically woefully inadequate sampling, I now tip my hat in respect when I have to correct less than one such error per 1000 words. I am quite satisfied when I see less than one per 500 words, and grow increasingly disappointed only when going below that mark.

So what do I do myself to keep my self-respect in that regard – and to earn the respect of anyone checking my work?

I always re-read a sentence as soon as I have completed it; in fact I have it read out to my by a text-to-speech (TTS) application while following along on screen.

Text-to-speech has progressed amazingly over the last decade, and there are now remarkably naturally sounding TTS voices available for almost any language, along with fairly capable utilities to harness them. And reading a text while simultaneously listening to it does wonders to my proofreading efficiency as it engages not only one but two senses in the QA process.

After finishing a job, I re-read the full text, again with TTS assistance. If I make changes, I make sure to fully re-read (and listen to) every sentence I have changed.

Finally, I run the text through at least one, but in most cases two spellcheckers, usually Microsoft Office and the best available dictionary for the Hunspell engine (there are usually several available for any language). Hunspell is used, among others, in LibreOffice and newer incarnations of Adobe’s Creative Suite. When I once compared their performance in checking a word list of around one million word forms, I was amazed by how little overlap there was between the two engines in terms of false positives and, worse, false negatives.

I also take every opportunity to have my work proofread by a peer, as it is usually more efficient to proofread someone else’s work than one’s own (though the gap can be closed with training and letting one’s own text “lie” for some time before re-reading it).

For all the trouble I am going to, I am afraid I cannot yet report that my error rate has plummeted to zero. The last time I had it checked, I was averaging between one error per 2000 words (on bad days) and one error per 5000 words (on good days). But I’m working on it.

What QA steps do you take to make your clients happy and to avoid those embarrassing moments when you open up an old translation for reference (or retrieve a TM match that definitely cannot be blamed on anyone else) and the first thing that jumps on you is a glaring typo?