Document Sample
					                 WRITING FOR TRANSLATION
                       G E T T I N G S T A R T E D : Guide
O c t o b e r/N o v e m b e r 20 09

                                          Planning and Writing
                                          for Translation

                                          Optimizing the Source
                                          Using Translation Memory

                                          Elements of Style
                                          for Machine Translation

                                          Optimized MT for
                                          Higher Translation Quality

                                          Controlled Authoring
                                          to Improve Localization
      Guide: G E T T I N G S T A R T E D
      Getting Started:                            Believe it or not, setting out to write lyrically
                                                beautiful copy for a manual or even the web
      Writing for Translation                   may not be the most straightforward way to                   Editor-in-Chief, Publisher Donna Parrish
                                                                                                                       Managing Editor Laurel Wagers
      get to clear translation. These authors have some better ideas. Barb Sichel begins                                  Assistant Editor Katie Botkin
      this Getting Started Guide with an overview on planning and writing for translation,                                       Proofreader Jim Healey
      and then Joseph Campo offers the findings from a project using a translation tool to                                             News Kendra Gray
      find already-translated phrases to write the original copy. Ken Clark gives a short                                         Illustrator Doug Jones
      guide on writing for machine translation (MT), and Lori Thicke outlines why MT allows                                 Production Sandy Compton
      for quality translation in the first place. Ultan Ó Broin finishes things with a discus-                                            Editorial Board
                                                                                                                               Jeff Allen, Julieta Coirini,
      sion on controlled authoring.                                                                                   Bill Hall, Aki Ito, Nancy A. Locke,
                                                                                        The Editors                     Ultan Ó Broin, Angelika Zerfaß
                                                                                                                Advertising Director Jennifer Del Carlo
                                                                                                            Advertising Kevin Watson, Bonnie Merrell
      Planning and Writing for Translation                                                                                      Webmaster Aric Spence
                                                                                                                      Technical Analyst Curtis Booker
      page 3                                                                        Barb Sichel                     Data Administrator Cecilia Spence
                         Barb Sichel, director of business development at International Language                          Assistant Shannon Abromeit
                Services, Inc., has over 25 years of sales, marketing and management experience.                              Subscriptions Terri Jadick
                                                                                                                          Special Projects Bernie Nova
      Optimizing the Source Using Translation Memory                                                       Advertising
      page 5                                                                    Joseph Campo                                               208-263-8178
                   Joseph Campo, a senior technical writer at Dassault Systèmes SolidWorks              Subscriptions, customer service, back issues
           Corporation in Concord, Massachusetts, has ten years of technical writing experience.            
      Elements of Style for Machine Translation                                                                Submissions
                                                                                                                   Editorial guidelines are available at
      page 8                                                                        Ken Clark         
                            Ken Clark, CEO of 1-800-Translate, worked previously as a journalist,                 Reprints
                 screenwriter and speech writer for Japanese and American government officials.             This guide is published as a supplement to
                                                                                                          MultiLingual, the magazine about language
                                                                                                      technology, localization, web globalization and
      Optimized MT for Higher Translation Quality                                                      international software development. It may be
      page 9                                                                          Lori Thicke          downloaded at
                  Lori Thicke is cofounder and general manager of Lexcelera (formerly Eurotexte),
                         established in 1986, as well as cofounder of Translators Without Borders.

      Controlled Authoring to Improve Localization
      page 12                                                                    Ultan Ó Broin
                      Ultan Ó Broin, MultiLingual editorial board member and Blogos contributor,
                           works for Oracle in Ireland. He has an MSc from Trinity College Dublin.

                            Writing for Translation
           Rely on the No. 1 independent technology for the linguistic supply chain.

                                                                                                                           Across Systems, Inc.
                                                                                                                           Phone +1 877 922 7677

                                                                                                                           Across Systems GmbH
                                                                                                                           Phone +49 7248 925 425

      page 2                                                                                                              The Guide From MultiLingual
                                                                    WRITING FOR TRANSLATION
                                                                            G E T T I N G S T A R T E D : Guide

                                                                                                                                                  WRITING FOR TRANSLATION
                                                                            Planning and Writing
                                                                                 for Translation
                                                                                                                                     BARB SICHEL

        ocuments and online communica-            involve translating warning labels and soft-       graphics accessibility. Don’t plan to embed
        tions are translated to achieve spe-      ware user interfaces. Again, to save money,        words into layer upon layer of graphics. Your
        cific objectives. Your goal may be        perhaps you can omit a section such as the         translator may not be able to access them for
to execute a global communication plan,           corresponding parts list. If your customers        translation at all or may be able to do so only
meet regulatory requirements, avoid liabil-       can’t order parts in Japanese by calling your      at great expense to you. Plan to place your
ity or drive revenue by addressing target         customer service line, why provide a Japa-         text labels beneath graphics rather than
audiences in their native language. What-         nese parts list?                                   inside of them. Text must be “live,” that is,
ever the outcome, you will need clear com-           Understanding the intent and full scope         accessible independently of the graphics in
munication of a single message across all         of your project will enable you to plan your       order to be translated and reinserted in the
of the languages involved to get there.           budget and work with your translator to            same position.
   Lately, cost considerations have become        determine the correct order in which to               The same concept applies to screen
just as important as the accuracy of the          proceed. A phased implementation may               shots. Unless you translate your software
translation. Consequently, writing for suc-       be easiest to manage while allowing you            first and provide new screen shots, the
cessful translation today involves planning       to complete the highest priority require-          English copy locked within your graphics
your project so that you can convey your          ments first.                                       cannot be accessed for translation. If you
message within a reasonable budget.                                                                  must use preexisting graphics, your trans-
                                                                                                     lator may be able to recommend solutions
Message and scope
   First, and most obviously, decide what
you need to communicate, and communi-
cate it as simply and directly as possible.
                                                    Translation is
                                                      a meticulous,
                                                                                                     such as a reference table so that the reader
                                                                                                     can still understand your message.
                                                                                                        Too often, project costs are unnecessar-
                                                                                                     ily high or the quality of the finished trans-
Determine what is most relevant to your                                                              lation is compromised because translation
target audience and what you must trans-                 skilled process                             was never considered when a document
late to achieve your particular objectives.                                                          was originally created.
   Take the time to think your project                 similar in nature to
through from the perspective of the recipi-                                                          Your copy
ent, and do some research if you don’t                  technical writing.                              Simple, straightforward text is easiest to
know the recipient’s perspective. Translat-                                                          translate. Say what you mean as concisely
ing everything you publish in English may                                                            as possible. Word count is a key factor in
not maximize the return on your translation       Layout                                             the cost of your translation, so, if possible,
investment. You might not have the luxury             For printed materials, properly planning       keep sentences short and limited to a single
of translating every one of your product          your layout even before you start writing          idea. If English copy already exists for your
data sheets, for example, so focusing on          copy can greatly influence the ease and            pending translations, review and revise
product line summary brochures instead            cost of managing your project. Quite liter-        the content. Formal copy style with correct
may be less costly. If it is beyond your          ally, it pays to understand which factors          grammar, spelling and punctuation will be
budget to translate your entire 200-page          affect the cost and quality of your trans-         most easily understood by your translator.
employee manual, perhaps you can focus            lation. Then you can craft your presenta-          Consider also your audience’s education
on only those critical policies most needed       tion to achieve the desired outcome within         level and communication style and then
to protect your firm’s interests.                 your allotted budget.                              select the appropriate tone. Instructions to
   Some projects, such as catalog or web-             A few things to consider are the choice        a physician prescribing medication should
site translations, may warrant the creation       of desktop publishing application and lay-         be written differently than instructions to
of abbreviated or revised versions for target     out. If this is going to be a printed docu-        the patient taking the medication.
audiences. Sections dealing with customer         ment with color plates, you might look at             Avoid words with double meanings and
support or how to locate a sales represen-        whether enough room is left for text expan-        references or metaphors that may not
tative, for instance, may need modification       sion to accommodate any graphics. Text will        make sense in other cultures. Don’t rely
so that they are relevant in the geographic       expand in some languages and may contract          on buzzwords, abbreviations, industry
locale in which they will be used. Other          in others. This has implications for the font      jargon, colloquial expression or humor.
types of projects require translation of ancil-   sizes and page margins you select, as well         Create standardized text whenever pos-
lary materials that may not immediately           as graphics. Chinese characters that need          sible. If you can reuse blocks of copy from
come to mind. Technical documentation for         to be reduced to a 6-point font in order to        one document to the next, you will save
large-scale industrial equipment may also         fit on a page will be illegible. Also, check the   time and money on your translations and

October/November 2009 •                                                                                        page 3
Guide: G E T T I N G S T A R T E D

ensure consistency across all of your writ-     them. Provide files to your translator in the      Timelines
ten and online communications.                  same format you would like to receive back.           Translation is a meticulous, skilled pro-
   If your content is highly technical in          PDFs are fine for reference, but depend-        cess similar in nature to technical writing.
nature or your industry-specific terms are      ing on the size of your document and the           Though you provide the concept and the
prone to multiple meanings, supply your         application used, having the source files          source files, your translator must take time
translator with reference material or glos-     available may significantly impact the time        to fully comprehend your meaning and
saries for key terms. Links to websites or      to quote your project, the cost of your pro-       find the best way to replicate the tone and
product catalogs can minimize the need for      ject and the appearance of the final out-          content in his or her native tongue. Often
research during the translation process.        put. If you are working from hard copies           there is research involved or requests for
   Some copy may not translate well or may      or scanned documents, manual processes             you to provide clarification.
translate into some languages but not oth-      will have to be employed that will similarly          Your project involves much more than
ers. Be particularly aware of this if you are   affect your project.                               merely translation. Numerous details are
creating ad copy or marketing materials. It        Given the source files, most translation        involved in preparing your files for trans-
is worthwhile to check with your translator     firms can replicate standard file formats,         lation, gaining commitment from the best
early, before you have invested heavily in      even for software code. How you present            qualified translators, proofreading, format-
developing graphics or a tagline to accom-      content for translation impacts cost, timeline     ting and ensuring proper quality control. For
pany your corporate logo. Choosing the          and the ease of implementing your project. If      multiple language translations, managing
right words and the right images or colors      you do any cutting and pasting at your end,        your project becomes even more complex.
for your presentation may make the differ-      have your translator provide a “post-format        If you make a single change, it needs to be
ence between a seamless translation and         review.” This ensures proper text flow and         disseminated across teams of translators
one that falls completely flat with your tar-   the overall quality of your presentation be-       and proofreaders for each language.
get audience.                                   fore you print or post it on the internet. Costs      Allow realistic timelines for your pro-
   Acronyms should be avoided. The prob-        for this service are usually nominal and can       jects to be completed. A simple brochure
lem in trying to translate an acronym is that   prevent potential embarrassment.                   may take several days, while a 300-page
once you translate the theme word, the             Formatting foreign character sets on            manual may take several weeks. Advise
letters change and they no longer cross-        your own can be a challenge, even for an           your project manager in advance if you
reference to the supporting ideas you           experienced graphics person, and you               must meet a specific deadline so that your
want to convey in your target languages.        may not have the right tool set. Languages         project can be managed accordingly.
A native-speaking translator is a good          such as Arabic that read right to left require
resource for spotting things that won’t         special software versions and the ability          Partnering with a vendor
play well with your target audience. Basic      to reorient everything on a page. It is best          Since the quality of the translations you
localization — gearing your translated          not to attempt this on your own.                   publish reflects on you and your organiza-
document to a particular country, region           If you are translating software for user        tion, establishing a comfortable working
or target audience — is usually part of any     interfaces, handheld LCD screens or similar        relationship with your vendor is essential.
well-executed translation project. Exten-       uses, be prepared to answer questions about           Carefully crafted branding strategies
sive localization, to the point of creative     your ability to handle foreign character sets,     can be derailed in an instant by sloppy or
strategizing, however, is a specialized         space limitations and other factors that spe-      inaccurate work. Even a single poorly cho-
skill beyond the scope of typical transla-      cifically affect these types of projects.          sen word can alter your intended meaning.
tion projects. If you suspect your project         If you need to resize short translations        And just imagine your customer purchas-
requires an unusual amount of attention,        to fit an ad or label, ask for an Adobe Illus-     ing a piece of equipment only to find that
check with your translator.                     trator EPS file that has been “outlined.”          the documentation doesn’t make sense or
   Provide only fully proofread, final copy     This provides the best of both worlds. It          that the table of contents doesn’t match
for translation. Drafts are fine for budget-    is locked down like a graphic to eliminate         the order of the text. You will rely on your
ary quotes, but works-in-progress are           the possibility of introducing errors during       translation vendor to provide you with
unsuitable for translation and will leave       formatting, but leaves flexibility for resiz-      accurate translations that are audience
yours prone to errors, inconsistencies and      ing. You can format the text to meet your          appropriate and delivered, print ready,
higher costs. If you intend to update docu-     needs, even for a character set that you           within the specified timeframe. You should
ments later with new product models or          may not have installed.                            also educate yourself as to their quality
next year’s catalog, the level of attention        Lastly, use the right application for your      processes and experience level with pro-
you devote to tracking changes and version      project. Some applications play well with          jects similar to yours so that you can move
control now will be well worth your effort.     the automated tools employed by trans-             forward with full confidence.
                                                lation firms while others require a lot of            While there is no single industry certi-
Formatting                                      manual manipulation.                               fication for translations, there are third
   Locate your source files for older docu-        Microsoft Word works fine for short docu-       parties such as TÜV or the American
ments. This includes all of the desktop pub-    ments, but FrameMaker may be a better              Translators Association that provide qual-
lishing and accompanying graphics files.        choice for large manuals. If you use charts,       ity testing and auditing. It is perfectly
Are they with your graphics design firm or      live embedded links or manually inserted           acceptable to ask for credentials. In many
archived somewhere within your organiza-        multiple carriage returns, the level of diffi-     cases, your own in-house quality policies
tion? Your translator may not be able to rep-   culty in working with your files for translation   or regulatory requirements demand that
licate your formatting and graphics without     will increase, and this will impact your cost.     you do. G

page 4                                                                                                               The Guide From MultiLingual
                                                                 WRITING FOR TRANSLATION
                                                                        G E T T I N G S T A R T E D : Guide

                                                                                                                                           WRITING FOR TRANSLATION
                                                           Optimizing the Source
                                                       Using Translation Memory
                                                                                                                            JOSEPH CAMPO

        ow many times have you written          largest chm, with approximately 2,000              A dual monitor setup was essential to this
        something and known that you            topics. I went back in time and created an      project. On my right monitor, I opened Work-
        wrote something similar, but can’t      English TM.                                     bench and ran topics individually through
remember where it was or how it was writ-          I collected 73 new and 39 changed top-       the English TM to pretranslate them. On
ten? If you could only find that text and       ics that documentation had actually sent        the left monitor, I opened the original HTML
replicate it, you would save money and          to the translation team during the Solid-       topic that had been sent to translation.
time for your translation team by reusing       Works 2007 development cycle. I used the        When I ran a topic through Workbench, it
already-translated text strings and would       Analyze tool in Workbench to obtain an          provided a percentage match of the new text
produce more consistent documentation.                                                          against the existing TM on a string-by-string
This article describes a pilot project that                                                     basis. I used these suggestions to change

                                                   The higher the
tested a potential solution to this issue us-                                                   the English source text in HTML on the left
ing translation memory (TM).                                                                    monitor and to improve the percentage of
   I hypothesized that if our technical writ-                                                   fuzzy match. I also paid strong attention to
ers could tie our authoring process into an
English TM that contains already-translated
                                                    percentage                                  trying to reduce the number of new words.
                                                                                                   After pretranslating each topic, I used
text strings, we could find existing English
text strings, reuse them on new topics and
                                                       fuzzy match, the                         the Analyze tool in Workbench to gauge
                                                                                                and record the amount of savings for each
lower our translation costs. In effect, the
documentation team would pretranslate
                                                     lower the cost                             topic. When I completed pretranslating
                                                                                                all the topics, I calculated the costs and
their new English documentation to maxi-
mize matches against existing English
                                                       to translate                             savings using the research data. I also
                                                                                                obtained a translation cost estimate from
text strings before sending topics to the
translators who use the same TM.
                                                        the text.                               our outsourcing localization vendor for the
                                                                                                now pretranslated topics.
   We would use the English (source
language) TM to improve the quality of                                                          Results
fuzzy matches and reduce the number of          original estimate of a full-cost translation.      In both Table 1 and Table 2, results show
words. Fuzzy matches indicate a percent-        I also obtained an original estimate for a      a modest reduction in per-language transla-
age match of new or changed text against        full-cost translation for the same topics       tion costs when comparing the original cost
existing already-translated text. A higher      from our outsourcing localization vendor,       estimates to the cost estimates after using
percentage fuzzy match means the text           using German as the target language.            Trados to research the TM (post-Trados).
string more closely matches existing trans-
lated text. The higher percentage the fuzzy                                                      Post-Trados
                                                                            Original cost                                   Savings
match, the lower the cost to translate the                                                       project cost
text. Totally new text strings are the most
                                                   New topics                $4,807.64            $4,301.79            $505.85 (10.5%)
expensive to translate, so I tried to reduce
new words used. Because we translate               Changed topics             $1,957.97           $1,554.78            $403.19 (20.6%)
into 12 languages, there is a great poten-
tial for cost savings.                             Grand total                $6,765.61           $5,856.57            $909.04 (13.4%)
   After approval of the pilot project, I
worked with my manager to schedule two                              Table 1: Cost estimate — vendor full-cost translation
                                                                       (includes translation, review and layout/DTP).
months of project time. The translation
team manager provided me with a TM tool
license — Trados, in my case — and I was                                    Original cost                                   Savings
                                                                                                 project cost
ready to start the project after several
days of training.                                  New topics                $4,463.57            $3,837.79            $625.78 (14.0%)

                                                   Changed topics             $1,322.25           $1,049.49            $272.76 (20.6%)
Project design
   We use RoboHelp HTML to create online           Grand total               $5,785.82            $4,877.28            $898.54 (15.5%)
help and deliver multiple compiled help
files (chms). I chose the main SolidWorks                        Table 2: Cost estimate — Trados Workbench Analyze tool
help to use in the pilot because it is our                  full-cost translation (includes translation, review and layout/DTP).

October/November 2009 •                                                                                   page 5
Guide: G E T T I N G S T A R T E D

   Details — new topics: I created charts to
display the percentages of fuzzy matches
in the original versus the post-Trados
topics. The post-Trados topics (Figure 1)
showed an increase in the number of 100%
matches and a decrease in the number
of No Matches. In terms of percentages,
there was also an increase in the number
of 50%-74% fuzzy matches. The remaining
fuzzy match ranges were approximately
equal to or less than the percentages of
the original new topics.
   • Total words reduced by 10% (2,028
   • 100% match increased by 439 words.
   • No match reduced by 1,613 words.
   Details — changed topics: Changed topics
are existing topics with changes to already-
translated text. These charts revealed a
similar trend as with new topics. In terms
of percentages, the post-Trados changed
topics showed an increase in the number of
100% matches of about 10%. The remain-
ing fuzzy match ranges were less than the
percentages of the original changed topics.                       Figure 1: Fuzzy matches in original versus post-Trados topics.
Overall, there is a greater percentage of
100% matches and a smaller percentage of           translation items require full-cost trans-        savings of 12.2% for new topics. This
no matches compared with the new topics.           lation. She suggested we apply a differ-          seemed like a reasonable compromise.
                                                   ent cost metric to the other 50% of our
Analysis                                           outsourced translation items; this metric         Savings projection
   The cost estimates are within accept-           is called raw translation, which includes            Outsourcing costs for a typical release
able deviations that permit me to say that         translation of the text only. The savings         vary from $100,000 minimum to $400,000
the Analyze tool results are defensible.           in percent are similar to full-cost transla-      maximum, depending on how many new
I discussed the deviation with a senior            tion using the Analyze tool. Notably, for         products and services requiring localiza-
employee in our research department.               new topics, raw translation saved 14.1%           tion are added to our suite of products,
Standard deviations are complex to calcu-          while full-cost translation saved 14%             their length, and the number of languages
late and vary based on many parameters.            (Table 4).                                        supported. If the process was applied to
When I provided the deviations, particularly          For the purposes of this pilot, I decided to   all new documentation that we send to
for the new topics, the research employee          split the difference between the outsourc-        translation for outsourced localization, an
felt that the 3.5% difference was within an        ing localization vendor savings of 10.5%          estimated cost savings of 12.2% (between
acceptable deviation range (Table 3).              and the Analyze tool full-cost translation        $12,200 and $48,800) would be achieved
   I then met with the translation manager         savings of 14% and to use an estimated            (Table 5).
to discuss the difference in costs between
our outsourcing localization vendor and                                                               Post-Trados
                                                                                Original cost                                   Savings
the Analyze tool. The translation manager                                                             project cost
confirmed that translation costs will vary          New topics                   $2,409.98              $2,071.03          $338.95 (14.1%)
depending on the vendor, the language,
and the services provided. Having multiple          Changed topics               $ 815.93               $ 629.67           $186.26 (22.8%)
variables makes it impossible to provide
                                                    Grand total                  $3,225.91              $2,700.70          $525.21 (16.3%)
an exact cost estimate to fit all situations.
   The translation manager informed me                              Table 4: Cost estimate — Trados Workbench Analyze tool
that only about 50% of our outsourced                                      raw translation (includes translation only).

                        New topics              New topics       Changed topics        Changed topics       Grand total         Grand total
                         savings                 deviation          savings              deviation            savings            deviation
 Vendor                    10.5%                                      20.6%                                    13.4%
                                                  33.3%                                       0%                                    17.9%
 Analyze tool              14.0%                                      20.6%                                    15.8%

                            Table 3: Full-cost savings comparison/deviation between vendor and Analyze tool.

page 6                                                                                                                 The Guide From MultiLingual
                                                                  WRITING FOR TRANSLATION
                                                                         G E T T I N G S T A R T E D : Guide

  Changed documentation is typically
                                                                           Outsourced localization             SolidWorks translation
localized by our in-house translation                                           cost savings                     team time savings
team, so for us there will be no “cost sav-
ings” per se. However, the translation             New                        $12,200 to $48,800                           n/a
team would experience a time savings of
                                                   Changed                             n/a                          20.6% to 22.8%
between 20.6% and 22.8% because of the
increased quality and number of matches                                  Table 5: Annual estimated savings if Trados
as well as the reduced word count.                                is implemented for all new and changed documentation.

Pilot project conclusions                        to deal with localization issues during the     to XML, this might be the perfect time to
   Using a TM tool is viable in pretranslation   authoring stage, they are simply too over-      examine your documentation in detail with
only if we consider its value in increasing      worked. . . . We often can’t get them to edit   your translation costs in mind.
the consistency and quality of documenta-        their work, let alone reduce the word count        One benefit I found was that while using
tion. I could not justify using the tool on      or make it consistent.”                         my TM tool, I was fully focused on reduc-
just a cost-savings basis alone.                    I have been following the progress           ing word count because I kept translation
   Savings were achieved by both reuse of        of the SDLX AuthorAssistant (SDLXAA)            as my main focus. Word reduction is hard
existing text and aggressive word-count          product, which seems similar to my pilot        to achieve in normal writing mode because
reduction. However, the anticipated trans-       project. SDLXAA lets writers write, then        the technical writer is normally not thinking
lation savings only partially offset the         runs the document against a TM to offer         about it. According to Freij, “verbosity is the
cost of the skilled writer’s time in editing.    suggestions for improved matches and            enemy. It pays to be concise and straight
I spent approximately 30 minutes per topic       reuse. According to Sue Blaisdell, informa-     to the point, eliminating unnecessary text
using my TM tool.                                tion architect at Avaya, “with AuthorAs-        when localization is imminent. When writ-
   For the 73 new topics, I spent approxi-       sistant, you can connect to TMs for your        ing technical documents, remember that
mately 37 writer hours. Using $50 cost           project, and it will display 100% and fuzzy     simplicity is also very much desired by the
per writer hour, I spent $1,850 in time to       matches to the writer. It also gives the        end-user.” It is also important that your TM
achieve only $625 savings in outsourcing         writers insight into the way that changes       be as clean as possible.
costs. Labor costs were triple the savings       they make in their English content affect          What writers need is a TM tool that runs
achieved, for one language. Actual cost          the localization costs.”                        side-by-side with an authoring application
savings are only achieved when factor-              This pilot project indicates that transla-   and can semi-automatically offer sugges-
ing in that we translate into 12 languages.      tion cost savings can be achieved using         tions on how to better match new text
Savings = total cost savings ($7,500)            TM, but at a cost in labor and time. With       to the existing TM. The development of
– time spent ($1,850) = $5,650. If the TM        usage, writers would become more pro-           SDLXAA and Author-it’s new application
tool were used to only search for reusable       ficient with the system and save time.          should give us hope that tools are becom-
text (no word reduction), the results would      Your company would have to be ready to          ing available to bring technical writers and
be even less impressive (estimated 2.4%          absorb license and time costs. If you are       translators closer together to achieve cost
savings in outsourcing costs).                   going through a major restructuring of          savings by leveraging valuable memory
                                                 your documentation, perhaps upgrading           resources. G
Beyond the case study: related research
   I queried translation experts as to whether
any similar projects had been undertaken.
Authoring memory tools have been around
for over ten years. An article by Jeff Allen       UPCOMING EVENTS
in 1999 discussed how authoring memory
could be used in conjunction with controlled
                                                     2009 Know-how for Global Success
language to aid in translation (www.transref
.org/default.asp?docsrc=/u-articles/allen2         ■ October 20-22
.asp). The new Author-it product, for exam-
ple, uses fuzzy logic matching within a con-
                                                   ■ Hyatt Regency Santa Clara, Silicon Valley, California
tent management system. I contacted Nabil
Freij, president of GlobalVision, and accord-        2010
ing to him, this pilot project was a unique
approach. The key to reducing localization         ■ 7-9 June
costs is to reduce word count. Some com-           ■ Hotel Maritim proArte, Berlin, Germany
panies are implementing controlled English
to reduce word count, increase the 100%
matches, and also to transition to machine         ■ October 12-14
translation (MT). According to Freij, “MT          ■ Bell Harbor Conference Center, Seattle, Washington
engines can perform better under restricted
and controlled vocabulary.” In his experi-
ence, “most tech pub writers do not want              ■

October/November 2009 •                                                                                    page 7
                 Guide: G E T T I N G S T A R T E D

                  Elements of Style
                  For Machine Translation
                  KEN CLARK

                             e have entered the Machine Transla-          • Don’t remove necessary words, and             • Check your translation. So after closely
                             tion Age. Demand for human trans-         don’t go too far with editing. In English we    studying and applying all these rules before
                             lation is still increasing dramatically   drop a lot of words when we write, espe-        translation, how can you know if your MT
                  — or was until this year — but the vast major-       cially when writing informally. Keep those      output makes any sense? Translate the
                  ity of the world’s translation is now done by        articles, prepositions, pronouns and so on      output back using an MT tool. That reverse
                  computer. And the vast majority of machine           where the machine can find them. English        translation may help you to spot the most
                  translation (MT) transactions is completed           speakers are able to fill in the blanks and     glaring errors. Recast those problem sen-
                  using free online translation tools such as Ba-      fully understand — not so when the reader       tences in English and see if the back transla-
                  bel Fish or Google.                                  is a translation engine.                        tion gets any clearer. Don’t expect miracles
                      The result usually leaves much to be                • Misspelling does not compute. A mis-       here. But it may be some comfort to know
                  desired, and there’s not much you can do             spelled word will not translate — end of        that the original translation is better than
                  about it when you are translating some-              story, end of translation.                      the back translation.
                  one else’s content, particularly if you don’t           • Ditto on punctuation. One accidental          • Keep source and target together. No
                  understand the source language. But you              period can completely change the meaning        garbage in the MT tool, less garbage out.
                  can dramatically improve translation of              of a sentence and trash your translation.       But garbage there will be. That’s why we
                  content you write yourself and share with            Spell-checking and proofreading after you       like to keep a copy of the source with the
                  others in a foreign language, without using          write and before you translate are pretty       target translation to create a bilingual out-
                  the special software and workflows of pow-           basic quality assurance steps.                  put so that those errors can be spotted
                  erhouse automated translation systems.                                                               and corrected later if need be.
                  Just a few simple writing tricks can make a                                                             • Identify MT. Avoid blame by giving

                                                                         The simplicity
                  dramatic difference in MT quality. It’s not                                                          credit. Letting people know that you used a
                  controlled language, but language control.                                                           machine to communicate with them allows
                      Writing clearly, whether for man or                                                              them to read with caution, and keeps them
                  machine, is always a struggle (at least for
                  me), and the dim machine minds of the
                                                                          and clarity of                               from feeling they’ve been short-changed
                                                                                                                       on a real translation.
                  translation tools are unforgiving when
                  it comes to bad composition in source.
                                                                              expression                               On the writers’ craft
                  Unlike us humans, MT tools have no sense
                  of context, no appreciation of an author’s
                                                                           demanded by MT                                 Using a little bit of discipline to prepare
                                                                                                                       content for MT extends the functionality of
                  intent and definitely no sense of humor.
                      With Strunk and White’s The Elements of
                                                                            tools would meet                           these tools for people busily engaged with
                                                                                                                       others in multiple languages.
                  Style as inspiration, here’s an abbreviated
                  guide to good English style for improved MT.
                                                                              the approval of                             Writing for MT, just like writing for human
                                                                                                                       translation, is good writing practice. Trans-
                      • Use short sentences. Keep it simple.
                  Cut the clauses. Ditch the sentence frag-
                                                                           Strunk and White,                           lation has a way of highlighting communi-
                                                                                                                       cation errors that are invisible or ignored in
                  ments. Simple sentences and grammati-
                  cal structure (subject-object-verb) are the
                                                                            I hope.                                    just a single language. The simplicity and
                                                                                                                       clarity of expression demanded by MT tools
                  only way to go.                                                                                      would meet the approval of Strunk and
                      • Avoid ambiguity, as in “I saw her                                                              White, I hope.
                  duck.” Well, which is it? A duck that quacks           • Slang is so like, whatever. No slang           Their book, The Elements of Style, prized
                  that belongs to her? Or was she avoiding a           and no jokes for MT. Irony is the first thing   for its focus on clear, concise language, has
                  flying object? Look for multiple meanings            lost in MT. Stay earnest and formal. That’s     been a source of guidance for writers, copy-
                  when proofing. Good luck. If you don’t find          why pithy headlines and snappy news-            editors and college students for half a cen-
                  it, your MT tool may just do it for you.             paper copy so often translate badly with        tury. To commemorate the 50-year edition
                      • Remove extra words. Editing out                these tools. Rule of thumb: Good MT style       published this spring, The New York Times
                  unessential phrases and extra words will             comes in one flavor . . . plain vanilla.        commentators consign it to the dustbin of
                  make for a simpler, better translation.                • Use “Do not translate” coding. Some         history in the Room for Debate blog (http://
                  Since the algorithms have fewer transla-             MT tools will allow you to place code 
                  tion variables to wrestle with and better            around a word or phrase, which allows the       04/24/happy-birthday-strunk-and-white).
                  style with fewer words in the translation,           word to pass through the engine without            I’ve still got a dog-eared, ratty, old copy
                  it will also be more accurate.                       getting translated.                             on my desk, where it shall remain. G

                 page 8                                                                                                                   The Guide From MultiLingual
                                                                   WRITING FOR TRANSLATION
                                                                           G E T T I N G S T A R T E D : Guide

                                                                                                                                                  WRITING FOR TRANSLATION
                                                                 Optimized MT for
                                                       Higher Translation Quality
                                                                                                                                     LORI THICKE

      veryone knows machine translation           Against the benchmark of FAQT, MT is sure         question is whether to wait for MT to catch
      (MT) has enormous potential for dra-        to disappoint.                                    up to our aspirations for it or to invest in
      matically reducing translation cost            For those resigned to the lack of qual-        processes that can optimize the MT we
and increasing speed. But who thinks of           ity with unoptimized MT, there’s always           have today.
MT as a way to improve quality?                   the unfortunately named FAUT (fully auto-
   ISO 9001-certified for the last decade,        matic useful translation). FAUT is essen-         How MT improves quality
my company’s quest for quality has unex-          tially “gisting” translation, which is a more        Once we stop waiting for quality MT
pectedly led us to MT. Along the way we’ve        or less accurate approximation of the             to emerge fully clothed from the loins of
developed and tested a number of dif-             source text.                                      a research and development lab some-
ferent processes for MT and discovered               Today, gisting is overwhelmingly the use       where, we can start to see MT for what it is:
that correctly optimized MT can actually          to which MT is being applied and accounts         an efficient solution that can assist human
improve quality — and for less cost and           for even more words translated than by            translators by taking out a large part of the
with higher rates of productivity. Under          humans. If the claim that MT translates           drudgery in translation.
the right conditions, MT actually breaks                                                               The reality we are seeing every day is that
those compromises we’ve come to accept                              Speed                           for technical translations ranging from soft-
in the traditional localization paradigm.                                                           ware to manuals to catalogs, quality MT is
You may want price, speed and quality, but                                                          achievable. But like any relationship, you
here’s the kicker: you only get to pick two                                                         have to work at it. In fact, correctly optimized
out of three.                                                                                       MT — that’s the “working at it” part — paired
   MT can offer all three. However, the truth                                                       with human post-editors can actually improve
is that for most people, quality MT is still an                                                     quality. How could this be possible?
oxymoron. And who could blame them?                                                                    In the first place, correctly customized
                                                                                                    MT (customizing MT engines is a skill in
MT: always five years from perfection                                                               itself) removes terminological inconsis-
   Just about any of us with an internet                                                            tencies. If the source document always
connection has had first-hand experience                                                            uses the same term, so will the MT engine.
with MT. We have probably used SYSTRAN                                                              This resolves the real problem of teams of
to translate an e-mail or ProMT to give us                                                          translators working on the same project
the gist of a web page. We may have con-          Price                             Quality         but employing divergent terminology.
versed with someone in another language               The current localization paradigm.            Across a large project, MT can also ensure
via Google’s translation center, read Wiki-                                                         a more consistent tone, with less stylistic
pedia in Thai thanks to Asia Online or            more than humans seems outrageous, con-           discrepancies. Furthermore, MT removes
solved an IT problem using Microsoft’s            sider that an estimated 30 million e-mails        that human element of non-quality: omis-
automatically translated knowledge base.          are translated by MT every day.                   sions. Enforced, validated terminology,
   Along the way, MT will have amused                For internauts, instantaneous gisting          consistency and completeness are MT’s
us with its inadvertent twisting of human         (gist-in-time) provides a basic understand-       strengths. But what about mistransla-
language.                                         ing of an e-mail or a website. In the corporate   tions? There’s no question that MT deliv-
   Most people would agree that “out-of-          space, gisting is used for legal discovery, for   ers more of its fair share of sentences that
the-box” MT is far from what it is supposed       patent or technology searches, or to iden-        mangle the meaning of the source text.
to be: fully automatic quality translation        tify parts of larger corpora that merit being        This is where the post-editors come in.
(FAQT). This has been the promise held out        translated by a human. But how much gist-         Working on a bitext format, a post-editor
to our industry since the very first MT sys-      ing do we humans really need? Not much,           correcting MT output will frequently scru-
tem translated 49 Russian sentences into          as it turns out: for all the profusion of free,   tinize texts more carefully than a reviewer
English using a 250-word vocabulary and           software-as-a-service and off-the-shelf MT        working on human output. On large-volume
six grammar rules. Fifty years later we’re        solutions, commercial translations, which         localization projects, T + E + P (translate +
still waiting. As Hans Fenstermacher of           need more than gisting quality, are by and        edit + proof) as a process may be inter- says, “MT has been five          large assured by humans. For the vast             preted differently by different language ser-
years from perfection since 1952.”                majority of corporate needs, MT is staying        vice providers. T + E + P on a million-word
   It could be that our overwrought expec-        on the shelf.                                     project may consist of T + a sampling review
tations for MT partially explain the slow            If FAQT is still “five years away” and         of 10-20. The source text may or may not be
uptake of MT by the translation industry.         FAUT is simply not that useful after all, the     consulted at the same time.

October/November 2009 •                                                                                        page 9
Guide: G E T T I N G S T A R T E D

   MT affords you no such luxury. Because            For software, quality may be defined         means no information, service or customer
MT can and does go completely off the rails       as accurate, understandable and rapid           satisfaction at all.”
from time to time, each and every segment         enough for simship. For a catalog, correct         Customers also report that support
must be examined in bitext format and             terminology on each of thousands of items       articles translated by MT are just about as
approved or rewritten by a human post-            is primordial. For courseware, the material     effective in solving their problems as human
editor. If only every translation received        needs to promote learning. For a knowl-         localized content and at a price far below
that type of attention!                           edge base, customers need to be able to         what human translations would cost.
   This process for review and correction,        resolve their problems without further             This is not about depriving translators
if properly managed, should not only catch        recourse to the help desk staff.                of work. Human translations would not
and fix the errors, but should also yield            Since MT allows you to calibrate the         have been economically feasible for the
an accounting of what changes need to be          human effort (linguistic training, post-        hundreds of thousands of knowledge base
made to the MT engine itself. This goes to        editing) that you put into achieving the        articles in various languages — including
the heart of any good quality system, such        quality levels you need, setting quality        Chinese, Japanese, Portuguese, French,
as ISO 9001: ensuring quality at the source       requirements in advance is an essen-            German and Spanish — that Microsoft pub-
— that is, catching errors at the beginning       tial step. The example of online help and       lishes online. This would have required an
rather than correcting them downstream            knowledge bases above demonstrates              initial outlay of approximately $30 million
— and, crucially, instituting processes for       the importance of customer-defined qual-        per language, according to Microsoft itself,
continuous improvement.                           ity. It’s well known that human reviewers       not including weekly updates. Instead,
   Correcting systematic errors and then          will often designate only extremely high        Microsoft chose its own hybrid MT sys-
feeding these corrections back into the           quality as acceptable. However, when the        tem to translate content that would other-
MT engine is what we call “the Virtuous           choice is between an imperfect translation      wise not have been translated. Measuring
Circle of MT Quality.” This, too, is                                                                     the results, the company found that
an integral part of the optimization                                                                     across all languages, MT helped solve
process.                                                                                                 customer problems on average 23%
                                                                                                         of the time. This figure may seem low,
What quality do you need?                                                                                but it’s only slightly below the success
   But what quality is good enough?                                                                      rate of 29% for human translation.
Any good process defines its quality                                                                        Microsoft concluded at a presenta-
expectations up front, and working                                                                       tion to the 11th Machine Translation
with MT is no exception.                                                                                 Summit in Copenhagen, Denmark, in
   MT quality has been measured by                                                                       September 2007 that “customer sat-
the wrong yardsticks to the detri-                                                                       isfaction numbers for machine trans-
ment of the elegant solution that MT                                                                     lated articles is comparable to and
can be when matched to the type of                                                                       sometimes exceeds original English!”
result needed. The question is not
whether MT is “better” than a human                                                                     Optimizing MT
translation on a given text. Rather,                                                                       Regardless of the quality level MT
the question is what quality is nec-                                                                    is to achieve — publishable qual-
essary for a particular project and                                                                     ity or simply understandable quality
what process — human only, human                                                                        — unoptimized MT is just not up to
+ translation memory (TM), human                                                                        the job. While some sentences coming
+ TM + MT — will best allow you to                Five factors influence increasing MT quality.         out of untrained MT engines may be
achieve that exact level of quality.                                                                    stunningly good, others will be pure
   The 2008 version of the ISO 9001 standard      and no translation (information available       gibberish. And without effective training,
introduces the idea of customer-defined           only in the original language), customers       there is no way to ensure that the terminol-
quality to the international norm. This is an     themselves weigh in heavily in favor of raw     ogy you want will be consistently applied
important distinction to make. Accuracy,          — that is, fully automatic — MT.                by the MT engine.
consistency of style, correct terminology,           Don DePalma of the Common Sense                 Training, then, is the secret sauce of good
spelling and punctuation, and completeness        Advisory says, “Whether it’s FAQT, FAUT,        MT, even more important than what system
are all inarguably elements of a quality trans-   or perfectly rendered output, the biggest       you choose, whether rule-based or statisti-
lation. But how much quality is required for a    decision that companies will have to make       cal (see sidebar). This is also one of the areas
given situation? “Doesn’t read like a transla-    about machine translation is whether any        that requires the greatest investment.
tion,” for example, is the type of quality that   of those are a worse alternative than no           For statistical machine translation (SMT)
a marketing translation would need to have        translation at all. Given the enormous vol-     systems, this training involves not only
in buckets. We may not have a specific metric     umes of content that companies and gov-         extensive corpora of bitext (think in terms
for defining marketing quality, but we sure       ernment should make available for other         of millions of segments), but also glossa-
know when it’s not there! But what about          markets, for me and many of the organiza-       ries and monolingual texts. The more the
software? A catalog? E-learning courseware?       tions that we talk to, the quality question     better. Imagine Steven Spielberg’s little
A knowledge base? This is where the quality       is ultimately a non-issue. What we call the     alien, ET, saying “Need more data.” That’s
question begins to get more nuanced.              ‘zero translation’ option of doing nothing      SMT in a nutshell.

page 10                                                                                                              The Guide From MultiLingual
                                                                   WRITING FOR TRANSLATION
                                                                            G E T T I N G S T A R T E D : Guide

   However — and this is a big however —
the data must be good, clean data, follow-
ing the garbage-in-garbage-out truism. As
                                                                    Rule-based versus statistical MT
Microsoft says, “we can never have enough            There are two major streams in MT technology: rule-based MT (RBMT) and statistical MT (SMT).
clean, parallel data.” And it must be domain      These two methods, espoused by various MT technology vendors, represent two different routes
and client-specific data: no point training       to the same place.
the system on EU corpora if you’re a car             The earliest systems were rule-based, among them SYSTRAN. For the development of RBMT
manufacturer.                                     systems (SYSTRAN, ProMT, Lucy), various languages were broken down into their parts of speech
   In rule-based machine translation (RBMT),      and grammatical rules were hard coded, along with dictionaries. An RBMT system would never
this training is even more specific, involv-      say un noir chat but un chat noir, coded, as it is, with the knowledge that adjectives follow nouns
ing data mining to create domain-specific         in French. Exceptions such as une vielle dame would also be coded in the system.
dictionaries created for terminology entries         SMT, on the other hand (Google, Asia Online), uses an algorithm to parse vast numbers of bilin-
including “Do Not Translates” and graphic         gual sentences (preferably in the millions) in order to extrapolate relationships, including word
                                                  order. Un chat noir would appear as the translation of a black cat if it had seen that in the training
user interface strings. This expert training
                                                  phase. However, blissfully ignorant of the rules of grammar (with the exception of Asia Online),
of the engine creates the grammatically
                                                  SMT would be likely to incorrectly translate a green cat as un vert chat because it wouldn’t have
coded glossaries that will do the work of
                                                  encountered any green cats — unless trained on Dr. Seuss.
imprinting in-house terminology on the sys-          Both RBMT and SMT systems have their advantages and disadvantages. Both are capable of
tem. This is actually trickier than it sounds     delivering accurate, fluid sentences, depending on how they were trained. Both can also deliver
and requires a linguist trained in MT’s idio-     utter gibberish — again, depending on how they were trained. RBMT wins the day when you
syncrasies to avoid inadvertently creating        don’t have millions of words of training corpora; SMT is the victor when it comes to adding a new
errors and making the output worse, rather        language pair, a major multiyear undertaking when preparing an RBMT system. Hybrid systems
than better. This can occur when terms are        such as SYSTRAN’s are capable of bridging the gap between RBMT and SMT.
coded incorrectly (a verb as a noun) but
also when coding correctly but failing to
take into consideration how the system will      Testing will provide information on the level         they need to know what level of quality
react to exceptions. If training the engine is   of fuzzy match that should be discarded in            is expected. Besides post-editing, other
the sine qua non of quality MT, it is also one   favor of MT segments. However, it’s usually           post-production optimization techniques
of the greatest barriers because relatively      useful to make sure that new MT segments              include use of QA tools, automatic post-
few linguists know how to correctly tune         are identified as such to distinguish them            editing through regular expressions, text
MT systems, and few resources exist to tell      from validated TM segments.                           normalization, updating of the TMs and so
them how to do it.                                                                                     on. And above all, it is essential that there
   Upstream of the actual MT processing,                                                               be ongoing tuning of the engine with new

                                                   Long, convoluted
another activity is important to optimizing                                                            and modified terminology and error cor-
MT output: controlled authoring, or lan-                                                               rections in a continuous, virtuous, cycle of
guage control of the source content. Long,                                                             feedback and improvement.
convoluted sentences do not lend them-
selves to MT, no matter how well trained the
                                                    sentences do not                                      If all these processes, from pre-production
                                                                                                       to post-production, are instituted to opti-
system is. Authoring guidelines specify, for
example, that technical writers use short,
                                                      lend themselves                                  mize MT output, what kind of quality can
                                                                                                       be expected? Recently one of our clients,
simple, declarative sentences, employ the               to MT, no matter                               a major software publisher, noted in the
active and not the passive voice, avoid par-                                                           report “Leveraging a crisis for innovation
enthetical expressions in the middle of a                how well trained                              (or never let a good crisis go to waste)” that

                                                        the system is.
sentence and so on. And while we humans                                                                “contrary to all expectations, using MT in
may understand text that is rife with gram-                                                            [our company] has improved the translation
matical errors, no MT system will.                                                                     quality . . . with the reviewer commenting ‘It
   Where the source text already exists                                                                was nearly 9 — it was the best translation
or where in-house documentation teams                                                                  of courseware I ever read.’”
are resistant to applying the principles of         The capacity of MT to function as a                   It has long been believed that buyers of
controlled language for authoring, there is      standalone will depend on the quality                 translation services must compromise. In
another solution. Using automatic normal-        required and on how well the engine is                the traditional localization paradigm, if you
ization or running source text through a QA      optimized through stringent training,                 want speed and quality, you have to com-
program may bring a noticeable improve-          ongoing maintenance, controlled author-               promise on price; if you want speed and
ment to the ability of your MT engine to         ing and so on. But for publishable quality,           price, you have to compromise on quality.
understand and translate your text.              human post-editors are essential.                        MT is often associated with a compromise
   TM leveraging is another step in MT opti-        In this regard, MT can be seen as just             of quality in favor of cost and turnaround
mization. Even a well-trained MT engine is no    another tool in the translator’s toolkit,             improvements. However, the reality is that
replacement for the human translations con-      much like any CAT tool, albeit one that’s             correctly optimized MT can break these
tained in TMs, assuming they’re of high qual-    more complex and expensive to set up.                 compromises by offering faster through-
ity. It’s important therefore to develop the     In optimizing MT, post-editors need to be             put, lower costs and higher quality. But you
processes that will increase TM leveraging.      trained in post-editing techniques, and               have to work at it. G

October/November 2009 •                                                                                          page 11
                 Guide: G E T T I N G S T A R T E D

                  Controlled Authoring
                  to Improve Localization
                  ULTAN Ó BROIN

                         ontrolled authoring, broadly speak-     approved rule and term application during      the years through such developments as
                         ing, is the process of applying a set   the actual text-editing phase.                 Caterpillar Technical English, Nortel Stan-
                         of predefined style, grammar, punc-                                                    dard English, the Plain English Campaign,
                  tuation rules and approved terminology         Controlled languages                           GM’s Controlled Automotive Service Lan-
                  to content (documentation or software)            It’s not uncommon for organizations to      guage, Global English and so on.
                  during its development. Many companies         have no serious control over their content        The introduction of structured authoring
                  offer some form of guidance to their con-      style rules and terminology or to rely on      through SGML and later XML, along with
                  tent developers, either through tools or       manual processes, combining in-house           more innovations in linguistic processing
                  more ad hoc means, of course, so this may      guidelines with the commonly applicable        and database storage, allowed for the devel-
                  not seem at all remarkable. In the last few    rules and recommendations of sources           opment of and application of targeted rules
                  years, however, innovations in linguistic      such as The Chicago Manual of Style,           to meet customer requirements driven by
                  processing technology and its commoditi-       while working off spreadsheets of terms        content type and market, reflected by the
                  zation indicate that controlled authoring      and applying simple checks for consis-         ability to now apply a controlled authoring
                  holds great potential for anyone seeking a     tency and using human editing to meet          process through common authoring tools
                  tool-driven approach to maximizing returns     their “quality” requirements. For some this    such as Microsoft Word, PTC Arbortext Editor
                  from the localization process. This has par-   is acceptable; however, it is hardly a scal-   and Adobe FrameMaker through plug-ins.
                  ticularly paved the way for the adoption of    able, enforceable or measurable process.          The use of an approved set of terminol-
                  cost-effective machine translation (MT).       We’ve all seen the waste of many possible      ogy, where each term has only one mean-
                  Controlled authoring and languages are         opportunities for localization efficiencies    ing in that context — consider the different
                  complex, so this article concentrates on       — let’s save the content development effi-     translations for the out-of-context word
                  the localization-related aspects of intro-     ciencies for another audience — because        job, for example — and clear and enforce-
                  ducing controlled authoring into an organi-    manual enforcement and voluntary uptake        able authoring rules allow writers to
                  zation that must localize its content.         of authoring guidelines allow for a good       achieve a high degree of consistency in the
                     Controlled authoring itself is frequently   deal of subjectivity in interpretation and     source texts they create, not only in the
                  conflated with other parts of the overall      application. Controlled authoring is much      words and terms they use, but how they
                  content delivery process, notably that         more objective as the selection, applica-      use them. Consistency in constructing
                  of content management, a separate but          tion and enforcement of such guidance          phrases, along with eliminating complex-
                  contributing function. Isolating the non-      is programmatic. The application of rules      ity, ambiguity and verbosity, is the key to
                  technical essence of controlled authoring      “controls” the authoring, so to speak,         maximizing TM use and MT potential (and
                  is made all the more difficult by the range    allowing content developers to avail of the    large efficiencies on the production side).
                  and interplay of tool functionality offered    rules directly through the authoring user         What might these controlled language
                  by various vendors. Whereas seemingly          interface: looking up alternative words,       rules entail? Well, the number can vary, could
                  subtle distinctions do not always make a       phrases and terms, reusing already written     be as many as 10 to between 50 and 100, but
                  great deal of sense from an overall busi-      phrases, harvesting and storing new ones,      typically might relate to standardized spell-
                  ness process engineering viewpoint, it’s       and reporting on the content’s compliance      ing, length of sentence, number of clauses,
                  important to understand from a local-          with the rules immediately or afterwards.      use of active versus passive, simplifying
                  ization perspective just how controlled           The origins of the controlled language      tenses, rules for noun phrases, modifiers,
                  authoring technology works. For example,       concept are far from the needs of modern       syntactic cues, past participles, gerunds,
                  if the storage of objects in the content       day localization, rather being designed to     avoidance of Latin phrases, slang and so on.
                  management system (CMS) allows reuse           improve comprehension of the source lan-       I recommend Jon R. Kohl’s The Global Eng-
                  at a level higher than the translation mem-    guage by simplifying matters for nonnative     lish Style Guide if you need a valuable start-
                  ory (TM) segmentation does, localization       English speakers of English (“human orien-     ing point and reference material for possible
                  saves are limited. It may be more helpful      tation”) or computers (“machine orienta-       rules as well as Sharon O’Brien’s “Controlling
                  from a business requirements position to       tion”). Often, these nonnative readers         Controlled English” paper (
                  regard controlled authoring as an informa-     worked in the maintenance and service field,   .info/CLT-2003-Obrien .pdf) for recommen-
                  tion quality process that consists of many     an audience targeted by probably the best-     dations on the rules central to content
                  different parts: data mining for rule and      known iteration of a controlled language:      intended for MT.
                  terminology research and creation, new         ASD-STE100 Simplified (Technical) English.        Naturally, the rules vary by content type
                  terminology harvesting and rule devel-         The genesis of the controlled language con-    and audience. Gerunds may be acceptable
                  opment, reuse management, reporting            cept can be traced back to Ogden’s Basic       in headings, but not main text without qual-
                  on quality, and so on rather than purely       English from the 1930s and established over    ification, delimiters may not be required

                 page 12                                                                                                           The Guide From MultiLingual
                                                                   WRITING FOR TRANSLATION
                                                                              G E T T I N G S T A R T E D : Guide

                                                                                                                            content easier to translate by humans and

                                                                         Solution 1

                                                                                      Solution 2

                                                                                                   Solution 3

                                                                                                                            machine. It’s a fundamental recognition
                                                                                                                            that the basic internationalization concept
                                                                                                                            of assuring translatability and high-quality
                                                                                                                            source content results in greater savings
                                                                                                                            accruing to the organization at the localiza-
             NLP-level verification of terms, grammar                                                                       tion stage than trying to continually nego-
             and style according to our requirements                                                                        tiate lower prices with vendors or praying
   Prompting of writer to reuse of existing segments from CMS                                                               for a quantum leap in translation technol-
                                                                                                                            ogy to turn garbage source into localized
   Scalability (multiple users, concurrent users, performance)
                                                                                                                            gospel. Efficiencies are magnified in a
    Easy maintenance of rules by existing, in-house resources                                                               one-to-many relationship as the number
Percentage of existing rules from style guide that can be automated                                                         of languages translated increases.
                                                                                                                               Controlled authored content is consis-
 Integration with existing translation glossaries and exchange formats                                                      tently expressed in an understandable way.
                    New terminology harvesting                                                                              This results in translators not needing clari-
                                                                                                                            fications, maximizing TM matches, elimi-
    Basic rule set supports translation memory requirements                                                                 nating the need for terminology creation
      Basic rule set supports machine translation readiness                                                                 after localization starts, and providing texts
                                                                                                                            more easily processed by MT, cutting down
     Automatic reporting on quality in batch and single mode
                                                                                                                            on post-editing needs and recalibrations.
    Interactive quality assurance through editing environment                                                               Volumes too, are generally smaller, reduc-
    Allows prioritization of rules for grandfathering of content                                                            ing cost and time-to-localize per se.
                                                                                                                               Bear in mind, however, that these sav-
        Supports multiple rules and terms by content type                                                                   ings are a function of the rules created,
       Plug-ins and integrations for existing authoring tools                                                               as well as how and when the texts are
                                                                                                                            translated. Overarching internationaliza-
                   Automatic indexing capability                                                                            tion rules also impact the efficiencies as
         Customer references include MT and TM savings                                                                      well as the technical review of the source
                                                                                                                            text by domain experts. If a switch should
            Open standards or proprietary architecture
                                                                                                                            be documented as being “off” instead of
              Established user group and conference                                                                         “on,” then don’t expect controlled author-
                                                                                                                            ing to eliminate any language version test-
                        Global 24 x 7 support
                                                                                                                            ing issues.
     Figure 1: Business requirements can be weighed against a variety of solutions.                                            Do you need controlled authoring tech-
                                                                                                                            nology in order to use MT? The simple
on software strings but are required on          (                                     answer is “no.” But if you need a scalable
messages or documentation, and so on.            Theses/PatrickCadwell_Thesis.pdf), the pub-                                approach to ensure your source text meets
Thus, solutions that allow for forms of          lications of Jeff Allen (                                realistic MT business requirements by pro-
semantic checking have an advantage.             controlled language), the MT, Localization                                 viding easily processed source text that
                                                 Professional, and Information Quality groups                               minimizes the need for post-editing, thus
Controlled authoring solution: which one?        on LinkedIn, and so on.                                                    making MT cost-effective, then controlled
  Commercially available controlled auth-           The decision as to which controlled                                     authoring technology is a must-have. Mov-
oring technology solutions range from the        authoring solution to adopt is driven                                      ing past the “writing for translation guide-
more sophisticated, scalable technology-         by business requirements. Localization                                     lines” approach is the way to go here.
based solutions based on advanced lin-           teams should ensure they’re involved in
guistic processing to less complex content       the identification of these, so come armed                                 The business case for
management-based offerings, “methodol-           with facts and figures for the business                                    controlled authoring: the big picture
ogies” and combinations of same. Possible        case. Initial business requirements when                                     It’s often said that the biggest risks to
options include acrolinx IQ, IAI CLAT,           applied to a range of solution possibilities                               the introduction of controlled authoring,
Author-it, Smart MAXit, Boeing Simplified        might look something like Figure 1.                                        other than the cost (nontrivial even at the
English Checker, SDL AuthorAssistant,               Requirements vary by organization,                                      best of times), is the political. The term
Shufra and more. For those interested in         naturally. Prioritize and weight each point                                controlled authoring itself must be found
researching a controlled authoring option,       before making a decision among compet-                                     guilty on all counts of contributing to the
possible sources of information are IDC          ing alternatives.                                                          problem of user acceptance as it conjures
reports, ELDA, International Journal of                                                                                     up images of mass layoffs, stilted, boring
Language and Documentation, CLAW pro-            Benefits to localization                                                   texts, loss of control by authors, inducing
ceedings, Localisation Research Centre              The clear benefits of controlled author-                                an immediate negative reaction, mostly
publications, DCU papers from Sharon             ing in the localization space are derived                                  based on understandable fear and igno-
O’Brien and MA research by Patrick Cadwell       from the improved source quality making                                    rance, frequently exacerbated by a belief

October/November 2009 •                                                                                                            page 13
Guide: G E T T I N G S T A R T E D

that controlled authoring can somehow          solution for a few thousand words of
offer automatic creation of content, and       marketing material would not be a strong
a narrow focus on just localization ben-       business case!
efits. There are ways of dealing with             • Prioritize rules. Decide which ones
these issues too, beyond the scope of          are more important to you than oth-
this article.                                  ers. Aim for automatable and therefore
   In general then, beyond the clear hard-     measurable ones. For example, a rule
sell on the TM and MT front, localization      called “one strong idea per sentence” is
cost and time-to-market savings, local-
ization teams can emphasize the quality
                                               not automatable, whereas repeating the
                                               noun instead of switching it for a pronoun
                                                                                                    We’re Not Just
aspects of the source content for native
speakers too — superior user experi-
                                               or checking for the passive voice is.
                                                  • Look for leverage points between
                                                                                                 Translators . . . We Are
ence, consistent terminology, less sup-        localization and authoring teams. Many               WORDSMITHS!
port calls, improved accessibility and so      rules for localization maximization are
on. Leverage the global user experience,       obviously ones that should be applied          in FRENCH only inc./in SPANISH too! is a
not just the localized one. It should be       to text even if never intended for local-     small GIANT!
pointed out there are controlled author-       ization in the first place. Other, more        • small and personal enough to ensure language
ing solutions for Japanese, German and         “severe” MT rules may not be optimal for          continuity, short communications channels, fast
so on, so do not assume it is an English-      the source language depending on the              feedback and short production cycles
only concept now, whatever the origins.        user experience required. For example,         • large enough to meet your needs with a
                                               text intended for mobile applications may         team of passionate, experienced, professional
Introduction and changing                      be fragmented, clipped, dropping articles         in-house translators and project managers
processes: localization’s role                 and so on for user experience reasons. Be
                                                                                              EN 15038-certified
   Introduction of controlled authoring        prepared for compromise. Err on the side
requires a serious management decision         of user experience trumping localization
as to timing, not least the provision of a     unless it’s a complete showstopper.               in FRENCH only inc./
significant budget. However, research
would indicate that using pilot projects
                                                  • One particular challenge to the intro-
                                               duction of controlled authoring can also
                                                                                             in SPANISH too! Translations
                                                                                                     Toronto, Ontario Canada
to develop the process as well as achieve      come from localization groups them-  •
maximum buy-in by the stakeholders in          selves — the disruption of TM match
the process is key, as well as using train-    rates for previously localized content.
ing techniques that rely less on com-          This requires careful management. Solu-
puter science and linguistics but more         tions include the introduction of con-
on content development approaches.             trolled authoring on new content yet to
In general, localization groups might          be localized, phased introductions based
consider the following with faced with         on content that is going to change any-
the opportunity to introduce controlled
                                               way, grandfathering of content that has
                                               shown little change over years, or a reas-          High Quality MT
   • Identify a localization strategy for TM
and MT tools and how controlled author-
                                               sessment as to how a one-time hit on
                                               localization assets results in longer term
                                                                                              for International Success
ing business requirements fit into this.       cost savings, time-to-market improve-            SYSTRAN is the leading provider of machine
   • Help kick-start the controlled author-    ments and quality uptake.                     translation (MT) solutions for the desktop, enter-
ing process of adoption and pilot proj-           • Localization group input to the rule     prise and internet services. Our solutions facilitate
ects by providing rules and terminology        creation process must be matched by an        multilingual communication in 52 language pairs
already harvested to the implementers of       evaluation of the localized source out-       and in 20 domains. SYSTRAN Enterprise Server 7
the technology.                                put, too, iteratively maximizing returns      is powered by our new hybrid MT engine that
   • As the creation of rules and terminol-    through the fine-tuning of rules. An MT
                                                                                             combines the predictability and consistency of
ogy are central to controlled authoring        pilot makes a fine adjunct to a controlled
                                                                                             rule-based MT with the fluency of the statistical
and to the impact on localization tools,       authoring pilot. Provide content develop-
then it is critical that localization groups   ment teams with the feedback, qualita-
                                                                                             approach. The self-learning techniques allow
remain visible and active as stakeholders      tive and quantitative. G
                                                                                             users to train the software to any specific domain
in their development and maintenance                                                         to achieve cost-effective, publishable quality
over time.                                     Acknowledgements                              translations. SYSTRAN solutions are used by
   • Recognize the best kind of texts             Publicly available sources from the fol-   Symantec, Cisco, Ford and other enterprises to
— large volumes of structured, techni-         lowing were used in this article: Patrick     support international business operations. For
cal, procedural texts such as software         Cadwell (DCU), Sharon O’Brien (DCU), Jeff     more information, visit
and user assistance strings or online          Allen (, Uwe Muegge
documentation. These texts require a           (, Jon Kohl (SAS), Andres Heu-         SYSTRAN Software, Inc.
consistent user experience between com-        berger (ForeignExchange Translations) and       San Diego, California USA • Paris, France
ponents. Seeking a controlled authoring        Fred Hollowood (Symantec).           •

page 14                                                                                                               The Guide From MultiLingual
                                                                          WRITING FOR TRANSLATION
                                                                                   G E T T I N G S T A R T E D : Guide

        Save Translation Cost
           with HyperSTE                                  TermNet — International
     HyperSTE is the leading quality assurance            Network for Terminology                                    Creating a Dialogue
   software for standardized documentation.
     Benefits include
                                                            TermNet, the International Network for                     with the World
                                                         Terminology, is a forum for companies, associa-        Our network of 500+ skilled professionals,
     • Up to 30% in cost savings on translation          tions and universities that engage in terminology.
        and localization                                                                                      working in over 50 world languages and
                                                         Terminology is considered and promoted by            numerous areas of expertise, provides you with
     • Up to 40% in reduced word count                   TermNet as an integral and quality assuring part
     • Quality improvement in writing and translations                                                        the precision and accuracy needed in today’s
                                                         of any product and service in the areas of           global marketplace.
     • Quality assurance and quality measurement            • information and communication
        for content                                                                                             • Interpretation and translation services
                                                            • classification and categorization
     • Up to 30% in reduced product cycle time              • translation and localization                      • Competitive pricing
     • Up to 40% reduction in overall                                                                           • 50+ languages
        documentation cost                                  If you would like to join the international
                                                         community, please visit and            • Free, no-obligation estimates
     • Improved safety and customer service
     • Facilitates DITA, S1000D, SCORM, CMS              contribute to our blog.
        and XML
                                                            TermNet — International                                   Tennessee Foreign
   Tedopres International, Inc.                             Network for Terminology                                   Language Institute
               Austin, Texas USA                                     Vienna, Austria                                     Nashville, Tennessee USA •   •               •

                                                                      SDL TRADOS                                   Translation Services
       Clarity and Efficiency                                         Technologies
     With a vast network of professionals worldwide,                                                                into 70 Languages
  we provide reliable, customized language                  SDL TRADOS Technologies is a division of
                                                         SDL, the world’s largest provider of technology         We provide translation services into 70
  solutions in Spanish and Brazilian Portuguese.                                                              languages using the most modern technology for
                                                         solutions for global information management
     Our services include localization, translation,     (GIM), which benefit corporations and institu-       clients throughout the whole world.
  interpreting, desktop publishing, and project          tions, language service providers and freelance         CEET provides translation, proofreading,
  management solutions that enable our clients to        translators worldwide.                               localization, DTP, interpreting, voice-over and
  increase revenue and create effective communica-                                                            cultural consulting in all major world languages
  tion channels with their audiences.                       SDL has over 170,000 software licenses
                                                         deployed across the translation supply chain         with special expertise in Central and Eastern
     Our adherence to on-time deliveries, fair pricing   and has demonstrated proven ROI in over 500          European languages.
  and fast turnaround makes us a language service        enterprise solution installations. SDL delivers         We approach all projects with respect to
  provider our clients can trust.                        innovative software products that accelerate         customers’ needs and the cultural uniqueness of
     You are kindly invited to find out how you can      global content delivery and maximize language        each country because we believe the language
  benefit from our services.                             translation productivity.                            of your firm communicates who you are to the
      Clear Words Translations
               Córdoba, Argentina                                             SDL                                           CEET Ltd.                                 Berkshire, UK                                  Prague, Czech Republic

October/November 2009 •                                                                                               page 15
                                                                  WRITING FOR TRANSLATION
                                                                        G E T T I N G S T A R T E D : Guide
     An invitation to subscribe to

      his guide is a component of the magazine MultiLingual. The

                                                                        promoting your business or for conducting fully international e-
      ever-growing easy international access to information, ser-       commerce, you’ll benefit from the information and ideas in each
      vices and goods underscores the importance of language            issue of MultiLingual.
and culture awareness. What issues are involved in reaching an
international audience? Are there technologies to help? Who pro-        Managing content
vides services in this area? Where do I start?                             How do you track all the words and the changes that occur
   Savvy people in today’s world use MultiLingual to answer these       in a multilingual website? How do you know who’s doing what
questions and to help them discover what other questions they           and where? How do you respond to customers and vendors in
should be asking.                                                       a prompt manner and in their own languages? The growing and
   MultiLingual’s eight issues a year are filled with news, technical   changing field of content management and global manage-
developments and language information for people who are inter-         ment systems (CMS and GMS), customer relations management
ested in the role of language, technology and translation in our        (CRM) and other management disciplines is increasingly impor-
twenty-first-century world. A ninth issue, the Resource Directory       tant as systems become more complex. Leaders in the devel-
and Index, provides listings of companies in the language industry      opment of these systems explain how they work and how they
and an index to the previous year’s content.                            work together.
   Two issues each year include Getting Started Guides such as
this one, which are primers for moving into new territories both        Internationalization
geographically and professionally.                                         Making software ready for the international market requires
   The magazine itself covers a multitude of topics.                    more than just a good idea. How does an international developer
                                                                        prepare a product for multiple locales? Will the pictures and col-
Translation                                                             ors you select for a user interface in France be suitable for users
   How are translation tools changing the art and science of com-       in Brazil? Elements such as date and currency formats sound like
municating ideas and information between speakers of different          simple components, but developers who ignore the many inter-
languages? Translators are vital to the development of interna-         national variants find that their products may be unusable. You’ll
tional and localized software. Those who specialize in technical        find sound ideas and practical help in every issue.
documents, such as manuals for computer hardware and soft-
ware, industrial equipment and medical products, use sophisti-          Localization
cated tools along with professional expertise to translate complex        How can you make your product look and feel as if it were built in
text clearly and precisely. Translators and people who use transla-     another country for users of that language and culture? How do you
tion services track new developments through articles and news          choose a localization service vendor? Developers and localizers
items in MultiLingual.                                                  offer their ideas and relate their experiences with practical advice
                                                                        that will save you time and money in your localization projects.
Language technology
   From multiple keyboard layouts and input methods to Unicode-         And there’s much more
enabled operating systems, language-specific encodings, systems           Authors with in-depth knowledge summarize changes in the
that recognize your handwriting or your speech in any language          language industry and explain its financial side, describe the chal-
— language technology is changing day by day. And this technol-         lenges of computing in various languages, explain and update
ogy is also changing the way in which people communicate on a           encoding schemes, and evaluate software and systems. Other
personal level — changing the requirements for international soft-      articles focus on particular countries or regions; specific lan-
ware and changing how business is done all over the world.              guages; translation and localization training programs; the uses
   MultiLingual is your source for the best information and insight     of language technology in specific industries — a wide array of
into these developments and how they will affect you and your           current topics from the world of multilingual computing.
business.                                                                 If you are interested in reaching an international audience in the
                                                                        best way possible, you need to read MultiLingual. G
Global web
   Every website is a global website, and even a site designed
for one country may require several languages to be effective.
Experienced web professionals explain how to create a site that                    Subscribe to MultiLingual at
works for users everywhere, how to attract those users to your          
site and how to keep the site current. Whether you use the inter-
net and worldwide web for e-mail, for purchasing services, for

October/November 2009 •                                                                              page 17

Shared By:
Description: guide on writing for machine translation (MT), and Lori Thicke outlines why MT ... machine. translation (MT) transactions is completed. using free online ...