Docstoc

Something for Nothing

Document Sample
Something for Nothing Powered By Docstoc
					Getting Something from Translation Memory
By Addison P Phillips
Director, Globalization Architecture, webMethods, Inc.


There is an obsession in the software business with doing the minimum amount of work. The
archetype in software is Larry Wall, inventor of perl, who famously said:
A truly great computer programmer is lazy, impatient and full of hubris. Laziness drives one to work very hard to
avoid future work for a future self. Impatience has the same endgame.
So it should be no surprise that the software localization is also rife with technologies and
products designed to reward laziness. Machine translation (MT), text alignment, terminology
seeding, portals, GMS, and translation memory: very hard work indeed in support of avoiding
“future work for a future self.”
The potential benefits of computer assisted translation (CAT) tools are said to range from
improved consistency to reduced cycle time and translation cost. Of course, following Wall's
dictum, being impatient, many of us would like to see these rewards quickly.
Translation vendors, especially the sophisticated multilingual firms, appear to be harvesting
some value from these tools. In “The Last Word” of MultiLingual Computing & Technology #65
Volume 15 Issue 5, Henri Broekmate suggests that vendors should use a wide variety of tools to
provide incremental increases in efficiency. He also suggests that most companies not only
practice but also should practice outsourcing of the technology aspects of localization.
The question for software companies (as consumers of translation services), though, is whether
adopting one or more tools delivers appropriate benefits; whether and which tool strategies they
should adopt to receive these benefits; and what costs are associated with these benefits. Is there
a return on an investment in translation memory? And are we ever going to get a weekend at the
beach courtesy of our tools?
Many localization professionals are afraid of the ridicule and internal fallout that might come
from talking concretely about their tools investment and particularly about failure, so it is
difficult to tell if one's experience is typical. The yardsticks applied to the "regular" software
development tools are said not to apply to the localization process. With that in mind, this article
examines the costs the rewards of adopting tools as experienced at webMethods. In particular it
focuses on webMethods’ use of TRADOS and SDLX.
In 2003 webMethods adopted the SDLX Translation Suite as the standard in-house translation
tool in an effort to control costs and help ensure future process scalability. This is a personal
story rather than a product review. It is, in some ways, a success story, but not the success we
expected.
The globalization team experienced many issues in adopting these tools and adapting processes
to them. There were certain expectations going in. How well were these expectations met? By
examining webMethods experiences, perhaps others can develop a realistic assessment of what a
tools strategy costs and benefits are and whether tools vendors are delivering on their promises.
webMethods adoption of SDLX was driven by the evolution of our localization program. More
products were being localized. These products were more complex and going into more
languages. Budgets and staffing were more or less constant. Schedules, meanwhile, were
growing shorter. We needed to become more efficient in order to deliver more product with the
same (or less) amount of work.
Anecdotal Evidence
By September 2003 the webMethods globalization team was nearing completion of its first
substantial project using a new tools approach based on SDLX. We had reached a point where
we were seriously questioning our choice of approach and tools, not to mention our sanity. The
engineering and project management teams were experiencing a letdown. Asking fellow
professionals about our experience we got some responses that bothered us:
“What did you expect? Everyone knows you have to handle the files, usually individually, that
way.”
“Your engineers are overqualified: we find that localization engineers just accept this sort of
thing, while mainstream engineers don’t understand why it doesn’t work. Localization folks just
know it doesn’t work and accept it.”
“Tools work best in homogenous environments: lots of big documents. They just can’t handle
these kinds of issues, so we just use them for traditional doc sets.”
“Why are you messing with the memory? Just pay the vendors to deal with it.”
webMethods' Localization Requirements
webMethods makes enterprise software. As a software company ourselves, we think we
understand what expectations a product's customers should have and what to expect from our
own software and our software vendors.
webMethods is also a pretty typical localization consumer, both in terms of the localization
activities and in terms of evolution as a global company. webMethods has a centralized
localization group that manages the projects; a testing group that certifies our products for non-
English platforms as well as the localized products; and a small engineering team that prepares
translation kits and builds the localized products.
webMethods products are complex to localize and test. For this reason and because the company
simships (releases localized product on the same day as the English), webMethods has
historically engineered translation kits internally. The company localizes all of its major product
lines, a policy driven by success of the localized products in Europe and Asia, as well as
awareness by upper management of the strong linkage between the modest localization budget
and success in these markets. The results speak for themselves: by 2003, 43% of webMethods
revenue came from overseas, compared with just 20% when I joined the company in 2001. In
2003, webMethods was named the fastest-growing software company in America.
To help make simultaneous shipment possible, while preserving flexibility, webMethods uses
"language packs" to deliver localized software. Language packs are localized add-ons to the base
release, which is a "global binary". One of the benefits of this separation is that it allows the
localization process to run in parallel up to the release date without being directly tied to the core
English product's build and test cycle.
webMethods products consist of a typical mix of materials: Java and C++ message catalogs
(generally using Java’s ListResourceBundle format); HTML help files (single sourced from
Frame); an online interface (originally HTML with a proprietary server-side scripting language
called “DSP”, but transitioning to JavaServer Page technology and JSR168 portlets); and
traditional documentation (mostly Frame). In recent years there has been growth of other formats
(notably the advent of JSP in place of the proprietary DSP as a tag language and the advent of
the W3C standard SVG for graphics in our business process modeling and visualization tools).
Expectations
webMethods expected translation memory tools to provide scalability in the localization process
by filtering out material that had been previously translated and unchanged between releases, as
well as cutting down on the amount of first-time or raw translation for material that had been
slightly modified. We wanted to apply glossaries and style guides consistently and to reduce the
cycle time on our translation and our review processes by reducing the amount of material
involved and eliminating review of fixed material.
The benefit in the translation cycle is partially offset by the need for trained localization
engineers to process the source material and generate the target language materials at the end of
the project.
Basically translation memory works by importing the source material and breaking it into
individual "translation units" or "segments" of text. New material is translated. Slightly modified
material is edited. Repetitive segments or text that hasn't changed from a previous translation is
recycled. The resulting segments are then exported back to the original format to form the target
language materials. These results can be formatted, engineered, and quality checked.
To be effective, there are several things that must happen.
First, the import and export must be able to handle the files properly and produce results that are
consistent from one run to the next and work with the source file formats. The target materials
coming out of the export should function in a manner consistent with the source language.
Second, the segmentation of the text must be linguistically sound and match actual text usage. A
piece of HTML like "<p>This is an <emph>emphasized</emph> item.</p>" should form a
single segment, not three, and the emphasized phrase in the source should be the emphasized
phrase in the target language, even if the word order changes (or results in two emphasized
phrases).
Third, the matching of segments in memory to content should be accurate. If no one is reviewing
the "100% match" segments, they had best be correct. There should not be segments that produce
lower match levels that haven't changed. The final, good quality translations should be the ones
recycled.
The memory should produce results that are contextual and be able to deal with similar (or
matching) segments in differing contexts accurately.
Finally, the amount of effort to produce these results should be less than the cost and time saved
by using the tools and should decline over time.


TRADOS
At first to achieve these results webMethods used separate strategies for each type of file and
standardized on TRADOS for our translation memory needs. Our selection of TRADOS was
based purely on its position as the market leader, without regard for the technology. The
globalization team felt that the benefits from using the leading technology would overcome any
technical issues involved in the adoption of the tool and that continued leadership would dilute
those issues over time. This is actually a fairly common attitude about tool selection, from
discussions with other companies. Coupled with a sort of “can do” attitude on the part of
localization engineers, the tool can be made to work in most situations. It wasn’t a bad choice,
but it didn’t meet the expectations set out above.
This is partially explained by how webMethods used TRADOS as compared to our perception of
the product's design goals. Many localization consumers, such as webMethods, pay their vendors
to run the tool, rather than running tools themselves: they only care about the cost savings and
consistency benefits that they gain from the tool, not the actual operational details. The delivery
at the end includes the memory files, but the localization consumer doesn’t open or maintain the
files. The features and design of the products reflect the most common usage patterns of
translation vendors and translators, rather than the requirements of the ultimate consumer.
A common pattern is that each particular file is pre-processed in a vendor-specific way by one of
the vendor’s engineers (and with minimal time investment). A loss of some leverage between
releases is looked at as a necessary evil. That is because producing consistent segmentation and
linguistically sound segmentation at the same time requires intervention in the source files. High
accuracy is more costly because it requires more work by the localization engineers. Perfect
accuracy can consume much of the cost savings, since engineering time is expensive compared
to translation cost. Post-processing removes the specific “tweaking” that was done to the source
files in order to create the target language files. The resulting memory is very sensitive to this
processing, so the memory isn't really portable because the processing applied to the source is
idiosyncratic, even though it is still “a TRADOS memory” or “an SDLX memory.”
We experienced problems implementing TRADOS because we were not prepared for the sheer
effort involved in implementing this kind of processing. TRADOS reliably processed our
traditional Word and Frame-based documentation, but not our Web content, especially files that
used JavaScript. Our server-side scripted "DSP" pages, of which we had about 800 in one
particular product, required enormous, painstaking engineering effort in order to produce
linguistically sound segmentation that didn't break file functionality. On our startup projects for
each product we got essentially no leverage (the files contained only modest repetition) and took
a lot more engineering time and effort to produce. Our follow on projects required less effort, but
not substantially less.
The main problem we faced was maintaining segmentation, the critical process by which a
translation memory tool identifies a specific chunk of text to be recycled. We had to write
extensive preprocessor tools to “protect” embedded scripting and fool TRADOS's segmentation
engine into recognizing single segments of text in various special situations. The resulting target
language files still had to be hand groomed by an engineer and individually tested.
The localization engineering team at this time consisted of three engineers, two of whom were
TRADOS experts and none of whom had less than six years of experience doing localization.
We had the then-latest version and we took training and used consulting services from TRADOS.
For a single-language product involving six components, our engineering schedules ran far
beyond traditional straight translation of the source files, jeopardizing the release date of the
project. Based on our lack of success, experience in “previous lives”, and the relative success of
our backup plan, we chose to follow four distinct processes for our files:
   •   Pay vendors to run TRADOS on our traditional documentation.
   •   Use a specialized vendor (Mercurius International) with proprietary tools (Termslink) to
       process on-line materials.
   •   Translate the help source as traditional documentation (and then rebuild the localized
       help from it).
   •   Build our own custom tools for the Java and C files, tools that transformed our message
       files into XML structures very similar to XLIFF. These last were capable of trivial “in-
       context 100% matching”, but not terminology management or fuzzy matching.
TRADOS, in other words, was reduced to a small part of our localization activity and outsourced.
Experience varied by product and component, but 30% to 80% leverage was not too uncommon
for second- and third-generation materials.
The Second Round: SDLX
This process served webMethods well for a number of years, but continued demand for
localization, addition of products, and schedule acceleration began to put a strain on the team. In
part this is because webMethods does “moving target” translation in order to achieve
simultaneous shipment. "Moving target" is when you localize the software source files in parallel
with the software development process and only institute a “resource freeze” with the core
development team on a date relatively close to release (so that the resources are changing while
you work). During the course of a typical release, the same files might be processed for
translation four or more times and the software must be cycled on a fairly rigid schedule
(synchronized with check builds of the core product and the schedule of the localization QA
team). Matching and leveraging are crucial to the success of such a project, but turnaround time
on the pre- and post-translation processing by localization engineers is the real driver.
Success at delivering more products more frequently led us to question our tools approach. We
identified that we would be processing several new file formats for which we didn't have a ready
process. Our translation focus changed to include more software and on-line content and less
traditional hardcopy documentation. And we started to experience increased demand for other
languages. We needed to increase the efficiency of our engineering and translation processes and
we began to look about for alternatives.
One of the choices we made was to replace TRADOS with SDLX. SDLX is a product of SDL
Desktop Products, an adjunct to the localization vendor based in Maidenhead, England. We went
back and read the review in MultiLingual Computing & Technology for the most recent version
and starting writing or talking to various current and former associates who had used it (and
other tools).
Trial Balloons
SDL offers a 30-day download of the latest version. This is a fully functional “Pro” license that
can do everything that the full product can. The localization team installed several copies of the
product and began to evaluate how well SDLX would address our needs in each product area.
Two parts were very simple. First, SDLX imported our existing TRADOS memories and
leveraged a very large percentage of several test documents. Then, with a small modification to
our software tools, SDLX happily imported the XML files for our software resources, aligned
them, and performed seamless leveraging.
The proprietary DSP and HTML files took a little more effort. We have hundreds of these files to
process and we choose some relatively good (we thought) examples. SDLX seemed to handle a
small sample in a straightforward manner.
We decided that we would proceed to purchase the product. Realizing that there would be
significant work building initial memory and figuring out difficulties with the product, we
planned to implement SDLX in our process using a non-critical project. webMethods was doing
the initial work to add Simplified Chinese to our language mix. This work was done on already-
shipping products, without the added complexity of simship or product development
coordination and during a coincidental lull in other activities.
Using SDLX
Using SDLX begins by creating a “project.” The engineer selects the files to process, which must
all reside in the same directory. The tool provides a variety of “filters,” which are software
components that know the file format, identify text that is available for translation and break it
into segments. Because file formats are somewhat variable, each SDLX filter provides a myriad
of options that allow you choose exactly how to handle the files in the project. For example, the
XML filter allows you to specify which tags (elements) contain segments and which attributes
are translated.
The greatest variety of filters is for online web content. There is a plain HTML filter, plus
several designed for server-side scripting languages (such as PHP or ASP). The oddly named
WBF filter is the most versatile and is the one webMethods used to process our DSP (and later
JSP) pages.
The various filters each have issues related to how input is converted into segments for
translation and back. Some of the filters try to correct mismatched pairs of tags (for example
matching every "<p>" with a "</p>") — including, oddly, the server-side scripting filter. Each of
the filters handles embedded JavaScript differently.
Editing and Translating
The translator's tool is a Windows program that shows a grid of source and target language
segments side-by-side. Colored segment numbers indicate whether a target language segment is
a 100% match, partial match, locked, and so forth.
The translators also have a large window in which they can see concordances with the memory
or with a TermBase (glossary). The translation environment is easy to use and integrated with the
other SDLX tools. The files (called “ITDs”) are basically Microsoft Access database files.
There are various versions of the translation suite. A key selling point is the availability of a free
version, called SDLX Lite. In order to use this tool, you must purchase the “Elite” version of the
suite (only ITDs produced using Elite can be read by Lite). If your vendors are willing to
purchase “Standard” or “Pro” licenses for themselves, you won’t need a special license version
to prepare translation kits.
Handling Embedded Formatting
The translation tool allows translators to control the way that formatting is interpreted as well. In
most cases, SDLX replaces embedded formatting information with visual cues. The translator
can move the formatting around using a special mode called “Format Painter.” With TRADOS
we frequently found that embedded formatting caused a segmentation break, which makes it
more difficult to create a good translation. With SDLX we found that segmentation problems of
that sort were much reduced.
Another thing that can occur in a segment is replacement text (such as variables in a software
message or server-side scripts in an HTML page). Such an item is represented in SDLX as a
small red vertical bar. As with formatting, the “red bars” can be moved using the Format Painter.
In theory, the contextual information embedded in these tags or markers is displayed as a tool tip
when the mouse is hovered over the bar.
Although it sounds like useful feature, it generally turns out not to be very valuable. The mouse
has to be hovered precisely over a small target for a long time. And since the information
presented there can't be controlled and is interpreted by the filter, it frequently doesn't convey
what the red bar actually represents. There is no provision for comments in the translation kit or
in an item. The Format Painter is similarly painful to use—the translator often can't see what
kind of formatting they are moving around.
Translators hate these features and are especially annoyed by the need to hover the mouse. The
result is a larger number of support calls from translators to the engineers or lead linguist (or a
need to peek at the source).
Once a segment has been translated and formatted and placed into a translation memory, you
don't have to perform the painting task on that segment again. This is great if you start with
SDLX from scratch. Unfortunately, aligning files or importing existing translation memory, even
though they contain the correctly reordered replacement variable information in the translated
segments, loses much of this information; and if you do much with alignment and import, you’ll
want to have a linguist go through and correct segments that contain multiple replacement
variables and other formatting.
The "Lite" version is fully capable for editing files. The Elite license needed to care for and feed
this free editor isn’t too costly, but it can be a distraction if you don’t remember to use the right
license to build your translation kits. In addition, the free editor can’t maintain TMs or
TermBases, so you’ll have to do all of that work using a Pro or Elite license if you go this route.
Generally it isn't too difficult to persuade vendors to acquire the tool for themselves.
No matter which version you use, there is no way to put comments to the translators into the
translation kit or to associate comments with a segment of text. This means that there are often a
lot more questions about context and meaning than with our older tools. It is also difficult to tell
the context of segments in the editor. There is a “preview” option, but it doesn’t work with
server-side scripting languages like JSP, PHP, or ASP.
Similar-looking segments often have very different contexts, and some must not be translated. At
first engineers had to lock these segments by hand (one at a time, so don't have too many).
Later, as we wrote more complex preprocessor tools, we found we could add additional markers
into our files that we could then instruct the filter to treat as an indication of non-translatable text.
One of the key features we looked at when selecting SDLX was the ability to do multiple
translations. Frequently in software there are identical short strings that have different contexts
and thus must be translated differently. For example, one string might be an internal identifier
and thus must not be translated, while another is a display string in a menu (acting as a verb) and
a third as a label in a dialog (acting as a noun or adjective). The ability to differentiate different
uses of the same source string is critical to getting a quality software translation.
To make this work in SDLX we used the additional attributes that our software extraction tools
produced about the source strings. These provided tokens or markers that identified each instance
of a string and allowed us to have separate entries in the TM database for each. The translator
could then produce a separate translation for a new segment or “borrow” one of the existing
translations. SDLX can also use these tokens to leverage like-to-like across versions of the same
product, reducing the need for translator intervention.
Using this capability was difficult, because we found that different versions of SDLX varied in
their support for maintaining the all-important attributes used to distinguish separate translations.
As we encountered bugs in the various filters, we began to spend more and more time interacting
with SDL in an attempt to get fixes for the issues we encountered.
Support from SDL
Like all software, translation memory products have bugs. Unfortunately, obtaining support and
fixes for bugs turned out to be more difficult that we felt it should have been. As software
vendors ourselves, we expected our vendors to leap on bugs and provide fixes. We expected that
future fixes would not be accompanied by regression on previous bugs and that we could get
fixes quickly.
This isn't exactly our perception of the support we received. For example, fixes to one set of
minor filter problems were delivered in a build of SDLX that broke attribute matching. Since our
software translation relied on XML attributes for context, this forced our engineers to run two
different versions of SDLX to process different file types: one with fixes that supported our on-
line documentation and a different (unpatched) version for our software files and their critical
attributes. Since there can’t be two copies of SDLX on the same machine, we needed at least
twice the number of computers for basic file processing.
As we progressed through a series of projects, we encountered more issues, some major and
some minor, with the various filters, imports, and ancillary software. SDL sent us new builds, but
these appeared not to incorporate or "roll up" fixes we had previously accumulated. In other
cases we received builds that broke different functionality. Some bugs were never fixed, despite
being acknowledged by SDL and some were fixed in the next version of SDLX (which we were
asked to purchase) rather than in the code base we were using. We never received a version of
SDLX in which all of the bugs were fixed at the same time. To save time and get projects done,
our engineers began to write small programs to modify the source files to overcome these
obstacles.
The need to run multiple versions of SDLX, which required us to switch computers frequently,
made another aspect of using SDLX particularly agonizing: for a translation “memory” product,
it sure didn’t remember things very well! Each time we went to process a set of files, we had to
recreate the project from scratch, setting all of the settings on the filters again, as if for the first
time. Each setting had to be reset for each item.
This might work well at a vendor site or for post-release file processing, since there may not be a
need to process a set of files more than once. But when there is a need to process many different
collections of the same types files on a regular basis, it is a bit of a headache. Our engineers
maintain lengthy “cheatsheets” with the list of instructions for processing each collection of files
in order to ensure that they don't miss a critical detail that would require starting over. Many of
these cheatsheets run to several pages of closely spaced text.
Later our engineers wrote sets of tools that were integrated with ant build scripts to automate
much of the pre- and post-processing. But SDLX neither supports automation nor has it added
features for remembering common or even the most recent configuration.
Managing Large File Collections
When translating a website or other products that feature many small files of the same general
type, one nice feature is the ability to “batch” files together into larger translation kits. For
example, one of our products has 855 small JSP files that form page fragments. Creating 855
separate translation files (.ITD) would be a major engineering and translation management
headache. SDLX solves this problem by allowing you to “glue” certain file types together for the
purposes of translation and then split the target language versions of the files apart at the end of
the process.
This is an incredible time saver. It has a few flaws, though.
Scripts (both server-side languages such as JSP and client-side scripts such as JavaScript) are
extracted from the files for separate handling (because the processing for these parts of the files
generally differ from that of the parent file). Unfortunately, each (and every) script is extracted
into a separate ITD file. So if you have "file1.html" and it has two scripts in it, you have three
ITD files come out. Then if you have "file2.html" with two more scripts, you now have six files
(or five files if you "glue" the HTML files together).
Once again our engineers were back writing pre-processors to extract the scripting material so
that we didn't end up with 856 ITD files—one with the glued HTML interface and 855 script
ITDs.
In addition, the scripts themselves were not handled very cleanly. Each line of the script was a
separate segment. Exposing JavaScript code to translation means testing the translated scripts or
a great deal of preparatory work locking segments when preparing the translation kit.
In fact this became a recurrent theme. Each file type has its own pre- and post-processor to
ensure highly normalized files for SDLX.
So a major portion of our SDLX tools strategy is devoted to maintaining a suite of tools to
process each type of file in order to get the required results from our translation memory:
consistent, linguistically-sound, accurate, contextual translations with a declining level of effort.
Our Java-based tools now include a variety of tokenizers and text processors controlled by an
Ant script. We built a process that allows us to create a file list for a particular project. Our Ant
script then checks files out of source control, renames, pre-processes and generally sets up all of
the materials for SDLX. Then the localization engineer can swiftly process each type of file
using the appropriate filter, creating the translation kit.
Each ITD file must still be opened by the engineer and reviewed before being sent to translators.
In many cases, using fuzzy matching produces results that must be corrected by hand, since a
98% match may still break software. Most software files use 100% match rather than fuzzy
matching, which means that our leverage has not improved versus our older simple string-
matching tools.
The counter benefit is that the string segments can be used as a TermBase for documentation,
improving consistency with the final user interface there.
There are also tradeoffs to increased efficiency. Getting all of the code (especially JavaScript or
JSP code) out of segments is too difficult, so there is some risk that the translation process will
break the code.
Although SDLX’s choice of MS-Access as a file format is a limitation for large projects, there
are alternatives for segmenting memory files or using Microsoft's SQL Server for storage.
In case it isn't clear, webMethods has never experienced a problem with the actual memory. All
of the bugs and issues webMethods encountered were in the filters, import/export of files, and
the alignment tools. We never experienced a problem with the memory processing or consistency
of the segmentation engine itself.
Perceptions
webMethods' perception of TRADOS and SDLX as products after having used them was quite
negative. The engineers felt that the software was unreliable and that there were many bugs or
shortcomings that required the engineers to develop workarounds. Using these workarounds
made the products seem arbitrarily arcane and difficult.
Fixes were often difficult to obtain — acknowledgement that the issue was a bug was sometimes
difficult to obtain, even with hand-built examples that clearly demonstrated the failings of the
product. Eventually we gave up attempting to get fixes and developed workarounds that relied on
intervention with the content. We settled on a specific product build whose bugs were well
known.
With SDLX, the filters were the most particular problem. There are many to choose from, each
with shortcomings of one kind or another. Unlike the core product, these appear to have been
engineered by small teams or individual engineers and for specific purposes.
For projects with the magnitude of ours, problems with the filters translated into engineering
time, working on scripts and tools to manipulate the source and target content. SDLX did offer
the alternative of writing our own filters in C++ or Visual Basic using the software development
kit, but these, of course, are not supported by SDL and might not be much more portable. With a
perception that future bugs would be blamed on our own filter, we instead chose to work around
the issues.
What concerned us greatly was that the problems we encountered were all obvious issues that
any SDLX (or TRADOS, for that matter) user should encounter when processing the same type
of content. We were often told that we were the first to encounter a particular issue, which
suggests that other users may have given up holding memory tool vendors to normal software
standards.
For example, most Frame documents have a style called "Default ¶ Font", where the style name
contains the pilcrow (“paragraph”) character as shown. SDLX mismapped that character in target
language MIF files in which the target language used a different character encoding than the
source files, such as Japanese or Chinese. The style name was unrecognized by Frame post-
translation. We could hardly have been the first customers to encounter that problem, since
nearly all Frame documents have that style, but that is what we were assured. Ultimately, we had
to write processors to handle character-mapping issues in the MIF filter, and apparently this is
how everyone does it. Really, it should have worked out of the box.
Bugs reappeared in later versions of a particular filter or build. SDL did not appear to maintain
source code “bookmarks” that allow a particular version to be patched, and adjustments to fix
one customer’s problems seemed to adversely affect another customer's implementation.
The promise of CAT is so obvious that a suite like SDLX should bring enthusiasm out of the
users. The actual quality and the perception of tool performance that flows from that quality
made it difficult for engineers and project managers to trust in and accept the tool. It wasn't
comforting that we were asked to purchase a follow-on release before an attempt would be made
to address some of our bugs, some nearly a year old at the time, and our dealings with SDL in
attempts to get bugs solved or obtain workarounds were often confrontational or quite bitter.
Eventually we were able to settle into a routine in which we felt that we obtained some level of
return on our tools investment. The benefit of having a single file format and using state-of-the-
art memory and terminology management began to shine forth again.
Perhaps familiarity has blinded globalization professionals to the shortcomings of our tools and
their vendors. Perhaps these tools have continued to evolve into more graceful solutions over the
past year. It seems more likely that the primary requirement for a localization engineer is
familiarity with perl or Python though.
Results
After a year of experience, the translation process is about as efficient as it was before adopting a
new suite of tools. The engineering effort, in terms of person-hours, is about the same to
accomplish the same amount of translation work. Version-on-version leverage for software and
traditional documentation is not improved. webMethods did gain leverage on Web content,
where no leverage was realized in the past, but this content tends to change more frequently. The
overall cost savings on translation is marginal.
There is a much more consistent approach for the translation process. All of the files are in a
single format and can be leveraged across projects and products. The localization engineers do
have to spend a lot of time on manual tasks in the translation kit and the need to maintain
processing tools for each file type is a hidden cost. Still, the tools strategy overall clearly benefits
consistency and repeatability in the process. No tool, though, seems to provide a breakthrough in
cost or speed.
It does seem that webMethods’ use of tools is on the aggressive “bleeding edge” of what can be
done. Smaller projects with more consistent inputs would probably deliver much better results.
Our experience is that we have achieved much more by regularizing the formats used by core
product teams and improving the internationalization and “localizability” of products than via the
use of tools in the localization process. These changes do eventually “trickle down” to where a
TRADOS or an SDLX can take advantage of them in ways that do improve the cycle time for
translation and engineering.
Overall the choice between using and not using tools is still a no-brainer: without tools
webMethods could not achieve the level of localized product delivery that it does with the
limited resources it applies to the task. However, the benefits appear to be limited and require an
investment to harvest them consistently.

				
DOCUMENT INFO
Shared By:
Categories:
Tags:
Stats:
views:4
posted:1/10/2012
language:
pages:11