The Nuts and Bolts; How to use XLIFF Vocabularies,

Document Sample
The Nuts and Bolts; How to use XLIFF Vocabularies, Powered By Docstoc
					                          The Nuts and Bolts;
       How to use XLIFF Vocabularies, and other Open
       Standards for XML Translation and Localization


                 The OASIS Symposium on the Future of XML Vocabularies

                                     26 April, 2005


                          Bryan Schnabel, Tektronix, Inc.

1   February 1, 2010
    The thing about language translation. . .
     If you are involved in translation, you know just how tricky it
        can be
          – It’s never the easiest part of the operation
          – Some of these topics might really be useful for you



     If you’re not involved in translation, just watch this
        presentation in terms of an OASIS success story
          – The interaction between different XML vocabularies is an interesting
            concept




2    February 1, 2010                     Confidential
    I will tell my story in Parts
     Part I: Good things
          – Why XLIFF is good
          – Why Open Standards are vital
          – Why diversity is good

     Part II: Sketch a quick example (case study)
     Part III: How does XLIFF work?
          – Nuts
          – Bolts

     Part IV: Demonstrations: interacting XML vocabularies
          – Technical Documentation and XLIFF
          – SVG and XLIFF
          – Online Help and XLIFF




3    February 1, 2010                  Confidential
    Part I

    The Goodness: XLIFF, Open Standards, and
     Diversity are each good. And they are
     related. . .




4    February 1, 2010   Confidential
    In some ways, the relationship is hierarchical or
    layered




5    February 1, 2010      Confidential
    In some ways it’s linear




6    February 1, 2010      Confidential
    XLIFF is good
     XML Localization Interchange File Format
    (from the charter, http://www.oasis-
       open.org/committees/xliff/charter.php)
    The purpose of the OASIS XLIFF TC is to define, through XML
      vocabularies an extensible specification for the
      interchange of localization information. The specification
      will provide the ability to mark up and capture localizable data
      and interoperate with different processes or phases without
      loss of information. The vocabularies will be tool-neutral,
      support the localization-related aspects of internationalization
      and the entire localization process.
    ...


7    February 1, 2010              Confidential
    So, why is XLIFF good?
     It’s an open standard
     It has been developed by people from all the important
        disciplines in translation
          – Tool makers
          – Translation and localization vendors
          – End users
                   Technical communicators and technical developers
                   Software developers


     Being and open standard, and having been developed by a
        wide range of disciplines, XLIFF vocabularies reflect diversity




8    February 1, 2010                          Confidential
    Open Standards are vital
     A company’s Information is one of its most valuable resources
     Proprietary tools and formats place Information, at risk
          – The information is difficult to access without the proprietary tool
          – Tool makers are not bound to maintain their tool in ways consistent with
            the requirements of a company

     Open standards facilitate “future-proofing”
     Open standards provide industry standard vocabularies to
        exchange information (peer to peer; vendor to client; apple to
        orange)




9    February 1, 2010                     Confidential
     Afterall. . .
     “Successful business integration relies on agreement
       between parties on the vocabularies that define the
       messages they exchange, and the understanding of
       both the syntax and semantics of the exchanges.”

     William Cox
     Jishnu Mukerji
     Symposium Co-Chairs



     * from The OASIS Symposium on the Future of XML Vocabularies,
        http://www.oasis-open.org/events/symposium_2005/about.php

10    February 1, 2010           Confidential
     Part II

     To set the stage for the next parts, here’s an
      example: a case study

                                         Part III and Part IV
                                         will be framed in the
                                         context of this case
                                         study (mostly)




11    February 1, 2010    Confidential
     Tektronix
      Tektronix, Technical Communications department, is typical of
         many publishing groups
           – Staff of about 40 people
                    tech writers, editors, graphic artists, production people, managers, 2
                     developers, 1 localization project manager

           – Many thousand pages per year
                    paper manuals, cd – dvds, online help, manuals on the web (pdf), marketing
                     product datasheets (html), important internal documents




12    February 1, 2010                             Confidential
For a long time we just published paper, lots of paper

      We got by rather well, for 17 years, using mostly Interleaf




                                    User manual




                                                            TDS3000
                                                            Reference




               Service manual

                                                        Reference card




13    February 1, 2010              Confidential
 Translating with Interleaf was complicated

          Change        changes during translation cycle
            log


                          Translation provider – extraneous activity
                                                                             $$$
                                                                             $$$$          Ileaf
                                              template                                        Ileaf
                                                                                          xlated
                                                                   build Ileaf               xlated
                                                                                           doc Ileaf
            Ileaf
                                 text                                 docs                      xlated
                                                                                              doc
           source
                              extraction                             (DTP)                       doc
            doc


                              $$                                             $$$
                                                                             $$$$
                                                                                           Frame
                                                                  build Frame                 Frame
                                                                                           xlated
          graphics                                                    docs                    xlated
                                                                                            docFrame
         extraction                                                  (DTP)                       xlated
                                                                                               doc
                                                                                                  doc




                                   Translators
                                 (Word in Trados)        $$
                                                                                    Translated documents were
                                                                                      usually not maintained


14   February 1, 2010                                         Confidential
     An unplanned “what if” happened
      Interleaf was purchased by Broadvision
      Broadvision’s roadmap for Interleaf departed from our
         requirements




15    February 1, 2010            Confidential
     We had to make a choice
     We chose to develop a new system based on open
      standards, starting with XML




16    February 1, 2010        Confidential
     The price of change
      “Rescue” Interleaf files
           – Develop a means to transform Interleaf files to XML
           – Develop the means to extract embedded Interleaf graphics (preserving
             fonts and vectors)



      Select XML tools


      Develop XML applications


      Training and implementation




17    February 1, 2010                    Confidential
     Why it was worth the price
      Future proof
           – If any XML tool evolves away from Tektronix’ requirements, just get a
             new tool.
           – Data needs no “rescue operation”

      One source can be published in many formats
           – PDF
           – HTML
           – Online help

      Single sourcing is enabled in a non-proprietary way
      Translation becomes less problematic




18    February 1, 2010                     Confidential
     Let’s explore this point further:

      Future proof
             – If any XML tool evolves away from Tektronix’ requirements, just get a
               new tool.
             – Data needs no “rescue operation”

      One source can be published in many formats
             – PDF
             – HTML
             – Online help

      Single sourcing is enabled in a non-proprietary way
      Translation becomes less problematic




19    February 1, 2010                      Confidential
     XML-based translation
                                                                (b2b XML transaction)



      Transform XML content                             English
                                                                                     XML
                                                                                    (XLIFF
                                                         XML file
            – From authoring DTD                                                   Schema)
            – To XLIFF Schema

                                                                                         Translation
                                                                                         provider
                                                                                         translates
                                                                                         content



      Transform XML content
                                                                                     XML
            – From XLIFF Schema                         Translated
                                                                                    (XLIFF
                                                         XML file
            – To authoring DTD                                                     Schema)




             Translated documents are                   Remember, the translation vendors
            maintained (become assets)                   helped write XLIFF; they are quite
                                                                        comfortable with it
20   February 1, 2010                    Confidential
     Part III

     The nuts and bolts: here’s a brief look at how
      XLIFF works.




21    February 1, 2010    Confidential
     XLIFF and XML Vocabularies
      The XLIFF document stores bilingual source-target strings and
         metadata
      The XLIFF document preserves the structure from the original
         documents
      The XLIFF document can contain metadata to aid in other
         aspects of the translation
           – Word count
           – Translation memory
           – Alternate translations




22    February 1, 2010                Confidential
     What is XLIFF like to the user?
      At Tektronix, for example it goes like this:




23    February 1, 2010              Confidential
     Browse XML Repository for English file




24   February 1, 2010      Confidential
     Make final edits




                                        The writer uses an
                                        “authoring” DTD to
                                        do their work




25    February 1, 2010   Confidential
     Select “Translation Tools” (custom feature)




 An XLIFF file is created from the English XML, and ftp’d to the
     translation vendor (along with the image files, and a composed
     PDF)
26   February 1, 2010             Confidential
     Snip of an XLIFF file: prepared for the vendor
<trans-unit resname="Para" tek:trk="n c 7" id="n-c-7">
  <source xml:lang="EN">Assignments between logical links and protocols
   must always be unique. This is especially important if your measurement
   includes a protocol stack that contains both MTP and LAPx parameters.
   You can check for uniqueness in the <g id="guilabel-x-d0e318"
   xmrk:ancs="5" ctype="x guilabel">Short View</g> pane.
</source>
 <target state="needs-translation" xml:lang=“ZH"> Assignments between
   logical links and protocols must always be unique. This is especially
   important if your measurement includes a protocol stack that contains
   both MTP and LAPx parameters. You can check for uniqueness in the <g
                                          All Hierarchical information
   id="guilabel-x-d0e318" xmrk:ancs="5" ctype="x guilabel">Short is
   View</g> pane.                         preserved in XLIFF file
 </target>
                                          Text is presented as Source and
</trans-unit>
                                             Target elements in English;
                                             state attribute set to “needs
                                             translation”
27    February 1, 2010             Confidential
     Vendor translates the target; changes state
<trans-unit resname="Para" tek:trk="n c 7" id="n-c-7">
   <source xml:lang="EN"> Assignments between logical links and protocols
   must always be unique. This is especially important if your measurement
   includes a protocol stack that contains both MTP and LAPx parameters.
   You can check for uniqueness in the <g id="guilabel-x-d0e318"
   xmrk:ancs="5" ctype="x guilabel">Short View</g> pane.
   </source>
   <target state="needs-review-translation" xml:lang=“ZH" >逻辑链接和协
   议之间的分配必须总是唯一的。如果测量中存在同时包含 MTP 和 LAPx 参
   数的协议堆栈,那么这一点至关重要。您可以在 <g id="guilabel-x-
   d0e123" xmrk:ancs="5" ctype="x-guilabel" >Short View(短视图)</g>
   窗格中检查其唯一性。
   </target>
</trans-unit>




28   February 1, 2010             Confidential
     Translated “pack” is ftp’d back to Tektronix




        Contents:
        • Composed Chinese PDF file (showpages)
        • Translated Image files
        • Translated English / Chinese XLIFF file
29    February 1, 2010                     Confidential
     Import the translated XLIFF file




        • XLIFF is transformed back into authoring DTD
        • Attributes and hierarchy are preserved
        • Translated file is checked into XML repository

30    February 1, 2010                  Confidential
     Translated file can go on its merry way . . .




31    February 1, 2010      Confidential
     PDF file is automatically created for print




32    February 1, 2010       Confidential
     Other media from the same translated XLIFF
                                             Chinese Online Help
                                             generated from
                                             translated XLIFF file




      Chinese HTML
      generated from
      translated XLIFF file


33   February 1, 2010         Confidential
     Part IV

     The demonstrations
      Technical Documentation and XLIFF
        – xliffRoundTrip – Tool that converts any XML to XLIFF, and back



      SVG and XLIFF
           – SVG to XLIFF to SVG – Tool for translating text in image files



      Online Help and XLIFF
           – Translate a document via XLIFF, and publishing each language as an
             online help system




34    February 1, 2010                      Confidential
     xliffRoundTrip tool: any doctype will do
      Complete round trip for ANY well-formed XML instance and
         XLIFF. . .

 Any XML                                                                                     Translated
 instance                xml2xliff.xsl                                                         (same)
                                           Valid                Translate                       XML
                                         XLIFF file             XLIFF file   xliff2xml.xsl    instance




     * xliffRoundTrip tool consists of two XSL files I developed for the XLIFF
       Technical Committee, and a Java application that processes the XML. I
       offer them free to the public under GNU license at
       https://sourceforge.net/projects/xliffroundtrip/


35    February 1, 2010                                Confidential
     How about that Demo?

     Let’s break away from
     PowerPoint for a few
     minutes for a live
     demonstration of
     XLIFF and Technical
     Documentation




36   February 1, 2010        Confidential
     SVG to XLIFF to SVG
     The generic roundtrip can be a good starting point, but sometimes
        complex document types require specified coding. Still, open
        standards should be embraced.

     Using open standards, in this demo we will:

      View an English SVG file in a text editor, the browser, and in
           Adobe Illustrator
          Transform the SVG to XLIFF using only standards (XSLT)
          Translate the XLIFF to some mythical language
          Transform the XLIFF back to SVG
          View the mythical language SVG file in the browser and AI
          Look “under the hood” at the code

37    February 1, 2010               Confidential
     Why do we need more than just the generic
     tool?


   SVG                    perl – “well                                      perl – “back    Translated
 instance                  former”                                            former”          SVG
                                           Valid               Translate                     instance
                        svg2xliff.xsl    XLIFF file            XLIFF file   xliff2svg.xsl


  Namespace challenges
  Less-than-well-formed instance from some SVG tool makers
  Judgment call: given the complexity of an SVG file, the generic
      xliffRoundTrip Tool creates a “difficult-to-read” file (lots of XML “goo”
      which makes use of vocabularies to model the image geometry)




38   February 1, 2010                                 Confidential
     How about that Demo?

     Let’s break away from
     PowerPoint for a few
     minutes for a live
     demonstration of
     XLIFF and Graphics




39   February 1, 2010        Confidential
     Compiling translated XML to online help
     Using mostly open standards (exception: the help compiler), we
         will:


      View an English XML file in the Epic Editor
      Transform the XML to XLIFF using open standards (XSLT)
      Translate the XLIFF to Chinese and German
      Transform the XLIFF to XHTML
      Compile the XHTML to Compiled online help (alas, MS HTML
           Help compiler, free but not open standard)
      View the Chinese and German online help




40    February 1, 2010               Confidential
     Open standards produce non-roundtrip


                                                                 XSLT 2.0
     English                                                   Transform to
     XML file                                                  XHTML files
                         2 XLIFF
                           files                      2
                                    Translate
                                   to Chinese   translated
                                       and        XLIFF      XSLT: Create the configuration
                                     German        files     files drawn from the translated
                                                             XLIFF files
                                                                    - Project file
                                                                                               Online
                                                                    - Index file
                                                                                                Help
                                                                    - TOC file

      Difference: the starting point and ending point are not the
         same doctype




41    February 1, 2010                                   Confidential
     How about that Demo?

     Let’s break away from
     PowerPoint for a few
     minutes for a live
     demonstration of
     XLIFF and Online Help




42   February 1, 2010        Confidential
     questions . . .




43    February 1, 2010   Confidential