Cultural Elements in Internet Software Localization

Document Sample
Cultural Elements in Internet Software Localization Powered By Docstoc
					Cultural Elements in
Internet Software
Valentina Dagienė, Tatjana Jevsikova,

Institute of Mathematics and Informatics
Internet Software

   The term “Internet software” here is used
    as a general term to address:
     software, used to access Internet resources
      (usually on a client side),
     web-based applications (server side).
   Provides the context in which the world is understood:
    rules for behavior, communication, interaction and
   Multilevel onion-like models, e.g.: basic assumptions and
    values, with resultant behavioral norms, attitudes and
    beliefs which manifest themselves in systems and
    institutions as well as behavioral patterns and non-
    behavioral items.
   There is a relation between user’s culture and software
    usability. Software can influence culture as well (this
    especially applies to Internet software).
Software Localization
 Software localization is software
  adaptation for particular cultural
  environment (locale).
 Unfortunately, still usually referred to as
  “language translation”.
 Localized software must look and feel as if
  it would have been made for the target
  language and culture.
Solving Culture-sensitive Issues
 At software production time, making it
  language- and culture-neutral and suitable
  for localization (internationalization).
 After software production time, modifying
  the original code at localization time.
An Aim of the Presentation
   To look at software elements that are based on
    culture and cultural conventions (a kind of
    “reflection” of cultural dimensions).
   To classify and discuss the most important
    software elements for successful cultural
    portability, basing on analysis of related
    normative documents and more than 10-year
    experience in software localization.
   A topic of studies by G. Hofstede, F. Trompenaar, E.
    Hall. The cultural dimensions identified by Hofstede offer
    possibility to structure culture according to the five
       Power Distance.
       Individualism vs. Collectivism.
       Masculinity vs. Femininity.
       Uncertainty Avoidance.
       Long-term vs. Short-term Orientation.
   These are categories that organize general cultural data.
    Speaking about software, we can look at software
    elements that are based on culture and cultural
Structure of Cultural Elements in
Possible Users of the Classification

   Researchers: to evaluate the level of
    internationalization of the original software,
    check the user-friendliness of localized software.
   Software developers: to develop better-
    internationalized software.
   Localizers: to adapt more cultural elements to
    the target culture and detect internationalization
Formal Definition of Cultural
   International standard on procedures for
    registration of cultural elements (ISO/IEC 15897)
    defines locale as
    “the definition of the subset of a user’s information
       technology environment that depends on language,
       territory, or other cultural customs”.
   Locale is usually identified by the language, using
    two-letter language code (ISO 639-1), and by
    territory, using two-letter territory code (ISO
POSIX Locale Categories
Set of Formal Definitions of Cultural
Conventions (FDCC), ISO/IEC 14652

 format of postal addresses;
 information on measurement system;
 format of writing personal names;
 format for telephone numbers and other
  telephone information.
International standard on procedures for
registration of cultural elements (ISO/IEC 15897)
   Specifies the procedures to be followed in
    preparing, publishing and maintaining a register of
    cultural specifications for computer use.
   First six clauses coincide with POSIX locale
   Additional information: national or cultural
    Information Technology terminology; personal
    naming rules; inflection; hyphenation; spelling;
    numbering; coding of national entities;
    identification of persons and organizations;
    electronic mail addresses; keyboard layout; man-
    machine dialogue, etc.
Unicode CLDR
(more than 100 locales registered)
 Date and time formats;
 Number and currency formats;
 Measurement system;
 Collation specification (sorting, searching,
 Translated names for languages,
  territories, scripts, timezones, and
 Script and characters used by a language.
Locale Implementation in Software
Locale Defined Elements
(red rectangles)
Language-driven elements:
Alphabets and Names
   Names (identifiers of various objects in Internet
    software, e.g. files, logins, passwords,
    domains...) are not only used by computers, but
    also by humans.
   Names in a native language and script are
    easier to:
     devise,
     memorize,
     guess,
     understand,
     manipulate,
     correct, etc.
Restriction to use in names only English
alphabet letters (in outdated software):
   Forces a user not to use some/all letters from
    his/her native alphabet, but allow using foreign
     Most   languages (even using Latin script), have some
      extra letters:
         e.g., å, š, ž, …

     Some English letters are not used in most of
      languages (using Latin script):
         usually q, w, and x.

     Makes impossible to use characters of non-Latin
The main reasons, why international
characters are not used in names today:

   External: some aspects of restriction for
    character use in names still exist in today’s
   Internal: previous experience on restriction had
    been applied for names affects users not to use
    national characters in names, unless such
    usage is technically possible.
Login Name
   Used in many web-based applications (virtual
    learning environments, e-mail clients, instant
    messengers, etc.).
   Characters:
     Usually only underscores, numbers, and letters from
      the basic Latin alphabet are accepted.
   Some systems use the login name not only for
    internal identification but also for addressing the
    user in the system.
Personal Name
   Today, practically all the software allows using all letters
    of alphabet to write person's first and last name (surname)
    (a user shouldn't change or misspell his/her real name to
    register in the system).
   However, in telecommunications many users avoid using
    their native alphabet and write their names with spelling
       For example, the number of incorrectly written names of Skype
        users varies from 10% to 90% depending on the language.
       Such a great “illiteracy” may be caused by previous experience
        with outdated software or influence of present restriction on login
   Used in software that performs user’s
    authorization (virtual learning environments, e-
    mail clients, instant messengers, etc.).
   Usually may be composed from letters and
   Many programs still restrict the set of letters to
    ASCII alphabet.
   The restriction of the character set available for
    password reduces its security.
Passwords (an example)

   User usually does not think that “letters” in this
    context are only letters of English alphabet, but of
    his native language.
    File/Folder Names for:
   Storing documents on a local computer:
      No technical problems in today’s OS.
   Exchanging documents between computers by removable storage
      Works well as long as the same 8-bit encoding is used in both
   Sending documents as parts of e-mail messages or as their
    attachments, or directly by instant messengers:
      No technical problems. Before sending are encoded in UTF-8
        (%FF sequences) without non-ASCII letters, after receiving are
        decoded back.
   Storing web pages or other web content on a server:
      Theoretically solved, the same method as sending by e-mail.
   Using inside applications:
      A duty of developer to provide user-friendly names for visible
Domain names
   Till 2003: letters of Basic Latin alphabet (26
    letters), digits, dash.
   2003: documents on using international
    characters in domain names were issued (RFC
    3490, RFC 3491, RFC 3492):
      International characters (represented in Unicode) are
       converted to ASCII string (Punycode), and before
       showing it to user, it is converted back to Unicode
       characters again:
    räksmörgå ↔

      Problems: usage of homographs.
Domain names in browsers
Semantically-expressed elements:
Matching of plural and singular forms
 English – 2 forms:
1 object, 2 objects, 10 objects
 Lithuanian, Polish, Russian ... – 3 forms:

   Some European languages, e.g.
    Slovenian, Maltese – 4 forms
Plural and Singular Forms in Other Languages
 No of   No of    Language (example)
 forms   lang.
 1       12       Georgian, Japanese, Korean, Vietnamese,
                  Turkish, etc.
 2       46       Dutch, English, German, Norwegian,
                  Swedish, Estonian, Finnish, Greek,
                  Hebrew, Italian, Portuguese, Spanish, etc.
 3       14       Slovak, Czech, Polish, Lithuanian,
                  Russian, Romanian, etc.
 4       2        Slovenian, Maltese

 5       0
 6       1
Grammatical Name Forms

 In inflective languages (Lithuanian,
  Finnish, Polish, etc.) names in dialog
  windows may appear in various cases.
 'Hello, Jonas' (in English) will be
    'Sveikas, Jonai' (in Lithuanian)
 “%S is logged in“, %S is a user name.
 English:
     John is logged in.
     Mary is logged in.

   Lithuanian (and many other languages):
     John yra prisijungęs.
     Mary yra prisijungusi.
Human-sensitive Elements
   Usually not defined by national or international
    standards (normative documents).
   Depend on deep cultural habits, country or its
    historical unit’s cultural conventions.
   They can also depend on individual persons and
    should be adaptable to person’s habits.
   They are difficult to express in a formal way (e.g.
    include into formal locale definition).
Some Examples
   Icons/Metaphors.
   Images, photos.
   Colour meaning.
   Usage of sounds and videos.
   Examples.
   Jokes and analogies.
   Political statements.
   Navigation scheme.
   Page layout.
   ...
Colour-Culture Chart
(Boor & Russo, 1993)
Color     China        Japan      Egypt        France        USA
Red       Happiness    Anger      Death        Aristocracy   Danger
                       Danger                                Stop
Blue      Heavens      Villainy   Virtue       Freedom       Masculine
          Clouds                  Faith        Peace
Green     Ming         Future     Fertility    Criminality   Safety
          Dynasty      Youth      Strength                   Go
          Heavens      Energy
Yellow    Birth        Grace      Happiness    Temporary     Cowardice
          Wealth       Nobility   Prosperity                 Temporary
White     Death        Death      Joy          Neutrality    Purity
Icons Example: Home Function

MS Internet
Mozilla Firefox
   Mentioned elements are more difficult to
    implement in internet software than in
    autonomously running software:
     they are deeply “grown” into the program,
     internet software has many links with other   software.
   Requirements:
     flexibly adaptable to software and other cultural
     flexibly fitting to each other;
     flexibly chosen by the user (multiple choices).
Existing Ways of Solution
   Cultural Web Spider, designed to extract information on
    culture specific webpage design elements (cultural
    markers) from the HTML and CSS code of websites for a
    particular country domain, that could help to create a
    cultural interface design “look and feel” prototyping tool
    (Kondratova I., Goldfarb I., Gervais R., Fournier, L., 2005).
   Many researchers confirm an importance of the cultural
    dimensions, set by Hofstede. They are used to create
    recommendations for a website navigation scheme and
    content presentation (Marcus A., Gould E.W., and others).
Existing Ways of Solution
   Recent research on incorporation cultural dimensions
    into global software includes attempts to create culturally
    adaptive software, applying AI mechanisms.
   It is also proposed to incorporate culture into a
    usermodel in order to implement adaptable
    personalization mechanisms, assigning Hofstede’s value
    for each cultural dimension according to user’s
    birthplace, country of current and former residence,
    languages, sex, age, political orientation and education
    level (Reinecke K. et al, 2007).
   Existing shortcomings in software internationalization can be
    explained by the lack of categories included in formal locale
    definitions, and lack of compatibility of different locale models.
   While the developed list of cultural elements is limited, we hope that
    it can help to pay more attention to the complex set of cultural
    elements while designing, localizing and testing localized or
    intended to localize internet software.
   Special attention during internet software development should be
    paid not only for a generalized set of elements, defined in existing
    locale models, but also to the ability to use international characters
    in object names (names of logins, files, domains, passwords); an
    ability to include a component for language’s grammatical forms
    generation; usage of parameters in localizable strings should be
    reduced due to different rules of words and phrases composition in
    different cultures.
   Another trend for future work could be some formalization of human-
    sensitive elements, used in software.