Wintec, New Zealand
Hunt, T. (2013). Cost effective software internationalisation. Journal of Applied Computing and Information Technology, 17(1). Retrieved May 6, 2021 from http://www.citrenz.ac.nz/jacit/JACIT1701/2013Hunt_Internationalisation.html
This paper describes the design and implementation of a method for allowing the user interface of a software application to be translated by the end user into any other language. It is proposed that if used by the software industry this technique will increase the availability of software to minority groups. The research involved the modification of an existing email application for children (www.mifrenz.com) by providing a tool for parents to modify the pre-installed translations created using automated translation tools. Standard software internationalisation techniques available with modern programming languages were extensively used. This work resulted in a fully implemented product that has been sold in 11 countries and has confirmed usage in Dutch, French, German, Spanish, Norwegian, Russian, Swedish, and English. It is concluded that with the advent of automated translation tools and giving the end user the ability to modify the translation, as described in this paper, means that it is now possible for all software to be delivered with any interface language at a minimal cost.
Internationalisation, translation, email, children
The term 'Global village' (McLuhan, 1962) refers to how technology has brought people from around the world into contact with each other. However this village speaks many languages and unfortunately not all languages are represented in the user interface of all software products. Efficiencies of scale often drive the languages in which a user interface is available and this can lead to a much reduced choice for users of minority languages especially if the type of product is itself a niche market (Simultrans, 2013).
The smaller the population size of a given spoken language, it can be expected the less likely the interface language of a particular software product will be available in that language. Also, the smaller the market for such a product, the less incentive there is for the software developer to make the effort to translate the interface to a particular language. This results in the limited availability of niche products in minority languages (Simultrans, 2013).
In a survey (McMurtrey, Downey, Zeltmann, & Friedman, 2008) of skills seen as critical for entry level IT professionals, Globalisation was ranked 7th, the lowest ranking. Another survey (Lethbridge, 2000) asked 'What knowledge is important to a software professional?' but did not mention the skill sets of Internationalisation and Localisation. This is despite the fact that current programming textbooks, for example Liang (2011), cover these subjects.
These skills include how to easily switch between different languages but also allowing for different date, time, number and currency formats. However, because the creation of each new interface language involves a significant cost, it is normal for a product to have a limited number of translations available (Simultrans, 2013). For those who do not use one of the commonly translated languages, this will obviously limit the choice of available software applications.
The main technique used by developers to internationalise software is to create language resource files which contain a list of the words that need to be displayed. A separate resource file is created for each language and packaged with the software: if a particular resource file has not been created, the user cannot select that language.
As stated earlier, a reason for not creating resource files is likely to be the cost. Crowdsourcing (Howe, 2006), a practice of getting a task performed by a group of people that offer their services, offers the potential to have foreign translations created at a lower cost.
This paper describes a user customisation method for getting translations created that takes its inspiration from Crowdsourcing. Briefly, an initial translation of the software's graphical user interface is created using an automated translation tool, and this far from perfect translation is made available for software application users to modify and share their translations with other users of the software. Critically the cost is zero.
This paper describes the design changes made to Mifrenz (Hunt, 2008), an email application for children, to allow new interface translations to be created by end users. Users who have the ability to create a translation of the user interface are by definition not the same users who need a translation to use the software. Potential translators will require some motivation for them to spend time creating the translation.
In the case of Mifrenz, it is expected that a parent or school teacher might create a translation for the children in their care and hopefully share the translation with others.
According to a 2003 study (U.S. Department of Commerce, 2003) of Internet use by US citizens, 60% of children aged 3 to 17 years used email or instant messaging. A later study by Zickuhr (2010) reported that 73% of child Internet users aged 12-17 used email. A study (Lee & Chae, 2007) of Korean children aged 10 to 12 in 2007 reported that the girls used email for an average of 3 minutes per day and boys 2 minutes.
Sim, MacFarlane, and Read (2006) noted that despite the large market for games in the UK, "... there is no clearly established methodology for evaluating software for children" (p. 236). They go on to note that software for children is often associated with education and fun. Robertson (1994) also comments on "The lack of attention to educational software..." and suggest the need for "... multidisciplinary design teams, each with individual needs, and including children on testing teams" (p. 257).
The software industry has been aware for many years of the Internationalisation and Localisation techniques highlighted for example in the book by Uren, Howard and Perinotti (1993). The subject has matured to the point such that it is covered in teaching textbooks, for example in the popular Java textbook by Liang (2011). It has been noted (Karat & Karat, 1996) that the basic terms of internationalisation, nationalisation and customisation of user interfaces may need to be clarified, before summarising "that the internationalization issues that we should enable adaptation for include language, culture and standards" (p. 40 ).
The topic is still, however, the subject of current discussion in the literature. Khadam and Vanderdonckt (2011) have recently reported on their 'Flippable User Interface for Internationalization' which they claim is a method for solving the problem of Arabic and Hebrew languages reading from right to left instead of the left to right of Western languages. However, it should be noted that this feature already exists in the Java programming language. Peng, Yang and Zhu (2009) address the need for reengineering existing code that has been written in ANSI format to the preferred Unicode for ease of internationalization. They claim to have successfully implemented a Chinese version of a 'large financial analysis system' by using 'lexical analysis to convert the code automatically'. The requirement to reengineer existing software is also addressed by Wang, Zhang, Xie, Mei and Sun (2009) who describe a technique for finding the "need-to-translate constant strings" (p. 358). They state that the current methods used in tools such as the Eclipse Integrated Development Environment (IDE) risk the introduction of bugs as around 50% of the strings they find do not need to be translated and in some cases these strings are used in SQL statements causing runtime errors.
Rößling (2006) has designed a Java Package that recreates some of the most commonly used Java Swing (graphical user interface) components to deal with internationalisation issues such as different text but also formatting of numbers and different icons. Although this addresses the idea of changing the icon, something that is not often discussed, it does not offer a full solution to the internationalisation problem, particularly as only some of the Java Swing components have been internationalized.
With the advent of Web Services, services such as that offered by Google Translate (Google, 2012) provide an accessible method (for those with internet access) of automated translation. Strings can be translated before the software is shipped or even just when a string is required. However, the quality of such translation is still not as good as that which can be performed by a human translator (Stefansson, 2012).
The low availability of multi-lingual products in niche markets is highlighted by a recent survey of children's email applications (Hunt, 2011). Hunt reports that of the eighteen email applications aimed at children, four were just adult products with parental controls added on (Table 1, pp. 34-5). Of those that had an interface specifically designed for children only one seemed (at the time of writing) to offer multi-lingual support with French, German, Japanese, Spanish, Dutch, Swedish, Portuguese and Italian languages offered (Table 1, pp. 34-35).
Mifrenz (Hunt, 2008) is an email application for children that can be described as Microsoft Outlook for kids. Mifrenz was written in the Java programming language and developed using the Netbeans IDE (NetBeans Community, 2013). The 'write once, run anywhere' Java catch phrase is implemented with both Mac OSX and MS Windows installs available. Mifrenz (Hunt, 2008) connects to an email server for sending and retrieving emails using the SMTP and IMAP protocols respectively. Parents can set up a list of contacts that their children are allowed to send emails to and receive emails from.
This works involves the refinement of a currently existing Object Oriented designed software application. The work builds on the successful use of the Evolutionary Prototype methodology (Davis, 1992) where each new version is a fully working implementation. This methodology particularly suits the author as it "acknowledges that we do not understand all the requirements and builds only those that are well understood" (Davis, 1992, p. 73). In addition the traditional roles involved in software development were all filled by the same person except for the role of child end-user.
As seen from above, it is possible to use a language translation server, to convert strings automatically to create a translated interface either by a just-in-time approach using an API (Google, 2012) or in advance of the software being shipped, using either an API or a more manual approach using a webpage for example Google Translate (Google, 2013). This author does not consider that automated translation alone is a valid method for providing a translated user interface. Therefore, in this work, the idea of providing the ability for the end user to change the automated translation, or indeed, create a new translation, was investigated.
The Java programming language provides the concepts of Locales and Resource Bundles and these were used extensively. A Locale object is used to represent a particular language and country while the resource bundle contains the translated words or phrases. When the code needs to display a text string to the user, the code finds the appropriate string from the resource file (via an instance of the PropertyResourceBundle class) by supplying a key. For example, if the code needs to display the English phrase 'Add Child', instead of the string 'Add Child' being directly displayed, the key 'ADD_CHILD' is used to retrieve the string 'Add Child' from the English language resource file. Figure 1 shows a small section of the English resource file.
When a phrase from a different language is required, the same key is used but the different translation is retrieved as shown in Figure 2 that uses the resource file for the Urdu translation.
The availability of automated translation services gives the opportunity for a starting point for translation. These services were used to create very basic translations, as for the Urdu language shown. It is acknowledged that these translations are far from perfect but they make a good starting point for a human to work from.
The Java resource bundle architecture provides a very useful 'fall back' facility for dealing with missing translations. When code attempts to find the string from a resource file for a key that does not exist, it will attempt to obtain the string from another resource file. The resource file chosen will be the one that is of the same base language but with no country specified. If it still fails to find the matching key it will then revert to the default language file where the key should exist, but the string returned will be in English rather than the required language. Having said that, it is possible that the key/string pair was never added to even the default language file and this will cause a runtime error. To prevent any such runtime errors causing problems, all such lookups need to be wrapped with exception handling. In the case of Mifrenz (Hunt, 2008), once these exceptions are caught, a generic string is returned to the calling code.
Figure 3 shows an annotated screen shot of the Graphical User Interface (GUI) that a parent or administrator can use to create a new language resource file.
Referring to the labels in Figure 3, label 1 shows a Combo box that allows the user to choose the 'Original' language that is displayed in column A of the table. The concept of original language is a list of language translations that have been approved by the software vendor, as opposed to translations that have been created by the end user or some other third party. The original language therefore provides a reference back to the correct meaning of word or phrase that should be displayed. For example, the translation for the phrase 'Add Child' will be displayed where the developer originally expected 'Add Child' to be displayed, for example, on a button to add a Child user. Once the user has selected the original language he or she can then select the language on which to base the new translation. In Figure 3, the user has chosen the Estonian language (label 2). The current Estonian language file is now used to display the text in column B. Assuming that this translation was the original translation supplied with the software, it is likely to be of a dubious quality as it was created by an automatic translation tool. However, it will be a good starting point and is also copied to column C to prevent the user having to retype the majority of the text. Labels 3 and 4 show the Combo boxes that allow the user to select the language and country of the new translation that they are creating. In this case, they have chosen to create a translation for the Estonian language as spoken in Estonia. Pressing the Save button will cause a new language translation file to be created for Estonian/Estonia.
After the file has been saved, the file will now be available for sharing with other users. Label 5 shows the Combo box (after it has been refreshed) from which the new translation can be selected. Selecting the Share button, will send a copy of the file to the company's server where it can be checked and made available to other uses. Label 6 shows the Combo box which lists translation files on the company's web server that are available for importing - in this case the a file for the Afrikaans language is displayed.
Applications that allow users to type a significant amount of information often contain a spell checker. Spell checkers rely on a dictionary file of the correct language being used. The popular word processor programme Microsoft Word (Microsoft Corporation, 2013a) provides for the ability to install additional dictionaries (Microsoft Corporation, 2013b) as the user requires: presumably to avoid an otherwise wasteful use of computer resources installing every available dictionary. Mifrenz (Hunt, 2008) provides a similar feature where the user is provided with the ability (label 7) to download from the company website any new dictionary files that become available.
As mentioned in the literature review, creating a multi-lingual application requires more thought than just providing multiple language files.
The word for the same meaning is likely to be a different length in different languages. For example this can be seen when the English phrase 'Create a new email' is translated into German (by an automated tool) to give the phrase 'Erellen Si eine neue E-mail'. The difference in string length means that the size of the component that is used to display the phrase to the user may either be too big or too small resulting in a non-optimum display layout. It is therefore advisable to use a layout tool such as the Java layout manager called Free Design that dynamically changes the size of components at runtime.
Some languages display text from right to left across the page rather than from left to right. Through the use of selecting the appropriate Locale, Java automatically displays the text in the correct manner and even automatically reverses the direction of component layout.
Sound files containing spoken words are another source of translation difficulty. Rather than try to allow the user to record new sound files, it may be easier to just do without such files altogether. An example of how this can be done is the replacement of a file that says 'You have new mail', with a file that plays the sound of someone knocking at a door.
Design of the child interface focused on the idea of what can be left out, compared with email applications for adults. As well as attempting to make the application less complicated, it also enabled the main graphical components to be much larger than could otherwise be achieved.
The techniques described have been successfully implemented into the Mifrenz (Hunt, 2008) email application available for download from http://www.mifrenz.com. Figure 4 shows the main child interface using a Simplified Chinese translation. The initial automated translation has been checked and modified by a native Chinese writer.
As agreed with the users, during the trial period, data such as user country and language is collected. Although the sample is small, of the 63 purchases made at the time of writing, 46 of the users were known to use the software with the English interface and 6 users used a non-English interface (1 of each of the following: Dutch, French, German, Russian Spanish and Swedish). The language being used by 11 of the users could not be determined. A further 1 user used the Norwegian interface but did not go on to purchase the software.
Figure 5 shows the country location of users who have purchased the software. As can be seen the large majority (88%, excluding Canada and unknown countries) of the users are from predominantly English speaking countries. These statistics highlight the commercial pressure for software developers to concentrate their efforts on producing software with an English interface and possibly neglect to develop user interfaces for other languages.
Of course these sales may reflect the fact that the Mifrenz website is only available in English and Chinese (not that there are any sales from China) and so has not reached the attention of many non-English speakers. Website analytics shows this is indeed the case with 95% of all visits in 2012 being from users with their internet browser software set to the English language. At this stage it is unknown if any of the users have modified the interface language.
Users of Mifrenz (Hunt, 2008) are given the option of using the software without purchasing it. In return they agree to allow data concerning which features of the software they use. This data will be analysed to determine which of the implemented features are actually used, in effect giving usability study feedback.
Although the author believes that all of the string constants in the Mifrenz (Hunt, 2008) code have been translated, there is still the possibility that some have not been discovered and so could appear to the user in English rather than the chosen language. This is an on-going issue with any software project if the code is still being actively developed and maintained. In the case of Mifrenz (Hunt, 2008), increasingly this work is being performed as part of student research. It would therefore be very useful to employ an automatic technique to find and list any string constants that should be but have not yet been translated. The author made an initial evaluation of the Netbeans Internationalization (NetBeans Community, 2013) wizard but found that it introduced more errors than it solved and so this work is unlikely to be straightforward.
Full Internationalization of a software product should look beyond just string translation. Although the current version of Mifrenz (Hunt, 2008) allows users to choose different 'skins' (colour schemes), there is probably benefit to allow users to have finer control over the colour of screen objects to allow for a range of cultural preferences. In addition, although the images used in Information Technology are relatively cultural neutral, it may be that providing a selection of icons or allowing users to use their own icons, for the various GUI components may also meet a cultural preference.
Determining how many users actually use the ability to customise the translated interface would shed insight into whether this feature is useful, or just something that sounds like a good idea but is not actually of real benefit. If it turns out that the feature is not used, it may be that automatic translation has reached a state that is suitable for this type of software application.
The cost of developing multi-lingual interfaces is likely to deter many software developers and companies to commit resources to language translation when the majority of software users are from English speaking countries. This presents a major obstacle to the accessibility of software, particularly for users from minority interest groups who also speak a language that is not catered for. Although automated translation services are now available and improving, they are still not sufficiently robust to provide a quality solution. This work has presented a cost effective (interested users make the changes without receiving payment) solution to the problem that utilizes user customisation for the development of any interface for a software product. It is hoped that this ultimately increases the availability of software to minority groups of users. The methods and techniques presented here have been demonstrated to work in a commercial product. No esoteric techniques have been used in this work and so the results have a wide ranging application within the software development industry where users are motivated to improve the availability of a software product in a given language.
The author would like to thank Wintec and Mifrenz Ltd for permission to publish this work.
Davis, A. M. (1992). Operational Prototyping: A new Development Approach.
Google. (2012, 04 20).
Google. (2013, 02 01).
Howe, J. (2006, 6).
Hunt, T. D. (2008). Mifrenz: Safe email for Children.
Hunt, T. D. (2011). Any language you choose: internationalization of a children's email application.
Karat, J., & Karat, C.-M. (1996). Perspectives on design and internationalization.
Khaddam, I., & Vanderdonckt, J. (2011). Flippable user interfaces for internationalization.
Lee, S.-J., & Chae, Y.-G. (2007). Children's Internet Use in a Family Context: Influence on Family Relationships and Parental Mediation.
Lethbridge, T. C. (2000). What knowledge is important to a software professional.
Liang, Y. D. (2011).
McLuhan, M. (1962).
McMurtrey, M. E., Downey, J. P., Zeltmann, S. M., & Friedman, W. H. (2008). Critical Skill Sets of Entry-Level IT Professionals:An Empirical Examination of Perceptions from Field Personnel.
Microsoft Corporation. (2013a, 02 01).
Microsoft Corporation. (2013b, 02 01).
NetBeans Community. (2013, 1 31).
Peng, W., Yang, X., & Zhu, F. (2009). Automation technique of software internationalization and localization based on lexical analysis.
Robertson, J. W. (1994). Usability and children's software: A user-centered design methodology.
Rößling, G. (2006). Translator: A Package for Internationalization for Java-based Applications and GUIs.
Sim, G., MacFarlane, S., & Read, J. (2006). All work and no play: Measuring fun, usability, and learning in software for children.
Stefansson, I. (2012, 12 1).
U.S. Department of Commerce, Economics and Statistics Administration, U.S. CENSUS BUREAU. (2003).
Uren, E., Howard, R., & Perinotti, T. (1993).
Wang, X., Zhang, L., Xie, T., Mei, H., & Sun, J. (2009). Locating Need-to-Translate Constant Strings for Software Internationalization.
Zickuhr, K. (2010, 12 16).