docassemble allows you to write a single interview that asks questions differently depending on the user’s language and locale. It also allows Unicode to be used in user-facing text, user input, and documents. With these features, docassemble should be fully usable in languages other than English.
Configuration
There are two different areas of docassemble where language translation can happen:
- In interviews, the active language can be set by the interview logic. An interview can operate in different languages depend on the user’s answer to a question. To support a multi-lingual interview, you can write separate YAML for blocks containing user-facing text. Or, your YAML can reference a translation table in XLSX or XLIFF format that translates phrases from a base language into another language.
- In addition to the interview interface, docassemble has various administrative pages, such as the login screen, the user profile screen, and others. Each phrase that the docassemble system uses on these screens is translatable through a system phrase translation file, which can be in YAML, XLSX, or XLIFF format.
In order for your docassemble server to be able support a
language, you must edit the words
directive in the Configuration
to point to a system phrase translation file. Without this, your
interviews will not be able to translate phrases like “Continue,”
“Back,” or “Sign in or sign up to save answers,” and the system will
fall back to using English.
When a user arrives at an administrative page, docassemble will try to set the active language to what the user expects.
- If the user is logged in and a language is configured in the user’s Profile, this language will be used. Note that by default, an end user cannot configure their own language; usually you would do this on their behalf.
- If the user has already visited the docassemble server already
in the same browser session, docassemble will use whatever
language has been stored to the session. For example, if the user
uses an interview that calls
set_language()
, that language will be stored in the session, and the user will see that language if they subsequently navigate to other pages of the server. - If the user’s browser is configured to request a particular
language, and that language is a language for which docassemble
has translations through the
words
directive in the configuration, this language will be used. - If none of the above methods can provide a language, the default
language specified in the
language
directive of the configuration is used.
When a user arrives at an interview, however, docassemble does not
follow the same rules because the interview might not support all
languages, even if there is a system phrase translation file for the
user’s language. Instead, the language is set to the language
in
the configuration by default, and the interview logic in your
interview should call set_language()
in an initial
block in order
to set it to a different language, if using a different language is
necessary.
The value of language
in the configuration must be a
two-character lowercase ISO-639-1 or ISO-639-3 code. For example,
English is en
, Spanish is es
, French is fr
, and Arabic is ar
.
Another configuration setting is locale
, which primarily
controls the default formatting of currency and numeric values.
The value of locale
must be a locale name without the language
prefix, such as US.utf8
or DE.utf8
. Any locale you use must be
installed in the operating system of the server.
See the other os locales
configuration directive for information
about how to install locales in the operating system of your server.
By default, only the en_US.UTF-8 UTF-8
locale (the locale for the
United States) is installed, so you will have problems if you try to
use other locales in your interviews.
Within interviews, the functions set_language()
and
set_locale()
will change the active language, locale, and dialect.
(The dialect is relevant only for the text-to-speech feature, which is
controlled by the special variable speak_text
.)
If you write functions that need to know the current language or
locale, use the get_language()
and get_locale()
function from
the docassemble.base.util
module. Also note that there is a
function get_dialect()
for retrieving the dialect.
The language and locale settings have the following effects:
- If you have a
translations
block in your interview, then whenever docassemble processes a phrase (text that you can mark up with Mako templating), it will use the appropriate translation of that phrase if a translation for the phrase is present in one of the Excel files referenced in thetranslations
block. For more information about creating these Excel files, see the documentation for the Download an interview phrase translation file utility. - Built-in words and phrases from the “core” docassemble code will
be translated into the active language. Whenever docassemble
prints such a word or phrase, it calls the
word()
function from thedocassemble.base.util
module. Callingword('Login')
will look up the wordLogin
in a translation table. If theword()
function finds the wordLogin
in the translation table for the active language, it will return the translated value. If it does not find a translation, it will returnLogin
. For more information about howword()
works, see functions. For information on how to define translations for a server, see thewords
directive in the configuration. For information on downloading a complete list of these phrases so that you can translate them into another language, see the documentation for the utility called Translate system phrases into another language. - When docassemble looks for a
question
orcode
block that defines a variable, it first triesquestion
s andcode
blocks for which thelanguage
modifier is set to the active language (either explicitly or by operation of thedefault language
). If docassemble does not find any suchquestion
s orcode
blocks, it looks for ones that do not havelanguage
modifier set. This means that if your interview only uses one language, you do not need to worry about setting thelanguage
modifier. If you are using thetranslations
block to translate all of the phrases in your interview, you probably will not need to use thelanguage
modifier onquestion
blocks, but you will need to use it on yoursections
block if you have one. - Some functions have language-specific responses, such as
today()
in thedocassemble.base.util
module, which returns today’s date in a readable format such as “October 31, 2015” (for languageen
) or “31 octobre 2015” (for languagefr
). However, not all languages are supported by thebabel.dates
package; you may need to fall back to a different language. To configure how this works, set thebabel dates map
directive in the Configuration. - When docassemble highlights
terms
in a question (see initial blocks), it will only highlight words specified interms
blocks for which thelanguage
of the term matches the language of the question. Or, if the question does not have thelanguage
modifier set, docassemble will look for aterms
block that does not havelanguage
set. - When docassemble displays
interview help
text, it will only display the content ofinterview help
blocks for which thelanguage
modifier is the same as the language of the question. Or, if the question does not have thelanguage
modifier set, docassemble will look for aninterview help
block that does not have thelanguage
modifier set. - If you have defined default text for various “screen parts” (such as
pre
,post
, andsubmit
) using themetadata
block and you defined values for multiple languages, docassemble will use the value for the current language.
On pages other than interviews, the language that is used for translation purposes is determined as follows:
- If the user is logged in, the user has a “profile” that includes a
field called
language
. If thelanguage
field in the user’s profile is defined, this language will be used for translations. When a user first registers, thislanguage
field is blank. You can use logic in an interview to set it to something. In interviews, you can read this profile by callingget_user_info()
and change it by callingset_user_info()
. - If the user is not logged in, or the user does not have a
language
defined in their profile, docassemble will look for a URL parameterlang
. For example, if you have a web site and you want to put a link on that web site to direct the user to log in to your docassemble server, you could use a URL likehttps://docassemble.example.com/user/sign-in?lang=es
and the user will see a login screen that is in Spanish. Other URLs with which you might want to use thelang
parameter are/user/register
and/list
. - If there is no
lang
parameter in the URL, docassemble will look for anAccept-Language
header in the request from the user’s browser. This is typically set by the user’s web browser to whatever language is defined in the web browser settings. So if the user is a German speaker who is using a web browser that is set up to use the languagede
, then the languagede
will be used for translations. In interviews, you can read this language by callinglanguage_from_browser()
. - If a language cannot be found in the
Accept-Language
header, the language defined by thelanguage
directive in the Configuration is used.
Best practices for single-language interviews
If your interview only works in one language, do not set the
language
modifier for any blocks, do not use default language
,
and do not call set_language()
or set_locale()
. Instead,
simply make sure that the default language
and locale
in the
configuration are set to the correct values.
Best practices for multi-language interviews
If you have an interview that needs to function in multiple languages,
you will need to have initial
code that calls set_language()
.
docassemble does not remember the active language from one screen
to the next , but the initial
code will make sure that it is
always set to the correct value.
Note that the function process_action()
is called after setting
set_language()
. By default, actions are processed at the very
beginning of your interview logic, before your YAML’s initial
and
mandatory
blocks are processed. However, if you have an initial
block that needs to define some “ground rules,” such as setting the
operative language, you need to explicitly call process_action()
in
your initial
block so that the “ground rules” are defined before
actions are processed.
Note that when a user is logged in, they have a user profile, and a
language
field is part of their user profile. The value of this
field will determine what language is used when a user logs in and
visits a page like /interviews
. When a user registers, this field
is left unset, and users do not have the ability to change their
language. You can set the language
field in the user profile
during an interview, using code like this:
If you want to avoid asking the user for their language in situations
where the language
field of the user profile has already been
defined, you can use a code
block to set user.language
without
asking a question
:
You can also avoid asking the user for their language by assuming that
they speak whatever language their web browser is set up to use, which
you can determine by calling language_from_browser()
.
docassemble does not handle language selection automatically because there is no one-size-fits-all solution. For example, suppose a user’s primary language was French, but you had some interviews on your system that were only available in English and German. You would want to give the user a chance to select whether they saw the interview in English or German. Handling language selection in the interview logic allows interview developers to customize the way that the applicable language is determined.
The reason for using an initial
block to set the language is based
on the fact that an interview session can have multiple users, who
might speak different languages. For example, you might have a legal
advice interview where the user may be Spanish-speaking but the
advocate may be English-speaking. Here is an interview where the
user
object is different depending on whether the active user is the
client or the advocate:
If you are writing an interview that offers multiple language options
and you are using the language
modifier or the default
language
, you may want to break out your interview
into different files:
code.yml
- for language-independent initial blocks, interview logic, questions, and code blocks.en.yml
- for English-languagequestion
s andterms
, withdefault language: en
as the first line.es.yml
- for Spanish-languagequestion
s andterms
, withdefault language: es
as the first line.interview.yml
- the main file, which simplyinclude
s the above three files.
Below is an example of a multi-language interview that asks the user
for a language, then asks for a number, then makes a statement about
the number. The interview is split into the four files listed above,
and all files reside in a folder called bestnumber
within the
data/questions
folder.
The contents of code.yml
are:
The contents of en.yml
are:
The contents of es.yml
are:
Finally, the contents of interview.yml
are:
Working with third-party translators
While it is generally a good thing that docassemble allows you to
write complicated question
s that make heavy use of Mako
templating, Markdown, and embedded Python code, a downside is that
all of this “code” can make the translation process more complicated.
Translators may be confused by all of the code, even when you give
them an interview phrase translation file in Excel format. They may
ask you to convert your code to Microsoft Word, putting a great deal
of conversion work on you. Or they may translate what you give them
incorrectly, for example by translating variable names when they
shouldn’t.
You may be tempted to change the way that you code interviews to
optimize for the needs of the tech-phobic translators you hire. For
example, if you have a single question
that has ten different
variations, you may decide to split this into ten separate
question
s so that the translator has an easier time translating.
Or you might decide to remove Mako templating and substitute generic
language that is easier to translate.
However, there is no reason that you should have to make your interview less functional or less maintainable just because the translators you tried to hire were confused by “code.” The solution is to find a different translator. Although hiring a translator other than the “lowest bidder” could increase the out-of-pocket costs of your project, you should think about the net cost of your project, including your own time, and think about costs and benefits in the long term.
While there are many translators who will be confused by having to “translate around code,” there are also many translators for whom “translating around code” is not a problem at all. Companies like Morningside Translations regularly handle technical translations and provide quality control to ensure that the translators do not disturb embedded Python. Shop around before concluding that you have to “dumb down” your interview to make it translatable.
Creating documents in languages other than English
If your interview uses the docx template file
feature, you can
prepare separate DOCX files for each language and then use code to
select which file to use. For example:
This will use the file letter_en.docx
if the language is English,
and letter_es.docx
if the language is Spanish, etc. This is
primarily useful if you are using the translations
block for
translating phrases and you are not using the language
modifier on
your question
s.
The documents feature that allows documents to be created from
Markdown text with Mako templating supports languages other than
English to the extent that RTF, Pandoc, and LaTeX do. LaTeX has
support for internationalization, and the default LaTeX template
will load either the polyglossia package or the babel package,
depending on what is available. The language used by LaTeX can be
set using the metadata
entries lang
and mainlang
in the
attachment
specification. For some languages, you may need to write
your own templates in order to enable fonts that support your
language.
Customizing based on language and locale
The language-specific functions, many of which are used internally
by docassemble object methods, can all be overridden with your own
versions. You can write special functions that should be used
depending on which language is the active language (as set by
set_language()
). For example, there is an internal function
your()
that when called as your('apple')
returns 'your apple
’.
The default language is English, and there are no definitions for any
other languages, so if you want to use a language other than English,
you will need to write alternatives. For more information about how
to do this, see the sections on language-specific functions and
simple language functions.
While many functions depend on the current language, there are a few
that depend on the locale. One is nice_number()
(one of the
language-specific functions that can be overridden). When the given
number is not among the numbers that should be converted to a word
(see update_nice_numbers()
), it is formatted according to the
locale. For example, in locale en_US
(English, United States),
nice_number(6242235.4)
becomes '6,242,235.4'
, but in locale
es_ES
(Spanish, Spain), it becomes '6.242.235,4'
.
Other locale-dependent functions are currency()
and
currency_symbol()
. In locale en_US
(English, United States),
currency(101.34)
becomes '$101.34'
, but in locale es_ES
(Spanish, Spain), it becomes '101,34 EUR'
. In locale en_US
(English, United States), currency_symbol()
becomes $
, but
in locale es_ES
(Spanish, Spain), it becomes 'EUR'
.
The docassemble features that work with currencies are complicated
because the way that currency is represented depends on locale, and
you might want to support more than one locale, but Python’s locale
module assumes that the locale setting is server-wide.
The currency()
function and the currency_symbol()
functions
are both language-specific functions, which means that you can
substitute your own functions in their place in order to have full
control over currency formatting. For example, if you include the
following in a Python module, then whenever the active language is
French, the currency symbol will be € by default and the currency()
function will return the number followed by €, instead of the currency
symbol followed by the number.
You can also override the default currency()
and
currency_symbol()
functions using the catch-all language '*'
instead of 'fr'
. You could use set_locale()
(without
update_locale()
) in an initial
block and use get_locale()
in
your currency()
and currency_symbol()
functions to do different
things depending on the locale.
If you have a specific currency symbol that you want to use on your
server, you can set the currency symbol
directive in the
configuration. This will have a global effect. Suppose you set
this in the configuration:
This is equivalent to doing: