The Kermit Project
|
Now hosted by Panix.com
New York City USA
•
kermit@kermitproject.org
…since
1981
|
Español | C-Kermit | E-Kermit | Kermit 95 | Scripts | News | About | FAQ |
Frank da Cruz
15 October 2013
Last update: Fri Feb 5 20:24:53 2016 Eastern USA time
Beginning in C-Kermit 9.0.304 Dev.06, C-Kermit has some internationalization features in its command language. These are based on the POSIX locale APIs and definitions. For starters, these features are available only in C-Kermit for Unix; let's see how they work out before trying to add them to C-Kermit for VMS.
$ locale LANG=en_US.ISO8859-1 LC_CTYPE="en_US.ISO8859-1" LC_NUMERIC="en_US.ISO8859-1" LC_TIME="en_US.ISO8859-1" LC_COLLATE="en_US.ISO8859-1" LC_MONETARY="en_US.ISO8859-1" LC_MESSAGES="en_US.ISO8859-1" LC_ALL=en_US.ISO8859-1 $
The locale is selected in Unix by environment variables and in VMS by logical names, which are approximately the same thing: variables that can be queried (or set) by any program, or by the user at the Unix shell or VMS DCL prompt. The variables that control localization are:
Table 1. Locale Variables Variable Description LC_COLLATE Used in sorting and collating LC_CTYPE Controls character set handling. LC_MONETARY Controls display input of money amounts. LC_NUMERIC Controls display input of numbers. LC_TIME Controls display input of dates and times. LC_MESSAGES Controls language and character set of messages. LANG Specifies the language to use. LC_ALL Used to set all the above at once.
These are set to values that specify language, country, and character set. The character set you choose should be the one that your terminal window or emulator uses (see this list for Kermit). Here are some examples:
Table 2. Locale Examples Locale Description en_US.US-ASCII English, United States, US ASCII character set es_DO.ISO8859-1 Spanish, Dominican Republic, ISO Latin Alphabet 1 de_AT.UTF-8 German, Austria, Unicode UTF-8 ru_SU.KOI8-R Russian, Soviet Union, KOI8-R character set pt_BR.ISO8859-1 Portuguese, Brazil, ISO Latin Alphabet 1 C The locale that disables all locale processing POSIX Synonym for C
The format of these strings is:
Table 3. Locale String Elements Item Defined in 1. 2-letter language code (lowercase) ISO 639-1 2. Underscore character (_) 3. 2-letter country code (uppercase) ISO 3166-1 4. Period character (.) 5. Character encoding MIME character set names
Unfortunately there is no definitive list of character-set names, and in fact, no standard spelling for them. For example the name used for ISO 8859-1 Latin Alphabet 1 might be ISO8859-1 or ISO88591 or ISO-8859-1. Furthermore, the character-set name (and the preceding period) can be omitted, in which case a default character set is used that is defined for the particular computer.
Different computers support different sets of locales. In Unix, “locale -a” lists all the locales supported on the computer where the command is given. Even when a locale is supported, the scope of the support can vary. For example, it might cover dates and times but not messages or numbers.
Read more about locales here.
To establish an environment that you want to use all the time, put a command like this:
export LC_ALL=pt_BR.ISO8859-1in your shell profile, whose name depends on which shell it is: .profile, .bash_profile, .login, etc. Some shells might not support this construct, in which case you'll need to do:
LC_ALL=pt_BR.ISO8859-1 export LC_ALLFor the exact format and spelling of the value (the part on the right side of the equals (=) sign), consult the links in Table 3 and then type “locale -a” to see what's available on your computer. If you wish you can also define the variables separately.
export LANG=pt_BR.ISO8859-1 export LC_TIME=en_US.ISO8859-1If you would like to switch among different locales on the same computer, you can define aliases if your shell supports them, as bash does:
alias es="export LC_ALL=es_ES.ISO8859-1" alias fr="export LC_ALL=fr_FR.ISO8859-1" alias de="export LC_ALL=de_DE.ISO8859-1" alias en="export LC_ALL=en_US.ISO8859-1"(In bash, these would go in your .bashrc file; every shell has its own conventions.) Then just type "es" at the shell prompt to switch to Spanish, "fr" to switch to French, etc.
C-Kermit 9.0.304 has been changed to inherit the user's prevailing locale unless told not to by:
kermit --nolocale K_NOLOCALE=1 kermitIf you want to use a particular locale even though Kermit has been started with kermit --nolocale or K_NOLOCALE=1, you can use the SET LOCALE command to choose the desired locale.
Once C-Kermit has started, you can use the SHOW LOCALE command to see what locale C-Kermit is using:
C-Kermit> show locale Locale enabled: LC_COLLATE="C" LC_CTYPE="de_DE.ISO8859-1" LC_MONETARY="de_DE.ISO8859-1" LC_MESSAGES="es_ES.ISO8859-1" LC_NUMERIC="de_DE.ISO8859-1" LANG="de_DE.ISO8859-1" C-Kermit>You can also change the locale in C-Kermit with the SET LOCALE command:
C-Kermit> set locale fr_FR.ISO8859-1 C-Kermit>The SET LOCALE command sets LC_ALL (i.e. all the other LC_ items) to the given locale. Note that some or all of this operation can fail; for example if the underlying locale database does not have messages for the given language, even though it might have day and month names. Or it might not have them in the given character set. You can check the results of SET LOCAL by giving a SHOW LOCALE command afterwards.
SET LOCALE with no argument disables locale processing (it is the same as selecting the “C” locale).
There is no way in C-Kermit to set different locale items separately (e.g. German day names, Finnish month names, and French messages) but this could be added if there was any demand.
C-Kermit>set locale C-Kermit>echo \fupper(Grüße aus Köln) GRüßE AUS KöLN C-Kermit> set locale de_DE.ISO8859-1 C-Kermit>echo \fupper(Grüße aus Köln) GRÜßE AUS KÖLN C-Kermit>When no locale is set, the umlauts (ü and ö) are not capitalized. With a German locale, they are capitalized properly.
For UTF-8 encodings, the most likely outcome is that \fupper() and friends work the same way they do when C-Kermit has disabled locales (GRüßE AUS KöLN).
Here are the first two locale-aware features in Kermit:
dd monthname yyyy hh:mm:sswhere monthname is the full name of the month in given locale, for example:
C-Kermit>set locale es_ES.ISO8859-1 C-Kermit>for \%i 1 12 1 { echo \fcvtdate(2016-\%i-01,6) } 1 enero 2016 00:00:00 1 febrero 2016 00:00:00 1 marzo 2016 00:00:00 1 abril 2016 00:00:00 1 mayo 2016 00:00:00 1 junio 2016 00:00:00 1 julio 2016 00:00:00 1 agosto 2016 00:00:00 1 septiembre 2016 00:00:00 1 octubre 2016 00:00:00 1 noviembre 2016 00:00:00 1 diciembre 2016 00:00:00 C-Kermit>for \%i 0 6 1 { echo \%i. \fcvtdate(20160801,\%i) } 0. 20160801 00:00:00 1. 2016-ago-01 00:00:00 2. 01-ago-2016 00:00:00 3. 20160801000000 4. Mon Aug 1 00:00:00 2016 5. 2016:08:01:00:00:00 6. 1 agosto 2016 00:00:00 C-Kermit>
Here's a little script that illustrates locale switching. The “touch /x” command is intended to elicit an access-denied error message:
define xx { # Macro to change locale and try some things. echo \%2 .country := \fupper(\%1) if equ \%1 "en" .country = US set locale \%1_\m(country).ISO8859-1 echo \v(year) \fmonthname(1,0) \fdayname(1,0) touch /x echo "-------------------" } xx fr French xx es Spanish xx de German xx it Italian xx pt Portuguese xx en English exit
And here are the results:
French 2013 janvier Lundi ?TOUCH /x: Autorisation refusée ------------------- Spanish 2013 enero lunes ?TOUCH /x: Permiso denegado ------------------- German 2013 Januar Montag ?TOUCH /x: Zugriff verboten ------------------- Italian 2013 Gennaio Lunedì ?TOUCH /x: Permission denied ------------------- Portuguese 2013 Janeiro Segunda Feira ?TOUCH /x: Permission denied ------------------- English 2013 January Monday ?TOUCH /x: Permission denied
This illustrates how locale support can vary. Notice that although the Italian and Portuguese locales know the day and month names, they do not have localized messages.