Subject Re: Programmatically Changing Language Driver
From Gerald Lightsey <glightsey1@cox.net>
Date Wed, 18 Oct 2006 23:54:42 -0700
Newsgroups dbase.native-tables

On Wed, 18 Oct 2006 16:05:32 -0700, in the dbase.native-tables group,
Ken Mayer [dBVIPS] said...
> If this works as well as Glede says, this would be quite useful for folk
> to work with as a dUFLP tool, hint, hint ...

If you do include it in the dUFLP, you may want to include the caveats
offered by Rick Fillman when I posted a form I was using to select the
tables and the Language Driver to accomplish this same task a few years
ago while Rick was still with dBASE Inc.  I lost my saved newsgroup
records from back then when dBI elected to stop|start|reformat these
newsgrooups so I do not have the original post.

I still have my forms from then and ldWarn.txt that I included as a
result of Rick's input.  ldWarn.txt displays below.

Gerald

Changing dBASE table language drivers entails possible risks of data
corruption, particularly if the table is further used in an environment
where the table language driver is different from the global language
driver.  The user of this form must accept responsibility for its use.

Note from the author:

This form will change the Language Driver ID byte in the header of the
.DBF table you select to the Language Driver ID you desire.  By default
it will also remove the byte in the header that indicates to dBASE a
table has a Production Index attached.  If a Production Index does
indeed exist the program writes out a .TXT file with the name of the
table and the file extension .RND.  This file can be used to completely
reindex the table once the Language Driver conversion is completed.  
Simply issue the command
DO MYFILE.RND in the command window to accomplish the task

The subject of Language Drivers is dBASE is very complex.  The obvious
need for Language Drivers arises out of the many languages in use around
the world.  As explained below by excerpts from dBASE Help files,
further considerations also arise because Microsoft Windows handles
language different than DOS.  This form was developed as an answer to my
own needs to convert some of my tables originally created in Windows
with dBASE Windows compatible language drivers to a language driver
compatible with dBASE DOS.  I wanted to use the tables and build
compatible .MDX files that could be used interchangeably by current
dBASE versions using DOS and Windows.

My own needs required that I be able to import data into dBASE tables
using the command APPEND FROM MYFILE.TXT TYPE DELIMITED and that
underlying binary code be EXACTLY THE SAME as the binary character code
in the source text file.  Also I needed to be sure that when I
subsequently exported data using the command COPY TO MYFILE2.TXT TYPE
DELIMITED the binary character code would be identical to the table data
being copied and also TRUE TO THE ORIGINAL SOURCE, MYFILE.TXT.  There
are two ways to accomplish this.  One is to make sure that the TABLE
LANGUAGE DRIVER matches the GLOBAL LANGUAGE DRIVER.  The second way to
accomplish it, in my experience, is to SET LDCONVERT OFF.  SET LDCONVERT
OFF, however only affects CERTAIN operations when table and global
language drivers are mismatched.  Therefore, continuing to work with
mismatched table and global language drivers entails considerable risk
of data corruption.  Since the user's continued use of language drivers
is beyond my control the user must accept the risk of using this
program.  It will only continue if that risk is accepted.

Gerald Lightsey May 20, 2003

About character sets

In the early 1980s, the developers of the IBM PC created an ordered list
of symbols known as the IBM extended character set. This list contained
all the classic ASCII 7-bit characters, together with various
mathematical symbols, line and box drawing characters, and some accented
characters.

While this was adequate for certain English-speaking countries, it was
insufficient for most other countries. For example, there are accented
characters in various European languages that are not included in the
IBM extended character set. Therefore, a number of other character sets
were developed. Each character set, including the original IBM extended
character set, is contained in a code page. Each code page is designed
for a particular country or group of countries, and each is identified
by a three-digit number.

Some examples of the code pages supported by MS-DOS are:

        437English and some Western European languages
        850Most Western European languages
        852Many Eastern European languages
        860Portuguese
        863Canadian French
        865Nordic languages

These are known as OEM code pages (for Original Equipment Manufacturer).
The classic IBM extended character set is contained in OEM code page 437
and is the default code page for the United States. DOS considers code
page 850 to be the default for most European countries. Code page 850
contains all the letters (but not all the symbols) of code pages 437,
860, 863, and 865; consequently, many of the box-drawing and line-
drawing characters contained in these code pages are omitted to make
room for accented characters in code page 850.

Each character in a code page is identified with a number; this number
(which can be decimal or hexadecimal) is known as a code point. For
example, the code point of the numeric character "4" is 52 (decimal) or
34 (hexadecimal) in code pages 437 and 850.

The Windows environment uses its own character set, which is generally
known as the ANSI character set. Although this character set shares many
characters in common with the OEM code pages, it omits most of the line-
drawing characters and mathematical symbols that these code pages offer.
Furthermore, even characters shared in common between an OEM code page
and the ANSI character set often have different code point numbers.

The global language driver determines the character set used by dBASE
Plus. If you have another product already installed on your system that
uses the Borland Database Engine (BDE), your current language driver is
unchanged when you install dBASE Plus. If no BDE language driver setting
is detected, however, dBASE Plus installs the ANSI language driver by
default.

About language drivers

dBASE Plus uses language drivers to specify which character set to use
and which language rules apply to that character set. For example, the
Canadian French language driver uses a character set that is identical
to code page 863, while the default driver for the United States uses a
character set that is identical to code page 437. It is important to
understand that dBASE Plus uses these internal code pages instead of the
code pages supplied by the operating system.

dBASE language drivers contain tables that define or control the
following for a particular character set:

Alphabetic characters

Rules for         upper- and lowercase
                Collation (sort order) used in sorting or indexing
                String comparisons (=, <, >, <=, >=)
                Soundex values (values that represent phonetic matches when
                exact spellings are not known)
                Rules for translation between OEM and ANSI character sets

dBASE Plus identifies each driver with a character string known as an
internal name. For example, the internal name of the German driver for
code page 850 is DB850DE0, and the internal name of the Finnish language
driver is DB437FI0. The following table lists some of the European
language drivers available in dBASE Plus.

Language or country        Code page        Internal name
Portuguese/Brazil        850                DB850PT0
Portuguese/Portugal        860                DB860PT0
Danish                        865                DB865DA0
Finnish                437                DB437FI0
French/Canada                850                DB850CF0
French/Canada                863                DB863CF1
German                        437                DB437DE0
Italian                437                DB437IT0
Netherlands                437                DB437NL0
Norway                        865                DB865NO0
Spanish                437                DB437ES1
Spanish                ANSI                DBWINES0
Swedish                 437                DB437SV0
English/UK                437                DB437UK0
English/UK                850                DB850UK0
English/USA                437                DB437US0
English/USA                ANSI                DBWINUS0
W. European                ANSI                DBWINWE0

When dBASE Plus converts data from OEM to ANSI, and vice versa, most
alphabetic characters exist in both an OEM code page and the ANSI
character set are converted without problem. Most of the extended
graphic symbols in an OEM code page cannot be represented in the ANSI
character set at all. When such a discrepancy exists, dBASE, like other
standard Windows applications, makes a guess at the nearest character,
but data loss can occur.

Using global language drivers

Each time you start dBASE Plus, a language driver is activated
automatically. This is known as the global language driver. This setting
applies to reading and writing of files, table creation, table-
independent character operations and the output of commands and
functions. For example, the global language driver governs FOR and WHILE
expression evaluations.

dBASE Plus normally chooses the global language driver from the dBASE
Plus Language Driver setting in the BDEADMIN.EXE Utility. Optionally,
you can also specify a global driver in your PLUS.ini file with an
ldriver key. When there is no PLUS.ini entry for a language driver, the
setting in the BDEADMIN Utility determines the global language driver.
When you place a valid driver entry in PLUS.ini it overrides the setting
in the BDEADMIN Utility except when creating tables. dBASE Plus will
always set the new tableÆs language according to the global language
driver specified in the BDE Administrator Utility.

To set the ldriver option in PLUS.ini:

1.        Close dBASE Plus if it is running.

2.        Open the PLUS.ini file (normally located in your Plus\Bin
directory)
        and enter one of the following in the [CommandSettings] section:

        ldriver = WINDOWS

        or

        ldriver = <internal driver name>

        For example, the internal name of a European Spanish language
        driver for code page 437 is DB437ES1; to install this driver,
        insert the following setting:

        ldriver = DB437ES1

3.        Save your changes and restart dBASE Plus.

Use ldriver = WINDOWS to maximize compatibility with the operating
system locale.

Use ldriver = <internal driver name> to specify a Borland language
driver and maximize compatibility with legacy applications. For legacy
applications matching the global language driver to the one previously
in effect will help ensure compatible character handling and processing
of data in the legacy tables.

Using table language drivers

dBASE Plus assigns a language driver to a table automatically when you
create it. This assignment is recorded in the LDID, a 1-byte identifier
in the file header region. When you create a table from scratch, dBASE
Plus always assigns the current global language driver to the LDID. When
you create a table file from another table file, either the global
language driver or the language driver of the original table is assigned
to the LDID of the new table. Which language driver is assigned depends
on the command you use to create the file, as shown in the following
table:.

Assigns global driver                        Assigns original table driver
CREATE                                COPY FILE
CREATE...FROM                        COPY STRUCTURE
CREATE...STRUCTURE EXTENDED        COPY...STRUCTURE EXTENDED
                                        COPY TABLE
                                        COPY TO
                                        IMPORT
                                        JOIN
                                        MODIFY STRUCTURE
                                        RENAME TABLE
                                        SORT
                                        TOTAL
For example, the following commands open a table file and create a new
one with the LDID set to the current global language driver:

use CLIENTS     // LDID specifies a language driver other than global
                // language driver

copy to CLIENTS2 structure extended         // LDID of CLIENTS2.DBF matches
the
                                        // language driver of CLIENTS.DBF
use CLIENTS2 exclusive
create NEWCLIENT from CLIENTS2        // Create a new table with the global
LDID

The following commands open a table file and create a sorted table file
with an LDID set to the original table language driver:

use CLIENTS                   // LDID specifies a nonglobal language
driver

sort on LASTNAME to CUSTOMER        // LDID of CUSTOMER the same driver
                                        //as with CLIENTS

Using table language drivers versus global language drivers

When the language driver of a table differs from the current global
language driver, the table language driver is loaded into memory
automatically when you open the table. Thereafter, the table language
driver is respected by some commands, while the global language driver
is respected by others.

All commands that have nothing to do with a table use the global
language drivers. The following table shows the general rules when
operations are performed on table data where the table language driver
differs from the global language driver.

Table driver                                        Global driver
INDEX ON command expressions                        SET FILTER command
expression
FOR scope expression of INDEX ON command        FOR and WHILE expressions for
                                                every command except
                                                INDEX ON
SET KEY range checking expression                SET RELATION TO expression
SORT command expressions
Secondary matches for expressions in
LOOKUP( ), FIND, SEEK, and SEEK( )
with EXACT set OFF
Secondary matches for SET RELATION TO
expression with EXACT set OFF
(uses the driver of the child table)

For example, when you create a table file with the German language
driver, an LDID identifier is written to the header region of the file.
If the global language driver is set to English and you open the table
in dBASE Plus, it notes the discrepancy between the tableÆs and the
systemÆs language rules. If you create an index with INDEX ON, the
logical order of the index obeys the language driver of the table:

use VOLK                     // Created with the German language driver
index on NAMEN tag DIENAMEN  // Orders records in the German way

By contrast, if you create a filter with SET FILTER, the filtering
condition obeys the global language driver:

use VOLK
set filter to NAMEN = "KONIG"  // Excludes records with "K+NIG" in NAMEN

LDCONVERT and ANSI drivers

dBASE automatically converts the contents of character fields and text
memos fro the character set indicated by the table's language driver to
the global character set used by dBASE when reading the data (and visa-
versa when writing).  Performance is best when the table character set
matches the global character set;  you can turn LDCHECK ON to receive
messages about potential performance problems.

If you choose an ANSI language driver, then no conversion is required
for display under Windows, which may improve UI performance, but this
may be offset by database conversion if you are using old OEM tables.  
If you create new ANSI tables then this is the best situation of all.

You can set LDCONVERT OFF if you need to retrieve binary data from a
character field of a table whose character set doesn't match the global
character set, but it is strongly recommended that you leave this set to
the default ON.  There is no performance cost when the table matches the
global, and if they don't match you will see garbage if it's OFF, and
may put garbage back in the file for other users.

The user of this form must accept responsibility for its use.