Chapter 20
Chapter 20
Database
Replication
In the 1970s, the mainframe computer was the main
instrument used in the delivery of data to the enterprise. Databases were centralized, and
clients were merely dumb terminals. This paradigm, however, met its partial demise because
it was expensive and unfriendly to the user.
In the 1980s, the local area network (LAN) came into
being, and data was distributed among groups of users tied into a common network. This
reduced development costs for some, but fragmented the data into smaller databases.
Organizational data was spread out over multiple locations, which meant much data entry
effort was duplicated and groups did not communicate efficiently.
The 1990s has brought the need for organizations to
communicate on a much larger scale. Wide area networks (WANs) provide a means for
communicating among individuals. The speed and reliability of WANs, however, are not
generally fast enough to allow for constant connection to databases located in other
cities or countries. It is necessary to have databases located locally that communicate
with one another.
The 90s have also brought about the widespread use
of laptop PCs. More and more workers are performing their daily chores off-line. These
individuals want access to data contained on WANs, but are unable to attach economically
from cars, airplanes, hotels, and client offices.
The purpose of this chapter is to show you how to
facilitate the environment in which we now work on a daily basis. You learn about creating
databases that can be copied to other sites. You then learn how to coordinate the changes
made to these databases among users at different sites. You learn about database
replication.
What Is
Database Replication?
When we refer to database replication, we are
talking about the act of creating copies of a database and coordinating the changes made
to the data among all copies. The original database is referred to as the Design Master.
Each copy of the database is referred to as a replica. The combination of the Design
Master and all the replicas of the Design Master is referred to as the replica set. The
act of creating the components of the replica set, and keeping the data contained in it
synchronized, is referred to as database replication.
By performing database replication, you permit users
to work on the data that is most convenient for them to use. This is important in large
organizations with offices in multiple sites, or among organizations with a significant
population of remote or mobile users.
The Microsoft Jet engine allows for several ways to
perform database replication. This includes the use of the Windows 95 Briefcase and the
Microsoft Access Replication Manager, and through programming using Data Access Objects
(DAO). The lesson today focuses on the use of DAO to perform replication.
Why Use
Database Replication?
There are numerous reasons why you may want to
consider using database replication in your Visual Basic 5 database application. If you
work in a large organization, you may need to deploy your application over a wide area
network environment. This typically requires you to keep the main copy of the database,
the Design Master, at the central office, and create replica sets across all the other
offices.
You may also need to build an application for use by
remote users. An example of this might be a customer contact management system for your
sales staff. Each salesperson could have a replica of the Design Master to review and
update while visiting clients. All the salespeople could then update all the changes they
make to the Design Master. In turn, each salesperson could receive all changes made by all
other members of the sales force to the Design Master. This is referred to as
synchronizing the data.
Generally, to back up a database, the data files
must be closed to all users. This is sometimes not practical, however, or even possible.
Database replication can be used in this situation to make a replica of the original
database, without having to close any files or hinder user access to the data contained in
the database.
You might also want to use database replication to
create a static database for reporting. In many applications, such as financial
applications, data changes constantly. Mass confusion reigns if users create reports that
differ each time they are generated. By using replication, you can create an unchanging
copy of the data to a separate database that users can then use for reporting and
analysis.
When Is
Database Replication Not a Good Idea?
Though database replication can be an invaluable
tool, there are scenarios where it should not be deployed. For example, you may not want
to deploy replication when you are delivering data in an intranet environment. Before
deploying a typical Visual Basic 5 database application in a large organization (for
example, an application with a front-end located on a user workstation and the data on a
separate server), you may want to test the performance of a database application that uses
a Web browser as the front end. This can greatly reduce the maintenance required for the
application and the deployment time to individuals.
You do not want to use replication in applications
that are heavily transaction-oriented. For example, an airline would not want to use
replication for a reservation system. It makes little sense for users to work with a copy
of a database that is unreliable, and therefore unusable, the second after the data is
replicated. (Many cynical travelers believe, however, that airlines do use two reservation
systems--one for passengers and one for baggage.)
You also do not want to use replication in a system
where data accuracy is extremely important, such as emergency response systems. In
databases used by law enforcement or fire departments, for example, you might not be able
to replicate data fast enough to be of value to the user. If, for example, a bank is
robbed in Columbus, Ohio, and the criminal is fleeing towards Cincinnati, you may not have
the time to perform the replication so that the police force in Cincinnati has a
description of the criminals. Additionally, the mode of data transfer used in the
synchronization may not be operating due to circumstances beyond your control.
Making a
Database Replicable
The focus of this exercise is to turn an existing
database into a Design Master. We use the REPLMAST.MDB database that shipped on
the CD that came with this text as our original database. Please locate this database in
the \\TYSDBVB5\SOURCE\DATA directory now and place it into the directory you want
to use for this project.
Before we begin, let's open the REPLMAST.MDB
file using the Visual Data Manager (Visdata). This can be done by selecting Add Ins |
Visual Data Manager from the Visual Basic 5 menu. When Visdata loads, select File | Open
Database | Microsoft Access and locate REPLMAST.MDB. Your screen should resemble
Figure 20.1.
Figure 20.1. The REPLMAST.MDB
database before it becomes a Design Master.
Note that there are nine tables in this database. You may also recognize this as a copy of
the guideS5.MDB database that we used in previous lessons.
Now select Utility | Preferences | Include System
Tables. This displays all of the system tables for this database in the Database window.
Your screen should look like Figure 20.2.
Select the Authors table and open the Fields
property. Notice that there are five fields defined for this table. Open the same property
for the guideSales table. Use Figure 20.3 as a reference.
Finally, open the Properties object in the Database
window. Take a look at the properties that currently exist for this database. Your screen
should look similar to Figure 20.4.
The purpose of this quick exercise was to show you
what tables and fields exist within the database. You now create a Visual Basic project
that turns the REPLMAST.MDB database into a Design Master. After that, you return
to Visdata and view the changes made to this database as a result of becoming a Design
Master.
Figure 20.2. The REPLMAST.MDB
database and system tables.
Figure 20.3. The fields of the Authors and guideSales
tables.
Figure 20.4. Database properties before the Design
Master is created.
Creating the Design Master
Start Visual Basic 5 and begin a Standard EXE
project. Add a command button to a form. Set its name property to cmdCreateMaster and its
Caption property to &Create Master. Your form should look similar to Figure 20.5.
Figure 20.5. The main
form of REPLDEMO.VBP.
Save the form as REPLDEMO.FRM and the project as REPLDEMO.VBP.
NOTE: Make sure that you have set the
Microsoft DAO 3.5 object library before performing the exercises in this chapter. This can
be done by selecting Project from the main menu, then choosing Preferences. Find the
option for the object library in the dialog that appears and then press OK.
Now, double-click the command button and enter the
code from Listing 20.1 in its Click event.
Listing
20.1. Visual Basic code for the Create Master command button.
Private Sub cmdCreateMaster_Click()
Dim dbMaster As Database
Dim repProperty As Property
`Open the database in exclusive mode
Set dbMaster =
OpenDatabase("c:\tysdbvb5\source\data\replmast.mdb", True)
`Create and set the replicable property
Set repProperty = dbMaster.CreateProperty("Replicable",
dbText, "T")
dbMaster.Properties.Append repProperty
dbMaster.Properties("Replicable") = "T"
`Display a message box
MsgBox "You have created a Design Master!"
End Sub |
This code opens the REPLMAST.MDB exclusively, creates the Replicable property and
appends it to the database, and then sets the Replicable property to T. Please
note that you must first create this property because it does not exist in a standard
database.
NOTE: Always make a backup copy of your
database before converting it into a Design Master. Once the Design Master is created and
data changes are made, destroy the copy. Later today you will see that making and using
backup copies of the Design Master is dangerous business.
Add a second command button and name it cmdExit, and
use E&xit as the caption. Enter the code from Listing 20.2 into the Click
event of this project.
Listing
20.2. The cmdExit_Click event.
Private Sub cmdExit_Click()
Unload Me
End Sub
Run the project and click the Create Master button. You should see a message box when the
Design Master is created. See Figure 20.6.
Figure 20.6. Confirmation
that the Design Master has been created.
You have created the Design Master. You did not create a new file; rather, you modified
the existing file. Don't try to perform this operation on this same file a second time. A
file can be made a Design Master only once.
Select the Exit button to close the project.
What Happens
to a Database When You Make It Replicable?
The simple routine you wrote and executed in the
preceding example made quite a few changes to the REPLMASTER.MDB database. This
section explores these changes in detail.
Fields Added
to a Replicated Database
Open the Visual Data Manager (Visdata) and load the REPLMAST.MDB
database. Open the guideSales table and then expand the fields. Your screen should look
like Figure 20.7. Compare Figure 20.7 and Figure 20.3 to find the fields that were added.
The following three fields are added to each table
when the Design Master is created:
- s_Generation
- s_GUID
- s_Lineage
Figure 20.7. Fields added
when the Design Master is created.
The s_Generation field identifies records that have been changed. All records start out
with a number 1 in this field. This number changes to 0 when the record is modified in any
way. During synchronization between members of the replica set (discussed later in this
chapter) only the records with a 0 in this field are transferred. This speeds the
synchronization process by requiring the transmission of only the records that were
actually changed.
The s_GUID field is a 16-bit GUID field that serves
as a unique identifier for each record. This number remains the same for each record
across all members of the replica set.
The s_Lineage field contains the name of the last
replica member to update a record. This field is not readable by the users of the
database.
We discuss these fields as we make changes to the
database.
System
Tables Added to a Replicated Database
With the REPLMAST.MDB database still open,
let's take a look at the system tables that now exist. For comparison, refer back to
Figure 20.2 to see a listing of the tables that existed before the creation of the Design
Master.
As you can see, many new tables have been added to
the REPLMAST.MDB database. The purpose of these tables is to keep track of
synchronization activities to ensure that members of the replica set are updated properly.
For a complete description of the tables added, look at Visual Basic guides Online and
search using the phrase "Replication System Tables." Then choose the
"Changes to Your Database" topic.
NOTE: You cannot change the information
contained in most of the system tables that are added when a Design Master is created. The
Microsoft Jet engine makes most necessary changes during the synchronization process.
At this point, let's just explore the MSysReplicas
table by opening it. This table contains information on each member of the replica set. At
this point, there is only one member in this set (see Figure 20.8). In the exercise on
creating replicas later in this chapter, this table gains a record for each replica of the
Design Master that is created.
Figure 20.8. The
MSysReplicas table when the Design Master is first created.
Properties Added to the Replicated Database
The creation of the Design Master added properties
to the database. Open the Properties object in the Database window. Your screen should
look similar to Figure 20.9.
Notice that a property named Replicable now exists
and has a value of T. This means that replicas can now be made of this database.
Also note that a property called ReplicaID was
added. As you might expect, this is the unique identifier for this database. Each replica
receives its own ReplicaID as it is created.
Figure 20.9. Database
Properties after the Design Master is created.
A property called DesignMasterID was also created. This property identifies the Design
Master of the replica set. Notice that the DesignMasterID and the ReplicaID for this
database are the same.
For Microsoft Jet version 3.5, the ReplicableBool
property is new. This property performs the same function as the Replicable property, but
uses a Boolean data type where the Replicable property uses a TEXT data type. Note, that
the value of the property is set to True.
The final property added to the database was
LastUpdated. This field stores the ID of the last member of the replica set to update the
database.
Properties
Added to a Replicated Table
Open the table properties for any table in the REPLMAST.MDB
database. Notice that fields were added to each table during the creation of the Design
Master. See Figure 20.10.
The Replicable and ReplicableBool properties serve
the same function for the table as for database properties. When these values are set to T,
it indicates that the table can be replicated.
Figure 20.10. Table
properties after the Design Master is created.
Physical Size Changes Made to a Database When It Becomes
Replicable
If you're thinking that the addition of these
tables, fields, and properties to the Design Master will increase the size of your
database, you're correct. Approximately 28 bytes are added to each record contained to
allow for the replication feature. This is not much in itself, but when you consider all
the tables in a typical application, and all the records in each table, it can add up to
something significant.
Let's perform some mathematical calculations. Say
that you have a database with five tables--a main table and four validation tables. Let's
say there are 100,000 records in the main table, and 1,000 records in each of the four
validation tables. Adding replication functionality adds 2,912,000 bytes ([100,000 +
4,000] x 28) to the total size of each member of your replica set. As you can see, the
numbers can add up quickly!
In addition to the increase for each record,
replication adds many new tables, each of which takes up hard drive space. The space
requirements of these tables vary dramatically depending on the frequency of
synchronization, the number of members in the replica set, and the number of conflicts and
errors encountered during the synchronization process.
In addition to the physical hard drive space you
consume, remember that you are using up fields in each table to track replication
information. The Microsoft Jet engine allows for 255 fields in a table, including the
replication fields. Although it is extremely rare to have tables with 255 fields, it is
possible.
NOTE: If you have a table in your database
that is approaching 255 fields in size, you should probably be more concerned about
database normalization than you are with the number of fields consumed by replication.
Please refer to Chapter 16, "Database Normalization," for a complete discussion
of database normalization issues.
The Effect
of Database Replication on AutoNumber Fields
A typical AutoNumber field is incremented by 1 each
time a record is added. When a database is made replicable, these fields become random
numbers. Let's look at a quick example.
Open the database AUTONUMB.MDB found in the
\\TYSDBVB5\SOURCE\DATA directory on the CD that shipped with this guide. Now open
the tblSupervisors table as a Dynaset. Your screen should look similar to Figure 20.11.
Figure 20.11. The AUTONUMB.MDB
file before it becomes replicable.
Insert a new record and watch how the ID field increments by 1. Now you can return to the
Visual Basic 5 project REPLDEMO.VBP and modify the cmdCreateMaster Click
event by substituting AUTONUMB.MDB for REPLMAST.MDB. Run the
project and make the AUTONUMB.MDB database replicable.
Now open the database AUTONUMB.MDB in
Visdata. Open the tblSupervisors table and notice what happens to the AutoNumber field
when you add a new record. A random number has been inserted in the AutoNumber field. (See
Figure 20.12.)
Figure 20.12. The
AutoNumber field becomes random after the Design Master is created.
NOTE: The effects of database replication are
not the only reason not to use AutoNumber fields in your application. The use of an
AutoNumber, or Counter, field as a primary key in a data table should raise a red flag for
the developer, indicating that the database is not properly constructed or normalized.
AutoNumber fields should be used sparingly, if at all.
Creating
Replicas
Copies of the Design Master are referred to as
replicas. We now modify the REPLDEMO.VBP project to create a copy of the REPLMAST.MDB
file.
If you need to, start Visual Basic 5 and load the REPLDEMO.VBP
project. Add another command button to your form and name it cmdMakeReplica; insert the
caption &Make Replica.
Next, insert the code from Listing 20.3 into the cmdMakeReplica_Click
event.
Listing
20.3. The Visual Basic code to make a replica.
Private Sub cmdMakeReplica_Click()
Dim dbMaster As Database
`Open the database in exclusive mode
Set dbMaster =
OpenDatabase("c:\tysdbvb5\source\data\replmast.mdb", True)
dbMaster.MakeReplica "c:\tysdbvb5\source\data\copy.mdb",
"Replica of " & "dbMaster"
dbMaster.Close
MsgBox "You have created a copy of your database"
End Sub |
This code first opens the database REPLMAST.MDB (our Design Master), and then
uses the MakeReplica method to create a new member of the replica set named COPY.MDB.
NOTE: Create the COPY.MDB file only
once. Trying to create another replica named COPY.MDB causes the program to fail.
NOTE: Always make a backup copy of a database
before you create a replica. This should be done whether you are creating a copy of the
Design Master, or another replica.
Save your project and execute it. Select the Make
Replica button to create the new database.
NOTE: You can't depend on the traditional
backup and restore methodology to safeguard a Design Master. Changes occur to the Design
Master during the synchronization process. Restoring a backup from a tape drive might
insert a database that is out of synch, and that might not be able to perform
synchronization with other members of the replica set. It is a far better practice to use
replication to create a backup copy that can be made the Design Master in case the
original is corrupted.
Select Exit when COPY.MDB is created. Open
your new replica in Visdata. Explore the properties of the new replica. Notice that you
have all of the same tables.
Now open the MSysReplicas table. When we first
looked at this table in the Design Master, there was only one entry. Now there are two.
Also note that the Description field for the new record is the same description you added
in the MakeReplica method you executed earlier. (See Figure 12.13.)
Figure 20.13. The
MSysReplicas table after creation of a replica.
As you can see, it is quite easy to make a replica. A replica can be made out of any
member of the replica set. For example, you could now create a third member of the set
from either REPLMAST.MDB or COPY.MDB.
Synchronizing
Members of a Replica Set
The act of making data in all members of the replica
set identical is referred to as synchronizing data. In this exercise we make data changes
to the Design Master and the replica you created in the previous exercise, and then
perform a synchronization to apply the data changes to the other member of the replica
set.
Open COPY.MDB in Visdata. Next, open the
Authors table. Add a few records to this table (make them up). Take note of how the
s_Generation field resets to zero when you add a record. The zero tells the Jet engine
that the record is ready to be copied during the next synchronization.
Also make a change to any existing record in this
table. Notice how the 1 in the s_Generation field also changes to zero. Again, this record
is marked to be synchronized. Your screen should look similar to Figure 20.14.
Figure 20.14. Changes to
records cause the s_Generation field to be set to 0.
Open the REPLMAST.MDB database in Visdata, and open the guideSales table. Make a
change to the first record. When we perform the synchronization, notice how changes get
updated in both members of the replica set.
Now close Visdata and open the REPLDEMO.VBP
project in Visual Basic 5. Add one more command button to the form. Name this button
cmdSynch, and set its caption to &Synchronize. Enter the code from Listing 20.4 into
the cmdSynch_Click event.
Listing
20.4. Code to perform a bidirectional synchronization.
Private Sub cmdSynch_Click()
Dim dbMaster As Database
`Open the database
Set dbMaster =
OpenDatabase("c:\tysdbvb5\source\data\replmast.mdb")
dbMaster.Synchronize "c:\tysdbvb5\source\data\copy.mdb"
MsgBox "The synchronization is complete."
End Sub |
This code uses the Synchronize method to copy changes from REPLMAST.MDB
to COPY.MDB, and vice versa.
Run the project and click the Synchronize button.
You receive a dialog box notifying you when the synchronization is complete. Stop the
program by selecting Exit.
NOTE: It is a good practice to compact your
database (repair it first, if necessary) before you perform a synchronization. This
ensures that you are not replicating potentially damaged records that might propagate
throughout the entire replica set.
Now open Visdata once more and load the COPY.MDB
database. Look first at the guideSales table and notice that it now reflects the data
change you made previously in the REPLMAST.MDB database. Open the Authors table.
Notice how the s_Generation field has been updated for the new and the changed records.
This is illustrated in Figure 20.15.
Figure 20.15. Data after
synchronization. Notice that the s_Generation field has a new value.
The s_Generation field is incremented by 1 each time a record is changed and a
synchronization is performed. The replica keeps track of the last record sent to a
particular member of the replica set, and only sends records with record numbers that are
greater than the last record sent, and of course, all records with an s_Generation value
of zero.
Open the REPLMAST.MDB file and its
guideSales table. Notice that the s_Generation field was updated on the record that was
changed.
The
Synchronize Method
In the preceding example, we used the Synchronize
method to perform a bidirectional synchronization of data. This two-way synchronization is
the default implementation of this method. The Synchronize method can also be
used to import information from another database, export changes to another database, and
even synchronize with databases over the Internet.
The structure of the Synchronize statement
is
Database.Synchronize pathname, exchange
where pathname is a string value naming the
destination of the replication, and exchange is one of dbRepExportChanges,
dbRepImportChanges, dbRepImpExpChanges, or dbRepSyncInternet.
Use the dbRepExportChanges to send changes to
another database without receiving updates from that database. Use dbRepImportChanges to
bring in changes from another replica set member without sending out any changes. If you
enter no exchange value, or use dbRepImpExpChanges, data flows both ways during a
synchronization. Finally, use dbRepSyncInternet to perform a synchronization over the
Internet.
NOTE: You need the Microsoft Office 97
Developer Edition if you want to perform data synchronization over the Internet.
NOTE: Be aware that the .MDB format
is used by the Microsoft Jet database engine. The Microsoft Jet engine is used by both
Visual Basic and Microsoft Access. It is common practice by Access developers to store
data, forms, reports, and queries in the same .MDB file. When you synchronize,
changes to forms or reports contained within the database are also synchronized.
Resolving
Synchronization Conflicts
Data conflicts are quite common among members of a
replica set. They can occur when the same record gets changed in different replicas in
between synchronizations. This means that two different users might see two different
values for the record. How does the Microsoft Jet engine know which value should be used?
Better yet, how does it know which value to use and distribute throughout the entire
replica set?
The logic that the Microsoft Jet engine uses to
resolve synchronization conflicts is simple and consistent. The replica set member that
changes the record the greatest number of times wins the conflict. If this number is equal
for all the replica members being synchronized, the Microsoft Jet engine selects the
record from the table with the lowest ReplicaID.
As you remember, the s_Lineage field stores the
number of changes to a record. This is the field that the Microsoft Jet engine examines to
determine which replica set member wins the conflict.
Load COPY.MDB into Visdata and open the
Authors table. Change the first record by changing the name of the Author in the first
record from "Smith, John" to "Smith, Copy." Now open the REPLMAST.MDB
database in Visdata, load the Authors table, and change the Name field of the first record
to "Smith, Curtis." Now save the record and close the table. Reopen the table
and change the DOB (Date of Birth) field to 9/2/64. Save the record and close the table.
You have now changed the first record of the Authors
table of COPY.MDB database once, and the same record in the REPLMAST.MDB
database twice. In a synchronization, which change do you think prevails?
To find out, close Visdata and load the REPLDEMO.VBP
project. Run the project and click the Synchronize button. When you are informed that the
synchronization is complete, close the project by pressing Exit.
Return to Visdata and load COPY.MDB. Open
the Authors table and notice that the first record is updated based upon the values that
were entered into the REPLMAST.MDB database. That is to say, the Microsoft Jet
engine knows that this record changed more times in the REPLMAST.MDB files than
in COPY.MDB, and therefore chooses that record as the one to use in the
synchronization.
But what happened to the change made in the COPY.MDB
file? To find out, close the Authors table, and you notice that a new table was added to
this database during the synchronization process, the Authors_Conflict table. Open this
table and you find a record with the single change. Your screen should look similar to
Figure 20.16.
Open the REPLMAST.MDB database in Visdata.
Notice in the Database window that the Authors_Conflict table does not exist. The error
table created by a synchronization conflict is stored only in the table that lost the
conflict. Open the Authors table, and you should see that both changes made to the first
record were preserved.
Figure 20.16. The
Authors_Conflict table.
Errors That May Occur During Replication
Along with record conflicts, more serious errors can
occur during synchronization. There are several actions that may cause an error during
synchronization. For example, you can implement table-level validation rules after
replicas have been created. This is not bad in itself, but an error occurs during
synchronization if you try to replicate the rule and if a member of the replica set has
entered and saved data that violates the rule.
This same type of error may occur if you change the
primary key of a table. You could try replicating this change only to find that you
receive an error when a replica has two equal values in two separate records in the field
you tried to create as the primary key.
In both cases, you are performing serious design
changes in mid-stream. You should therefore be careful and limit the design changes you
make to members of a replica set.
An error may also occur when one replica set member
deletes an entry from a validation table that has been used by another member in updating
a master record. You receive an error when you try to import the master record into the
replica set that deleted the validation table entry. Each member by itself doesn't violate
referential integrity rules, but when combined, they do so in grand style. To avoid this
situation, make validation tables read-only to all but the Design Master whenever
possible.
NOTE: Try to limit users to read-only access
to validation tables in a replicated environment.
NOTE: Try to avoid using cascading updates
and cascading deletes in your application when replication is used. These features make it
easy for you to cause a large number of synchronization errors.
You might also receive a synchronization error when
you try to update a record that is locked by another user in a multiuser environment. An
entry is written to the MSysErrors system table when you encounter such an error. To avoid
this problem, it is best to have all users locked out of a database during
synchronization.
You might also receive an error if you add a new
table to your database and use the same name that another replica used for a different
table. To avoid this, all members of the replica set need to communicate all database
changes.
In summary, synchronization errors can occur as a
result of design changes, as a result of violation of referential integrity rules on a
consolidated basis, or as a result of record locking by users of a replica set member. You
can avoid most of these errors by completing development before replication begins, by
securing validation tables whenever possible, and by locking the replica members involved
in a synchronization.
NOTE: Errors encountered during
synchronization are stored in the MSysErrors table. This table is replicated during the
synchronization process. Therefore, try to correct all encountered errors before they are
passed to other members of the replica set.
Replication
Topologies
When you implement database replication in your
application, you most likely will make more than one replica of the original Design
Master. When you do, you will be faced with the logistical question of how and when to
update replica set members.
You need to implement a schema for the order in
which data updates get dispersed throughout the replica set. The design of the order in
which replica set members get updated is referred to as the replication topology. We cover
the various topologies in this section. It is important, however, to note that there is no
universal best topology. You need to investigate the needs of your application's users
thoroughly before you can decide on which topology to implement.
The most commonly used topology implemented in
database replication is the star topology. In the star topology, there is one central
database, usually the Design Master, with which all members of the replica set perform a
synchronization. No replication occurs directly between members of the replica set. As an
example, let's assume you created a replica set with one Design Master (DM) and four
replicas (A, B, C, D). To begin, A first synchronizes with DM. Next, B synchronizes with
DM, then C with DM, and D with DM. A, B, C, and D don't talk to one another directly, but
pass all data changes through DM.
The star topology is the simplest topology to
implement. It doesn't require a strict synchronization order be maintained. Replica A
could synchronize after B, and C could synchronize before B. This is therefore a good
topology to use when you are working with a large number of replicas, such as in a sales
force automation application. Users can synchronize in this topology without having to
worry about when other members of the replica set synchronize.
There are two drawbacks to the star topology,
however, of which you should be aware. First of all, the central database with which all
replicas synchronize serves as a single point of failure. If this database is down, no one
can talk to anyone else. You should therefore be prepared to move one of the replica set
members into the central role if necessary. Remember, though, that use of a backup is not
recommended as a means of safeguarding a database in a replicated environment.
The other problem with this topology is that it
permits some replicas to synchronize infrequently, or not at all. This is actually a very
common problem in contact management databases, because some users don't see the need for
sharing their entries with other members of the replica set, or just don't get around to
performing the synchronization.
NOTE: It is not realistic to believe that
humans can stick to a strict replication schedule. Or that they will voluntarily perform a
synchronization if it is difficult. If implemented in an end-user application,
synchronization must be made extremely easy to use, or it will not be used.
A linear topology can also be used for
synchronization. In this topology, replica A synchronizes with B, then B synchronizes with
C, and then C synchronizes with D. To restart the process, D would synchronize with C, and
then C with B, and finally B with A.
A ring topology is similar to a linear topology,
except, the reverse track is not performed. In this scenario, replica A synchronizes with
B, B synchronizes with C, and then C synchronizes with D. Replica D then restarts the
process by synchronizing with A, and then A synchronizes with B, and so on.
The linear and ring topologies are good in that they
do not have a single point of failure. They are bad in that the synchronization can be
stopped, or delayed if one member goes down. Also, the transfer to other members of the
replica set is slower. In a linear topology, a change to C would have to go first to D,
then back to C, and then to B before it is sent to A. This is a total of four
synchronizations.
The fourth topology that can be used in a replicated
database structure is referred to as the fully connected topology. In this scenario,
replica A synchronizes directly with B, C, and D; replica B synchronizes directly with A,
C, and D; replica C synchronizes directly with A, B, and D; and D synchronizes directly
with A, B, and C. This topology requires the greatest amount of work, and should be used
in applications that require constant availability of data.
NOTE: You might want to reconsider the use of
database replication in your application if you are using the fully connected topology to
guarantee data availability. Web-enabled applications with centralized data may be a
better solution.
The topology you ultimately choose for your
application depends on the timeline requirement of data. If this is unknown, start with
the star topology and make changes as necessary.
Keeping
Database Elements from Replicating
There might be some data tables that you do not want
to replicate to other members of a replica set. This might be the case with data that is
highly sensitive in nature, or data that is of little value to other replicas. For
example, you might want to replicate general employee information to remote offices of
your organization, but you might not want to distribute payroll information outside of the
main office. Or, you might not want to replicate a table of office supply vendors used by
your California office to your office in Vermont.
In the following example we create the KeepLocal
property for the Authors table of a new database named KEEPLOC.MDB. This file can
be found in the \\TYSDBVB5\SOURCE\DATA directory on the CD that shipped with this
text. We then convert this database into a Design Master and make a replica named COPYKL.MDB.
This replica does not have the Authors table as part of its object collection.
Start this exercise by loading the REPLDEMO.VBP
project into Visual Basic 5. Add a command button to the form REPLDEMO.FRM. Set
the Name property of this button to cmdKeepLocal, and its Caption to &Keep Local. Now
add the code in Listing 20.5 to the cmdKeepLocal_Click event.
Listing
20.5. The Visual Basic 5 code to keep a table object from replicating.
Private Sub cmdKeepLocal_Click()
Dim dbMaster As Database
Dim LocalProperty As Property
Dim KeepTab As Object
Dim repProperty As Property
`Open the database in exclusive mode
Set dbMaster =
OpenDatabase("c:\tysdbvb5\source\data\keeploc.mdb", True)
Set KeepTab = dbMaster.TableDefs("Authors")
Set LocalProperty = dbMaster.CreateProperty("KeepLocal",
dbText, "T")
KeepTab.Properties.Append LocalProperty
KeepTab.Properties("Keeplocal") = "T"
MsgBox "The Authors table is set to not replicate"
`Create and set the replicable property
Set repProperty = dbMaster.CreateProperty("Replicable",
dbText, "T")
dbMaster.Properties.Append repProperty
dbMaster.Properties("Replicable") = "T"
`Display a message box
MsgBox "You have created a Design Master out of KEEPLOC.MDB!"
dbMaster.MakeReplica "c:\tysdbvb5\source\data\copykl.mdb",
"Replica of " &
Â"dbMaster"
dbMaster.Close
MsgBox "You have created a copy of KEEPLOC.MDB"
End Sub |
This code first opens our database, KEEPLOC.MDB, and sets the KeepLocal property
of the Authors table to T. Note that the KeepLocal property must be set before
the Design Master is created. The program then turns KEEPLOC.MDB into a Design
Master, and creates a replica named COPYKL.MDB.
Now run the application and select the Keep Local
Command button. You are prompted with a Message Box when the KeepLocal property is set to T
for the Authors table, when the KEEPLOC.MDB database is converted into a Design
Master, and when the COPYKL.MDB file is created. Finally, select Exit to unload
the project.
After the program is completed, open the KEEPLOC.MDB
database in Visdata. Expand the Authors table object in the Database Window and then
expand the Properties of the Authors table. Notice that the KeepLocal property is set to T.
This is illustrated in Figure 20.17. Open the guideSales table. Notice that there isn't a
KeepLocal property.
Figure 20.17. To prevent
this table from replicating, the KeepLocal property was created
and set to T.
Now open the COPYKL.MDB file in Visdata.
Notice that this database does not have an Authors table. You have successfully made a
replica of the Design Master and excluded a table!
NOTE: Objects created after a replica is
created do not flow to other members of the replica set. You must first set their
Replicable property to T to replicate them.
Summary
In database replication terminology, the main, or
central, database is referred to as the Design Master. A copy of the Design Master is
referred to as the replica. The combination of the Design Master and all replicas is
referred to as the replica set. Database replication is the process of synchronizing data
so that it is the same across all members of the replica set.
Database replication is a good tool to use in the
development of systems deployed across a WAN or to remote users. Replication can also be
used to make copies of databases that cannot be shut down. Replication is also good for
creating reporting databases.
Do not use database replication when a centralized
data storage facility can be used, such as a Web-enabled application. Also, don't use
replication in heavily transaction-oriented applications, or in applications where
up-to-the-minute accuracy is of paramount importance.
Tables, fields, and properties are added to a
database when it is made a Design Master. The addition of these items is necessary to
track changes to data and to facilitate the synchronization between members of the replica
set. These additions, however, consume additional hard drive space.
Creating and changing the Replicable property of a
database to T creates a Design Master. Once the Design Master is created, you can
use the MakeReplica method to make copies of it. Finally, you use the Synchronize
method to replicate data changes to members of the replica set. Data synchronization is
the act of copying data changes from one member of a replica set to another.
The Synchronize method can be used to
import data changes, export data changes, perform "two-way" data changes, and
even perform data exchanges over the Internet.
Synchronization errors occur when two members of a
replica set try to synchronize records that each has changed. Errors can also occur during
the synchronization process when design changes are made to a database but violated by
replicas prior to synchronization of the changes. Violation of referential integrity can
be encountered by replicas that add records to a database that uses validation records
deleted in another replica. Record locking in a multiuser environment can also cause
synchronization errors.
There are four topologies for the synchronization of
replicas. These are the star, linear, ring, and fully connected topologies. The star
topology is the most common, but like all the other topologies, has certain strengths and
weaknesses.
There may be times when you do not want to replicate
objects contained in one database to other members of the replica set. If such is the
case, use the KeepLocal method before you create the Design Master. This method
keeps the object from being copied to other replica set members.
Quiz
- 1. Define database replication.
2. Cite examples of applications that can make good use of database replication.
3. Cite examples of systems in which database replication should not be used.
4. What fields are added to all data tables when a database is turned into a Design
Master?
5. What properties are added to the database during the creation of the Design
Master to indicate that it can be replicated?
6. How much hard drive space is consumed by a database when it is turned into a
Design Master?
7. What happens to an AutoNumber field when a database is turned into a Design
Master?
8. What method do you use to create a copy of a Design Master?
9. What is the logic that the Microsoft Jet engine uses to resolve synchronization
conflicts?
10. What topologies can be used for database synchronization? Which topology is the
most commonly used?
11. What method do you use to keep database objects from replicating to other
members of a replica set?
Exercises
Design an implementation strategy for the rollout of
a database application that you built to track and deliver your company's employee
information. This application needs to be installed at your corporate office in
Cincinnati, and then delivered to offices in Chicago, Los Angeles, and New York. Use the
following information as you design your strategy:
- The database is named EMPLOYEE.MDB and has
four tables: EmployeeMaster, EmergencyInfo, Education, and SalaryInfo.
- All payroll is performed in the Cincinnati office.
- Updates need to be made at each site, and shared with
all other sites. The order in which updates are made to the database is not important.
- Cincinnati is the largest office. Chicago is the
second largest, and is three times the size of the Los Angeles or the New York offices.
Include the following items as part of your
implementation plan:
- Names of the tables to be distributed to each site
- Backup methodology
- Synchronization topology
- Code to keep the payroll information (SalaryInfo)
from replicating
- Code to create the Design Master
- Code to create the Chicago replica
- Code to synchronize your Chicago and your Cincinnati
databases
-