In the last few years I have been active with data replication solutions in the Oracle realm as you may know. This data replication field is one that has many angels, so there is something new to learn every day and sometimes there even are really new possibilities!
The first and most familiar form of the data replication forms is ‘physical data replication’, also known as ‘Standby Database‘.
In this form of replication, both source and target database are binary identical. Changes are propagated by copying the archived redo logfile from the source database to the environment for the standby database lives. Most often this is another server, preferably in another building in another town, far enough away to not be struck by the same havoc.
There are basically 3 ways to accomplish this;
- Use Oracle Data Guard (in Enterprise Edition Oracle database)
- Use Dbvisit Standby (in all Oracle database Editions)
- Write your own scripting (not recommended in any case)
The second and more emerging form of data replication is ‘Logical Data Replication’.
In this form of replication, there is not real relationship between the source and the target database, other than that the target database houses data coming from the source database. They can live on different systems, be from different database version, a different operating system or even be from a different vendor.
Data is harvested from the source database, converted and copied over to the target database / system. On the target system this data is being applied, in the native speech of the the target database.
There are a few ways to accomplish this, but basically every vendor has the same technique. It is more a matter of pricing, basically.
- Oracle Golden Gate (expensive, complex)
- Dell Shareplex (somewhat expensive)
- IBM Infosphere (ComPlex, expensive)
- Dbvisit Replicate (easy, affordable)
So, having discussed this, as this is not new, why this blogpost?
A Standby database is more or less closed. You can open it occasionally to query some data, but that interrupts the apply-process.
On-line data replication does what it says, you have an active database, where data is continuously added. This way you can, for example query, the same data on two sources to spread load.
The case I mean to discuss is the following:
“I have 10 source database and I want one target database (ah, presto, on-line data replication) and I want to backup 5 tables from each source to the target database (again, on-line data replication, but wait, backup?) so I can easily copy back specific data to the source (eeeuhm, yes…) whenever a user messes up the source tables (aï…) and I want the target to be update each day at 23:00 (so… okay!)
This reeks after somewhat of a hybrid approach!
We cannot do regular on-line data replication, for this is aimed at being real-time.
And we cannot leverage Standby database, since it needs to be centralized in one database and not 10. Next to that it would take some administration to open up the standby database in read-only mode, take the copy, and close the database again.
- By “delaying” the application of changes to the data, we could make sure the requested tables are only updated from 23:00 on;
- We can combine the 50 tables (10 databases x 5 tables) in one single target database, since it is a logical approach to the matter;
- We can easily restore or copy back corrupted data, since both the source and the target database remain continuously open.
Using Dbvisit Replicate, having this kind of protection for your “logical test-cases”, what this company was doing to require this solution, is really affordable.
It can help in dynamically and quickly resetting specific data-sets or test-cases while remaining much more flexible than creating scripts to reset a specific data-set or test-case! And, of course, there are many more ways to use this neat feature…