Perhaps, the HA (High-Availability) feature is one of the most important in any fully-featured database management system. It refers to an ability to minimize system down time while continuing normal operation of service in the event of hardware, operating system, service, or network failure. This ability is a critical element in the network computing area where services should be provided 24/7. This is why it is one of the must-have Enterprise level features.
The CUBRID HA feature has a shared-nothing architecture and is based on transaction log replication which guarantees 100% database consistency between master and slave database servers. This is different from that of MySQL Replication, which is "asynchronous and does not guarantee that data from the master has been replicated to the slaves" (see MySQL HA). To perform synchronous replication in MySQL, users have to either purchase external solutions or use them under open source license terms. All-in-one CUBRID Database with built-in HA feature makes it easier for users to provide continuous high-availability and 100% data consistency.
There are two ways to configure the CUBRID HA to keep databases synchronized throughout multiple server systems:
1. Database Server Duplication
The Figure 1 shows a common HA configuration where a database server is duplicated (indicated as a node A and node B in the figure). In case a master database server (node A) fails, the requests failover to a stand-by server (node B), which becomes a master server. Once the failed server (previously master server) is recovered, it becomes the slave database server.
2. Broker Duplication
The Broker is a CUBRID-specific middleware which is a part of the CUBRID's 3-tier architecture. It relays the communication between the Database Server and external applications be it the CUBRID Manager, the JDBC or PHP Connectors. It provides functions including connection pooling, monitoring, and log tracing and analysis.
The Figure 2 shows another HA configuration where a broker is duplicated (indicated as a broker b1 and broker b2 in the figure). In case a broker b1 fails, the requests failover to a broker b2. Once the failed broker (broker b1) is recovered, the requests failback to the broker b1. Such configuration is only possible in database systems with a 3-tier architecture, which CUBRID represents.
CUBRID HA Development History
The CUBRID HA feature was first implemented in CUBRID version 2.0 based on the Linux Heartbeat daemon which allows clients to know about the presence or disappearance of peer processes on other machines. However the Linux Heartbeat, as an external package, was not optimized for CUBRID, thus, difficult to maintain.
Therefore, in CUBRID 2.2 the Development team has dropped the support for the Linux Heartbeat package and implemented the native CUBRID Heartbeat component with accent on reliability, integrity and ease of use. The CUBRID Heartbeat feature is included in the cub_master process. It exchanges heartbeat messages with cub_master processes of other nodes and executes failover on the standby server when a failure is detected. It also monitors the availability of the HA related processes (cub_server, copylogdb, applylogdb) on a regular basis. Thus, since version 2.2 the CUBRID HA feature is being developed completely in-house.
These days NHN Corporation, the largest CUBRID user, deploys CUBRID with HA mode enabled in 70% of all its CUBRID deployments.
HA Improvements in CUBRID 8.4.0
There are three major directions when it comes to improving the CUBRID HA feature in the 2008 R4.0 version. They are Convenience, Stability, and Functionality.
Convenience
As the High-Availability feature is one of the most complex implementations of any database system, DBAs report many issues pertaining to much efforts to configure, deploy, and later analyze the logs. Therefore, the CUBRID HA in version 8.4.0 is meant to bring ease of configuration, administration, and use, so that any one-month-fresh DBA can enable the HA support for their CUBRID database servers.
Below is what the CUBRID HA Development Team has already performed for CUBRID 8.4.0.
- Revised configuration scripts that are easier to edit and reconfigure.
- Revised and added the CUBRID HA utilities - additional commands and tools to manage the HA feature.
- Revised all error codes and messages to display more comprehensible messages.
- Improved HA monitoring tool - the feature frequently requested by most DBAs.
- Improved HA logs representations to display readable outputs.
- Improved manual and documentations to facilitate DBAs' learning process and work.
Stability
For the sake of increased stability of the CUBRID HA feature, the HA Dev Team has performed intensive code refactoring, cleaning, and quality assurance. The internal algorithms are also modified and improved, especially for log duplication. For instance, see Figure 3.
CUBRID HA provides three copy modes for transaction log duplication:
- Sync mode (most stable mode with no data loss)
- Async mode (fast but with data loss)
- Semi-sync mode (mid-fast and no data loss)
Functionality
The most important in the HA is its functionality. The HA feature in CUBRID 8.4.0 release will provide the following new features:
- New PHRO broker mode: PHRO (preferred host read only) is another RO mode, which enables the broker to lookup the prioritized hosts first before other hosts so that DBAs can easily control (configure) the connection order. As a result, there will be four broker modes: RW (read write), RO (read only), SO (slave only), and PHRO (preferred host read only).
- New M:S:R configuration: In CUBRID 8.4.0 there will be not only M:S (Master:Slave) but also M:S with multiple replicas. This configuration will provide much lower networking and disk I/O loads for heavy read loaded Web services, than the single master and multiple slaves configuration does.
- Multiple slaves in a single server: This is best for small size Web services whose primary goal is to lower the cost. This configuration will allow to gather all (or some) slave databases on a single host. Thus, it will be possible to allocate a single physical machine for multiple slave databases, each of which are separately replicated from their own master databases.
Conclusion
Currently, the CUBRID HA is known for its stability and reliability. However, the new version will be even more powerful and much more easy to use, so that even any one-month-fresh DBA can enable the HA support for their CUBRID database servers. Besides, the CUBRID HA feature will best fit the needs of enterprise services which require 100% data consistency between master and slave database servers and continuous availability."The CUBRID HA is the most distinguishing and key feature of CUBRID for Enterprises. Therefore, we will continue improving it, and provide a reliable and easy to use HA solution to ensure the data availability and consistency throughout all data servers," says Wonhee Jeon, a leader of the CUBRID HA Development Team.
To learn more about other new features coming in CUBRID 8.4.0, read Roadmap: What to Expect in CUBRID 8.4.0.