Database transactions and distributed computing
A system that supports reliable transactions will:
- Keep the data consistent at any time, including if failures happen
- Provide isolation among all applications and user accessing the database
In 1983, transactional database properties were summarized by the acronym ACID:
- Atomic: all or nothing for a set of operations, either all accepted or all rejected
- Consistency: time consistency, semantic consistency and constraint consistency, certainly the most flexible property of the four
- Isolated: reads are isolated from transactions being written. There are several levels for isolations. Most important is to avoid dirty reads, commonly accepted isolation level is “Read committed”. But it is not sufficient for all cases especially in compliance contexts
- Durable: once a transaction has been committed, data will stay forever, whatever happens (e.g. crash just after a transaction committed)
In the world or Relational Database Management System, any serious solution complies with ACID properties. Most implementations are not distributed, which greatly simplifies the implantation of ACID properties.
Before addressing the ACID issue with distributed database, we must introduce the notion of eventually consistent and strongly consistent which is part of the distributed database vocabulary.
Distributed databases obviously drop into two main categories for implementing transactions. The first group focuses on performance and simplicity at the price of no certainty for consistency in time. The second category pays a small (when efficiently implemented) trade-off in performance to get certainty that a transaction has actually been written (per row Atomicity) and will remain written (Durability). Eventually consistent systems do not guarantee that a data will be viewed the same, while it is being changed (updated or deleted).
- Eventually consistent systems, by nature, disable ACID properties
- Strongly consistent systems are able to implement per row ACID properties
HBase implements strong consistency combined with replication, which means that a transaction is acknowledged when it has been actually done for all replicates; data is not visible as long as the transaction is not committed and that it is visible and consistent to every reader once committed. However a transaction in HBase does not have the exact same properties as a transaction in an ACID RDBMS.
Is HBase ACID?
The straight forward answer is no, because at least, HBase does not provide atomicity for a set of operations. On the other hand, HBase claims to be ACID per row. In the end, this question sounds very theoretical. Scaled Risk solves a very practical equation:
HBase + Scaled Risk = Transactional system for finance
- Any single table atomicity issue can actually be solved with HBase
- Any multi-table problem can actually be addressed by a per row Atomicity
- Scaled Risk is able to isolate multi-row transactions for reads
ACID properties do not actually work on traditional RDBMS when inserting/updating/deleting 1 billion rows in a single transaction.
Read more and discover how Scaled Risk extends HBase core features to enable off-the-shelf transactional Big Data for the finance professionals by downloading our complete white paper on Transactional Big Data for Finance.