A transaction is an abstraction that represents an isolated unit of work and guarantees its atomicity, consistency, and isolation – properties essential in database administration.
Bank apps usually do not permit Alice to withdraw funds before depositing them, since this violates their atomicity property. Furthermore, they wouldn’t permit Alice to send Bob money before first sending some to herself first.
ACID properties
ACID properties (Atomicity, Consistency, Isolation, Durability) guarantee database transactions’ integrity for reliable business applications and applications in general. They guarantee that changes to data will remain intact even in case of system failure or crash.
Atomicity ensures that transactions are treated as single, indivisible units; their operations either succeed or fail as one unit, thus avoiding unintended results such as debit from one account triggering overdraft on another.
Consistency ensures that a transaction returns the system to an inconsistant state before and after it, thus avoiding data corruption. Isolation ensures that concurrent transactions do not interfere with each other and durability ensures that changes made during a transaction remain permanent despite system failures; for instance, Alice and Bob should both see identical amounts in their accounts following any transfer between them.
Transaction processing
Transaction processing is the practice of instantly updating data within a computer system, often to prevent hardware or software errors from leaving transactions partially completed, potentially costing time and money to complete. Transaction processing helps ensure only legitimate changes to database records occur.
Transaction processing begins by entering transaction data into a system. This can either be accomplished manually, such as when employees log their hours worked in a time-tracking system, or automatically through point-of-sale systems that record customer purchases.
Next, the transaction is processed. This may involve validating data, verifying errors, and updating databases or other systems. Once complete, it should be logged (sometimes in journals) for forward recovery – this guarantees that even if an entire transaction fails, its effects will still have an impactful outcome in terms of atomicity and potential duplication in future transactions. A logged transaction also helps avoid duplicate transactions from occurring in future.
Two-phase commit
Two-phase commit (2PC) is an industry standard protocol designed to maintain ACID properties of transactions in distributed systems. The two-phase commit protocol involves communicating between local resource managers and remote resource managers, with either acting as coordinator or participant depending on where it first initiated a transaction from.
At the outset of a two-phase commit process, participants are directed by their coordinator to prepare themselves by writing durable copies of their transaction results to secure storage. When all have done this, they notify their coordinator that they are ready to commit.
At this stage, the coordinator instructs all of the participants to either commit or rollback their transactions, with all taking the appropriate actions accordingly. This ensures that a distributed atomic transaction either commits across all servers where it accessed or aborts altogether, eliminating situations in which one server credits customers without debiting another and vice versa.
CAP theorem
The CAP theorem is an essential concept in understanding distributed systems. It describes tradeoffs between consistency, availability, and partition tolerance – essential knowledge for database designers and architects when creating distributed systems.
Consistency refers to all nodes within a system seeing identical values at all times, which allows the system to function efficiently. Without it, systems simply don’t work – for instance if video editor reads an untrue value then footage won’t be complete and vice versa.
Availability refers to a system’s ability to respond promptly to queries. This is essential so users can continue working, however high availability comes at the cost of consistency; network failures may cause partitions within a system and return it back into an orderly state, making high availability difficult or impossible to achieve simultaneously. Availability must always take priority over consistency for optimal system operation – however this is often impractical and only one should ever strive for both at once.