Monday, February 24, 2020

Service Architecture for OLTP --- Part 4. SAGA Patterns in Micoservice for OLTP


This post covers the SAGA programming patterns in the OLTP application architecture explained in this series.

First revisit the requirements for Web Services with or without OLTP:
  • Large scale on online request serving:
    High throughput and concurrency with reasonable latency for request serving.
  • Large scale on data in the system:
    The states of the system will need to be distributed (partitioned).
An Ecommerce website is a typical example of such Web Service with strong OLTP requirement on certain user operations, such as checkout a shopping cart. Consider the following scenarios:
  • The Website has large number of active products for sale. Products are randomly grouped into DB shards to avoid hot spots.
  • A popular product may be concurrently checked out by large number of user sessions. 
  • The sequence of operations for the final step of checkout:
    1. Check and deduct inventories of all products in checkout cart.
    2. Call payment service to secure payment.
    3. Generate an order.

Naive Implementation


Wrap the whole sequence of operations into a single DB transaction or a global transaction due to DB sharding and/or payment service is invoked thru RPC. This implementation likely will not satisfy the requirements on concurrent checkout on the same product, because an in-cart product's row in inventories table will be locked for the duration of the transaction, which prevents other checkout operation on the same product from making progress.
Global transaction will exacerbate the problem as overhead of network messaging and 2PC implementation will make the transaction duration even longer.

SAGA Pattern


The intuition is to break a long transaction to shorter ones. Specifically separate out short transactions to encapsulate operations that are potentially contentious (between different concurrent transactions). The example would have at least two transactions:
  • Transaction 1: Reserve inventory for a product in checkout cart:
    1. Check and deduct inventories of the product.
    2. create a reservation for the product and the checkout cart: {prodId, cartId, quantity}. This could be a new row in the checkoutReservation table.
  • Transaction 2 is performed after transaction 1 is performed on every product in the cart. The input is the cart with only products that transaction 1 succeeds on.
    1. Call payment service to secure payment.
    2. Generate an order.
    3. Remove reservations for all reserved products.
Note that transaction 2 only access data within the scope of current checkout session, and hence without any contention from other checkout sessions (assuming DB locks at record level). Transaction 1 needs to be a short transaction as there is contention. Hence we should make it a local transaction --- keep Inventory record and its corresponding reservation record in the same DB shard. We will also need to have a compensating transaction for Transaction 1 to rollback Transaction 1 if Transaction 2 fails.
  • Comp_trans 1: compensating transaction of Transaction 1.
    1. Remove reservation for the pair of {prodId, cartId}.
    2. Add back inventory of the product.
We will go thru a more realistic SAGA sequence for this example in part 5 (Workflow), where a separate order creation step is added to request processing part of the flow. For now, let us look at the state transition diagram of this over simplified SAGA sequence.

The dash line separates "Synchronous Request Processing" from "Async Processing" that happens outside the context of request processing and logically after server returns response to user. The "Async Processing" part of the state transition will need to be handled by a workflow (more explained in the "Workflow" post).

SAGA has the following properties:
  • The sequence of multiple transactions of the SAGA pattern does not provide serializable isolation on the whole sequence as a single transaction could. This implies the following constraints on the use cases:
    • The SAGA sequence of operations needs to be logically isolated already in the application. In other words, there is no other writers on the same data set during the progress of the sequence. In the above example, transaction 2 operates only on data within the scope of the checkout session and not shared by other checkout sessions.
    • Read requests can tolerate some transient states during the execution of the SAGA sequence. In the above example, at the state of "order in progress", inventory is deducted but payment is in progress, the application need to be able to reconcile the "inconsistent" state when user queries the state of the order.
  • Both transaction and compensating transaction need to be idempotent. This is an important requirement imposed by distributed workflow processing with inherent issue of network partitioning (for example, network partitioning between data stores and workflow engine).
  • Transactions in the SAGA sequence will need to be automatically orchestrated in both the context of request processing and async processing. Compensating transactions will also need to be added to the sequence for orchestration in case of rollback.
  • Compensating transactions will have to eventually succeed albeit with retries (as indicated in the above state transition), or else the system will be in an "inconsistent" state, which may require human intervention. 

Idempotent Transaction


The key technique to implement idempotency is to:
  • Always generate new record in DB, which is associated with pre-existing ID.
  • If the transaction intends to update a record, also generate above-mentioned new record to represent the occurrence of the update.
How this works is pretty straightforward --- you always check whether a record with that ID already exists before you generate a new one.

SQL Example for Transaction to Reserve Inventory


The following SQL code example is tested on MariaDB (a fork of MySQL after the Oracle acquisition).
Database schema for tables inventories and reservedInv, which are both sharded by prodId to guarantee local transaction for reserving inventory.

CREATE TABLE inventories
(
  prodId          VARCHAR(20),        # Unique ID for the record
  num             INT,                
  PRIMARY KEY     (prodId)            # Make the id the primary key
);

CREATE TABLE reservedInv
(
  prodId          VARCHAR(20),  
  cartId          VARCHAR(20),  
  num             INT,                # number reserved
  PRIMARY KEY     (prodId, cartId)    # Make the id the primary key
);

The transaction to reserve inventory demonstrates the technique to implement idempotency and use read lock.

Delimiter //
Create Procedure reserve_inventory(
    IN prod VARCHAR(20),
    IN cart VARCHAR(20),
    OUT status VARCHAR(20)
)
BEGIN
START TRANSACTION;
# check whether the prodId is already reserved by the cartId to guarantee idempotency
SELECT COUNT(1) INTO @reserved FROM reservedInv WHERE prodId=prod AND cartId=cart;
IF @reserved > 0 THEN
    COMMIT;
    SET status = 'NO OP';
ELSE
    # use select ... for update to explicitly lock read
    SELECT num into @invNum from inventories WHERE prodId = prod FOR UPDATE; 
    IF @invNum > 0 THEN
        UPDATE inventories SET num = num -1 WHERE prodId = prod;
        INSERT INTO reservedInv (prodId, cartId, num) VALUES (prod, cart, 1);
        COMMIT;
        SET status = 'RESERVED';
    ELSE
        ROLLBACK;
        SET status = 'ABORTED';
    END IF;
END IF;
END //
Delimiter ;

CALL reserve_inventory('p1', 'c1', @status); SELECT @status;

The SQL for the compensating transaction should be straight forward based on the above.

No comments:

Post a Comment

Post your comments

Decentralized Database over DLT