[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[tm-rg] BTP, WS-BA and compensations



Sorry I wasn't able to attend the call: I'm trying to clear the decks to become more actively involved in this group.

Just to correct one misunderstanding on BTP. (I write as one of the co-authors of the OAIS BTP specification.)

BTP allows complete freedom to implementers of services participating in a business transaction: if they wish to use a DO-COMPENSATE model, then they can do so. If they wish to use a PROVISIONAL-FINAL model, or a VALIDATE-DO, then they can do so. These three models form a spectrum of participant patterns for business transactions (i.e. transactions that link related state-changing operations in autonomous applications or services).

BTP permits this by sending one of two messages to participants or sub-coordinators: CONFIRM or CANCEL. Each of these messages can be faulted, if a technical failure or business event has prevented or superseded delivering on the promise of an earlier PREPARED message. The application has complete freedom to define its actions prior to sending PREPARED, and in reaction to either CONFIRM or CANCEL. The "application" could be a generic resource manager operating under conventional ACID rules; it could be a business service that treats CANCEL as "charge 2.5% for the cancelled reservation" and CONFIRM as "charge 100%". Any application or business rules can apply.

In our company, Choreology, we have developed the pretty firm view that the canonical model is PROVISIONAL-FINAL (where some kind of application resource reservation algorithm is used to put a service into a pending state with respect to a request). Examples from the business world that use this pattern, very naturally, are quote-to-order, or reserve credit (authorize), do trade (trigger payment). The service transits from a tentative or provisional state to a final state (either confirmed or cancelled) under the control of the business transaction coordination service, using a distributed coordination protocol like BTP. 

One way or the other, this implies the use of some kind of application-level lock (often as simple as a pending status flag). However, and this can be of critical value in some applications, we can allow observation of provisional states. It may be very useful to see potential consumption of a resource, including in deciding when to allow reservation of more of that resource. A probabilistic approach to reservation can allow more sophisticated use of inventory. The use of an application-determined locking strategy can tune the competing demands of concurrency and accuracy in a very knowledgeable way. This frees systems from the rigorous, but limiting, constraints imposed by general-purpose data-level resource managers such as DBMS and message queues. 

Both VALIDATE-DO and DO-COMPENSATE are afflicted to differing degrees by "promise degradation" over time. When a system goes prepared with respect to a business transaction it promises to obey the final decision of the coordinating application. These two patterns create wobbly promises.

(Parenthetically: coordinating applications should be able to decide that some participants are to be confirmed, while others are to be cancelled: what BTP calls a Cohesion, or cohesive business transaction, where the uniform outcome rule of classic atomic transactions is relaxed. This allows several viable outcomes, from the most desirable to the minimally acceptable, to be created. "All-or-nothing" is not a natural feature of many business transactions.)

VALIDATE-DO: a valid potential operation now may become an invalid operation in the future. This may lead to recoil: the attempted confirmation must be faulted. A familiar pattern of optimistic concurrency control, which can create relatively localized damage (failure of an interaction between the two parties, breach of a globally uniform outcome in a single business transaction). 

DO-COMPENSATE: much worse. If one's "provisional" action is identical to one's "confirmed" action, then the system which enacts DO really enacts it. And when I say "really" I mean: right the way through to physical actions such as manufacture-to-order, inventory picking/movement, invoice generation, payments etc. The "effect explosion" gets worse and worse, the longer you leave the newly unleashed business process running. COMPENSATE operations then become very complex indeed: they must accomodate a plethora of time-dependent states, affecting numerous internal systems and potentially external organizations. This effect explosion problem can lead to system designers breaking natural business transactions in two: first do a quote, then do an order. The recoverable correlation offered by the simplifying transaction abstraction is then lost.

WS-BusinessActivity (first edition, August 2002, revised 2004) from BEA/IBM/Microsoft is very similar to BTP (not surprising, as it post-dates the fundamental conceptual work of BTP in 2001). But it suffers a particular defect, which is small, but very limiting. This defect arises from its original defining use-case: BPEL Long-running Transactions (LRT).

BPEL processes are formed of nested scopes. Scopes can be defined to be "reversible" (compensatable). Service invocations from such scopes could use WS-BA to distribute the implied transactional context to remote service operations, allowing them in turn to become "reversible". Reversibility is expressed in the ability of a BPEL scope to have a compensation handler. But BPEL scopes (and transitively, WS-BA-infected services) cannot define a "confirmation" or "finalization" handler. This means that BPEL can only employ one model on the BT participant spectrum: DO-COMPENSATE. My hope would be that this limitation will be lifted in a future version of BPEL, allowing scopes to be defined as "contingent", implying presence of both confirm and cancel ("compensation") handlers. (Another way of putting this is: currently BPEL uses the Open Nested Transaction model, which has not gained wide practical support because of the problems I have outlined with its implications for participant behaviour.)

In WS-BA this limitation is expressed in the fact that a CLOSE message (akin to a BTP CONFIRM) cannot be faulted: it is assumed that the only effect of CLOSE is to drop the compensation handler (in transaction terms, to logically delete the log entry which allows recoverable memory of the business transaction to be retained), and it is further assumed that failure to logically delete the log is a non-fatal technical error, that can be passed over silently, and cleaned up by some form of GC.

We have proposed to the authors of WS-BA that this restriction be lifted. This is a very simple change to WS-BA, but a liberating one. If CLOSE can be faulted, then the full spectrum of BT participant patterns can be used, as the service writer desires. And it should be noted: each, relatively autonomous, service drawn into a coordinated business transaction may choose to use a different participant model, concurrently. It should also be noted that the BPEL LRT (do-compensate) model simply becomes a particular (perfectly valid) use of a more general facility.

The following diagram shows the change required to the WS-BA protocol.



The explicit recognition of the effect of time on "options" or "reservations" in BTP is another useful feature, which we would like to see introduced into any other protocol (such as WS-BA), that is intended for use in application-level coordination.

At some point in the near future we intend to publish our currently private feedback to the authors of WS-Coordination, WS-AtomicTransaction and WS-BusinessActivity, and we will post the link to this group, inter alia.

I hope that the group has a productive meeting in Hawaii.

Alastair

Alastair J. Green
CTO, Choreology Ltd
68 Lombard St, London EC3V 9LJ
+44 870 739 0050




-----Original Message-----
From: Torsten Steinbach [
mailto:torsten@data-grid.org]
Sent: 27 May 2004 16:16
To: tm-rg@ggf.org; Andrew.Simpson@comlab.ox.ac.uk; djp@comlab.ox.ac.uk
Cc: manfred_oevers@uk.ibm.com
Subject: [tm-rg] Telcon notes 27 may 2004


Hello everyone,

Attached please find the minutes of our today's telcon.
Btw, I put stuff like this to our GridForge project as well:
http://forge.gridforum.org/docman2/ViewCategory.php?group_id=140&category_id=726
<http://forge.gridforum.org/docman2/ViewCategory.php?group_id=140&category_id=726>

Torsten.