IRAC 4.5 Transactions; IRAC 4.6 Robustness and Restoration; IRAC 4.7 Common External Form

4.5 Transactions

There are several important observations to be made about the definition of TRANSACTION in the list of definitions. The first observation emphasizes how this definition extends the normal database concept of transaction. The term transaction as used here relies on the concept of "an identified set of operations". The method of identification is not specified, but it is not necessarily limited to simple sequentiality. It will be necessary to allow the execution of code in a tool between the operations of the set without that code being included in the transaction. A mechanism which allowed one to define an atomic operation C as "A" followed by "B" would be insufficient; the operations must be allowable as part of an algorithm. That is, the binding of operations to transactions must be possible at run-time, as opposed to only earlier.

Transactions are most clearly understood in terms of implicit consistency constraints. A system may provide built-in support for some constraints (that is, for explicit constraints), for example requiring that all objects of a particular type have a particular relationship. However, there may be implicit constraints in any system which are only understood by the user or program writer.

For example, in a banking system, there is a constraint that the total amount of money in the system be constant.

Transactions provide a means to support the preservation of these constraints in the face of concurrently executing programs and program and system failure. A transaction provides a means for a program to assert that, within some meaning of consistency understood by that program, "if the data are consistent at the start of the transaction, and there are no interfering concurrent transactions, then the data are consistent at the end of the transaction". The transaction mechanism must prevent failures or concurrent transactions from interfering with this consistency. This aspect is known as isolation.

Another key feature of transactions is that they appear atomic from outside the transaction. This sense of "all or nothing", when provided by the system underlying the tools, represents a powerful asset for an IPSE. Simple locking gives some assistance to preventing clashes due to concurrent programs. In order to gain consistency as described above, the tool has to define internally its own notion of transactions (and support this itself) and hold locks until the end of such (to prevent interference by other programs). However, this gives no assistance in the event of programs or system failure, since the system does not know what state is consistent. The granularity of locks is left as a decision for the PCIS designer.

4.5A Transaction Mechanism. The PCIS shall support a transaction mechanism. The effect of running transactions concurrently shall be as if the concurrent transactions were run serially.

The definition of TRANSACTION is concerned with the consistency of the transaction when it is run on its own, and not with the effects of other concurrent actions. In the terminology of [Gray76] this requirement implies that the PCIS shall provide level 3 consistency (between concurrent transactions).

The requirements should not be read to imply that lower degrees of consistency are precluded. Indeed, the notion that some of the operations of a process may be outside the identified set of operations may be seen, in some ways, as analogous to providing for lower degrees of consistency, though, in that case, only for some of the operations of the process.

4.5B Nested Transactions. The PCIS shall support nesting of transactions.

It is envisaged that the transaction mechanism will often be used by individual tools to ensure the integrity of data which they manipulate. However, it is also envisaged that tools may be composed together and cooperate with one another, using various means of control integration. The PCIS must support use of transactions by such a composed tool. This then implies a need to support nesting of transactions.

For example, suppose that a tool starts a transaction. It may subsequently decide to either commit or abort the effects of its operation when it terminates the transaction. During the transaction it may activate a second tool to carry out a part of its function. It should be possible for the operation of the second tool to be specified as part of the identified set of the transaction. If so identified, then any transactions started by the second tool are said to be nested and the transaction of the first tool is said to be the outer transaction. The operation of the second tool must be unaffected by the existence of the outer transaction, except that the commit operation of the nested transaction must be reversible should the outer transaction be aborted.

The nesting of transactions is described in [Moss81].

4.5C Transaction Control. The PCIS shall support facilities to start, commit and abort transactions.

It is envisaged that it would be preferable if it were possible for some operations which were not part of a transaction to be used by a tool concurrently with a transaction. For example, a compilation might be done as part of a transaction. If the compilation failed, a simple abort of that transaction would restore the program library to its original state. However, it is essential that the error listing file be written by operations that are not part of the transaction, since otherwise the listing file would be destroyed by the abort.

Some operations might not be undoable, for example, output direct to a line printer. Such operations are then not eligible for inclusion within a transaction. Whether such things as the data displayed on a screen is restored to the previous state on abort is left as a PCIS designer's choice.

The "start transaction" and "end transaction" operations mentioned in the rationale to 4.5D are intended to be discrete operations, not side effects or optional parts of some other operations. Specifically, making the span of a transaction always identical to that of the process executing it, by adding a boolean input parameter TRANSACTION to the "start process" operation and providing no other way to start a transaction, is not a good approach. It is thought to make the "grain" or "size" of a transaction too large and to be an unnecessary restriction. A given process should be able to perform several transactions within its lifetime. These may be serial in time, nested or overlapping.

As indicated in the example above, it is expected that a transaction would span a complete compilation. In fact it would be quite reasonable, in a PSE, for a set of compilations and a system build followed by a system test all to be carried out within a single transaction. Transactions may indeed entail storage of large amounts of backup or recovery data, and ultimately this will reach limits such as machine or disc limits. However, it is not mandated that transactions span separate user or batch job login sessions.

4.5D Identification of Transaction Operations. The PCIS shall support a mechanism for identifying which operations and nested transactions (that can be aborted) are to be a part of a transaction.

It must be possible to exclude some operations, that occur between the start and end of a transaction, from the effects of the transaction mechanism. Sequentiality is not necessarily the means of identifying the set of included operations. The word "identifying" is specifically chosen without specifying how. However, this leads to the concept of a transaction involving an "identified set" of operations, without stipulating how the set is identified, or who identifies it; that is a choice for the PCIS designer. For example, all operations between calls on PCIS functions provided for the purpose (that is, "start transaction" and "end transaction") may be deemed to be identified, or alternatively, operations may have an extra parameter to indicate whether they are to be considered as part of the set, or some mixture of these.

Note that the concept of an "identified set" of operations is somewhat of an extension to the normal database concept of a transaction.

4.5E Transaction Granularity. The PCIS shall support transactions which range in length from very few operations to a very large number of operations efficiently.

Transactions may be used in an IPSE in the classical way to perform a small number of operations atomically or in a way particular to an IPSE to perform a very large number of operations (for instance, an Ada compilation). This is a requirement to support both forms of transactions in terms of other PCIS facilities with large inherent overhead. For example, it must be possible to start multiple transactions within a given PCIS process. It must be possible for the transactions to be serial in time, overlapping in time, or nested. Multiple transactions within a PCIS process are deemed necessary because a given program may need to perform multiple small atomic operations. This should be supported efficiently. Similarly, nested transactions within a PCIS process are deemed necessary because a transaction itself may need to perform multiple small atomic operations.

Concurrent programming languages allow multiple threads of control within a PCIS process. It should be possible for each thread in a multi-thread program to exploit the transaction mechanism. Transactions that overlap in time allow multiple threads of control to exploit the transaction mechanism such that each thread of control can be executing a transaction in parallel.

4.5F Program Independence. The PCIS shall support the activation of a program as a transaction where that program may not have been written to execute as a transaction.

This requirement is in addition to other means of specifying what operations are to be a part of the "identified set". In this case, the whole execution of the program is part of the transaction. Consideration must be given to providing the invoking process with visibility into the transaction, and to delaying any transaction abort past process termination. Such means allow the invoking process to, for example, analyze the causes of failed executions, before the transaction abort eliminates the evidence.

4.5G Program Execution. The PCIS shall support the execution of a program within a transaction as part of that transaction.

This is to allow tool composition so that the activated program, within the context of a transaction, acts in the same manner as a called subprogram within the same transaction.

4.5H Resource Failure. Failure of resources affecting a transaction which is in progress should have the effect of aborting that transaction.

The requirement is to do as well as is possible. Clearly there will be some failures with which the PCIS cannot cope.

4.5I Long Transactions. The PCIS shall support the automation, coordination and control of activities in long term projects.

While long transactions are still an area for experimentation, the automation which they imply for large development processes is clearly necessary. Also, some specific facilities to be provided for their support can be identified. It is likely that most of those facilities are already identified in other requirements. This requirement is specific direction to consider whether there have been any significant omissions.

4.6 Robustness and Restoration

The PCIS shall support facilities which ensure the robustness of data and the ability to restore data represented in the Object Management System. The facilities shall include at least those required to support the backup and archiving capabilities provided by modern operating systems.

The reader is referred to the definitions of ARCHIVE and BACKUP in the glossary.

Projects tend to have amounts of data that are large in terms of the capacity of existing storage units. Moreover, the need for reliability makes it necessary to keep redundant copies of all data. The OMS must be able to divide the data that constitutes its objects among several different storage units. The economics of the cost of a storage unit versus its speed make it necessary for the OMS to support several different kinds of storage, in order to provide reasonable access at acceptable cost.

Most users moving to PCIS implementations will currently be using "modern operating systems". It is important that a PCIS implementation should provide backup and archive facilities that are not significantly worse than those to which they are accustomed.

4.7 Common External Form

4.7A Representation. The PCIS shall specify a representation on external media of data that can be represented in the OMS; this representation is to be known as the Common External Form.

The PCIS definition is likely to use built-in data types, including character, date, integer and string. Since there will be a number of PCIS implementations on a range of hardware, in order to meet the goal of interoperability, it will be necessary to define a single format for data transfer. The data can then be translated into the external form on the originating machine and can be translated back to the correct internal form on another. It is desirable to have a single external form regardless of the actual transmission medium, and this may limit the choice of formats. For example, some communication systems can handle only subsets or variants of ASCII characters.

Bulk data cannot, by implication, be infallibly transferred between PCIS implementations on different architectures using only the facilities within the PCIS.

4.7B Export. The PCIS shall support the transfer of data from the OMS of Section 4 to external media in the Common External Form. All information (including relationships and attribute values) in the part of the OMS transferred shall be preserved on the external medium in the Common External Form.

The successful transfer of all information in an OMS starts with the capture of that information in the Common External Form. This capture must include not only the objects and their attributes, but also the particular attribute values, the OMS structure as represented by relationships, and all of the related typing information. This requirement also points out that the OMS transfer is not limited to the whole OMS. It must be possible to transport either large or small portions of the information represented in the OMS.

4.7C Import. The PCIS shall support the transfer of data from an external medium in the Common External Form to the OMS of Section 4. The PCIS shall preserve all information on such a transfer, except where this is not possible because of different representations of data on the systems involved.

This requirement complements Requirement 4.7B (Export). The goal of importation is to represent the transferred information in such a way that a transported tool could use it successfully to achieve its purpose in the new IPSE.

Problems can arise where the internal form on some machine cannot represent the full range of the external form. For example, the floating point number representation on one machine may have greater range or precision capability than on another machine.

4.7D Data Exchange. The PCIS shall support the exchange of data between PCIS environments, and between PCIS and non-PCIS environments.

This requirement can be seen as a combination of Requirement 4.7B (Export) and Requirement 4.7C (Import).

It makes however explicit the need to exchange parts of the object base from one environment to another one, either through an intermediate medium (archiving tape) or through a communication line. Furthermore, it does not necessarily restrict the exchange to be made with only other PCIS environments.

Go forward to Section 5, Program Execution Facilities.