.. highlight:: none .. index:: pair: protocol inheritance; design .. _design-protocol: Protocol inheritance ==================== .. mps:prefix:: design.mps.protocol Introduction ------------ :mps:tag:`intro` This document explains the design of the support for class inheritance in MPS. :mps:tag:`readership` This document is intended for any MPS developer. Purpose ------- :mps:tag:`purpose.code-maintain` The purpose of the protocol inheritance design is to ensure that the MPS code base can make use of the benefits of object-oriented class inheritance to maximize code reuse, minimize code maintenance and minimize the use of boilerplate code. :mps:tag:`purpose.related` For related discussion, see `mail.tony.1998-08-28.16-26`_, `mail.tony.1998-09-01.11-38`_, `mail.tony.1998-10-06.11-03`_ and other messages in the same threads. .. _mail.tony.1998-10-06.11-03: https://info.ravenbrook.com/project/mps/mail/1998/10/06/11-03/0.txt .. _mail.tony.1998-09-01.11-38: https://info.ravenbrook.com/project/mps/mail/1998/09/01/11-38/0.txt .. _mail.tony.1998-08-28.16-26: https://info.ravenbrook.com/project/mps/mail/1998/08/28/16-26/0.txt Requirements ------------ :mps:tag:`req.implicit` The object system should provide a means for classes to inherit the methods of their direct superclasses implicitly for all functions in the protocol without having to write any explicit code for each inherited function. :mps:tag:`req.override` There must additionally be a way for classes to override the methods of their superclasses. :mps:tag:`req.next-method` As a result of :mps:ref:`.req.implicit`, classes cannot make static assumptions about methods used by direct superclasses. The object system must provide a means for classes to extend (not just replace) the behaviour of protocol functions, such as a mechanism for invoking the "next-method". :mps:tag:`req.ideal.extend` The object system must provide a standard way for classes to implement the protocol supported by their superclass and additionally add new methods of their own which can be specialized by subclasses. :mps:tag:`req.ideal.multiple-inheritance` The object system should support multiple inheritance such that sub-protocols can be "mixed in" with several classes which do not themselves support identical protocols. Overview -------- :mps:tag:`overview.inst` The key concept in the design is the relationship between an "instance" and its "class". Every structure that participates in the protocol system begins with an :c:type:`InstStruct` structure that contains a pointer to an :c:type:`InstClassStruct` that describes it, like this:: instance class .----------. .----------. | class |----->| class | ------------ ------------ | ... | | sig | ------------ ------------ | ... | | name | ------------ ------------ | ... | |superclass| ------------ ------------ | | | ... | :mps:tag:`overview.prefix` We make use of the fact that we can cast between structures with common prefixes, or between structures and their first members, to provide dynamic typing and subtyping (see [Kernighan_1988]_, A.8.3). :mps:tag:`overview.method` The :c:type:`InstClassStruct` it itself at the start of a class structure contains pointers to functions that can be called to manipulate the instance as an abstract data type. We refer to these functions as "methods" to distinguish them from functions not involved in the object-oriented protocol. The macro ``Method`` is provided for calling methods. :mps:tag:`overview.subclass` An instance structure can be extended by using it as the first field of another structure, and by overriding its class pointer with a pointer to a "subclass" that provides different behavior. :mps:tag:`overview.inherit` Classes inherit the methods from their superclasses when they are initialized, so by default they have the same methods as the class from which they inherit. Methods on the superclass can be re-used, providing polymorphism. :mps:tag:`overview.inherit.specialize` Classes may specialize the behaviour of their superclass. They do this by by overriding methods or other fields in the class object. :mps:tag:`overview.mixin` Groups of related overrides are provided by "mixins", and this provides a limited form of multiple inheritance. :mps:tag:`overview.extend` Classes may extend the protocols supported by their superclasses by adding new fields for methods or other data. Extending a class creates a new kind of class. :mps:tag:`overview.kind` Classes are themselves instance objects, and have classes of their own. A class of a class is referred to as a "kind", but is not otherwise special. Classes which share the same set of methods (or other class fields) are instances of the same kind. If a class is extended, it becomes a member of a different kind. Kinds allow subtype checking to be applied to classes as well as instances, to determine whether methods are available. :: instance class kind (e.g. CBS) (e.g. CBSClass) (e.g. LandClassClass) .----------. .----------. .----------. | class |----->| class |----->| class |-->InstClassClass ------------ ------------ ------------ | ... | | sig | | sig | ------------ ------------ ------------ | ... | | name | | name | ------------ ------------ ------------ | ... | |superclass|-. |superclass|-->InstClassClass ------------ ------------ | ------------ | | | ... | | | ... | | | LandClass<-' :mps:tag:`overview.sig.inherit` Instances (and therefore classes) will contain signatures. Classes must not specialize (override) the signatures they inherit from their superclasses, as they are used to check the actual type (not sub- or supertype) of the object they're in. :mps:tag:`overview.sig.extend` When extending an instance or class, it is normal policy for the new structure to include a new signature as the last field. :mps:tag:`overview.superclass` Each class contains a ``superclass`` field. This enables classes to call "next-method". :mps:tag:`overview.next-method` A specialized method in a class can make use of an overridden method from a superclass using the :c:type:`NextMethod` macro, statically naming the superclass. :mps:tag:`overview.next-method.dynamic` It is possible to write a method which does not statically know its superclass, and call the next method by extracting a class from one of its arguments using ``ClassOfPoly`` and finding the superclass using ``SuperclassPoly``. Debug pool mixins do this. However, this is not fully general, and combining such methods is likely to cause infinite recursion. Take care! :mps:tag:`overview.access` Classes must be initialized by calls to functions, since there is no way to express overrides statically in C89. :c:macro:`DEFINE_CLASS` defines an "ensure" function that initializes and returns the canonical copy of the class. The canonical copy may reside in static storage, but no MPS code may refer to that static storage by name. :mps:tag:`overview.init` In addition to the "ensure" function, each class must provide an "init" function, which initialises its argument as a fresh copy of the class. This allows subclasses to derive their methods and other fields from superclasses. :mps:tag:`overview.naming` There are some strict naming conventions which must be followed when defining and using classes. The use is obligatory because it is assumed by the macros which support the definition and inheritance mechanism. For every kind ``Foo``, we insist upon the following naming conventions: * ``Foo`` names a type that points to a :c:type:`FooStruct`. * :c:type:`FooStruct` is the type of the instance structure, the first field of which is the structure it inherits from (ultimately an :c:type:`InstStruct`). * :c:type:`FooClass` names the type that points to a :c:type:`FooClassStruct`. * :c:type:`FooClassStruct` names the structure for the class pointed to by :c:type:`FooStruct`, containing the methods that operate on ``Foo``. Interface --------- Class declaration ................. .. c:macro:: DECLARE_CLASS(kind, className) :mps:tag:`if.declare-class` Class declaration is performed by the macro :c:macro:`DECLARE_CLASS`, which declares the existence of the class definition elsewhere. It is intended for use in headers. Class definition ................ .. c:macro:: DEFINE_CLASS(kind, className, var) :mps:tag:`if.define-class` Class definition is performed by the macro :c:macro:`DEFINE_CLASS`. A call to the macro must be followed by a function body of initialization code. The parameter ``className`` is used to name the class being defined. The parameter ``var`` is used to name a local variable of type of classes of kind ``kind``, which is defined by the macro; it refers to the canonical storage for the class being defined. This variable may be used in the initialization code. (The macro doesn't just pick a name implicitly because of the danger of a name clash with other names used by the programmer). A call to the macro defines the ensure function for the class along with some static storage for the canonical class object, and some other things to ensure the class gets initialized at most once. Class access ............ .. c:macro:: CLASS(className) :mps:tag:`if.class` To get the canonical class object, use the :c:macro:`CLASS` macro, e.g. ``CLASS(Land)``. Single inheritance .................. .. c:macro:: INHERIT_CLASS(this, className, parentName) :mps:tag:`if.inheritance` Class inheritance details must be provided in the class initialization code (see :mps:ref:`.if.define-class`). Inheritance is performed by the macro :c:macro:`INHERIT_CLASS`. A call to this macro will make the class being defined a direct subclass of ``parentClassName`` by ensuring that all the fields of the embedded parent class (pointed to by the ``this`` argument) are initialized as the parent class, and setting the superclass field of ``this`` to be the canonical parent class object. The parameter ``this`` must be the same kind as ``parentClassName``. Specialization .............. :mps:tag:`if.specialize` Fields in the class structure must be assigned explicitly in the class initialization code (see :mps:ref:`.if.define-class`). This must happen *after* inheritance details are given (see :mps:ref:`.if.inheritance`), so that overrides work. Extension ......... :mps:tag:`if.extend` To extend the protocol when defining a new class, a new type must be defined for the class structure. This must embed the structure for the primarily inherited class as the first field of the structure. Extension fields in the class structure must be assigned explicitly in the class initialization code (see :mps:ref:`.if.define-class`). This should be done *after* the inheritance details are given for consistency with :mps:ref:`.if.inheritance`. This is, in fact, how all the useful classes extend ``Inst``. :mps:tag:`if.extend.kind` In addition, a class must be defined for the new kind of class. This is just an unspecialized subclass of the kind of the class being specialized by the extension. For example:: typedef struct LandClassStruct { InstClassStruct instClass; /* inherited class */ LandInsertMethod insert; ... } LandClassStruct; DEFINE_CLASS(Inst, LandClass, class) { INHERIT_CLASS(class, LandClass, InstClass); } DEFINE_CLASS(Land, Land, class) { INHERIT_CLASS(&class->instClass, Land, Inst); class->insert = landInsert; ... } Methods ....... .. c:macro:: Method(kind, inst, meth) :mps:tag:`if.method` To call a method on an instance of a class, use the ``Method`` macro to retrieve the method. This macro may assert if the class is not of the kind requested. For example, to call the ``insert`` method on ``land``:: res = Method(Land, land, insert)(rangeReturn, land, range); .. c:macro:: NextMethod(kind, className, meth) :mps:tag:`if.next-method` To call a method from a superclass of a class, use the :c:type:`NextMethod` macro to retrieve the method. This macro may assert if the superclass is not of the kind requested. For example, the function to split AMS segments wants to split the segments they are based on, so does:: res = NextMethod(Seg, AMSSeg, split)(seg, segHi, base, mid, limit); Conversion .......... .. c:macro:: IsA(className, inst) _`if.isa`: Returns non-zero iff the class of ``inst`` is a member of the class or any of its subclasses. .. c:macro:: MustBeA(className, inst) :mps:tag:`if.must-be-a` To convert the C type of an instance to that of a compatible class (the class of the actual object or any superclass), use the ``MustBeA`` macro. In hot varieties this macro performs a fast dynamic type check and will assert if the class is not compatible. It is like C++ "dynamic_cast" with an assert. In cool varieties, the class check method is called on the object. For example, in a specialized Land method in the CBS class:: static Res cbsInsert(Range rangeReturn, Land land, Range range) { CBS cbs = MustBeA(CBS, land); ... .. c:macro:: MustBeA_CRITICAL(className, inst) :mps:tag:`if.must-be-a.critical` When the cost of a type check is too expensive in hot varieties, use ``MustBeA_CRITICAL`` in place of ``MustBeA``. This only performs the check in cool varieties. Compare with :c:macro:`AVER_CRITICAL`. .. c:macro:: CouldBeA(className, inst) :mps:tag:`if.could-be-a` To make an unsafe conversion equivalent to ``MustBeA``, use the ``CouldBeA`` macro. This is in effect a simple pointer cast, but it expresses the intention of class compatibility in the source code. It is mainly intended for use when initializing an object, when a class compatibility check would fail, when checking an object, or in debugging code such as describe methods, where asserting is inappropriate. It is intended to be equivalent to the C++ ``static_cast``, although since this is C there is no actual static checking, so in fact it's more like ``reinterpret_cast``. Introspection ............. :mps:tag:`introspect.c-lang` The design includes a number of introspection functions for dynamically examining class relationships. These functions are polymorphic and accept arbitrary subclasses of :c:type:`InstClass`. C doesn't support such polymorphism. So although these have the semantics of functions (and could be implemented as functions in another language with compatible calling conventions) they are actually implemented as macros. The macros are named as function-style macros despite the fact that this arguably contravenes guide.impl.c.macro.method. The justification for this is that this design is intended to promote the use of polymorphism, and it breaks the abstraction for the users to need to be aware of what can and can't be expressed directly in C function syntax. These functions all have names ending in ``Poly`` to identify them as polymorphic functions. .. c:macro:: SuperclassPoly(kind, class) :mps:tag:`if.superclass-poly` An introspection function which returns the direct superclass of class object ``class`` as a class of kind ``kind``. This may assert if the superclass is not (a subtype of) the kind requested. .. c:macro:: ClassOfPoly(kind, inst) :mps:tag:`if.class-of-poly` An introspection function which returns the class of which ``inst`` is a direct instance, as a class of kind ``kind``. This may assert if the class is not (a subtype of) the kind requested. .. c:macro:: SetClassOfPoly(inst, class) :mps:tag:`if.set-class-of-poly` An initialization function that sets the class of ``inst`` to be ``class``. This is intended only for use in initialization functions, to specialize the instance once its fields have been initialized. Each Init function should call its superclass init, finally reaching InstInit, and then, once it has set up its fields, use SetClassOfPoly to set the class and check the instance with its check method. Compare with `design.mps.sig`_. .. _`design.mps.sig`: sig .. c:macro:: IsSubclass(sub, super) :mps:tag:`if.is-subclass` An introspection function which returns a :c:type:`Bool` indicating whether ``sub`` is a subclass of ``super``. That is, it is a predicate for testing subclass relationships. Protocol guidelines ................... :mps:tag:`guide.fail` When designing an extensible method which might fail, the design must permit the correct implementation of the failure-case code. Typically, a failure might occur in any method in the chain. Each method is responsible for correctly propagating failure information supplied by superclass methods and for managing it's own failures. This is not really different from the general MPS convention for unwinding on error paths. It implies that the design of a class must include an anti-method for each method that changes the state of an instance (e.g. by allocating memory) to allow the state to be reverted in case of a failure. See :mps:ref:`.example.fail` below. Example ....... :mps:tag:`example.inheritance` The following example class definition shows both inheritance and specialization. It shows the definition of the class ``RankBuf``, which inherits from :c:type:`SegBuf` of kind :c:type:`Seg` and has specialized ``varargs`` and ``init`` method. :: DEFINE_CLASS(Buffer, RankBuf, class) { INHERIT_CLASS(class, RankBuf, SegBuf); class->varargs = rankBufVarargs; class->init = rankBufInit; } :mps:tag:`example.extension` The following (hypothetical) example class definition shows inheritance, specialization and also extension. It shows the definition of the class ``EPDLDebugPool``, which inherits from ``EPDLPool`` of kind :c:type:`Pool`, but also implements a method for checking properties of the pool. :: typedef struct EPDLDebugPoolClassStruct { EPDLPoolClassStruct epdl; DebugPoolCheckMethod check; Sig sig; } EPDLDebugPoolClassStruct; typedef EPDLDebugPoolClassStruct *EPDLDebugPoolClass; DEFINE_CLASS(Inst, EPDLDebugPoolClass, class) { INHERIT_CLASS(class, EPDLPoolClass, InstClass); } DEFINE_CLASS(EPDLDebugPool, EPDLDebugPool, class) { INHERIT_CLASS(&class->epdl, EPDLDebugPool, EPDLPoolClass); class->check = EPDLDebugCheck; class->sig = EPDLDebugSig; } :mps:tag:`example.fail` The following example shows the implementation of failure-case code for an "init" method, making use of the "finish" anti-method to clean-up a subsequent failure. :: static Res AMSSegInit(Seg seg, Pool pool, Addr base, Size size, ArgList args) { AMS ams = MustBeA(AMSPool, pool); Arena arena = PoolArena(pool); AMSSeg amsseg; Res res; /* Initialize the superclass fields first via next-method call */ res = NextMethod(Seg, AMSSeg, init)(seg, pool, base, size, args); if (res != ResOK) goto failNextMethod; amsseg = CouldBeA(AMSSeg, seg); amsseg->grains = size >> ams->grainShift; amsseg->freeGrains = amsseg->grains; amsseg->oldGrains = (Count)0; amsseg->newGrains = (Count)0; amsseg->marksChanged = FALSE; /* */ amsseg->ambiguousFixes = FALSE; res = amsCreateTables(ams, &amsseg->allocTable, &amsseg->nongreyTable, &amsseg->nonwhiteTable, arena, amsseg->grains); if (res != ResOK) goto failCreateTables; /* start off using firstFree, see */ amsseg->allocTableInUse = FALSE; amsseg->firstFree = 0; amsseg->colourTablesInUse = FALSE; amsseg->ams = ams; RingInit(&amsseg->segRing); RingAppend((ams->allocRing)(ams, SegRankSet(seg), size), &amsseg->segRing); SetClassOfPoly(seg, CLASS(AMSSeg)); amsseg->sig = AMSSegSig; AVERC(AMSSeg, amsseg); return ResOK; failCreateTables: NextMethod(Seg, AMSSeg, finish)(seg); failNextMethod: AVER(res != ResOK); return res; } Implementation -------------- :mps:tag:`impl.define-class.lock` The :c:macro:`DEFINE_CLASS` macro ensures that each class is initialized at most once (even in multi-threaded programs) by claiming the global recursive lock (see design.mps.thread-safety.arch.global.recursive_). .. _design.mps.thread-safety.arch.global.recursive: thread-safety.html#design.mps.thread-safety.arch.global.recursive :mps:tag:`impl.derived-names` The :c:func:`DEFINE_CLASS()` macro derives some additional names from the class name as part of it's implementation. These should not appear in the source code, but it may be useful to know about this for debugging purposes. For each class definition for class :c:type:`SomeClass` of kind ``SomeKind``, the macro defines the following: * ``extern SomeKind SomeClassGet(void);`` The class ensure function. See :mps:ref:`.overview.naming`. This function handles local static storage for the canonical class object and a guardian to ensure the storage is initialized at most once. This function is invoked by the :c:macro:`CLASS` macro (:mps:ref:`.if.class`). * ``static void SomeClassInit(SomeKind);`` A function called by :c:func:`SomeClassGet()`. All the class initialization code is actually in this function. :mps:tag:`impl.subclass` The subclass test :mps:ref:`.if.is-subclass` is implemented using an array of superclasses [Cohen_1991]_ giving a fast constant-time test. (RB_ tried an approach using prime factors [Gibbs_2004]_ but found that they overflowed in 32-bits too easily to be useful.) Each class is assigned a "level" which is the distance from the root of the class hierarchy. The :c:type:`InstClass` structure contains an array of class ids indexed by level, representing the inheritance of this class. A class is a subclass of another if and only if the superclass id is present in the array at the superclass level. The level is statically defined using enum constants, and the id is the address of the canonical class object, so the test is fast and simple. .. _RB: https://www.ravenbrook.com/consultants/rb/ Common instance methods ----------------------- :mps:tag:`method` These methods are available on all instances. .. c:type:: void (*FinishMethod)(Inst inst) :mps:tag:`method.finish` The ``finish`` method should finish the instance data structure (releasing any resources that were acquired by the instance during its lifetime) and then call its superclass method via the :c:func:`NextMethod()` macro. .. c:type:: Res (*DescribeMethod)(Inst inst, mps_lib_FILE *stream, Count depth) :mps:tag:`method.describe` The ``describe`` field should print out a description of the instance to ``stream`` (by calling :c:func:`WriteF()`). References ---------- .. [Cohen_1991] "Type-Extension Type Tests Can Be Performed In Constant Time"; Norman H Cohen; IBM Thomas J Watson Research Center; ACM Transactions on Programming Languages and Systems, Vol. 13 No. 4, pp. 626-629; 1991-10. .. [Gibbs_2004] Michael Gibbs, Bjarne Stroustrup. 2004. "`Fast Dynamic Casting `__". .. [Kernighan_1988] Brian W. Kernighan, Dennis M. Ritchie. 1988. "The C Programming language 2nd Edition".