World's most popular travel blog for travel bloggers.

[Solved]: What is the difference between Abstract Data Types and objects?

, , No Comments
Problem Detail: 

An answer on Programmers.SE characterizes an essay by Cook (Objects are not ADTs) as saying

  • Objects behave like a characteristic function over the values of a type, rather than as an algebra. Objects use procedural abstraction rather than type abstraction

  • ADTs usually have a unique implementation in a program. When one's language has modules, it's possible to have multiple implementations of an ADT, but they can't usually interoperate.

It seems to me that, in Cook's essay, it just happens to be the case that for the specific example of a set used in Cook's paper, an object can be viewed as a characteristic function. I don't think that objects, in general can be viewed as characteristic functions.

Also, Aldritch's paper The power of interoperability: Why objects are inevitable¹ suggests

Cook's definition essentially identifies dynamic dispatch as the most important characteristic of object

agreeing with this and with Alan Kay when he said

OOP to me means only messaging, local retention and protection and hiding of state-process, and extreme late-binding of all things.

However, these companion lecture slides to Aldritch's paper suggest that Java classes are ADTs while Java interfaces are objects -- and indeed using interfaces "objects" can inter-operate (one of the key features of OOP as given by one of the bullet points above).

My questions are

  1. Am I correct to say that characteristic functions are not a key feature of objects and that Frank Shearar is mistaken?

  2. Are data that talk to each other through Java interfaces examples of objects even though they don't use dynamic dispatch? Why? (My understanding is that dynamic dispatch is more flexible, and that interfaces are a step towards objective-C/smalltalk/erlang style messaging.)

  3. Is the idea of dependency inversion principle related to the distinction between ADTs and objects? (See the Wikipedia page or The Talking Objects: A Tale About Message-Oriented Programming) Although I'm new to the concept, I understand that it involves adding interfaces between "layers" of a program (see wikipedia page diagram).

  4. Please provide any other examples/clarifications of the distinction between objects and ADTs, if you want.

¹ This paper (published in 2013) is easy reading and summarizes Cook's 2009 paper with examples in Java. I highly recommend at least skimming it, not to answer this question, but just because it's a good paper.

Asked By : LMZ

Answered By : LMZ

Google brought up a similar question with an answer that I think is very good. I've quoted it below.

There's another distinction lurking here that is explained in the Cook essay I linked.

Objects are not the only way to implement abstraction. Not everything is an object. Objects implement something which some people call procedural data abstraction. Abstract data types implement a different form of abstraction.

A key difference appears when you consider binary methods/functions. With procedural data abstraction (objects), you might write something like this for an Int set interface:

interface IntSet {   void unionWith(IntSet s);   ... } 

Now consider two implementations of IntSet, say one that's backed by lists and one that's backed by a more efficient binary tree structure:

class ListIntSet implements IntSet {   void unionWith(IntSet s){ ... } }  class BSTIntSet implements IntSet {   void unionWith(IntSet s){ ... } } 

Notice that unionWith must take an IntSet argument. Not the more specific type like ListIntSet or BSTIntSet. This means that the BSTIntSet implementation cannot assume that its input is a BSTIntSet and use that fact to give an efficient implementation. (It could use some run time type information to check it and use a more efficient algorithm if it is, but it still could be passed a ListIntSet and have to fall back to a less efficient algorithm).

Compare this to ADTs, where you may write something more like the following in a signature or header file:

typedef struct IntSetStruct *IntSetType; void union(IntSetType s1, IntSetType s2); 

We program against this interface. Notably, the type is left abstract. You don't get to know what it is. Then we have a BST implementation then provides a concrete type and operations:

struct IntSetStruct {  int value;  struct IntSetStruct* left;  struct IntSetStruct* right; }  void union(IntSetType s1, IntSetType s2){ ... } 

Now union actually knows the concrete representations of both s1 and s2, so it can exploit this for an efficient implementation. We can also write a list backed implementation and choose to link with that instead.

I've written C(ish) syntax, but you should look at e.g. Standard ML for abstract data types done properly (where you can e.g. actually use more than one implementation of an ADT in the same program roughly by qualifying the types: BSTImpl.IntSetStruct and ListImpl.IntSetStruct, say)

The converse of this is that procedural data abstraction (objects) allow you to easily introduce new implementations that work with your old ones. e.g. you can write your own custom LoggingIntSet implementation, and union it with a BSTIntSet. But this is a trade-off: you lose informative types for binary methods! Often you end up having to expose more functionality and implementation details in your interface than you would with an ADT implementation. Now I feel like I'm just retyping the Cook essay, so really, read it!

I would like to add an example to this.

Cook suggests that an example of an abstract data type is a module in C. Indeed, modules in C involve information hiding, since there are public functions that are exported through a header file, and static (private) functions that don't. Additionally, often there are constructors (e.g. list_new()) and observers (e.g. list_getListHead()).

A key point of what makes, say, a list module called LIST_MODULE_SINGLY_LINKED an ADT is that the functions of the module (e.g. list_getListHead()) assume that the data being input has been created by the constructor of LIST_MODULE_SINGLY_LINKED, as opposed to any "equivalent" implementation of a list (e.g LIST_MODULE_DYNAMIC_ARRAY). This means that the functions of LIST_MODULE_SINGLY_LINKED can assume, in their implementation, a particular representation (e.g. a singly linked list).

LIST_MODULE_SINGLY_LINKED cannot inter-operate with LIST_MODULE_DYNAMIC_ARRAY because we can't feed data created, say with the constructor of LIST_MODULE_DYNAMIC_ARRAY, to the observer of LIST_MODULE_SINGLY_LINKED because LIST_MODULE_SINGLY_LINKED assumes a representation for a list (as opposed to an object, which only assumes a behaviour).

This is analogous to a way that two different groups from abstract algebra cannot interoperate (that is, you can't take the product of an element of one group with an element of another group). This is because groups assume the closure property of group (the product of elements in a group must be in the group). However, if we can prove that two different groups are in fact subgroups of another group G, then we can use the product of G to add two elements, one from each of the two groups.

Comparing the ADTs and objects

  • Cook ties the difference between ADTs and objects partially to the expression problem. Roughly speaking, ADTs are coupled with generic functions that are often implemented in functional programming languages, while objects are coupled with Java "objects" accessed through interfaces. For the purposes of this text, a generic function is a function that takes in some arguments ARGS and a type TYPE (pre-condition); based on TYPE it selects the appropriate function, and evaluates it with ARGS (post-condition). Both generic functions and objects implement polymorphism, but with generic functions, the programmer KNOWS which function will be executed by the generic function without looking at the code of the generic function. With objects on the other hand, the programmer does not know how the object will handle the arguments, unless the programmers looks at the code of the object.

  • Usually the expression problem is thought of in terms of "do I have lots of representations?" vs. "do I have lots of functions with few representation". In the first case one should organize code by representation (as is most common, especially in Java). In the second case one should organize code by functions (i.e. having a single generic function handle multiple representations).

  • If you organize your code by representation, then, if you want to add extra functionality, you are forced to add the functionality to every representation of the object; in this sense adding functionality is not "additive". If you organize your code by functionality, then, if you want to add an extra representation - you are forced to add the representation to every object; in this sense adding representations in not "additive".

Advantage of ADTs over objects

  • Adding functionality is additive

  • Possible to leverage knowledge of the representation of an ADT for performance, or to prove that the ADT will guarantee some postcondition given a precondition. This means that programming with ADTs is about doing the right things in the right order (chaining together pre-conditions and post-conditions towards a "goal" post condition).

Advantages of objects over ADTs

  • Adding representations in additive

  • Objects can inter-operate

  • It's possible to specify pre/post conditions for an object, and chain these together as is the case with ADTs. In this case, the advantages of objects are that (1) it's easy to change representations without changing the interface and (2) objects can inter-operate. However, this defeats the purpose of OOP in the sense of smalltalk. (see section "Alan Kay's version of OOP)

Dynamic dispatch is key to OOP

It should be apparent now that dynamic dispatch (i.e. late binding) is essential for object oriented programming. This is so that it's possible to define procedures in a generic way, that doesn't assume a particular representation. To be concrete - object oriented programming is easy in python, because it's possible to program methods of an object in a way that doesn't assume a particular representation. This is why python doesn't need interfaces like Java.

In Java, classes are ADTs. however, a class accessed through the interface it implements is an object.

Addendum: Alan Kay's version of OOP

Alan Kay explicitly referred to objects as "families of algebras", and Cook suggests that an ADT is an algebra. Hence Kay likely meant that an object is a family of ADTs. That is, an object is the collection of all classes that satisfy a Java interface.

However, the picture of objects painted by Cook is far more restrictive than Alan Kay's vision. He wanted objects to behave as computers in a network, or as biological cells. The idea was to apply the principle of least commitment to programming - so that it's easy to change low level layers of an ADT once the high level layers have been built using them. With this picture in mind, Java interfaces are too restrictive because they don't allow an object to interpret the meaning of a message, or even ignore it completely.

In summary, the key idea of objects, for Kay - is not that they are a family of algebras (as is emphasized by Cook). Rather, the key idea of Kay was to apply a model that worked in the large (computers in a network) to the small (objects in a program).

edit: Another clarification on Kay's version of OOP: The purpose of objects is to move closer to a declarative ideal. We should tell the object what to do - not tell it how by micromanaging is state, as is customary with procedural programming and ADTs. More info can be found here, here, here, and here.

edit: I found a very, very good exposition of Alan Kay's definition of OOP here.

Best Answer from StackOverflow

Question Source : http://cs.stackexchange.com/questions/51847

0 comments:

Post a Comment

Let us know your responses and feedback