Object-Orientation and Smalltalk (Part 3)

Object-oriented design

Classes

Typically, in an object-oriented program, every class has one of two roles.

Abstract class
- Meant to be inherited from
- Not meant to be instantiated
- Some subclassResponsibility methods
  Note, in Smalltalk, we use methods that perform (self subclassResponsibility) to indicate a method that should be overridden in a concrete sub-class. This leads to (slightly) better error messages when a sub-class fails to override the method.
  
  For example,
  
  -> (class BadTrue [subclass-of Boolean]) (1) -> (val badTrue (new BadTrue)) <BadTrue> -> (badTrue not) Run-time error: subclass failed to implement a method that was its responsibility -> (badTrue neg) Run-time error: BadTrue does not understand message neg
  
  1 Define BadTrue as a sub-class of Boolean with no instance variables and no methods (other than those inherited from Boolean).
- Examples: Boolean, Magnitude, Collection, …
Concrete class
- Meant to be instantiated
- May or may not meant to be inherited from
  
  Note, in Smalltalk, there is no way to prevent a class from being used as a super-class for some new sub-class.
- No subclassResponsibility methods
  
  The class does not define any methods that perform (self subclassResponsibility).
  
  The class does override any super-class methods that that perform (self subclassResponsibility).
- Examples: True, Fraction, List, …

Abstract and concrete classes are all about reuse of implementation! And the key to successful reuse is a well-designed class hierarchy.

Implement as much as possible in the abstract class(es), in terms of a small set of "critical" abstract methods. Implement specialized behavior in concrete subclass by providing an implementation of the abstract methods and inheriting the implementations of other methods.

Occasionally, there may be opportunities for a concrete subclass to override a non-abstract method of the abstract superclass. Typically, this occurs when there is a (significantly) more efficient implementation of the method utilizing specific aspects of the concrete subclass than the general-purpose implementation from the abstract superclass.

Exercises

How does type checking in Java help to enforce the distinction between an abstract class and a concrete class?
Are there other ways that type checking in Java helps to protect against misuse of class hierarchies?

uSmalltalk Basis

Smalltalk, the language, is quite small; most of the "magic" of Smalltalk occurs in the standard library.

Smalltalk-80 "blue book" devotes about 90 pages on the language definition (syntax and semantics), but about 300 pages on the standard library.

We’ve looked at the Boolean class hierarchy, which exemplifies the "All control-flow is message send" slogan.

Today, we’ll look at the Collection class hierarchy and, briefly, at the Number class hierarchy. Both are somewhat simplified in uSmalltalk relative to Smalltalk-80, but nonetheless are good examples of object-oriented design.

`Collection` class hierarchy

digraph {
rankdir = BT;

subgraph {
rank = same;
"Collection" [label="Collection\n(abstract)"];
}

subgraph {
rank = same;
"Set" [label="Set\n(concrete)"];
"KeyedCollection" [label="KeyedCollection\n(abstract)"];
}

subgraph {
rank = same;
"Dictionary" [label="Dictionary\n(concrete)"];
"SequenceableCollection" [label="SequenceableCollection\n(abstract)"];
}

subgraph {
rank = same;
"List" [label="List\n(concrete)"];
"Array" [label="Array\n(concrete)"];
}

"Set" -> "Collection";
"KeyedCollection" -> "Collection";
"Dictionary" -> "KeyedCollection";
"SequenceableCollection" -> "KeyedCollection";
"List" -> "SequenceableCollection";
"Array" -> "SequenceableCollection";

"Set" -> "List" [style=dotted;constraint=false];
"Dictionary" -> "List" [style=dotted;constraint=false];

}

In the above diagram, a solid arrow means "is a subclass of" (for example, Set is a subclass of Collection), while a dotted arrow means "is a client of" (for example, Set is a client of List; more specifically, the Set class declares an instance variable that is a List object).

Collection (abstract)

contains objects

Set (concrete)

contains objects in no particular order, with no duplicates

KeyedCollection (abstract)

contains objects associated with keys (alt., contains key-value pairs)

Dictionary (concrete)

keys are arbitrary objects, need only to be comparable for equality

SequenceableCollection (abstract)

keys are consecutive integers

List (concrete)

sequence that can grow and shrink, but linear-time access

Array (concrete)

sequence that has fixed size, but constant-time access

`Collection` protocol

mutators

add: anObject (abstract)

add the argument object to the receiving collection

addAll: aCollection

add each element of the argument collection to the receiving collection

remove: oldObject

remove an element equal to the argument object from the receiving collection; error if absent

remove:ifAbsent: anObject exnBlock (abstract)

remove an element equal to the argument object from the receiving collection; eval exnBlock if absent

removeAll: aCollection

remove each element of the argument collection from the receiving collection; error if any absent

observers

isEmpty

is the receiving collection empty?

size

answer the number of elements in thee receiving collection

includes: anObject

does receiving collection contain an element equal to the argument object?

occurrencesOf: anObject

answer how many elements of the receiving collection are equal to the argument object

detect: aBlock

answer the first element of the receceiving collection satisfying the argument block; error if none; aka, uScheme’s exists?

detect:ifNone: aBlock exnBlock

answer the first element of the receceiving collection satisfying the argument block; eval exnBlock if none

= aCollection

answer whether the contents of the receiving collection are equivalent to the contents of the argument collection

iterators

do: aBlock (abstract)

eval the argument block on each element of the receiving collection

inject:into: anObject aBinaryBlock

answer the result of combining elements using the argument block, starting with the argument object; aka, uScheme’s foldl

select: aBlock

answer a new collection (like the receiving collection) containing the elements x of the receiving collection for which (aBlock value: x) answers true; aka, uScheme’s filter

reject: aBlock

answer a new collection (like the receiving collection) containing the elements x of the receiving collection for which (aBlock value: x) answers false; aka, uScheme’s filter

collect: aBlock

answer a new collection (like the receiving collection) containing (aBlock value: x) for each element x of the receiving collection; aka, uScheme’s map

private

species (abstract)

answer the concrete sub-class of the receiving collection used to create new collections like the receiving collection

Collection class protocol;

creators

new

answer an empty collection (abstract)

with: anObject

answer a singleton collection containing the argument object

withAll: aCollection

answer a collection containing all of the elements of the argument collection

A few points. In contrast to the immutable data structures of uScheme and Standard ML, the Collection classes correspond to mutable data structures. Thus, there are a number of mutator methods that change the collection itself. Indeed, there are very few creator class methods; instead, one is meant to start with an empty collection and mutate the collection to contain the elements of interest. The observer methods do not mutate the collection itself, but simply inspects the collection to determine some property of its elements.

The iterators could be considered observer methods, but are rich enough to deserve their own category. These methods use a block argument to process each of the elements of the collection.

Examples

-> (val i 0)
0
-> (val s1 (Set new))
Set( )
-> ({((s1 size) = 10)} whileFalse: {(s1 add: i) (set i (i + 1))})
nil
-> s1
Set( 0 1 2 3 4 5 6 7 8 9 )
-> (s1 add: 8)
Set( 0 1 2 3 4 5 6 7 8 9 )
-> (s1 add: 9)
Set( 0 1 2 3 4 5 6 7 8 9 )
-> (s1 add: 10)
Set( 0 1 2 3 4 5 6 7 8 9 10 )
-> (((s1 add: 13) add: 12) add: 11)
Set( 0 1 2 3 4 5 6 7 8 9 10 13 12 11 )
-> (val s2 (Set new))
Set( )
-> (set i 0)
0
-> ({(((s2 size) = 5) not)} whileTrue: {(s2 add: i) (set i (i + 2))})
nil
-> s2
Set( 0 2 4 6 8 )
-> (s1 removeAll: s2)
Set( 1 3 5 7 9 10 13 12 11 )

Note that the mutator methods (e.g., add: and removeAll:) answer the receiver, supporting the chaining idiom.

-> (val s1 (Set new))
Set( )
-> (((s1 add: 1) add: 2) add: 3)
Set( 1 2 3 )
-> (val s2 (Set new))
Set( )
-> (((s2 add: 3) add: 2) add: 1)
Set( 3 2 1 )
-> (s1 == s2)
<False>
-> (s1 = s2)
<True>
-> (s2 add: 4)
Set( 3 2 1 4 )
-> (s1 == s2)
<False>
-> (s1 = s2)
<False>

Note that the elements of a Set are not sorted.

Recall that == corresponds to "object identity", while = answers whether the contents of the receiver are equal to the contents of the argument right now. Whether or not two collections are equal depends on their current contents and can be different at different times during execution.

-> (val i 0)
0
-> (val l (List new))
List( )
-> ({((l size) = 10)} whileFalse: {(l add: (i * i)) (set i (i + 3))})
nil
-> l
List( 0 9 36 81 144 225 324 441 576 729 )
-> (l detect: [block (x) ((x mod: 7) = 0)])
0
-> (l remove: 0)
0
-> (l detect: [block (x) ((x mod: 7) = 0)])
441
-> (l includes: 223)
<False>
-> (l includes: 225)
<True>
-> (l detect: [block (x) (x < 0)])
Run-time error: no-object-detected
-> (l detect:ifNone: [block (x) (x < 0)] nil)
Run-time error: UndefinedObject does not understand message value
-> (l detect:ifNone: [block (x) (x < 0)] {nil})
nil

Note that detect: reports an error if the collection does not have an element satisfying the block. In contrast, detect:ifNone: invokes an exception block when the collection does not have an element satisfying the block. Remember: the second argument of detect:ifNone: is a block; in the above, the exception block is used to answer a default value.

-> (val i 0)
0
-> (val l1 (List new))
List( )
-> ({((l1 size) <= 10)} whileTrue: {(l1 add: (i + 1)) (set i (i + 1))})
nil
-> l1
List( 1 2 3 4 5 6 7 8 9 10 11 )
-> (val l2 (l1 reject: [block (x) ((x mod: 2) = 1)]))
List( 2 4 6 8 10 )
-> (l2 inject:into: 0 [block (x y) (x + y)])
30
-> (l2 inject:into: 1 [block (x y) (x * y)])
3840

Note how the inject:into: method behaves like uScheme’s foldl function. In the first case, we add the elements of l2 and in the second case we multiply the elements of l2.

Implementation

We won’t examine the implementation of the whole Collection hierarchy; instead, we focus on how the four abstract methods of the Collection class are used to implement all of the other methods.

The four abstract methods of the Collection class are:

add:
remove:ifAbsent:
do:
species

Some of the concrete methods of the Collection class are simply idiomatic uses of the abstract methods. For example, the remove: method simply sends the remove:ifAbsent: message to itself with an exception block that reports an error:

(method remove: (anObject)
   (self remove:ifAbsent: anObject {(self error: 'remove-was-absent)}))

Other concrete methods of the Collection class are more interesting uses of the abstract methods. The critical method is do:, which encapsulates iteration over all of the elements of a collection, along with the ability of a block-expression to mention (and therefore mutate) any variable in scope (such as local variables, method parameters, and instance variables). For example, the size method can be implemented by initializing a local variable to 0, then iterating through all of the elements of the collection and incrementing the local variable for each one, and finally answering the local variable.

(method size () [locals ans]
  (set ans 0)
  (do: self [block (x) (set ans (ans + 1))])
  ans)

Note that although the block parameter x is not used, it would be incorrect to use {(set ans (ans + 1))} (equivalent to (block () (set ans (ans + 1)))), because do: expects a unary block (a block that is sent the value: message with one argument), but {(set ans (ans + 1))} is a nullary block (a block that expects to be sent the value message with no arguments).

We can generalize this pattern to the inject:into: method, which provides functionality much like uScheme’s foldl function.

(method inject:into: (thisValue binaryBlock)
  (self do: [block (x) (set thisValue (binaryBlock value:value: x thisValue))])
  thisValue)

Note that we do not even need a local variable, since the thisValue parameter can be mutated.

In the above examples, we sent the do: message to self. In the addAll: and removeAll: methods, we send the do: message to another object.

(method addAll: (aCollection)
  (aCollection do: [block (x) (self add: x)])
  self)
(method removeAll: (aCollection)
  (aCollection do: [block x (self remove: x)])
  self)

There is an interesting facet of blocks in the above. Although a block is an object, an occurrence of the self variable in the body expression of a block-expression does not refer to the block object; rather it refers to the receiver object of the method in which the block-expression occurs. Thus, the (self add: x) really does add to object to which received the addAll: message (and does not attempt to send the add: message to a block object).

For an other interesting example, consider the detect:ifNone: aBlock exnBlock, which searches for an element of the collection that satisfies aBlock and, if none is found, evaluates the exnBlock.

(method detect:ifNone: (aBlock exnBlock)
  (self do: [block (x) ((aBlock value: x) ifTrue: {(return x)})])
  (exnBlock value))

There is an interesting facet of blocks in the above. Although a block is an object that evaluates its body expression in response to the value message to yield and answer, an occurrence of return in the body expression of a block-expression does not yield that value as the answer of the value method of the block; rather, it yields that value as the answer of the method in which the block-expression appears.

This leads to a particularly simple implementation of the isEmpty method:

(method isEmpty ()
  (self do: [block (x) (return false)])
  true)

The species method of the Collection class is used to create a collection of the same class as the receiver. It is considered a private method of the Collection class, in the sense that it is meant to be used to implement other methods of the Collection class hierarchy, but is not meant to be used by clients of the Collection class. In each of the concrete subclasses of Collection, the species method is simply overridden to return the concrete class itself. For example, in the List class, the species method is implemented as:

(method species () List)

The species method is used for the iterator methods that build up a collection that is like the receiver.

(method select: (aBlock) [locals ans]
  (set ans ((self species) new))
  (self do: [block (x) ((aBlock value: x) ifTrue: {(ans add: ans)})])
  ans)
(method reject: (aBlock)
  (self select: [block (x) ((aBlock value: x) not)]))
(method collect: (aBlock) [locals ans]
  (set ans ((self species) new))
  (self do: (ans add: (aBlock value: x)))
  ans)

Recall that, as methods defined in the Collection class, these methods should always work for any concrete subclass of Collection. Of course, subclass can override (redefine) any of these methods with more efficient versions.

For example, the List subclass, which uses a circular list of cons cells, overrides the isEmpty method. The size method implemented in the Collection: class (which uses do:) is a linear-time operation; it always iterates through all elements of the collection. But, determining emptiness of an array can be done in constant-time.

The "Problem" with the `Array` class

Consider the following transcript:

-> (val i 0)
0
-> (val a (Array new: 10))
( nil nil nil nil nil nil nil nil nil nil )
-> ({(i < 10)} whileTrue: {(a at:put: i i) (set i (i + 1))})
nil
-> a
( 0 1 2 3 4 5 6 7 8 9 )
-> (a add: 10)
Run-time error: arrays-have-fixed-size

The (abstract) superclass Collection specifies the (abstract) method add:, but the (concrete) subclass Array fails to implement it!

On the one hand, this is reasonable, because an Array is a fixed-sized data structure and it doesn’t make sense to add (or remove) elements from an array.

On the other hand, an Array is a subclass of Collection. Under the object-oriented discipline an Array is a Collection and it should be possible to use any concrete subclass of Collection for a purpose which only requires the behavior of the Collection methods. (This (strong) behavior subtyping property is known as the Liskov substitution principle.)

This is an issue that almost always arises in any "deep" class hierarchy, where you have several layers of subclasses. It can be particularly hard to design deep class hierarchies (or, to design a class hierarchy than can be deeply subclasses).

As far as Prof. Fluet knows, Smalltalk didn’t really have a solution to this problem. Later object-oriented languages have offered some features that provide more flexible reuse mechanisms.

Java

In Java, we have interfaces that are distinct from classes. With interfaces, we could carve up the Collection protocol into a number of interfaces, depending on which of the abstract methods they depended upon. For example, a AddableCollection interface might have the add:, addAll:, … methods, while a DoableCollection interface might have the do:, detect:ifNone:, … methods, and a AddableDoableCollection interface might have the select:, reject:, collect:, … methods.

Now, the Array class would implement DoableCollection but not AddableCollection or AddableDoableCollection.

Prior to Java 8, implementing DoableCollection would not provide a default implementation of detect:ifNone: in terms of do:. It would be up to each implementing class to provide the implementation of detect:ifNone:.

With Java 8, it became possible to specify that method of an interface has a default implementation.
Ruby

Ruby, as a dynamically typed language, does not have interfaces (which are a static typing feature). But, Ruby does have mixins (technically, Ruby has modules, which can idiomatically be used as mixins). A mixin is a collection of method definitions that can be included into a class. A mixin is not itself a class, but when included into a class, it behaves as though it were a superclass, in the sense that it provides the code for one or more methods of the class being defined.

Exercises

The detect: method of the Collection protocol is similar, but not exactly the same, as uScheme’s exists? function. And there is no method that corresponds to uScheme’s all? function. Give implementations of exists?: aBlock and all?: aBlock methods suitable for inclusion in the Collection implementation.
The occurrencesOf: method counts the number of objects in a collection that are = to a given object. Generalize to a count: method that counts the number of objects in a collection that satisfy an argument block. Give two implementations of count:: one using do: and one using inject:into:.
Look through the Collection (Figure 10.15), KeyedCollection (Figure 10.16), and SequenceableCollection (Figure 10.18) protocols. For each method, consider whether or not the Array class has a reasonable implementation of that method. What do you think of the percentage of methods that Array cannot support?
The Dictionary class is a KeyedCollection that assumes only that its keys respond correctly to the = message. Imagine a BSearchTree class that is a KeyedCollection that assumes that its keys respond correctly to the = and < messages and implements a mutable finite map using a binary search tree. Are there any methods of KeyedCollection that the BSearchTree class could not support? Are there any additional methods that the BSearchTree class could support? Would you advocate for any changes to the Collection class hierarchy to accommodate a BSearchTree class? What must be true of the < and = methods so that the BSearchTree class behaves correctly?
Read and think about Exercises 17 — 31 of Chapter 10 of Programming Languages: Build, Prove, and Compare, which ask for various improvements and extensions to the Collection class hierarchy.

"Number" class hierarchy

We won’t look closely at the numeric classes of uSmalltalk, but suffice to say that it is another rich hierarchy.

digraph {
rankdir = BT;

subgraph { rank = same;
  "Magnitude" [label="Magnitude\n(abstract)"]; }

subgraph { rank = same;
  "Natural" [color=red]; "Number" [label="Number\n(abstract)"]; }

subgraph { rank = same;
  "Fraction"; "Float"; "Integer" [label="Integer\n(abstract)"]; }

subgraph { rank = same;
  "SmallInteger"; "LargeInteger" [label="LargeInteger\n(abstract)",color=red]; }

subgraph { rank = same;
  "LargePositiveInteger" [color=red]; "LargeNegativeInteger" [color=red]; }

Natural -> Magnitude;
Number -> Magnitude;
Fraction -> Number;
Float -> Number;
Integer -> Number;
SmallInteger -> Integer;
LargeInteger -> Integer;
LargePositiveInteger -> LargeInteger;
LargeNegativeInteger -> LargeInteger;

LargePositiveInteger -> Natural [style=dotted;constraint=false];
LargeNegativeInteger -> Natural [style=dotted;constraint=false];

Fraction -> Integer [style=dotted;constraint=false];
Float -> SmallInteger [style=dotted;constraint=false];

}

The red classes are extensions of the numeric classes described in exercises of the textbook.

The Magnitude protocol includes =, <, >, <=, >=, min:, and max: methods. In general, it is only appropriate to compare objects of the same subclass of Magnitude. That is, if both Char and Date are subclasses of Magnitude, then one can compare a Char with another Char or compare a Date with another Date, but it isn’t appropriate to compare a Char with Date.

The Number protocol includes negated, reciprocal, abs, +, -, *, /, isNegative, isNonnegative, isStrictlyPositive, isZero, raisedToInteger:, squared, sqrtWithin:, sqrt, coerce:, asInteger, asFraction, asFloat.

We point out the coerce: aNumber method, as there is a subtle use of it in Programming06: Smalltalk Programming. The coerce: method coerces the value of the argument to the class of the receiver. For example:

-> (val half (1 / 2))
1/2
-> (half coerce: 3)
3/1
-> (3 coerce: half)
0

The coerce: method is important because the numeric operations assume that the receiver and the argument are of the same class:

-> (val half (Fraction num:den: 1 2))
1/2
-> (half + half)
1/1
-> (half + 2)
Run-time error: SmallInteger does not understand message den
-> (half + (half coerce: 2))
5/2

Double Dispatch

We conclude with a brief look at the "double dispatch" pattern, which can be used to improve and extend the numeric class hierarchy.

Recall that method dispatch is the mechanism that decides which method (i.e., code) should be evaluated in response to a message.

Single dispatch means that the method to be evaluated depends only on the receiver of the message.

Double dispatch, on the other hand, means that the method to be evaluated depends on both the receiver and the argument(s) of the message.

In Smalltalk, the language only provides single dispatch. On the other hand, there are programming idioms that let us express double dispatch.

To motivate dynamic dispatch, we might like the various Number classes to interoperate seamlessly. For example, we saw above that the numeric operations assume that the receiver and the argument are of the same class. We also saw that the programmer could explicitly insert coercions. It would be nicer if the numeric operations themselves coerced their arguments as appropriate:

-> (half + half)
1/1
-> (half + 2)
5/2
-> (2 + half)
5/2

In particular, we would like the class of the result of a numeric operation to depend on the classes of the receiver and the argument, possibly with one or the other being coerced to the result class:

class of arg

class of result

Integer

Fraction

Float

class of recv

Integer

Fraction

Float

Fraction

Float

A poor way of doing this would be to use the isKindOf: messages of the Object protocol to inspect the class of the receiver. (This would be comparable to using Java’s instanceof operator, which we should recognize as poor object-oriented design.) This would be the receiver learning the class of the argument.

Instead of the receiver learning the class of the argument, the receiver can teach its class to the argument. Thus, the body of the + method will not actually perform the addition; rather it will send a message to the argument that encodes the class of the receiver. The actual code to perform the addition will only be invoked after the classes of both operands are known.

How will we encode the class of the (original) receiver when sending a message to the (original) argument? In the name of the message! For example, addIntegerTo:, addFractionTo:, and addFloatTo:; these message names encode both the operation and the class of the argument. Meanwhile, an add: method in each of the concrete Number subclasses will actually perform the addition, assuming that the receiver and the argument are of the same class. These are additional "private" methods in the Number class hierarchy, in the sense that they are meant to be used to implement other methods of the Number class hierarchy, but are not meant to be used by clients of the Number class hierarchy.

Here is the essential code:

In class Integer

(method + (aNumber) (aNumber addIntegerTo: self))
(method addIntegerTo: (anInteger) (anInteger add: self))
(method addFractionTo: (aFraction) (aFraction add: (aFraction coerce: self)))
(method addFloatTo: (aFloat) (aFloat add: (aFloat coerce: self)))
(method add: (anInteger)
  ;; actual code to add two integers
)

In class Fraction

(method + (aNumber) (aNumber addFractionTo: self))
(method addIntegerTo: (anInteger) ((self coerce: anInteger) add: self))
(method addFractionTo: (aFraction) (add: aFraction self))
(method addFloatTo: (aFloat) (aFloat add: (aFloat coerce: self)))
(method add: (aFraction)
  ;; actual code to add two fractions
)

In class Float

(method + (aNumber) (addFractionTo: aNumber self))
(method addIntegerTo: (anInteger) ((self coerce: anInteger) add: self))
(method addFractionTo: (aFraction) ((self coerce: aFraction) add: self))
(method affFloatTo: (aFloat) (aFloat add: self))
(method add: (aFloat)
  ;; actual code to add two floats
)

Consider the evaluation of (half + 2):

evaluate (half + 2)
send + to half
class of receiver is Fraction
does class Fraction implement +? yes!
method body is (aNumber addFractionTo: self)
evaluate (aNumber addFractionTo: self), where self is half and aNumber is 2
evaluate (2 addFractionTo: half)
send addFractionTo: to 2
class of receiver is Integer
does class Integer implement addFractionTo:? yes!
method body is (aFraction add: (aFraction coerce: self))
evaluate (aFraction add: (aFraction coerce: self)), where self is 2 and aFraction is half
evaluate (half add: (half coerce: 2))
- (half coerce: 2) answers 2/1
evaluate (half add: 2/1)
- code to add two fractions is executed
answer 5/2

Exercises

Read through Exercises 32 — 39 of Chapter 10 of Programming Languages: Build, Prove, and Compare, which explore improvements and extensions to uSmalltalks’s numeric classes.

Object-Orientation and Smalltalk (Part 3)

Object-oriented design

Classes

Exercises

uSmalltalk Basis

Collection class hierarchy

Collection protocol

Examples

Implementation

The "Problem" with the Array class

Exercises

"Number" class hierarchy

Double Dispatch

Exercises

`Collection` class hierarchy

`Collection` protocol

The "Problem" with the `Array` class