Introduction to Object Oriented Programming, 3rd Ed
Chapter 27
Implementation
Outline
- Roadmap
- Two General Approaches to Implementation
- The Receiver as Argument
- The Corresponding Formal Argument
- Non-virtual methods
- Name Encoding
- Inherited Methods
- The Problem with Multiple Inheritance
- The Slicing Problem
- Overridden Methods
- Solution, A Virtual Method Table
- Instances Share the same Virtual Method Table
- Method Calls become Indexed Offsets
- Building a Virtual Table for a Subclass
- Virtual Method Table for Subclasses
- Elimination of Virtual Calls
- Dispatch Tables
- An Object and Its Dispatch Table
- Method Cache
- The Messaging Function checking the Cache
- Bytecode Interpreters
- Bytecodes in the Little Smalltalk System
- Just In Time Compilers
Other Material
Roadmap
Topics covered in this chapter include:
- Compilers vs Interpreters
- The Receiver as Argument
- Inherited Methods
- The problem of Multiple Inheritance
- Overridden Methods
- Dispatch Tables
- Bytecode Interpreters
- Just in Time Compilation
Intro OOP, Chapter 27, Slide 01
Two General Approaches to Implementation
-
Compilers - translated into basic machine code, source code
not available at run-time, generally very efficient.
-
Interpreters - translated into intermediate representation, source
code available for reference at run-time, generally somewhat less efficient.
Endpoints are clear, but there are lots of gray areas in the middle.
Intro OOP, Chapter 27, Slide 02
Java JIT systems are one of those gray areas between compilers and interpreters
The Receiver as Arguments
A method is eventually invoked just like any other function. This
means that the receiver just be passed as an argument.
Traditionally, it is passed as the first argument.
This means a method call, such as
aCardPile->addCard (currentCard)
Is translated into
addCard (aCardPile, currentCard)
(This is ignoring the method lookup process, which we will discuss
shortly).
Intro OOP, Chapter 27, Slide 03
The Corresponding Formal Argument
On the other side, the receiver pseudo-variable is just a formal argument:
Instead of
void CardPile::addCard (Card * aCard) {
...
}
We have
void addCard (CardPile * this, Card * aCard) {
...
}
The first argument can then be used to access data members and other
methods.
Intro OOP, Chapter 27, Slide 04
Non-virtual Methods
In languages that permit both virtual and non-virtual methods (such as C++)
a non-virtual method is translated into a simple procedure call.
- The receiver is made into the first argument
- The name is encoded to make it unique
Intro OOP, Chapter 27, Slide 05
Name Encoding
-
Different classes are allowed to have functions with the same name.
-
Some languages (C++) even permit multiple functions with the same name
within a class.
-
Yet linkers usually want every function to have a unique name.
-
Solution - generate an internal name that encodes both class name,
function name, and argument types.
-
Example: Foo::Bar_int_float_int
Intro OOP, Chapter 27, Slide 06
An encoded name is sometimes called a mangled name.
You will sometimes see mangled names in error messages generated by a linker.
Inherited Methods
Now consider those methods defined in a parent class, but used by a child class.
How is it that this mechanism can work? Normally you cannot change the
types of arguments (recall that the receiver is just an argument).
Solution is that the data associated with an instance of a child class is
an extension of the data associated with the parent class.
This means that data fields in the parent class will be found at the
same offset in the child class.
Intro OOP, Chapter 27, Slide 07
The Problem with Multiple Inheritance
The idea that a child is an extension of the parent explains one of the most
vexing problems in the implementation of multiple inheritance.
A child can extend one parent, or the other, but not both.
That is the offset of data fields in the child cannot simultaneously match
both parents.
Intro OOP, Chapter 27, Slide 08
Slicing Problem
The idea that a child can extend the data area of the parent also makes
it difficult to support both the following goals
- The goal of keeping memory on the stack, which is laid out at compile time
- The goal of supporting the polymorphic variable, which can hold
an instance of the child class at run time.
Most OO languages uphold (2) and abandon (1), C++ is an exception in
that it upholds (1) and therefore abandons (2).
Intro OOP, Chapter 27, Slide 09
Overridden Methods
We next consider those methods that are defined in a parent class,
and overridden in a child class.
Problem, how can a polymorphic method invocation find the right method?
Note that the right method can change during the course of execution,
even for the same method call.
CardPile * aCardPile = new DiscardPile();
Card * aCard = ... ;
aCardPile->addCard (aCard); // how to find the right method
Intro OOP, Chapter 27, Slide 10
Solution, a Virtual Method Table
The solution is that every object contains an extra hidden data field.
This data field points to an array of pointers to functions.
The array is determined by the current dynamic type, and is shared
by all instances of the class.
The offset of each method can be determined at compile time.
Intro OOP, Chapter 27, Slide 11
Instances Share the same Virtual Method Table
Two instances of a class will share the same virtual method table.
Intro OOP, Chapter 27, Slide 12
Method Calls Become Indexed Offsets
Each object maintains a pointer to a table, called the virtual method
table.
Virtual methods are identified by a fixed address in this table.
A method call, such as
A.foo(B, C)
is translated into
(* A.virTable[idx])(A, B, C)
Intro OOP, Chapter 27, Slide 13
Building A Virtual Table for a Subclass
When a subclass is created, a new virtual method table is generated.
-
Methods that are inherited point to the same function as the parent class.
(and are found in the same offset in the virtual method table).
-
Methods that are overridden occupy the same offset location, but
point to the new function, instead of the version in the parent class.
Intro OOP, Chapter 27, Slide 14
Virtual method Table for Subclasses
Intro OOP, Chapter 27, Slide 15
Elimination of Virtual Calls
Even though the overhead of a virtual call is small, it can still add up.
If the (dynamic) class of the receiver is know, a virtual call can simply
become an ordinary procedure call
Good optimizing compiles spend a considerable amount of time tracing possible
execution flows to gather this information.
Sometimes methods can even be expanded in-line at the point of call.
Intro OOP, Chapter 27, Slide 16
Dispatch Tables
In languages without static typing it is not practical to use a virtual table,
since such a table would need to encode all methods, not simply those
in a given class hierarchy.
An alternative technique uses a pointer to a list of selector/method pairs.
When a method is invoked, a run-time search is performed to match the method
being called with the list of known selectors, until an appropriate method
is found.
In Objective-C the messages
[ neighbor checkrow: row column: column ]
is translated into
objc_msgSend(neighbor, "checkrow:column:", row, column)
Intro OOP, Chapter 27, Slide 17
An Object and Its Dispatch Table
An important difference is that the dispatch table is searched at run-time,
not at compile time.
Intro OOP, Chapter 27, Slide 18
Objective-C uses a linear list for the table, Smalltalk generally uses
a balanced search tree, but the idea is similar.
Method Cache
In order to avoid the cost of a dynamic search of dispatch tables, a single
global cache can be used to hold frequently invoked methods.
The cache is used as a large hash table.
Prior to searching a dispatch table, the a single hashed entry is examined -
if it matches the selector being sought, the method is used, if not the
dispatch table is searched and the new entry replaces the value in the
hash table.
Assuming a good hash function is used, efficiency can be very high.
Intro OOP, Chapter 27, Slide 19
The Messaging Function checking the Cache
Intro OOP, Chapter 27, Slide 20
Bytecode Interpreters
An implementation technique that is widely used (Smalltalk, Java)
- Program is compiled into an ``assembly language'' for an imaginary machine.
- Since this assembly code is often represented by a string of bytes,
it is called bytecode.
- Since the machine is imaginary, can run on any platform.
- But it must be simulated (by a virtual machine) and hence you pay an
execution time cost.
Intro OOP, Chapter 27, Slide 21
Bytecodes in the Little Smalltalk System
Intro OOP, Chapter 27, Slide 22
Just In Time Compilers
A currently popular mix between compilers and interpreters.
Key idea, first time a method is executed, translate the bytecode into
native machine code.
Gives fast execution time, pay penalty for translation (and analysis if you
want to do a good job).
Currently very popular with Java systems.
Intro OOP, Chapter 27, Slide 23