Athena

Debugging With More Than Watches And Breakpoints

(or How To Use The CPU Window)

Brian Long (www.blong.com)

Table of Contents

If you find this article useful then please consider making a donation. It will be appreciated however big or small it might be and will encourage Brian to continue researching and writing about interesting subjects in the future.


 Introduction

This paper will look at some options available for debugging Delphi applications which are not well used by many Delphi developers. This is perhaps because these techniques involve, in many cases, knowledge of implementation details of various Delphi language constructs, and familiarity with assembly language, to one extent or another. The fact that many Delphi developers are not familiar with such low-level language and machine details could be related to the fact that many Delphi developers have a history of using 4GLs, high-level database languages, or even Turbo Pascal. Users of such tools will often not have experience of issues at the implementation level or machine level.

Delphi 5 will be assumed as the working environment throughout this paper, however if anything mentioned is not available in earlier versions, this will be pointed out in the text. Additionally, the compiler optimisation option is assumed to be disabled for any disassembly that might be displayed. This makes the code more straightforward to analyse, without need to concern ourselves with how the optimiser may chop and change the code around.

Introduction To Delphi Debugging Facilities

Delphi has always had a rich variety of debugging facilities. Delphi 1 had the traditional 3rd Generation Language (3GL) offerings, including single-stepping (executing individual source lines one at a time, optionally executing whole subroutines as one statement), watches (sometimes referred to as watchpoints) and breakpoints (which can have an associated condition or pass count). Some viewing windows (for watches, breakpoint properties and a function call stack) and an evaluate-and-modify area completed the set.

We won't be looking at these normal debugging facilities in any detail as most Delphi developers are quite at home using them. Instead, we will be focusing on some of the lesser known and more scarcely used areas.

The set of debugging tools remained mostly unchanged when Delphi first went 32-bit with version 2, apart from the ability to view additional execution threads in your process in a Thread window.

The CPU Window

Version 3 added (well, almost added) the CPU window to the set (as well as the modules window), but as it was not quite finished it had to be enabled with an undocumented option. This involved using REGEDIT.EXE to add a new string item in Delphi 2 or 3's Debugging key called EnableCPU and giving it a value of "1" (it defaults to a value of "0"). This enables a new CPU Window menu item on the View menu.

Despite being fully available in Delphi 4 and 5 (as well as in the historically popular standalone Turbo Debugger), the CPU window is not very well understood and developers often shy away from it. This is unfortunate because, when used well, it can prove to be one of the most direct and powerful debugging aids for the developer.

Unfortunate it may be, but it is understandable, as the first thing you see when opening the window is a lot of assembly instructions. This tends to makes the average Delphi developer run and hide. In order to help make the window more approachable, we will have a quick run through of some of the basics of assembly instructions that will hopefully set us in good stead to use this window to good effect.

Starting With The CPU Window

Nobody who writes Windows applications writes them solely in assembly (at least, no one who wants to be productive). Assembly programming is one step away from machine code programming. Assembly programming involves using recognisable mnemonics instead of the resultant numeric opcodes, and each assembly statement corresponds to exactly one machine instruction. Any high level programming language allows you to write higher level abstractions of logic using a given dictated syntax, and the compiler then expands this out to the resultant machine code in your generated executable.

However despite the compiler's optimising nature, sometimes some key logic that may get called a significant number of times may be better written in assembly for efficiency. Various parts of the Delphi Run-Time Library (RTL) are written in assembly to aid execution efficiency. To a lesser extent, occasional routines (or portions of them) in the Visual Component Library (VCL) are also hand-written in assembly.

Apart from the business of writing assembly code, which does not concern us here, knowing a smattering of assembly is most useful when debugging in Delphi as it allows you to understand (at least partially) what is presented in the CPU Window.

Figure 1: The CPU Window

Figure 1 shows the CPU Window as it normally appears. It shows itself under three circumstances:

  1. At any point that the debugger has control of your process (your program is suspended in the debugger), you can choose View | Debug Windows | CPU.
  2. It appears automatically if you pause the execution of your program (Run | Program Pause) and the machine instruction being executed has no corresponding source line.
  3. Assuming the option has not been disabled, the debugger will intercept any exceptions that happen in your application and suspend your program at that point. For most exceptions, the dialog offers a checkbox that will display the CPU window when OK is pressed (see Figure 2). In Delphi 4 and 5, this option is in the Tools | Debugger Options... dialog, on the Language Exceptions page and is the Stop on Delphi Exceptions checkbox.

Figure 2: When an exception is picked up, you have the option of seeing the CPU window

There are five parts of the CPU window, as can be seen in Figure 1. Starting at the top left, going clockwise, these are:

You can navigate around these panes in one direction using Tab or Shift+Tab for the other direction.

CPU Registers

Registers are special memory locations (with no address) that the CPU can access very efficiently. They are either 8-bits (Byte), 16-bits (Word) or 32-bits (DWord) in size, although the 8-bit registers are just individual bytes of the some of 16-bit registers. Some of the 16-bit registers are just the low words of some of the 32-bit registers. Assembly instructions often operate in conjunction with registers to move information around, or to modify it.

The most commonly used registers are 32-bits and are prefixed with the letter E for extended. This is to distinguish them from their 16-bit counterparts. For example, the 32-bit EAX register is an extended version of the 16-bit AX register. Table 1 lists all available registers with some information about them.

Table 1: CPU registers

Register

Register name

Size

Comments

EAX

Extended accumulator

32 bits

General purpose register

EBX

Extended base

32 bits

General purpose register

ECX

Extended count

32 bits

General purpose register

EDX

Extended data

32 bits

General purpose register

ESI

Extended source indicator

32 bits

General purpose register

EDI

Extended destination indicator

32 bits

General purpose register

EBP

Extended base pointer

32 bits

General purpose register

ESP

Extended stack pointer

32 bits

General purpose register

EIP

Extended instruction pointer

32 bits

Status/control register

EFL

Extended flags

32 bits

Status/control register

CS

Code segment

16 bits

To hold a segment selector

DS

Data segment

16 bits

To hold a segment selector

SS

Stack segment

16 bits

To hold a segment selector

ES

Extra segment

16 bits

To hold a segment selector

FS

Another extra segment

16 bits

To hold a segment selector

GS

Another extra segment

16 bits

To hold a segment selector

AX

Accumulator

16 bits

Low word of EAX

AH

Accumulator high

8 bits

High byte of AX

AL

Accumulator low

8 bits

Low byte of AX

BX

Base

16 bits

Low word of EBX

BH

Base high

8 bits

High byte of BX

BL

Base low

8 bits

Low byte of BX

CX

Count

16 bits

Low word of ECX

CH

Count high

8 bits

High byte if CX

CL

Count low

8 bits

Low byte of CX

DX

Data

16 bits

Low word of EDX

DH

Data high

8 bits

High byte of DX

DL

Data low

8 bits

Low byte of DX

Not all these registers can be directly modified. For example, EFL contains a number of flag bits which are modified individually by various machine instructions that execute (see below). Also, EIP points to the instruction being executed. This is modified implicitly by jump and return instructions. You can see the address in EIP being indicated by a green arrow in the disassembly pane (top left) in Figure 1.

The 32-bit registers often contain a memory address (as memory addresses in 32-bit Windows are also 32-bits). A reference to the register (e.g. EAX) returns that address, whereas a reference to the address in square brackets returns the data at that memory location (e.g. [EAX]). This makes it easy for registers to be used as pointers.

Some registers are described with a specific intention, for example ESI and EDI are described in terms that suggest they are used to point to the source and destination address for data movement operations. This intention may well be true, but there are generally no hard and fast rules for when certain registers should be used.

Because registers are so much more efficient than normal memory, Delphi's optimising compiler tries to use them wherever possible, rather than storing things in conventional memory.

CPU Flags

Various machine instructions modify various internal CPU flags which are all maintained in the EFL register. Table 2 shows the sixteen flags available in 32-bit mode and also shows the bit positions in the EFL register that represent the flag. All the other bits in the EFL register are reserved and remain set to zero.

Table 2: CPU Flags

Value

EFL register bit number

Flag/bit name

Flag type

CF

0

Carry flag

Status

PF

2

Parity flag

Status

AF

4

Adjust flag

Status

ZF

6

Zero flag

Status

SF

7

Sign flag

Status

TF

8

Trap flag

System

IF

9

Interrupt flag

System

DF

10

Direction flag

Control

OF

11

Overflow flag

Status

IO

12 and 13

I/O privilege level field

System

NF

14

Nested task flag

System

RF

16

Resume flag

System

VM

17

Virtual 8086 mode flag

System

AC

18

Alignment check flag

System

VF

19

Virtual interrupt flag

System

VP

20

Virtual interrupt pending flag

System

ID

21

Identification flag

System

Typically, only the status and control flags are of interest in most applications, but the system flags are included for completeness. The Flags pane shows all these flags.

An Intel machine code reference manual will explain the purpose of each flag (see Reference 1) and detail which flags are modified by which instructions (References 1, 2 and 3). Just to give the idea though, if an arithmetic operation causes an unsigned value to overflow the maximum 32-bit unsigned value, the overflow flag will be set.

Various assembly instructions will jump to locations or perform other operations based on the state of specific flags.

Program Stack

The CPU stack is a first-in, last-out area of memory that can be used to store information temporarily. ESP is the stack pointer register that points to the last item pushed on the stack. When an additional item is pushed on the stack, ESP is decremented by an appropriate number of bytes and the data item is copied to where ESP points to. The stack therefore fills up starting from higher memory addresses and moves down towards lower memory addresses.

The stack is also used as a mechanism to pass parameters to subroutines, as well as the area used to store local variables. ESP points to the last entry pushed on the stack, whereas EBP (the base pointer) is used to point to the first entry pushed on the stack by the current routine. So EBP is higher than ESP.

You can see the address in ESP being indicated by a green arrow in the stack pane (bottom right) in Figure 1. Also, as items are pushed onto the stack, you can see their values appearing in the stack pane and ESP (as indicated by the arrow) moving to point to each new item added in.

Some Simple Assembly Instructions

The intention of this section is not to teach assembly programming. That is a job for other people (see the References at the end of this paper). However, in order to make best use of the CPU window we will need to know what some commonly encountered instructions mean. To decode others, use an assembly reference, such as one of the aforementioned references.

Bear in mind that many of these assembly instructions can be used with a whole variety of parameter combinations. Those listed in Table 3 are just examples, to give you the general idea. Assembly instructions compile to machine code instructions taking a varying number of bytes from one upwards.

Table 3: Common assembly instructions

Assembly instruction

Meaning

mov eax, 1

Set EAX to 1

xor eax, eax

Same as mov eax, 0 but more efficient

mov eax, esi

Copy ESI value to EAX

mov edx, [eax]

Move DWord pointed to by EAX into EDX

mov dl, $01

Set low byte of EDX to 1, leaving other bytes unchanged

push eax

Push EAX onto the stack, so EAX can be re-used

pop eax

Pop value off stack and store it in EAX

ret

Return from routine (return address is on stack)

call $410123

Push address of next instruction on stack and jump to address $410123

mov ebx, [ebp-$5]

Move the DWord starting at the address 5 less than that stored in EBP into EBX

cmp dword ptr [ebx+$3C], $00

Compare the DWord at the address $3C bytes after the address stored in EBX with zero

jz SomeRoutine+$A0

If the last compare operation suggested the two values are equal, jump to $A0 bytes after the first byte of the SomeRoutine routine. Compare operations are implemented as subtractions. JZ is a contraction of Jump if Zero.

movsb

Copy byte pointed to by ESI to the address pointed to by EDI, then increments ESI and EDI by 1 (unless direction flag is set, where they are decremented

movsw

Copy word pointed to by ESI to the address pointed to by EDI, then increments ESI and EDI by 2 (unless direction flag is set, where they are decremented

add esp, -$08

Subtract 8 from stack pointer

Using The CPU Window

To successfully use the CPU window, you need to understand how to get the panes to reposition at any given address and how to follow referenced addresses and jumps to their destinations.

Repositioning the disassembly origin

To disassemble code at any location, right-click on the disassembly pane and choose Goto Address... (or press Ctrl+G with the disassembly pane active). This brings up an address entry dialog where you can type in a new address value (hexadecimal addresses need the $ prefix) or expression. To move a few bytes down, you could use an expression like EIP+20. Alternatively, if you know that a code address has been loaded into a register you can enter the register name in the dialog.

Note, though, that disassembly has a potential for failure. If you start disassembling halfway through the bytes of one instruction, the subsequent bytes will be interpreted as if they are the start of an instruction. This erroneous translation can propagate through many bytes and give very misleading results. The key problem here is that nothing specifies where a given instruction starts and ends. It all relies on being given a valid starting point, and disassembling from there.

Always consider this if you position the disassembler on a specific location, rather than letting it position itself as instructions are executed.

You can tell the disassembly pane to disassemble from an origin which is one byte later using Shift+Ctrl+® or one byte earlier with Shift+Ctrl+¬.

Following a jump in the disassembly pane

If you can see a jump instruction in the disassembly pane, you can see where it will take you by clicking on that instruction, then right-clicking and choosing Follow (or pressing Ctrl+F). This is particularly useful when faced with relative jumps, such as those that jump x bytes into a routine, for example: jmp Foo + $7B.

Customising the memory dump pane view

The memory dump pane defaults to displaying information on a byte-by-byte basis. However it can sometimes be useful to see things listed as other types. For example, you often encounter lists of addresses, which are best viewed as DWords. To accomplish this, right click on the memory dump pane and choose Display As | DWords. Other display options include bytes, words, quadwords (8 byte unsigned integers), singles, doubles and extendeds.

A Virtual Method Table (VMT) is a good example of a list of addresses. You can see one being displayed by the memory dump pane (as a list of DWords) in Figure 3. We will see how to home in on a VMT, and what good it can do us later.

Figure 3: Displaying A VMT as a list of DWords

Repositioning the memory dump origin

Similar to the disassembly pane, you can tell the memory dump pane to start dumping from a given address (or expression that yields an address) with the context menu's Goto Address... option (Ctrl+G).

You can also tell the memory dump pane to dump from an origin which is one byte later using Shift+Ctrl+­ or one byte earlier with Shift+Ctrl+¯ .

Following an address in the memory dump pane

If you see a memory address in the memory dump and you wish to see what is at that location, you have two useful options. If you want to see a dump of the memory starting at that address, you can select the address, right-click on it and choose Follow | Offset to Data (or press Ctrl+D). This is quicker than entering the address it in the memory dump pane's Ctrl+G dialog.

However, if you feel it maybe the address of some code you want to see disassembled, you can select the address, right-click and choose Follow | Near Code (or press Ctrl+E).This is quicker than entering the address in the disassembly pane's Ctrl+G dialog.

Using The System Unit Source

Before looking at how a number of simple Delphi constructs appear at the machine level, it is useful to know of a number of potentially useful constants and types in the System unit. If you haven't browsed this unit before, take some time to do so.

Items that we will make use of, or at least refer to, will include the virtual method table entry constants (starting at line 94) and the internal exception handling frame types (starting at line 4096).

The virtual method table entry constants define positions that precede any given class VMT, which are occupied by pointers to other compiler-generated information, such as the class name and the RTTI table. These constants are all negative, their values specifying how many bytes before the VMT the relevant information can be found. Since these constants are in the interface part of the System unit, they can be used without problem.

Some Simple Delphi Constructs At Machine Level

Register calling convention

Normally, parameters are passed to routines using the stack. The register calling convention, or fastcall calling convention as it is also known, attempts to be more efficient by passing the first three suitable parameters in CPU registers. 32-bit Delphi uses the register calling convention by default. Consequently the first parameter which requires four bytes or less will be passed in EAX, the second in EDX and the third in ECX.

When a method is called, the first parameter is actually Self (which obviously is hidden from the programmer), so Self will reside in EAX in a method compiled with the default register calling convention. However, if EAX is used for other purposes, the Self pointer may be pushed onto the stack for a while, as described below.

Parameters and local variables

Parameters and local variables live on the stack. The default Delphi calling convention makes the first three suitable parameters get passed in the EAX, EDX and ECX registers as described above, but if any code in the generated routine makes use of any of these registers, they will be stored on the stack as the routine enters. This is also the true if the routine calls other routines or accesses properties, meaning the compiler cannot know whether these registers will be preserved or not.

For example, if you place a breakpoint on the begin line of Listing 1, and view the CPU window, the disassembled code looks like Listing 2. You can see the parameters being stored just below the base pointer. The code corresponding to begin is called a subroutine's prologue code, whereas the code corresponding to the end is called the epilogue code. The prologue and epilogue may include additional code to set up necessary protection constructs, and to tidy away resources which have been consumed in the subroutine's execution.

Listing 1: A simple routine with three 32-bit parameters

procedure Foo(A, B, C: Integer);
begin
  Application.MainForm.Tag := A + B + C;
end;

Listing 2: The prologue code for Listing 1

push ebp           //record old base pointer
mov ebp, esp       //turn current stack pointer into this routine's base pointer
add esp, -$0c      //make space on stack for recording parameters
mov [ebp-$0c], ecx //store 3rd parameter below where 2nd parameter will go
mov [ebp-$08], edx //store 2nd parameter below where 1st parameter will go
mov [ebp-$04], eax //store 1st parameter below base pointer

Local variables are stored directly below the area where parameters are stored (if at all).

Exception handlers and resource protection blocks

Often Delphi routines will have much more involved prologue code because strings, dynamic arrays or interface references will be involved.

Strings and dynamic arrays are dynamically managed by code generated by the compiler. It needs to ensure that, regardless of exceptions, allocated string memory will be tidied up appropriately. In the case of interface references the reference count must be decremented before the routine ends, regardless of exception.

The compiler achieves these goals by inserting custom exception handling frames in the prologue and epilogue. In the case of strings, this is true whether they are passed as parameters, declared as local variables, or manufactured as a side effect of evaluating an expression such as this one:

ShowMessage( IntToStr( 99 ) );

You can spot an exception frame being set up when the FS segment register is used. Win32 exception frames are recorded on the stack. A pointer to the top of the exception list is stored at byte zero in the FS segment (FS:[0]). FS refers to thread-local storage, so each thread has its own exception frame list. Listing 3 shows the prologue of the same routine from Listing 1, but with a local string variable declared.

Listing 3: The prologue for Listing 1 after adding a local string variable

push ebp                //record old base pointer
mov ebp, esp            //turn current stack pointer into this
                        //routine's base pointer
add esp, -$10           //make space on stack for recording parameters
                        //and string variable pointer
push ebx                //preserve whatever EBX maybe
xor ebx, ebx            //set EBX to 0 (a nil pointer in this case)
mov [ebp-$10], ebx      //store nil in local string variable
                        //(strings are initialised to nil)
mov [ebp-$0c], ecx      //store 3rd param below where 2nd param will go
mov [ebp-$08], edx      //store 2nd param below where 1st param will go
mov [ebp-$04], eax      //store 1st param below base pointer
xor eax, eax            //set EAX to 0
push ebp                //store base pointer on stack
push $0045a94c          //store address of exception description record
push dword ptr fs:[eax] //store previous top exception record ptr on stack
mov fs:[eax], esp       //record location of new, topmost, exception record

The FS:[0] value contains an address that points to an exception frame record, stored on the stack, containing three DWord fields. Listing 4 shows the TExcFrame record type as can be found in System.pas, Delphi's core RTL unit. The first field contains the address of the next exception frame record, the next contains the address of an exception description record (a TExcDesc record) used in the event of an exception.

Listing 4: Exception frames, as defined in System.pas

TExcDescEntry =
record
  vTable:  Pointer;
  handler: Pointer;
end;

PExcDesc = ^TExcDesc;
TExcDesc =
packed record
  jmp: JmpInstruction;
  case Integer of
  0: (instructions: array [0..0] of Byte);   //try/finally
  1: (cnt: Integer; excTab: array [0..0{cnt-1}] of TExcDescEntry);
       //try/except
end;

PExcFrame = ^TExcFrame;
TExcFrame =
record
  next: PExcFrame;
  desc: PExcDesc;
  hEBP: Pointer;
  case Integer of
  0:  ( );
  1:  ( ConstructedObject: Pointer );
  2:  ( SelfOfMethod: Pointer );
end;

This record contains an address to jump to in case of error, along with additional information such as another address to jump to first in the case of a try/finally statement (variant part 0). The second variant part of TExcDesc contains a list of exception handlers. It starts with an integer field indicating how many there are. This is followed by a pointer to something in the specified exception class, and a pointer to the code to execute if exception of that type occurs.

The last TExcFrame field contains the value to restore the base pointer to if the exception is handled.

The specific address ($0045a94c) of the exception description record will vary, but in the case we are working with, you can see the given address a little further down the disassembly pane. The bytes contained within the record are disassembled in red in Listing 5.

Listing 5: Epilogue code of a routine that uses a string

//Auto-generated code to help tidy the string up
xor eax, eax       //Set EAX to 0
pop edx            //Pop old exception frame record address
pop ecx            //Pop exception description record address (no longer needed)
pop ecx            //Pop old EBP (no longer needed)
mov fs:[eax], edx  //Make old exception frame the topmost exception frame
push $0045a953     //Push new return address on stack (the
                   //corresponding line is shown green)
lea eax, [ebp-$10] //Load address of string into EAX
call @LStrClr      //Pass string address to string clearing routine
ret                //Return to recently pushed return address
jmp @HandleFinally //Jump to finally section handler
jmp Foo + $47      //Jump to finally section (shown in blue)
//Code that corresponds to the end line in the routine
pop ebx            //Pull the preserved EBX off the stack
mov esp, ebp       //Reset the stack pointer to what it was before this routine
pop ebp            //Reset EBP to stored value
ret                //Return to caller

If an exception occurs, the exception description record contains an instruction to jump to the HandleFinally routine. The record also contains a jump instruction that jumps to the finally part of the code, which passes the string address to LStrClr, to free its memory. If no error occurs, the LStrClr call is still made, but the code is set up to jump past the exception description record by a fabricated return (the target address is pushed onto the stack and a ret instruction is made).

So you may well ask how this information can be used. Okay, let's take an example scenario where you have stepped into a subroutine in the CPU window where the prologue code sets up an exception frame for an exception handler, something like Listing 3. Because of the occurrence of the exception description record, which is placed after some later code, and the issue with disassemblers getting confused about where instructions start, the actual exception handling code may not be displayed correctly.

Look at Figure 4 for an example. The Delphi code that generates this machine code is shown in Listing 6. As you can see there are two types of exceptions that can be caught, EConvertError and EAccessViolation. However, the disassembler is having a hard time coping with the exception description record (which you can see starts at address $0045445d).

Figure 4: Poor disassembly due to a data record embedded in the code

Listing 6: A simple exception handler that compiles down to Figure 4

procedure TForm1.Button1Click(Sender: TObject);
begin
  try
    Tag := StrToInt(Edit1.Text)
  except
    on E: EConvertError do
      ShowMessage('Bad input');
    on E: EAccessViolation do
      ShowMessage('AV')
  end
end;

The record contains five bytes that map into a jump to the HandleOnException routine (as can be seen in Figure 4 if you scan down to the specified address). As Listing 4 shows, this jump instruction is followed by another instruction for a try/finally statement, or a count DWord followed by that many TExcDescEntry records for a try/except statement. Unfortunately, the disassembler does not know the following bytes are not instruction opcodes and so attempts to disassemble them, thereby having a knock-on effect for the disassembly of the real instructions that follow the record.

To try and help clear things up and see what should really be going on we should instruct the memory dump pane to dump memory starting from address $0045445D + 5 (using Ctrl+G), in DWord format. Figure 5 shows the result.

You can see the integer value of 2, indicating two TExcDescEntry records to follow, each of which is defined to have two 32-bit pointers (addresses) contained therein. The first one (which we will assume corresponds to the EConvertError handler) has a vTable field of $00417354 and a handler field of $00454476, the second one (presumably for EAccessViolation) has a vTable of $004173B0 and a handler of $00454485.

Figure 5: Looking at exception description record fields

Now we need to know what these pointers point to. Let's try the EConvertError description record. The vTable field will point at some data (something to do with the exception class's VMT) whilst the handler field points to code. Select the handler value ($454476 at address $45446A) and press Ctrl+E, then select the vTable value ($417354 at address $454466) and press Ctrl+D.

As you can see in Figure 6, this repositions the disassembly origin correctly on the EConvertError handling code which starts at address $454476. This address was actually present in Figure 4, but was interpreted by the disassembler as being in the middle of another instruction.

Figure 6 also shows a memory dump of the EConvertError VMT. Notice that the VMT has the class name stored just after it (we will see this again later).

Figure 6: Seeing the exception handling code hidden from us in Figure 4

So knowledge of how exception handlers are laid out allows you to get an accurate disassembly of the relevant instrcutions.

This sort of approach can also be useful if you do not have the source for a given subroutine containing some exception handling logic, but wish to know which exceptions are handled by it and what they do. Clearly, the TExcDescEntry records contained within the TExcDesc record generated for a try/except statement allow us to find these things out.

Object references and class references

An object reference is a pointer, this much is quite common knowledge to Delphi developers. But what does it point to? In abstract terms it points to an object, but this doesn't really help us understand what is going on.

It can be quite useful here to recall that an object is a collection of code (methods) and data. The data and code for any given object are defined in by its class. There is no need for code to be duplicated for each individual object instance, so each instance uses the same code.

However, each object needs its own set of instance data so, when an object is created, memory is allocated to accommodate all the required data fields. An object reference points to the start of the instance data block.

For any given object's instance data, the first item is a class reference. Class references are variables that can refer to any of a number of related class types. Through a class reference you can call the constructor, without knowing which class will be constructed, as shown in Listing 7.

Listing 7: Use of a class reference

type
  //this type is defined in the VCL, but is shown here for clarity
  TComponentClass = class of TComponent; //this is a class reference type
...
var
  AClass: TComponentClass; //this is a class reference variable
  AnObject: TComponent;
...
AClass := TButton; //set the class reference to some component class
AnObject := AClass.Create(Self); //construct an instance of the chosen class
if AnObject is TControl then
  TControl(AnObject).Parent := Self

A class reference is also a pointer. It points to the VMT's first entry. For example, in a simple application with (amongst other things) a button (Button1) on a form, let's suppose a breakpoint is placed in one of the event handlers. When the breakpoint triggers and the debugger takes control, you open the CPU window and wish to investigate the internals of the button.

To do this you can select the memory dump pane, press Ctrl+G and enter the symbol Button1 and press Enter, or find a reference to Button1 in the editor, highlight the word and drag it onto the memory dump pane (this dragging option was introduced in Delphi 5). Either of these steps will make the memory dump show you the address at where the Button1 object reference field is stored, which means it will show you the portion of the form instance data containing the Button1 object reference.

In my test case, the memory dump pane's origin is $BB1938 (this is the address that is shown if you add a watch on the expression @Button1). The DWord value at this address (the value of the object reference) is $BB2E10, meaning that the instance data for Button1 starts at $BB2E10. Assuming this value is selected, pressing Ctrl+D repositions the memory dump to start at that address, thereby showing the button's instance data.

The first DWord in the instance data (at address $BBE210) will be the class reference pointer, shown as $42AA30 in my case. Pressing Ctrl+D when this value is selected will de-reference the class reference pointer and display the VMT of the object's class, which in this case is TButton and starts at address $42AA30.

As well as using the class reference found in an object's instance data you can also take advantage of a number of global class reference variables that exist in the process at run-time. Pressing Ctrl+G and entering TButton will take you to a memory location that holds a class reference for TButton. pressing Ctrl+D will then take you into the VMT.

As a further shortcut, entering Pointer(TButton)^ in the Ctrl+G dialog will take you straight to the TButton VMT.

Virtual Method Tables (VMTs)

A VMT is easy to recognise as it typically has a list of virtual method addresses in it. If the memory dump pane is displaying DWords, these will all be very similar values, often not very far from the $400000 base address for the whole process.

You can follow any of the VMT entries by selecting them and pressing Ctrl+E. For example, Figure 7 shows the TButton VMT after pressing Ctrl+E for the first entry. You can see that this entry corresponds to the AssignTo virtual method, which is implemented in TWinControl.

Figure 7: Looking at the VMT of TButton

You should note that not all classes have their own virtual methods over and above those defined in TObject (Exception is one such example). In these cases, there is still a VMT, but with no method addresses in the traditional way. Instead it is solely present for the VMT prefix fields as discussed below.

Virtual Method Table prefix fields

The VMT contains the address of methods declared with the virtual directive, or such methods overridden with the override directive. This does not include those defined in TObject (SafeCallException, AfterConstruction, BeforeDestruction, Dispatch, DefaultHandler, NewInstance, FreeInstance and the destructor Destroy) which have special locations residing before the VMT.

In fact the VMT is prefixed with many special fields including the pointer to the class RTTI, the pointer to the published field table, the pointer to the published method table, and so on. As a convenience, the System unit defines a number of constants defined to represent each of these fields in terms of their relative offsets from the VMT start, as mentioned earlier. A full list of them is given in Table 4.

Table 4: VMT prefix field constants

Constant name

Offset

Purpose

vmtSelfPtr

-76

Address of first VMT entry if any, or of class name

vmtIntfTable

-72

Address of implemented interface table

vmtAutoTable

-68

Address of automated class section (Delphi 2)

vmtInitTable

-64

Address of table of fields requiring initialisation

vmtTypeInfo

-60

Address of RTTI

vmtFieldTable

-56

Address of published field table

vmtMethodTable

-52

Address of published method table

vmtDynamicTable

-48

Address of DMT

vmtClassName

-44

Address of class name string

vmtInstanceSize

-40

Number of bytes of instance data required by object

vmtParent

-36

Address of ancestor class VMT

vmtSafeCallException

-32

Address of virtual method, SafeCallException

vmtAfterConstruction

-28

Address of virtual method, AfterConstruction

vmtBeforeDestruction

-24

Address of virtual method, BeforeDestruction

vmtDispatch

-20

Address of virtual method, Dispatch

vmtDefaultHandler

-16

Address of virtual method, DefaultHandler

vmtNewInstance

-12

Address of virtual method, NewInstance

vmtFreeInstance

-8

Address of virtual method, FreeInstance

vmtDestroy

-4

Address of virtual destructor, Destroy

So, to get taken to the implementation of the TButton destructor (using the addresses found before), press Ctrl+G in the memory dump pane, enter $42AA30 + vmtDestroy, press OK, then press Ctrl+E. The disassembly pane will then show the destructor (which for TButton is TWinControl.Destroy).

Note that, as discussed earlier, you can directly locate the VMT by appropriate use of the class type in the Ctrl+G dialog. So instead of $42AA30 + vmtDestroy, you could alternatively use PInteger(TButton)^ + vmtDestroy. This requires more typing, but requires less exploration to find correct addresses.

If you have the Debug DCUs compiler option enabled, you will have easy access to the VCL source code. Enabling the disassembly pane, clicking on the first destructor instruction and then selecting View Source from the right-click menu (or pressing Ctrl+V) will load the relevant source file and locate the corresponding source line for you. The Debug DCUs option was added in Delphi 5 to the Compiler page of the project options dialog.

Virtual method calls

You can recognise virtual method calls as they take the form:

mov reg, VMT start
call dword ptr [reg + VMT offset]

For example, a simple call to Button1.Invalidate in an event handler expands to the assembly instructions in Listing 8. The VMT offset specified for Invalidate is $74 bytes. The first entry is at offset 0, the second at offset 4, the third at offset 8 and so on. So the entry is given by (Offset / 4) + 1, which makes Invalidate the thirtieth entry in the VMT.

Listing 8: Calling a virtual method

mov eax, [ebp-$04] //Reload Self from the stack
mov eax, [eax+$000002d4] //Load Button1's object reference into EAX
mov edx, [eax] //Load VMT pointer into EDX
call dword ptr [edx+$74] //Call address $74 bytes into VMT

Dynamic method calls

You can recognise a dynamic method call as it matches this standard pattern:

mov eax, object reference
mov bx, dynamic method index
call @DynaInst

A call to a button's Click method is translated in Delphi 5 to a dynamic method index $FFEC.

Debugging Techniques

Finding an object's class name

All classes have their class names stored as a short string just after their VMT. The actual address is stored in a field shortly before the VMT, as indicated by the vmtClassName constant. If you have access to an object reference, you can follow it through to the VMT, access the address field before the VMT, follow that address and you will see the class name.

Alternatively, you can just look at a memory dump of the VMT and scroll downwards. You will see the class name before long (see Figure 6).

If you have an object reference in EAX, entering PInteger(EAX)^ + vmtClassName in the Ctrl+G dialog will show you the address where the class name is stored. Pressing Ctrl+D takes you to the class name.

Finding the size of a block of memory

When the Delphi 32-bit memory manager allocates a block of memory, it comes via a suballocator. You can find the suballocator source, formatted with very little consideration for easy-to-read indentation in the GetMem.Inc file in the RTL\SYS subdirectory under Delphi's Source directory.

The suballocator allocates large blocks of memory from the Windows heap using Windows heap allocation routines. As RTL memory allocation routines are called, it divides it into appropriately sized chunks of memory and returns them, which is where the term suballocator comes from.

The suballocator records additional information in a set of four bytes before the allocated block. Primarily, the information recorded is the size of the allocated block, giving potentially a maximum size of 4Gb-1, although the size value is an integer, so in fact you have a maximum of 2Gb-1. The high bit should therefore be ignored.

If you have a pointer to any block of memory allocated through the Delphi RTL (not directly through Windows API calls) you can readily find out how large it is. Suppose you have an address in EAX, such as an object reference. Recall that an object reference is a pointer to memory allocated to hold instance data. Enable the memory dump pane, ensure it is displaying DWords, press Ctrl+G and enter EAX-4.

This will dump the memory starting four bytes before the address held in EAX, and so show the heap block prefix DWord, followed by the object's instance data. The first DWord of instance data will be the address of the VMT (the class reference). The memory dump pane will look like Figure 8. You can see it has a value of $206.

Figure 8: Looking at a heap block's prefix DWord

One little known aspect of the suballocator is that it always rounds memory block sizes up to the nearest four bytes, so in order to calculate the size of the block, it is necessary to mask off the lowest 2 bits. A binary and operation with $FFFFFFFC will do this. $206 and $FFFFFFFC gives $204.

This represents the size of the button's instance data plus the size of the prefix DWord. The implication here is that the button has $200 bytes of instance data.

If you follow the VMT pointer and use the vmtInstanceSize offset pointer, you will find that the class claims to require $200 bytes of instance data, which is true.

Testing if an object has been destroyed

If you are debugging an application and you have a suspicion that the object pointed to by an object reference has been destroyed, you may be able to verify this in a very similar way to checking the size of its instance data block.

It was mentioned that heap sizes are rounded up to the nearest four bytes, which leaves two spare bits. These bits are used to store internal information for the heap suballocator. We can use these to our advantage.

First let's consider the possibilities:

To check what is going on, examine the object reference and what it points to in the memory dump pane. If it looks like it is pointing at garbage, then the object has been destroyed and the memory has been trashed. However, if the displayed memory looks like it might be an object, try following the VMT pointer at the start of the instance data. Assuming it takes you to what looks like a VMT, and the VMT has the right class name following it, then we need to do more checks.

Ctrl+P (Previous on the right-click menu) takes the memory dump pane back to the previous location it was dumping. Back on the dump of the instance data, look four bytes ahead of it to see the heap prefix DWord. If you are looking at the beginning of the instance data, you can press Shift+Ctrl+¬ four times.

In the case of a valid button object the value is $206. Stripping off the top 30 bits (by performing a binary and with 3) gives 2, where bit 0 has a value of 0 and bit 1 has a value of 1. So what does this mean?

Well the least significant bit (LSB), bit 0, indicates if the block is free or in use (it is free if the block is set). Bit 1 seems to be used as a backup flag, being set when the memory block is in use. The state of these flags is checked regularly in the heap code, checking for erroneous permutations. The important one to focus on is the LSB.

After destroying the button by calling its Free method, the button's heap prefix has a value of $207, making the flags have a value of 3 (both bits set). Since the LSB is set, the heap block is marked as free.

Another, easier test, is to see if the reported block size is odd. If the value is odd, the LSB will be set, and the block is therefore free. If the block size is even, LSB is clear and the block is still in use.

Keeping a variable alive against the optimiser's wishes

Sometimes when debugging, you will find a variable is irritatingly unavailable just when you want to check its value. This problem is exaggerated with optimisation turned on.

A common way around the problem is to pass the variable to a helper routine at various points in the code. Such a helper routine should take a single untyped var parameter, but should do absolutely nothing (see Listing 9). Any variable can therefore be passed to the parameter and the compiler will have little choice but to keep it available.

Listing 9: A routine to help keep variables alive

procedure Touch(var X);
begin
end;

Bear in mind that Inprise R&D recommend not relying on this approach working forever, as the compiler might one day be made smart enough to notice that the routine does nothing, and so not call it. This would re-introduce the same scope problem.

Closing in on Access Violations

Most times an Access Violation occurs in the debugger, by the time the debugger pops up and announces it to you, having suspended your program, you are not in the offending routine's scope. This means that when you try and evaluate your variables, the debugger claims not to understand what you mean.

This typically happens when the actual problem occurs in the implementation of a routine you do not have source for, called by your routine.

One way of helping out here is to add in helper handlers. These are exception handling constructs which do not actually perform any exception handling per se, but instead simply cause an exception handling frame to be added into the subroutine (see Listing 10).

Consequently when an Access Violation occurs in the routine at some point, your helper handler will pick up the problem and so your routine will be in the current scope. This means you can evaluate expressions based in this subroutine.

Listing 10: Helper Handlers

try
  //code that you suspect may induce an AV
except
  raise
end
...
try
  //code that you suspect may induce an AV
finally
end;

Re-executing code

This may not be an entirely practical suggestion, but it is very possible to re-execute one or more machine instructions. All you need to do is change the value of EIP in the CPU registers pane, giving it a value corresponding to the start of an already executed instruction.

You should take care when doing this to ensure that the repeated execution of a statement does not cause the stack to get in a muddle, or problems will ensue. Another possible problem would be caused by resetting EIP in the middle of a for loop, which can cause the loop counter to be incremented in unexpected ways. The implementation of the for loop at machine level, particularly when optimisations are enabled, is often done in a surprising fashion.

In general, you should try to reset EIP to point to the start of a source line, as registers tend to be reloaded for each source line due to the way the compiler generates machine code.

Other Tips

Try using the event log window (View | Debug Windows | Event Log) to help debugging applications. Calls to OutputDebugString made in your application cause entries to be added to the event log, so it can be used as an execution trace. A nice implementation point of OutputDebugString is that if your program is not running within a debugger, it does absolutely nothing, returning immediately.

You can also make use of the event log by setting advanced breakpoint properties. Breakpoints do not necessarily have to break the execution of your program. They can evaluate expressions and (optionally) add the result to the event log. They can also enable or disable whole groups of other breakpoints, thereby turning on or off other trace/break options that you set up. These advanced breakpoint properties were added in Delphi 5.

Summary

In a 75 minute conference talk, we are unable to go too deep, but you should be starting to see the power available when using the CPU window. Hopefully, after hearing or reading this paper you will feel confident enough to start experimenting with it on your own.

References

  1. http://developer.intel.com/design/PentiumIII/manuals - has links to downloadable manuals (in PDF format) including Volumes 1, 2 and 3 of the Intel Architecture Software Developer's Manual.
  2. http://www.online.ee/~andre/i80386 - Intel 80386 Programmer's Reference, explaining all about assembly programming in depth
  3. http://www.jegerlehner.ch/intel - contains a PDF file that prints out to a two page summary of machine instructions

About Brian Long

Brian Long used to work at Borland UK, performing a number of duties including Technical Support on all the programming tools. Since leaving in 1995, Brian has spent the intervening years as a trainer, trouble-shooter and mentor focusing on the use of the C#, Delphi and C++ languages, and of the Win32 and .NET platforms. In his spare time Brian actively researches and employs strategies for the convenient identification, isolation and removal of malware. If you need training in these areas or need solutions to problems you have with them, please get in touch or visit Brian's Web site.

Brian authored a Borland Pascal problem-solving book in 1994 and occasionally acts as a Technical Editor for Wiley (previously Sybex); he was the Technical Editor for Mastering Delphi 7 and Mastering Delphi 2005 and also contributed a chapter to Delphi for .NET Developer Guide. Brian is a regular columnist in The Delphi Magazine and has had numerous articles published in Developer's Review, Computing, Delphi Developer's Journal and EXE Magazine. He was nominated for the Spirit of Delphi award in 2000.