AN2797 Freescale Semiconductor / Motorola, AN2797 Datasheet - Page 5

no-image

AN2797

Manufacturer Part Number
AN2797
Description
Migrating from IBM 750GX to MPC7447A
Manufacturer
Freescale Semiconductor / Motorola
Datasheet
Feature Overview
2.1.4 Branch Processing Unit
The branch processing unit found in the IBM 750GX can process one branch while resolving two speculative
branches per cycle. It uses a 512-deep branch history table (BHT) for dynamic branch prediction to produce four
possible outcomes (not taken, strongly not taken, taken, strongly taken) and incorporates a 64-entry branch target
instruction cache (BTIC) to reduce branch delay slots by supplying the next instruction(s) from this cache for a
particular branch target address, rather than from the instruction cache, preventing a 1-clock-cycle penalty.
In contrast, the MPC7447A processes one branch per cycle like the IBM 750GX but can resolve three speculative
branches per cycle. The increased BHT with 2048 entries offers the same four prediction states but with the
advantage of a larger size. In addition, the BHT can be cleared to weakly not taken, using HID0[BHTCLR]. The
BTIC is twice the size of the IBM 750GX, providing 128 entries arranged as 32 sets using a 4-way set-associative
arrangement.
2.1.5 Completion Unit
The completion unit works in the IBM 750GX with the dispatch unit so that it can track dispatched instructions and
retire them to the completion queue in order. In following with the dispatch unit, two instructions can be retired per
clock cycles cycle, providing that there are slots available in the completion queue. When the instruction is removed
from the queue, the rename buffers must have been freed and any results written to processor registers such as GPRs,
FPRs, link register (LR), and counter (CTR).
For the MPC7447A, due to deeper pipelines, we can have up to sixteen instructions at some stage of pipeline
processing and retire a maximum of three instructions per clock to one of the sixteen completion queue slots.
2.2
Pipeline Comparison
The difference in pipeline depths between the IBM 750GX and MPC7447A is significant. With the IBM 750GX,
the minimum depth has been kept to a rather short four stages of instruction; fetch, dispatch/decode, execute, and
complete. Write back is included in the complete stage. The pipeline diagram for the IBM 750GX is shown in
Figure
3.
Migrating from IBM 750GX to MPC7447A, Rev. 1.0
Freescale Semiconductor
5

Related parts for AN2797