AN2203 Freescale Semiconductor / Motorola, AN2203 Datasheet - Page 46

no-image

AN2203

Manufacturer Part Number
AN2203
Description
MPC7450 RISC Microprocessor Family Software Optimization Guide
Manufacturer
Freescale Semiconductor / Motorola
Datasheet

Available stocks

Company
Part Number
Manufacturer
Quantity
Price
Part Number:
AN22030A
Manufacturer:
PANASONIC/松下
Quantity:
20 000
Optimizations to Exploit the Memory Hierarchy
4.2.2
The CTR instruction pair mtctr/bcctr should be used for all computed branches. This includes case
statement jumps and all indirect function calls. Note that to save the return address on indirect function calls,
the link form of the bcctr instruction (bcctrl) should be used. The LR-based indirect branch (bclr) should
be used only for subroutine call/return. Misusing the LR and CTR can corrupt the hardware link stack such
that several future branches are mispredicted.
See Section 3.1.4, “Using the Link Register (LR) Versus the Count Register (CTR) for Branch Indirect
Instructions,” for more information.
4.2.3
Where possible, branches should be biased as fall-through. This is because taken branches can interrupt the
fetch supply. On the MPC7450, a taken branch incurs a 1–2 cycle fetch bubble. A 1-cycle bubble occurs for
a b or bc with a BTIC hit. A 2-cycle bubble occurs for a BTIC miss or for branches that cannot use the BTIC
(bcctr, bclr). The 2-cycle fetch bubble is due to the 2-cycle fetch latency to the instruction cache.
Section 3.1.1.1, “Fetch Alignment Example,” and Section 3.1.1.2, “Branch-Taken Bubble Example,” show
how the fetch supply works and why it is useful to bias branches to the not-taken case.
4.2.4
The availability of eight CR fields in the PowerPC architecture means that multiple condition checks can
effectively occur simultaneously. Some scenarios can take advantage of this to handle branch-dependent
indicators such that the branch resolves before it would be predicted, eliminating the cost of misprediction.
Even if the branch is mispredicted, having data earlier may allow the mispredict recovery to occur earlier.
Issuing a mtctr or mtlr instruction well ahead of its dependent branch instruction can often help avoid stalls
or mispredictions as well.
4.3
Memory considerations can also affect code performance. This section describes several areas where there
is opportunity for optimization.
4.3.1
Any data cache access crossing a double-word boundary (with the exception of vectors, which are naturally
quad-word based accesses) causes misalignment, and incurs at least one additional cycle of latency. See
Section 3.7.5, “Misalignment Effects,” for more MPC7450 specific information.
Note that misalignment penalties may increase on future high-performance microprocessor.
4.3.2
Aligning a branch target can be useful to the fetch supply. Preferred alignment for a MPC7450 should be
such that the first four instructions of a branch target should be in the same cache block. See Section 3.1.1.1,
“Fetch Alignment Example,” for more information.
46
Optimizations to Exploit the Memory Hierarchy
Using the Link Register
Branch Bubbles
Branch Dependencies
Data Alignment
Instruction Code Alignment
MPC7450 RISC Microprocessor Family Software Optimization Guide
Freescale Semiconductor, Inc.
For More Information On This Product,
Go to: www.freescale.com
MOTOROLA

Related parts for AN2203