tm1300 NXP Semiconductors, tm1300 Datasheet - Page 74

no-image

tm1300

Manufacturer Part Number
tm1300
Description
Tm-1300 Media Processor
Manufacturer
NXP Semiconductors
Datasheet

Available stocks

Company
Part Number
Manufacturer
Quantity
Price
Part Number:
tm1300-1.2
Quantity:
380
TM1300 Data Book
Figure 4-8. Final version of the frame-reconstruction code.
Figure 4-9
cost loop. Unlike the previous example, the code is not a
self-contained function. Somewhere early in the code,
the arrays A[][] and B[][] are declared; somewhere be-
tween those declarations and the loop of interest, the ar-
rays are filled with data.
4.4.1
First, we will look at the simplest way to use a TM1300
custom operation.
We start by noticing that the computation in the loop of
Figure 4-9
of two unsigned characters (bytes). By now, we are fa-
miliar with the fact that TM1300 includes a number of op-
erations that process all four bytes in a 32-bit word simul-
taneously.
fundamental to the MPEG algorithm, it is not surprising
4-8
Figure 4-9. Match-cost loop for MPEG motion estimation.
Figure 4-10. Unrolled, but not parallel, version of the loop from
A Simple Transformation
void reconstruct (unsigned char *back,
{
}
shows the original source code for the match-
involves the absolute value of the difference
Since
int i;
int *i_back
int *i_forward = (int *) forward;
int *i_idct
int *i_dest
for (i = 0; i < 16; i += 1)
unsigned char A[16][16];
unsigned char B[16][16];
for (row = 0; row < 16; row += 1)
{
i_dest[i] = DSPUQUADADDUI(QUADAVG(i_back[i], i_forward[i]), i_idct[i]);
unsigned char A[16][16];
unsigned char B[16][16];
for (row = 0; row < 16; row += 1)
{
}
the
for (col = 0; col < 16; col += 4)
{
for (col = 0; col < 16; col += 1)
.
.
.
cost += abs(A[row][col+0] – B[row][col+0]);
cost += abs(A[row][col+1] – B[row][col+1]);
cost += abs(A[row][col+2] – B[row][col+2]);
cost += abs(A[row][col+3] – B[row][col+3]);
PRODUCT SPECIFICATION
match-cost
.
.
.
cost += abs(A[row][col] – B[row][col]);
unsigned char *forward,
unsigned char *destination)
= (int *) back;
= (int *) idct;
= (int *) destination;
calculation
char *idct,
is
to find a custom operation—ume8uu—that implements
this operation exactly.
To understand how ume8uu can be used in this case, we
need to transform the code as in the previous example.
Though the steps are presented here in detail, a pro-
grammer with a even a little experience can often per-
form these transformations by visual inspection.
To use a custom operation that processes 4 pixel values
simultaneously, we first need to create 4 parallel pixel
computations.
unrolled by a factor of 4. Unfortunately, the code in the
unrolled loop is not parallel because each line depends
on the one above it.
version of the code from
each computation its own cost variable and then sum-
ming the costs all at once, each cost computation is com-
pletely independent.
Figure
Figure 4-10
4-9.
Figure 4-11
Figure
Philips Semiconductors
shows the loop of
4-10. By simply giving
shows a more parallel
Figure 4-9

Related parts for tm1300