PNX1302EH NXP Semiconductors, PNX1302EH Datasheet - Page 81

PNX1302EH

Manufacturer Part Number

PNX1302EH

Description

Manufacturer

NXP Semiconductors

Datasheet

1.PNX1302EH.pdf (548 pages)

Specifications of PNX1302EH

Lead Free Status / RoHS Status

Not Compliant

Available stocks

Company

Part Number

Manufacturer

Quantity

Price

Company:

Bonase Electronics (HK) Co., Limited

Part Number:

PNX1302EH

Manufacturer:

NXP

Quantity:

201

Company:

BOSTOCK HK LIMITED

Part Number:

PNX1302EH

Manufacturer:

XILINX

Company:

Meier Automation Equipment Co., Limited

Part Number:

PNX1302EH

Manufacturer:

PHILIPS/飞利浦

Quantity:

20 000

Company:

BOSTOCK HK LIMITED

Part Number:

PNX1302EH,557

Manufacturer:

NXP Semiconductors

Quantity:

10 000

Company:

Bonase Electronics (HK) Co., Limited

Part Number:

PNX1302EH/G

Manufacturer:

NXP

Quantity:

5 510

Company:

Meier Automation Equipment Co., Limited

Part Number:

PNX1302EH/G

Manufacturer:

NXP/恩智浦

Quantity:

20 000

Current page: 81 of 548
Download datasheet (6Mb)

Philips Semiconductors

Excluding the array accesses, the loop body in

Figure 4-11

formed by the ume8uu custom operation: the sum of 4

absolute values of 4 differences. To use the ume8uu op-

eration, however, the code must access the arrays with

32-bit word pointers instead of with 8-bit byte pointers.

Figure 4-13

B[][] as one-dimensional instead of two-dimensional ar-

rays. We take advantage of our knowledge of C-lan-

guage array storage conventions to perform this code

transformation. Recoding to use one-dimensional arrays

prepares the code for transformation to 32-bit array ac-

cesses.

(From here on, until the final code is shown, the declara-

tions of the A and B arrays will be omitted from the code

fragments for the sake of brevity.)

Figure 4-13. The loop of

Figure 4-12. The loop of

loop eliminated.

Figure 4-11. Parallel version of

unsigned int *IA = (unsigned int *) A;

unsigned int *IB = (unsigned int *) B;

for (i = 0; i < 64; i += 1)

cost += UME8UU(IA[i], IB[i]);

shows the loop recoded to access A[][] and

is now recognizable as the function per-

unsigned char A[16][16];

unsigned char B[16][16];

unsigned char *CA = A;

unsigned char *CB = B;

for (row = 0; row < 16; row += 1)

{

unsigned char A[16][16];

unsigned char B[16][16];

for (row = 0; row < 16; row += 1)

{

int rowoffset = row * 16;

for (col = 0; col < 16; col += 4)

{

for (col = 0; col < 16; col += 4)

{

cost0 = abs(CA[rowoffset + col+0] – CB[rowoffset + col+0]);

cost1 = abs(CA[rowoffset + col+1] – CB[rowoffset + col+1]);

cost2 = abs(CA[rowoffset + col+2] – CB[rowoffset + col+2]);

cost3 = abs(CA[rowoffset + col+3] – CB[rowoffset + col+3]);

cost += cost0 + cost1 + cost2 + cost3;

Figure 4-14

Figure 4-11

cost0 = abs(A[row][col+0] – B[row][col+0]);

cost1 = abs(A[row][col+1] – B[row][col+1]);

cost2 = abs(A[row][col+2] – B[row][col+2]);

cost3 = abs(A[row][col+3] – B[row][col+3]);

cost += cost0 + cost1 + cost2 + cost3;

Figure

with the inner

recoded with one-dimensional array accesses.

4-10.

Figure 4-14

use ume8uu. Once again taking advantage of our knowl-

edge of the C-language array storage conventions, the

one-dimensional byte array is now accessed as a one-di-

mensional 32-bit-word array. The declarations of the

pointers IA and IB as pointers to integers is the key, but

also notice that the multiplier in the expression for row

offset has been scaled from 16 to 4 to account for the fact

that there are 4 bytes in a 32-bit word.

Of course, since we are now using one-dimensional ar-

rays to access the pixel data, it is natural to use a single

for loop instead of two.

lined version of the code without the inner loop. Since C-

language arrays are stored as a linear vector of values,

we can simply increase the number of iterations of the

outer loop from 16 to 64 to traverse the entire array.

The recoding and use of the ume8uu operation has re-

sulted in a substantial improvement in the performance

of the match-cost loop. In the original version, the code

executed 1280 operations (including loads, adds, sub-

tracts, and absolute values); in the restructured version,

there are only 256 operations—128 loads, 64 ume8uu

operations, and 64 additions. This is a factor of five re-

duction in the number of operations executed. Also, the

PRELIMINARY SPECIFICATION

shows the loop of

Custom Operations for Multimedia

Figure 4-12

Figure 4-13

shows this stream-

recoded to

4-9

PNX1302EH NXP Semiconductors, PNX1302EH Datasheet - Page 81

PNX1302EH

Specifications of PNX1302EH

Available stocks

Related parts for PNX1302EH