Assignment 4

Due:            Friday, 10/26/2001

 

The following code is to find the largest value in an array and store it in the variable "result".

 

	ori r3, r0, #arrayLower
	lhi r3, #arrayUpper
	ori r4, r0, #sizeLower
	lhi r4, #sizeUpper
	lw r1, 0(r3)
loop:	lw r2, 0(r3)
	slt r5, r1, r2 	// set r5 if r1 < r2
	bez r5, skip
	or r1, r2, r0 	// r1 = r2
skip: 	addi r3, r3, #4
	sub r5, r4, r3
	bnez r5, loop
	ori r3, r0, #resultLower
	lhi r3, #resultUpper
	sw r1, 0(r3)

 

1.       Show the timing of the above instructions for the DLX pipeline without any improvements to reduce the stalls needed to resolve any hazard.  All hazards result in stalls until the result is stored in the proper register.  Use the steps as described in figure 3.5 to determine what actions are done in each stage.  For a control/branch hazard, clear the pipeline whenever a branch is encountered.  For conditional branches, show two timings, one when the branch is taken and one when the branch is not taken.

 

2.       Show how the above instructions can be re-ordered or modified to result in a sequence of instructions that performs the same task, but has fewer stalls using the conditions from question 1.

 

3.       Show the timing of the above instructions for the DLX pipeline with forwarding to reduce the stalls needed to resolve any hazard.  All hazards result in stalls until the result is stored in the proper register.  For a control/branch hazard, use predict-not-taken to resolve the hazard.  For conditional branches, show two timings, one when the branch is taken and one when the branch is not taken.

 

4.       Show how the above instructions can be re-ordered or modified to result in a sequence of instructions that performs the same task, but has fewer stalls using the conditions from question 3.

 

The following code is store the sum from multiplying all corresponding values together in two floating point arrays.  The code will first be given in C, then in DLX.

 

result = 0;

for (i = 0; i < size; i++)

                result = result + array1[i] * array2[i];

 

movi2s                f2, r0

ori                r2, r0, #array1Lower

                lhi                r2, #array1Upper

ori                r3, r0, #array2Lower

                lhi                r3, #array2Upper

                ori                r4, r0, #sizeLower

                lhi                r4, #sizeUpper

                add                r4, r4, r2

loop:      ld                f0, 0(r2)

                ld                f4, 0(r3)

                multd                f0, f0, f4

                addd                f2, f0, f2

                addi                r2, r2, #8

                addi                r3, r3, #8

                sub                r5, r4, r2

                bnez                r5, loop

                ori                r3, r0, #resultLower

lhi                r3, #resultUpper

                sd                f2, 0(r3)

 

5.       Show the timing of the above instructions for the DLX pipeline without any improvements to reduce the stalls needed to resolve any hazard.  All hazards result in stalls until the result is stored in the proper register.  Use the steps as described in figure 3.5 to determine what actions are done in each stage.  For a control/branch hazard, clear the pipeline whenever a branch is encountered.  For conditional branches, show two timings, one when the branch is taken and one when the branch is not taken.

 

6.       Show how the above instructions can be re-ordered or modified to result in a sequence of instructions that performs the same task, but has fewer stalls using the conditions from question 5.

 

7.       Show the timing of the above instructions for the DLX pipeline with forwarding to reduce the stalls needed to resolve any hazard.  All hazards result in stalls until the result is stored in the proper register.  For a control/branch hazard, use predict-not-taken to resolve the hazard.  For conditional branches, show two timings, one when the branch is taken and one when the branch is not taken.

 

8.       Show how the above instructions can be re-ordered or modified to result in a sequence of instructions that performs the same task, but has fewer stalls using the conditions from question 7.