1. Cache memory

We have an assignment in Cache memory, in this simulator.

We are asked to compute two series
A[i] = A[i-1] + 3 (A[0] = 1)
B[i] = B[i-1] + B[i-2] + 1 (B[0] = 0, B[1] = 1)
~~i runs until 45
in one loop and then do
S[i] = A[i] + B[i]
D[i] = A[i] - B[i]
and eventually
A[i] = S[i]
B[i] = D[i] //I did not use two extra arrays, since this can be done with only two arrays.

• We have two separate memories (one for data and one instructions)
• We do not use delayed branches or delayed loads
• If we increase at least one cache memory's size (data or/and instructions) we have 3 Mhz penalty at the frequency.
We have
• block size - frequency
• 128 b - 50 Mhz
• 256 b - 47 Mhz
• 512 b - 44 Mhz
• 1024 b - 41 Mhz
• T = (Instructions + DataCache.miss * 25 + InstructionsCache.miss * 25)/frequency
• Free to select options like the size block, writing policy, 2-ways associative, etc., without any penalty.

Our goal is to minimize T. I had all the selections to default except the block size, which I set it to max and size of data cache at 512 b.
With these I got:
• Instructions = 974
DataCache.miss = 23 (Compulsory misses)
InstructionsCache.miss = 10

thus T = 40.8 μsec.

My code is this:
Code:
```.text

main:
la \$t5, ArrayB			# Put address of ArrayB into \$t5
lw \$t6, 0(\$t5)			# \$t6 = B[0]
lw \$t7, 4(\$t5)			# \$t7 = B[1]
addi \$t5, \$t5, 8		# Offset + base into \$t5

la \$t1, ArrayA			# Put address of ArrayA into \$t1
addi \$t0, \$t1, 176		# 176 = (45 - 1) * 4

lw \$t2, 0(\$t1)			# \$t2 = A[0]
li \$t4, 4				# Offset for A[1].
add \$t4, \$t4, \$t1		# Offset + base into \$t4
addi \$t3, \$t2, 3		# \$t3 = A[1] = A[0] + 3
sw \$t3, 0(\$t4)			# Store A[1].

Loop:   					# Compute series
addi \$t3, \$t3, 3		# Mathematical operation
addi \$t4, \$t4, 4		# Increase offset for ArrayA
addi \$t2, \$t3, 0		# Copy \$t3 to \$t2. \$t21 now holds A[i-1] for the next itteration

sw \$t2, 0(\$t4)			# Store A[i]

add \$t8, \$t7, \$t6		# B[i] = B[i-1] + B[i-2]
addi \$t8, \$t8, 1		# B[i] = B[i] + 1
addi \$t6, \$t7, 0		# Copy \$t7 to \$t6. Now \$t6 = B[i-1] and \$t7 = B[i-1]
addi \$t7, \$t8, 0		# Copy \$t8 to \$t7. Now \$t8 = B[i], \$t6 = B[i-1] and \$t7 = B[i]

sw \$t8, 0(\$t5)			# Store B[i]
addi \$t5, \$t5, 4		# Increase offset for ArrayB

blt \$t4, \$t0, Loop		# Loop through the elements

sub \$t5, \$t5, 4			# OffsetB--
Loop2:

sub \$t6, \$t2, \$t8		# Subtraction
sw 	\$t6, 0(\$t5)			# Store B[i]
sub \$t5, \$t5, 4			# OffsetB--

sw \$t7, 0(\$t4)			# Store A[i]
sub \$t4, \$t4, 4			# OffsetA--

bgt \$t4, \$t1, Loop2 	# Loop all elements
sub \$t6, \$t2, \$t8		# Subtraction
sw 	\$t6, 0(\$t5)			# Store B[0]. A[0] is ok already.

################### For testing ################
#li	\$v0, 4
#la	\$a0, string1
#syscall
#la \$t1, ArrayA
#la \$t3, ArrayB
#Test:
#	lw \$t2, 0(\$t1)
#	lw \$t4, 0(\$t3)
#	li \$v0, 1
#	move \$a0, \$t2
#	syscall
#	li	\$v0, 4
#	la	\$a0, string1
#	syscall
#	li \$v0, 1
#	move \$a0, \$t4
#	syscall
#	li	\$v0, 4
#	la	\$a0, string1
#	syscall
#	blt \$t1, \$t5, Test
#####################################################

li \$v0, 10
syscall					# Ciao

.data
string1:		.asciiz	"\n"
ArrayA:	.word	1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
ArrayB:	.word	0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0```
Any ideas to improve this? Or any comments on the value I already have..

2. I thought using one array instead of two, so that I would load only one base of the array and one counter and then load elements like this
Code:
```\$t1, 0(\$t3)
\$t2, 180(\$t3)```
but the assignment requests two arrays :/

3. I submitted the assignment, so maybe the thread should be deleted. I have exams..