Great isn't it?
The lookup table contains the square value of numbers going from 0 to N. You may ask : "Why a square value table?". Well, let's write the equation for the square of (a+b) and (a-b):
- (a+b)² = a² + 2ab + b²
- (a-b)² = a² - 2ab + b²
- (a+b)² - (a-b)² = 4ab
- ab = 1/4 ((a+b)² - (a-b)²)
All you have to do is to compute a look up table containing the square value for (a+b) and (a-b).
The size N of the lookup table is bounded by the maximum value of (|a|+|b|). Theoretically N is equal to 511.
In the following code, one of the operand can be signed. The square value lookup table has been splitted in two part. squareTable_l and squareTable_h contains respectively the lsb and the msb of i²/4.
Code: Select all
mul:
; unsigned 8bit * signed 8bit
; A + B
addw <__tmp,<__tmp+2, <_temp
; A - B
subw <__tmp,<__tmp+2, <_temp+2
; abs(A + B)
lda <_temp+1
bpl .m0
negw <_temp
.m0:
; abs(A - B)
lda <_temp+3
bpl .m1
negw <_temp+2
.m1:
clc
lda #low(squareTable_l)
adc <_temp
sta <__ptr
lda #high(squareTable_l)
adc <_temp+1
sta <__ptr+1
lda [__ptr]
sta <__tmp
clc
lda #low(squareTable_h)
adc <_temp
sta <__ptr
lda #high(squareTable_h)
adc <_temp+1
sta <__ptr+1
lda [__ptr]
sta <__tmp+1
clc
lda #low(squareTable_l)
adc <_temp+2
sta <__ptr
lda #high(squareTable_l)
adc <_temp+3
sta <__ptr+1
lda [__ptr]
sta <__tmp+2
clc
lda #low(squareTable_h)
adc <_temp+2
sta <__ptr
lda #high(squareTable_h)
adc <_temp+3
sta <__ptr+1
lda [__ptr]
sta <__tmp+3
subw <__tmp+2, <__tmp, <_temp
rts
Code: Select all
lda #7
sta <__tmp
lda #31
sta <__tmp+2
jsr mul
; <_temp contains the result (remember it's a 16bits word)