Table based multiplication

hu, cd, scd, acd, supergrafx discussions.
Post Reply
User avatar
MooZ
Site Admin
Posts: 386
Joined: Sun Jun 22, 2008 3:19 pm
Location: Lvl 3
Contact:

Table based multiplication

Post by MooZ » Mon Jun 23, 2008 7:06 pm

Here's a relatively fast technique to multiply 2 bytes. It's based on a lookup table. No loops involved!
Great isn't it? :)
The lookup table contains the square value of numbers going from 0 to N. You may ask : "Why a square value table?". Well, let's write the equation for the square of (a+b) and (a-b):
  1. (a+b)² = a² + 2ab + b²
  2. (a-b)² = a² - 2ab + b²
If we do (1) - (2), we'll have:
  • (a+b)² - (a-b)² = 4ab
  • ab = 1/4 ((a+b)² - (a-b)²)
And here it is! The multiplication of a by b is the fourth of the difference of the square value of (a+b) and the square value of (a-b).
All you have to do is to compute a look up table containing the square value for (a+b) and (a-b).
The size N of the lookup table is bounded by the maximum value of (|a|+|b|). Theoretically N is equal to 511.

In the following code, one of the operand can be signed. The square value lookup table has been splitted in two part. squareTable_l and squareTable_h contains respectively the lsb and the msb of i²/4.

Code: Select all

mul:
	; unsigned 8bit * signed 8bit		
	; A + B
	addw  <__tmp,<__tmp+2, <_temp
	; A - B
	subw   <__tmp,<__tmp+2, <_temp+2
	; abs(A + B)
	lda    <_temp+1
	bpl    .m0
	negw   <_temp
.m0:
	; abs(A - B)
	lda    <_temp+3
	bpl    .m1
	negw   <_temp+2
.m1:

	clc
	lda    #low(squareTable_l)
	adc    <_temp
	sta    <__ptr
	lda    #high(squareTable_l)
	adc    <_temp+1
	sta    <__ptr+1
	lda	   [__ptr]
	sta    <__tmp
	
	clc
	lda    #low(squareTable_h)
	adc    <_temp
	sta    <__ptr
	lda    #high(squareTable_h)
	adc    <_temp+1
	sta    <__ptr+1
	lda	   [__ptr]
	sta    <__tmp+1
	
	clc
	lda    #low(squareTable_l)
	adc    <_temp+2
	sta    <__ptr
	lda    #high(squareTable_l)
	adc    <_temp+3
	sta    <__ptr+1
	lda	   [__ptr]
	sta    <__tmp+2
	
	clc
	lda    #low(squareTable_h)
	adc    <_temp+2
	sta    <__ptr
	lda    #high(squareTable_h)
	adc    <_temp+3
	sta    <__ptr+1
	lda	   [__ptr]
	sta    <__tmp+3
	
	
	subw   <__tmp+2, <__tmp, <_temp
	
	rts
And this is how to use it:

Code: Select all

	lda    #7
	sta    <__tmp

	lda    #31
	sta    <__tmp+2

	jsr    mul

	; <_temp contains the result (remember it's a 16bits word)

Post Reply