A MILESTONE REACHED

A couple of days ago I finished implementing all but a couple of
the instructions in my 8051 core.  For each of the ones that I did
implement, I wrote a small 8051 program using that instruction and
made sure that it worked by running it on the core, which was in
turn being simulated on the Verilog simulator.

After a couple of false starts, today I got the core to run the
following simple program for the first time on the actual FPGA.

	ORG	0
	MOV	A,#0C0h		; Put an obvious bit pattern in A

START:				; A big delay
	MOV	R2,#255		;     "
LOOP2:				;     "
	MOV	R1,#255		;     "
LOOP1:				;     "
	MOV	R0,#25		;     "
	DJNZ	R0,$		;     "
	DJNZ	R1,LOOP1	;     "
	DJNZ	R2,LOOP2	;     "

	RR	A		; Animate the LED display ...

	SJMP	START		; ... and repeat forever

	END

I have the accumulator wired to an array of 8 LEDs on the
evaluation board, so that when this thing runs I can watch the
pair of 1 bits in the accumulator change position with each
execution of the RR instruction.  It's very exciting.

SOME STATISTICS

Right now the implementation is pretty sparse.  I haven't yet
installed the interrupt mechanism or any UARTs or timers or other
peripherals.  I have implemented MUL (a no brainer since the FPGA
has some dedicated hardware multipliers) and DIV (a 9-cycle affair
that uses the traditional pencil and paper, shift and subtract
method).  The two missing instructions are RETI (don't need it
yet) and DA A (it'll be easier to test on the hardware than in the
simulator).

The synthesizer reports that the core as it sits should run at
about 30 MHz on the Xilinx Spartan XC3S500E.  I haven't actually
tried it yet any faster than 5 MHz, but I have some weak evidence
(from one of the false starts) that it may go as fast as 50 MHz
with a bit of tweaking.  As for size, the synthesizer reports that
about 1/3 of the available logic is in use.

NEXT STEPS

1.  The synthesizer is currently issuing a number of warnings.
    Resolve them all.

2.  Make a makefile or a batch file or a script or something to
    automatically synthesize the design and load it onto the eval
    board.

3.  Repeat step 2 to automatically run the simulator.

4.  Write some kind of comprehensive regression test.

5.  Establish some sort of stable, baseline implementation for
    size and speed comparisons.

6.  Tweak the implementation in various ways.  For each such
    experiment, make sure it's still correct by running the
    regression test, then see how the tweak affects the size and
    speed of the core.

7.  Implement the interrupt mechanism.

8.  Implement at least a UART and some timers.

Depending on how the experiments in Step 6 turn out, I may wind up
starting over from the beginning to produce a better version based
on whatever I happen to learn from doing this one.

Anyway, fun stuff.

-- Russ

PS: If anyone has any ideas about the test in Step 4, above, I'm
    interested.
