Saturday, February 20, 2016

Virtual Machines and CPU emulation.

Simulation and modeling of the real world, emulation of real hardware and creating virtual worlds are IMO the coolest aspects of computing and programming. Speaking of hardware emulation - I always wanted to roll my own MOS 6502 emulator. Not that there is a lack of good emulators out there. Quite contrary. However, being MOS 6502 (and retro-computing in general) aficionado, I wanted to challenge myself just for the heck of it, to learn if I can pull it on my own and how difficult it would be.
The project I am presenting here is the result of above challenge.
I elected C++ as my hammer to nail it. It is in the very early stage of development, but at this point is usable and presentable, lacking perhaps some supporting tools (assembler, conversion tools etc.).
It implements full list of classic MOS 6502 op-codes (including BCD mode), but it is not a hardware-level emulator (I hope I got the emulation vs. simulation terminology right), meaning I do not emulate the processor cycle to cycle, just simulate the functionality of the MOS 6502 in such a way that the resulting state of the virtual processor registers and flags is the same as when the real processor executes them. There is no emulation of address lines, data lines and signals changes in the real-time or even in the emulated time scale, just the virtual representation of the processor internal state after each op-code.
Program at this point is a single-threaded DOS console executable, equipped with step-by-step debugger which can also execute the memory image in the continuous way or up to the next BRK op-code and can also emulate rudimentary character I/O and a virtual 80x25 display.
I prepared an input file 'tinybasic.dat', which allows to run Tiny Basic interpreter in my VM6502 machine.

After downloading the source code and building it (I use Dev C++ 5.11) invoke the program in DOS prompt like this:

MKBASIC TINYBASIC.DAT

You should see the screen of the debugger console:



then, in the debugger console issue commands:

i e000
x cf0



and then switch to upper case (press Caps Lock) and answer 'C' to the prompt from Tiny Basic for the mode of boot: (Warm/Cold).



Continue working in upper case mode while playing with TB, it does not accept lower case codes, will cause errors.



Source code, open and free for non-commercial use:
https://github.com/makarcz/vm6502

To play with my simulator, you need to prepare the memory image file or enter the machine code by hand in debug console/monitor (command: W address hexbyte hexbyte ... 100).

The format of the memory image is ASCII with following protocol:
ADDR
hexaddr
; COMMENT
ORG
hexaddr
hexbyte hexbyte hexbyte ...

Where:
hexaddr - a 16-bit hexadecimal address in the range $0000..$FFFF, must start with '$' character.
hexbyte - an 8-bit hexadecimal value in the range $00..$FF, must start with '$' character.
All file contents are optional. If not provided, default values are assumed.

Note that you can also use decimal values, when they don't start with '$', they will be interpreted as decimal numbers.
Address in the next line after ADDR keyword is the run address of the program. It is optional as all other tokens, default is 0x0000 (I know, this is incorrect and I will change it in the next version to reset vector) and can be changed in debug console with command: A address.
Address in the next line after ORG keyword is the starting address of where following data will be but in the memory image. You can use ORG several times in the memory image file, so yo don't have to put vast amounts of 0 codes when you need to skip large areas of memory to put the next meaningful data in it.

The memory image file is the argument to the program:

mkbasic memoryimagefile

It is not mandatory. If none is provided, program will try to read file 'dummy.ram' and will display warning if that file does not exist as well. 
With no input provided, all memory is filled with zero except IRQ/BRK vector $FFFE which is initialized with $FFF0 and op-code RTI is put inder $FFF0 address. I don't yet initialize NMI and RST vectors, this will go to the next version.
Program also tries to read ROM image file 'dummy.rom'. This functionality is not used for now and I am not sure if I will continue it or rather invent a different scheme of memory mapping. I provide both: 'dummy.rom' and 'dummy.ram' files with my program, so make sure you have them in the current directory if yo want to avoid warning messages.
The ROM area is currently emulated in the RAM image by rendering the addresses $D000 to $DFFF a read-only area. You can load data into this area from memory image file during program startup, but you cannot modify these addresses during simulation. This currently cannot be customized, but I will work on the new memory mapping scheme that will allow some sort of memory configuration mimicking the real world computer systems and allowing user to customize the memory configuration.

The rudimentary character input/output can be emulated.
This is by default turned off, but can be enabled in debug console (command: I hexaddr). The default address is $E000 and you should use it when running Tiny Basic emulator as that image has this address hard-coded as character I/O.
The same address $E000 is used to emulate character input (when reading from memory) or output (when writing to memory). Emulator handles this internally by either prompting the single character from user (non-blocking mode) or putting the character received from simulated 6502 code into the internal buffer which is then sent to the display device simulator.

Sorry, as I said before for now I don't provide conversion tools to prepare memory image input file, so when you generate machine code from your favorite 6502 assembler, you need to use 3-rd party tools or your own scripts or programs to convert the resulting binary code to the format I described above.
I used Kowalski's emulator, which has integrated assembler and also can save the code as a data file if the dis-assembler window is active. After that you need to process resulting file in programmer's editor or other word processor to remove 'eXXXX:' labels and '.DB' and '.END' directives and add:
ORG
hexaddr
at the beginning. This will result in memory image file compatible with my VM.

I will provide proper tools in the next release.

I hope you like it and please provide valuable feedback for me to improve on it, although don't expect too much in the area of improving user friendliness. This project is intended rather for myself than for the world. My intention is to create an emulator of the 8-bit home-brew 6502 computer that I am building now (check my other blog) and also as a base for more elaborate Virtual Machine that will have op-codes beyond MOS 6502 which I will use to build interpreters of some retro computer languages, like BASIC for example.

Thank you for visiting my blog.

2/20/2016
M.K.