Retroshield 6800 is running FLEX 2.0

2026/03/05

I LOVE Retroshields!

Not because I made them but they makes debugging HW/SW so easy and very educational.

Recently, I was bringing up the Retroshield 6800 prototype. I quickly got MIKBUG and SWTBUG running. Next step was to run FLEX 2.0 using the DSK images on Teensy sdcard. That’s where my life felt like a “going down the drain”… I spent 2 weeks with no progress. Until today. With hind-sight, I literally wasted 2 weeks. At least I learned a great deal in the process.

retroshield6800 proto

What did I learn?

Simple? Yeah, right.

Special Thanks!

I wanted to shout out to Joseph H. Allen for his EXORsim - Motorola EXORciser Simulator. When I couldn’t make progress w/ my 1797 emulation, I brought in his 1771 floppy emulation. I got the same behavior. This meant the bug was somewhere else and that helped me figured it out. Thank you Joseph.

Root-cause

When SWTBUG runs the disk boot routine, it will issue a multi-sector read starting at Track 0/Sector 0 and start copying data starting at $A100+. Multi-sector read means read the rest of the sectors on that track. Total 72 sectors * 256 bytes/sector = 18,432 bytes. In hex, $4800. When floppy disk controller says no more sectors, disk boot routine will jump to $A100, which is the beginning of sector 0.

 00444                      * MINIFLOPPY DISK BOOT
 00445 E28F 7F 8014 DISK    CLR $8014       ; Select Disk 0
 00446 E292 8D 2E           BSR DELAY
 00447 E294 C6 0B           LDA B #$0B      ; Issue RESTORE command (Track0)
 00448 E296 8D 25           BSR RETT2
 00449 E298 E6 04   LOOP1   LDA B 4,X       ; Wait till not BUSY 
 00450 E29A C5 01           BIT B #1
 00451 E29C 26 FA           BNE LOOP1
 00452 E29E 6F 06           CLR 6,X         ; Choose Sector 0
 00453 E2A0 8D 1D           BSR RETURN
 00454 E2A2 C6 9C           LDA B #$9C      ; Read-Multi command
 00455 E2A4 8D 17           BSR RETT2
 00456 E2A6 CE A100         LDX #$A100      ; Load Address - A100
 00457 E2A9 C5 02   LOOP2   BIT B #2
 00458 E2AB 27 06           BEQ LOOP3       ; Wait for DRQ bit
 00459 E2AD B6 801B         LDA A $801B     ; DRQ=1: We can read next byte
 00460 E2B0 A7 00           STA A 0,X       ;        and save to RAM
 00461 E2B2 08              INX
 00462 E2B3 F6 8018 LOOP3   LDA B $8018     ; Check BUSY bit
 00463 E2B6 C5 01           BIT B #1        ; 
 00464 E2B8 26 EF           BNE LOOP2       ; BUSY=1, we are still reading
 00465 E2BA 7E A100         JMP $A100       ; BUSY=0, multi-sector read complete. Execute.
 00466 E2BD E7 04   RETT2   STA B 4,X
 00467 E2BF 8D 00   RETURN  BSR RETT1       ; simple delays
 00468 E2C1 39      RETT1   RTS

First, Why I think the FLEX boot process is smart is the directory structure is stored on Track0, after the bootloader. A multi-sector read by SWTBUG means the directory structure is also in RAM memory. And that’s what I think the bootloader does, it reads the first file in the directory structure into memory and runs it. This happens to be the FLEX2.SYS.

Bullet item 2, here was the problem.

A100 - E900: Bring all sectors from Track 0
E200 - E3FF: SWTBUG monitor code

There is an overlap ! In real systems, SWTBUG is in EPROM. In Retroshield SW, it is possible to put ROM into Teensy RAM (Shadow-RAM) so I can override it, i.e. change boot vector, or place my custom I/O drivers or disable unnecessary delays. As I usually do, I copied code from another shield and forgot ROM was copied to RAM. As a result, the track 0 multi-sector read was overwriting the SWTBUG bootloader routine. And I scratched my head for a week trying to figure out why the multi-sector read stops in the middle of sector 66, magically all the time, as seen below:

////////////////////////////////////////////////////////////////////
// Monitor Code
////////////////////////////////////////////////////////////////////
#define STORE_ROM_IN_FLASH  0   // 1: Read-only ROM. 0: Shadow-RAM

#ifdef ROM1_START
#if (STORE_ROM_IN_FLASH)
const unsigned char EPROM1[] PROGMEM = {
#else
unsigned char EPROM1[] = {
#endif
#include "eeprom1.h"
};
#endif
FD1771 restore!
Set sector = 0
FD1771 read multiple
Read sector drive=0, track=0, sector=1
Read state 2: 00: 8e
Read state 2: 01: a0
Read state 2: 02: 7f
Read state 2: 03: 20
...
Read state 2: fe: 00
Read state 2: ff: 00
Sector 1 done
Read sector drive=0, track=0, sector=2
Read state 2: 00: 00
Read state 2: 01: 03
Read state 2: 02: 00
Read state 2: 03: 00
...
Sector 65 done
Read sector drive=0, track=0, sector=66
Read state 2: 00: 00
Read state 2: 01: 43
Read state 2: 02: 00
Read state 2: 03: 00
...
Read state 2: ab: 00
Read state 2: ac: 00
Read state 2: ad: 00
FD1771: FIXME: Missing Read Port5, 0000
FD1771: FIXME: Missing Read Port5, 0001
FD1771: FIXME: Missing Read Port5, 0002
FD1771: FIXME: Missing Read Port5, 0003
Read state 2: ae: 00

With hind-sight, when track0 data starts overwriting E2A9, we are executing random code. $A100 + (66-1) sectors * 256 bytes + $AD => $E2AD. (sectors numbers start at 1, not 0).

 00457 E2A9 C5 02   LOOP2   BIT B #2
 00458 E2AB 27 06           BEQ LOOP3       ; Wait for DRQ bit
 00459 E2AD B6 801B         LDA A $801B     ; DRQ=1: We can read next byte
 00460 E2B0 A7 00           STA A 0,X
 00461 E2B2 08              INX
 00462 E2B3 F6 8018 LOOP3   LDA B $8018     ; Check BUSY bit
 00463 E2B6 C5 01           BIT B #1        ; If BUSY=1, we are still reading
 00464 E2B8 26 EF           BNE LOOP2       ; If BUSY=0, multi-sector read complete.

When $E2B0: STA A 0,X executes to store data at $E2A9: LOOP2 BIT B #2 address, the $E2B8: BNE LOOP2 jumps to whatever the disk had for us to execute.

Once I set that #define to keep SWTBUG image as ROM, it worked right away! You see Retroshield code is complaining CPU is trying to write to ROM section.

#define STORE_ROM_IN_FLASH  1   // 1: Read-only ROM. 0: Shadow-RAM

retroshield6800 running flex

And BASIC:

retroshield6800 running msbasic

Closing

My engineering career taught me two things:

  1. If engineers in a room can’t agree on the same thing, they must have different assumptions you need to check or resolve.
  2. if an engineering problem takes more than 1-2 days to solve, then it will be a hairy but interesting problem (more smart people in the room, the more once-in-a-life-time bug).

This bootloader overwriting the boot loader reminded me the NAND bug we fixed on iPod nanos.

QA team reported one ipod out of say 100 would get stuck while playing audio. This would happen like once a day. SW team got involved, but it was very difficult to replicate the issue. We asked the factory to have all the operators take an ipod and try to replicate the issue :) Engineering in Cupertino was also working 24hrs trying to replicate it. Debugging went about a week. Learned a few things but no real root-cause. One day, HW and SW folks were in the room debugging an ipod. Luckily I was there too. When we saw what we saw on the oscilloscope, we let the loudest BOOOYYAAAAHHH I ever heard :) I think we all cried. It turns out, issue was the NAND power supply timing. Whenever NAND was not needed, we turned off NAND power. When the regulator turns off, you need to wait for voltage to go down to 0V (or below critical thresholds, such as NAND Power-On-Reset level (hint, hint)). If you turn the power back on quickly, the rail goes down to 1.5V and then back to 3.3V. In otherwords, your 3.3V NAND is briefly running at out-of-spec 1.5V, and if you don’t reset it, of course it will get confused and not respond to incoming NAND commands. Ahh, good old days of sweat, blood and fun…