Rings of Saturn Crack Notes
By Hot Rod
July 2012

[WARNING  INCREDIBLY LONG POST AHEAD ]

I thought that if I was going to clean up and share any of my notes for this, Id better do it sooner than later, while I still can decode the notes.  So here goes  

Ill only touch on the highlights and some of the more interesting detailed aspects, rather than a full end to end description (just to save time/space).

The disk itself was copyable with a normal disk copier, except for tracks $00-$02, and tracks $13 and $14, sectors $0E and $0F (on each track).  A nibble dump of tracks $13 and $14 revealed that they were mostly standard address and data markers, except for a couple places, but more on that later. (I use The Inspector and its N feature to take track nibble dumps).  Tracks $00-$02 had several changes.   Upon boot of the disk, an Applesoft prompt appears, indicating some form of DOS was used.

So my general plan of attack was to copy the tracks/sectors that could be as-is, and then capture the RWTS of the program and use Advanced Demuffin to convert the rest of the data, and then modify the RWTS to work with a normalized format.

Track $00, sector $00 was not readable without ignoring errors in DOS ($B942:18), and given the way the rest of the nibble dump looked, boot tracing was the way to go.  And thats where the fun began.

The first thing of interest in the boot trace was that it actually loaded two sectors into memory, instead of the usual one.  So code appeared at $800-$9FF (and $800 is set to 2, causing the ROM boot routine to read in two sectors).

Heres the initial boot0 code, with inline comments.  I left out the $900 page, to keep it shorter (its on the corresponding notes disk image though).

0800-   02          ???
// Coming from ROM, carry is set, so execution will continue here at first
// but not on subsequent entries (see below)

0801-   90 4A       BCC   $084D

// The $803 routine sets up three values on stack as well as next address
// marker of D5 AA EF, using much indirection.
// $C600 and fake processor status are pushed onto the stack; 
// this will cause reboot if interrupted.
// This routine will also modify the byte at $801 from #$90 to #$B0,
// changing the logic to BCS $084D (ah, self-modifying code!).
// So this routine at $803 only runs once, and then the loop is through $84D.

0803-   C6 27       DEC   $27		
0805-   BD 31 09    LDA   $0931,X
0808-   49 B0       EOR   #$B0
080A-   48          PHA
080B-   C6 3D       DEC   $3D
080D-   98          TYA
080E-   C8          INY
080F-   48          PHA
0810-   CE 00 08    DEC   $0800
0813-   A9 20       LDA   #$20
0815-   C6 27       DEC   $27
0817-   48          PHA
0818-   51 26       EOR   ($26),Y
081A-   91 26       STA   ($26),Y
081C-   AA          TAX
081D-   A5 27       LDA   $27
081F-   85 32       STA   $32
0821-   CE 00 08    DEC   $0800
0824-   A8          TAY
0825-   B5 33       LDA   $33,X
0827-   84 29       STY   $29
0829-   84 21       STY   $21
082B-   8A          TXA
082C-   A2 17       LDX   #$17
082E-   86 31       STX   $31
0830-   D5 33       CMP   $33,X
0832-   A6 2B       LDX   $2B
0834-   5D 31 09    EOR   $0931,X
0837-   85 29       STA   $29
0839-   5D 32 09    EOR   $0932,X
083C-   C6 3D       DEC   $3D
083E-   85 28       STA   $28
0840-   5D 33 09    EOR   $0933,X
0843-   85 48       STA   $48
0845-   A0 2B       LDY   #$2B
0847-   84 20       STY   $20
0849-   C6 40       DEC   $40

// It will take this next branch, so code continues at $88B
084B-   30 3E       BMI   $088B

// On re-entry after $801 has been modified, $84D will run
084D-   24 40       BIT   $40

// So on to $884
084F-   30 33       BMI   $0884
0851-   24 24       BIT   $24
0853-   24 24       BIT   $24
0855-   40          RTI
0856-   40          RTI
0857-   40          RTI
0858-   40          RTI
0859-   43          ???
085A-   4F          ???
085B-   50 59       BVC   $08B6
085D-   52 49       EOR   ($49)
085F-   47          ???
0860-   48          PHA
0861-   54          ???
0862-   31 39       AND   ($39),Y
0864-   38          SEC
0865-   31 40       AND   ($40),Y
0867-   40          RTI
0868-   40          RTI
0869-   40          RTI
086A-   41 50       EOR   ($50,X)
086C-   50 4C       BVC   $08BA
086E-   45 20       EOR   $20
0870-   43          ???
0871-   4F          ???
0872-   4D 50 55    EOR   $5550
0875-   54          ???
0876-   45 52       EOR   $52
0878-   20 49 4E    JSR   $4E49
087B-   43          ???
087C-   40          RTI
087D-   40          RTI
087E-   40          RTI
087F-   40          RTI
0880-   24 24       BIT   $24
0882-   24 24       BIT   $24

// Code continues here while loading
0884-   C5 48       CMP   $48
0886-   D0 03       BNE   $088B

// Ah look, probably exits out to here on this jump, eh?
0888-   4C 00 09    JMP   $0900

// and here first time through
088B-   A4 48       LDY   $48
088D-   EA          NOP

// OK, disk read routines, but what are the markers being checked?
088E-   BD 8C C0    LDA   $C08C,X
0891-   10 FB       BPL   $088E

// So way back up in the $803 routine, this ends up pointing to $830
// which contains the opcode $D5!
0893-   51 20       EOR   ($20),Y
0895-   D0 F4       BNE   $088B
0897-   2C 40 40    BIT   $4040
089A-   BD 8C C0    LDA   $C08C,X
089D-   10 FB       BPL   $089A
// this ends up pointing to $81C, which contains the opcode $AA!
089F-   D1 31       CMP   ($31),Y
08A1-   D0 F0       BNE   $0893
08A3-   2C 40 40    BIT   $4040
08A6-   BD 8C C0    LDA   $C08C,X
08A9-   10 FB       BPL   $08A6
// and this points to an opcode in ROM that is $EF.  Brilliant!
08AB-   CD D0 FF    CMP   $FFD0
08AE-   D0 E3       BNE   $0893
08B0-   38          SEC
08B1-   2C 40 40    BIT   $4040

// This next bit is interesting too.  The page codes are 4+4 encoded
// and decoded from the nibbles to set the buffer address for the
// ROM routine to use.
08B4-   BD 8C C0    LDA   $C08C,X
08B7-   10 FB       BPL   $08B4
08B9-   2A          ROL
08BA-   8D C3 08    STA   $08C3
08BD-   BD 8C C0    LDA   $C08C,X
08C0-   10 FB       BPL   $08BD
08C2-   29 55       AND   #$55
08C4-   85 27       STA   $27
// After decoding and setting the buffer address, the ROM routine at $C6A6
// is used to read in the pages.  It will jump back to $801 from ROM.
08C6-   6C 28 00    JMP   ($0028)
08C9-   A0 A0       LDY   #$A0
08CB-   A5 A0       LDA   $A0
08CD-   A0 D0       LDY   #$D0
08CF-   A0 A0       LDY   #$A0
08D1-   FE A0 A0    INC   $A0A0,X
08D4-   FD A0 C3    SBC   $C3A0,X
08D7-   CE CE C1    DEC   $C1CE
08DA-   A0 D3       LDY   #$D3
08DC-   C5 A0       CMP   $A0
08DE-   D7          ???
08DF-   C2          ???
08E0-   A0 A0       LDY   #$A0
08E2-   C4 A0       CPY   $A0
08E4-   A0 A0       LDY   #$A0
08E6-   A0 C9       LDY   #$C9
08E8-   A0 43       LDY   #$43
08EA-   4F          ???
08EB-   50 59       BVC   $0946
08ED-   52 49       EOR   ($49)
08EF-   47          ???
08F0-   48          PHA
08F1-   54          ???
08F2-   20 31 39    JSR   $3931
08F5-   38          SEC
08F6-   31 20       AND   ($20),Y
08F8-   42          ???
08F9-   59 20 41    EOR   $4120,Y
08FC-   50 50       BVC   $094E
08FE-   4C 45 20    JMP   $2045

Well, that all seemed fine, with a modest level of protection, so I dutifully place the next boot trace interrupt at $888 (replacing the JMP $900) and let it go.  Never hit it.  Huh?

Long story short, after further examination and fiddling about, I determined that this boot stage loads code into $1000, $1100, $1200, $1300, and then $100.  Wait - $100?  Thats the stack!

Lets look at that loop re-entry point again:

// On re-entry after $801 has been modified, $84D will run
084D-   24 40       BIT   $40
084F-   30 33       BMI   $0884
0851-   24 24       BIT   $24
0853-   24 24       BIT   $24
0855-   40          RTI
0856-   40          RTI
0857-   40          RTI
0858-   40          RTI
0859-   43          ???
085A-   4F          ???

As it turns out, on the last spin through the loop, $40 isnt negative, so the BMI doesnt branch.  It falls right through to what appears to be gibberish.  Except that those two BIT opcodes are harmless, and the next opcode is a return from interrupt (RTI).

Recall that the $803 routine forced a $C600 and fake processor status onto the stack.  An RTI is different from an RTS in that the processor status is pulled off the stack first, followed by the return address, and the address is used as-is, without adding one to it.  So if this branch is taken before the last page is loaded over the stack, the disk will reboot.

But since the very last page that was loaded was at $100, a whole new stack now exists.  Gee, wonder whats on it?

The new stack page contains this odd repeating pattern:  $1FF:10 11 12 10 11 12... (descending),  ending with $100:10 12 11 (ascending).

OK, so where to with an RTI?  Since a new stack was loaded from disk, the previously pushed $C600 and processor status are replaced by the 10 11 12. An RTI pulls off the 12 first, and then since it's an RTI, the plain address is pulled next ($1011) without adding one to it like an RTS does.

Heres the code at $1011:
100F-   92 60       STA   ($60)
1011-   60          RTS
1012-   02          ???

Look, an RTS.  But we just consumed the three bytes that had been on the stack with the RTI.  Now where?  Well, it wraps.  So we look to $100 for the return address:

$100:10 12 11

At $100 is 10 12, so a return address of $1211 is built (this is an RTS, so one is added to the $1210 this time).  Heres the code at $1211:

1211-   EA          NOP
1212-   D8          CLD
1213-   A9 10       LDA   #$10
1215-   48          PHA
1216-   A9 78       LDA   #$78
1218-   48          PHA
1219-   D0 E6       BNE   $1201
121B-   60          RTS

Are you kidding me?  This code forces an address of $1078 onto the stack, before branching to $1201.

11FC-   A8          TAY
11FD-   68          PLA
11FE-   8D 55 10    STA   $1055
1201-   60          RTS
1202-   CC A0 A0    CPY   $A0A0

At $1201 is another RTS, which then pulls the newly forced address from the stack, adds one, and we're at $1079. 

At $1079, the boot finally continues.  Whew!  Now thats what I call some excellent obfuscation.   Great fun, and from 1981 even.

Anyway, by saving out the $1000-$13FF code, it is then possible to load it as a file via DOS and continue the boot at will. This does require that the disk be left on track0 though, since it hasn't done any arm moves yet.  So after loading the $1000 code again following a DOS boot, a mini boot trace to leave the arm on track 0, and the X-reg  loaded with #$60 (or disk controller slot#) does the trick:

9600<C600.C6FFM
96F8:4c 79 10

And now all the boot0 twists and turns are bypassed.

From here it loads code into $1B00-$3FFF and goes to $1B03 (by way of PHA and RTI again).  The code at $1D00-$3FFF is the DOS, so were getting close.  The routine at $1B03 is interesting in that it relocates DOS to $9D00-$BFFF, but in doing so, also reprograms the addressing to work in the new location.  I didnt spend time studying exactly how it accomplishes it, but thats the end result.  It leaves a converted copy in place where it was, so the execution continues here:

1C15-   C6 41       DEC   $41
1C17-   C6 43       DEC   $43
1C19-   D0 EE       BNE   $1C09
1C1B-   4C 8F 24    JMP   $248F

which then is:

248F-   4C 4D AA    JMP   $AA4D

Since this is converted, but not moved.

$AA4D is the entry point to fire up DOS and it takes over from there.  There is a HELLO program on the disk, which contains a line to BRUN RAMLOADER, which is the other file present in the catalog.  RAMLOADER loads into $300, and basically is a short routine to set up an I/O control block and call the RWTS entry point to read from track $16 sector $00 into $200 and then off it goes.  But aha, theres the address of the RWTS entry - $B6D5.  Bingo.

030F-   A9 03       LDA   #$03
0311-   A0 16       LDY   #$16
0313-   4C D5 B6    JMP   $B6D5

Now, the only remaining mystery is why wont tracks $13 and $14 sectors $E and $F read from regular DOS?  I could tell from a nibble dump that track $13 had a data marker on a couple sectors of D5 AA D3, and track $14 had D5 AA DD.  But just changing a normal D5 AA AD to those still wouldnt read them.  Setting $B942:18 would, and at first glance it looks like the resulting data is good (these are the sectors where HELLO and RAMLOADER reside, and can see the correct data for them).  But if these copies are used, the program wont work.

Heres why (ignore the addresses, this is taken from a pre-relocated listing and should be $Bxxx):

38AF-   BD 8C C0    LDA   $C08C,X
38B2-   10 FB       BPL   $38AF
38B4-   C9 EE       CMP   #$EE
38B6-   D0 E7       BNE   $389F
38B8-   49 AD       EOR   #$AD
38BA-   F0 0B       BEQ   $38C7
38BC-   D0 00       BNE   $38BE
38BE-   BC 8C C0    LDY   $C08C,X
38C1-   10 FB       BPL   $38BE
38C3-   B9 00 BB    LDA   $BB00,Y
38C6-   2C A9 00    BIT   $00A9
38C9-   A0 56       LDY   #$56
38CB-   88          DEY
38CC-   84 26       STY   $26
38CE-   BC 8C C0    LDY   $C08C,X
38D1-   10 FB       BPL   $38CE
38D3-   59 00 BB    EOR   $BB00,Y
38D6-   A4 26       LDY   $26
38D8-   99 00 BD    STA   $BD00,Y
38DB-   D0 EE       BNE   $38CB

This is what the code looks like before DOS has a chance to start up.  But when DOS starts, theres this bit of code in it:

AADD-   A9 C5       LDA   #$C5
AADF-   8D B4 B8    STA   $B8B4
AAE2-   A9 31       LDA   #$31
AAE4-   8D B5 B8    STA   $B8B5

This will result in:

B8AF-   BD 8C C0    LDA   $C08C,X
B8B2-   10 FB       BPL   $38AF
B8B4-   C5 31       CMP   $31
B8B6-   D0 E7       BNE   $389F

Theres code elsewhere in the DOS that sets the values of #$D3 and #$DD into address $31 when those sectors are to be read.  But whats more interesting is that when $31 is set to the normal #$AD, the EOR #$AD will result in zero and the branch will be taken.  This skips over reading an extra nibble! (which is then used to lookup in the read translate table).  When $31 is set to #$D3 or #$DD, this extra read is not skipped, and those sectors then read correctly.  When read with the $B941:18 setting (and those markers), the bytes are decoded OK, but are off by an extra byte, causing the code to fail. Neato.

So, to overcome this and capture those four sectors correctly, Advanced Demuffin is called in.  Since the DOS is already captured (just run the routine at $1B03 and stop the jump to $AA4D; its already left sitting at $1D00-$3FFF, just boot a slave disk and save it out) and the RWTS entry is conveniently known, the only trick getting Advanced Demuffin to work is to tweak the IOB a bit to set up $31 with a #$D3 or #$DD as needed (and use the DOS after the mod to $B8B4 has been made to compare to $31).

Then to get the DOS to no longer mess with the extra byte and the changing marker, I opted to simply tweak the code to store C9 AD at $B8B4 instead of C5 31, thus bypassing the concern completely (the rest of DOS will still conditionally set $31 to the #$D3/#$DD/#$AD values, but it no longer will matter).

Thats pretty much it.  DOS is captured, and the rest of the data is converted.  The only remaining task is to put DOS back onto tracks $00-$02 and create a loader to get it into the correct location, and then jump to $AA4D to let the program continue as if nothing were different.  Rather than doing anything particularly fancy, I opted to squeeze a simple loader completely into track 0, sector 0 (and even then, only half of it).  I let the ROM routine load all of track $00, and then just used the now-loaded RWTS to load the rest of it before jumping to $AA4D.  And I had just enough bytes left to put a one-liner note on boot, so as to distinguish from other versions.

So lots of interesting bits in this one, and it was a good opportunity to re-learn some of this stuff after 25 years away from it.

]HR



