From koch%informatik.uni-kl.de@uklirb.informatik.uni-kl.de Wed Nov  3 22:43:41 1993
Received: from uni-kl.de (stepsun.uni-kl.de) by marsh.cs.curtin.edu.au with SMTP id AA02785
  (5.67a/IDA-1.5 for <gregorya@lillee.cs.curtin.edu.au>); Wed, 3 Nov 1993 22:43:41 +0800
Received: from uklirb.informatik.uni-kl.de by stepsun.uni-kl.de id aa02925;
          3 Nov 93 15:43 MET
Received: from rhein.informatik.uni-kl.de by uklirb.informatik.uni-kl.de
          id aa02978; 3 Nov 93 15:36 MET

I've done this already. My Apple ][+ emulator runs on Unix (graphics is done
with X Windows) and is completely written in C. If you have some knowledge
of C, get it and analyze it. It contains the whole code to do 6:2 encoding
and decoding on the fly. There is also code for the handling of the stepper
motor emulation and such.

>I have heard that a book called 'Beneath Apple DOS' would explain all this
>but none of the bookshops here in this backwater of Perth, Western Australia
>carry it.

Take my advice: GET THIS BOOK. You will probably fail writing your emulator
without it! The needed information is contained in the DOS 3.3 Reference
Manual too, but very compressed. Perhaps a posting in c.s.a.marketplce will
do it?!?

>I've tried to figure out the //c disk ROM code without success (some comments
>with the source would be nice).

Sorry, but i cant help you with that. I don't own a Apple //c.

>This time I'm also wanting to know how DOS 3.3 and ProDOS 8 encode their
>data into something the disk drive can understand.  My understanding is that
>they use something called '6 and 2' which encodes 256 bytes of 'real' data
>into 342 (or thereabouts) of disk data.  I would  like to know the details
>of that process.

Ok, i'll try to explain it.
A track consists of 16 sectors. Each sector has the following layout:

	Autosync	Sector		Autosync	Sector
	Bytes		Header		Bytes		Data

	$1111111100	$D5 $AA $96	$1111111100	$D5 $AA $AD
	at least 5	$tr $tr $se $se at least 5	342 bytes encoded data
			$vo $vo $cs $cs			$cs
			$DE $AA $EB			$DE $AA $EB

The autosync bytes are normally 8 or 10, only the begin of the track
(before sector 0) has much more (normally about 30).
The Apple disk controller will read them as $FF, if it is in sync.

The sector header begins with the $D5 $AA $96 sequence. 
Then comes the track#, sector#, volume# and a checksum.
The header ends with the trailer $DE $AA $EB.
The track#, sector# and volume# are 4:4 encoded. A byte with the bits
pqrstuvw is 4:4 encoded into two bytes looking like
1p1r1t1v and 1q1s1u1w. This encoding is simple and fast:
	LDA	track
	ORA	$AA
	JSR	wbyte
	LDA	track
	LSR
	ORA	$AA
	JSR	wbyte
And decoding is even simpler:
	JSR	rbyte
	STA	track
	JSR	rbyte
	ASL
	AND	track
	STA	track
The checksum is simply track#, sector# and volume# exored and 4:4 encoded.

Now comes the data part:
It is preceeded by $D5 $AA $AD.
The 256 byte data are stripped to 6 bit. The remaining 2 bits are combined
in groups of 3 into 86 additional bytes (256/3=85.666667):
	$00:	00rstuvw
	$01:	00rstuvw
	 :	   :
	 :	   :
	$FE:	00rstuvw
	$FF:	00rstuvw

	$00:	00pqpqpq	p+q from byte 1,2+3
	$01:	00pqpqpq	p+q from byte 4,5+6
	 :
	 :
	$55:	00pqpqpq	p+q from byte 252,253+254
	$56:	0000pqpq	p+q from byte 255+256

The 342 6-bit bytes (nybbles) are now translated into disk bytes.
The Apple uses GCR encoding, which means, that no clock cycles are written
to the disk due normal writing. Thus, the disk bytes must "contain" "enough"
1 bits to ensure a correct timing. Here is, what "enough" means:
	The first bit must be set.
	There must not be more than 2 consecutive zero bits.
	Only one pair of zero bits per byte is allowed.
If you write down the bit patterns of every number between 128 and 255,
you will find, that 66 bytes fullfill the requirements.
$D5 and $AA are reserved for the header and do not occur elsewhere,
leaving exact 64 valid bytes. These are arranged in a lookup table
(as well as a reverse lookup table for reading).
Every 6-bit byte is translated into its disk byte before writing.
The checksum is a exor over all nybbles.

This is my knowledge of these things. Everything is from my mind, so
don't flame me, if i made a mistake.

Ah... one thing:
The physical sectors are arranged consecutive. The numbers in the header 
run from 0 to 15. DOS 3.3 as well as ProDOS, UCSD and CP/M translate these
physical sector number into a logical sector number. Unfortunately,
this is done in a different manner. While ProDOS and UCSD have the same
(2 descending) sector interleave, DOS 3.3 uses a 2 ascending and CP/M
a 3 descending. Don't become confused: the PHYSICAL disk layout is
the same, only their logical interpretation will vary, because different
block sizes are used (DOS:256, ProDOS+UCSD:512, CP/M:128).

Peter (koch@informatik.uni-kl.de)

P.S.: You can find my emulator on ftp.uni-kl.de:/pub/apple2

P.

