Technology

6502 “Illegal” Opcodes Demystified

A closer look at the “illegal” opcodes and undocumented instructions of the MOS 6502 MPU.

The instruction table of the MOS 6502 MPU, designed by MOS Technology and introduced in 1975 (the CMOS version, 65C02, was developed by Western Design Center) has some obvious gaps, with just 56 intructions documented in various address modes. This leaves 105 undocumented slots — and the 6502 community has been eager to fill these gaps, ever since.

Still, there’s some mystery left and there are questions unanswered, like, were at least some of them intentional (especially, since some of them are handy for block transfer, something the Z80 has dedicated instructions for) or are they all by accident, how do they behave, and why so? Here, we’ll try to come up with some answers to these questions.

First, let’s have a look at the instruction table, as it is commonly presented, with the blank gaps filled in. (Here, for the “illegal” opcodes, we use the mnemonics used by the DASM and ACME assemblers, with the exception of “USBC” for instruction code $EB, where these use plain “SBC”.)

MOS 6502 instruction table
Instruction set of the MOS 6502 MPU, “illegals” on grey background. — Open in a new tab.

And here are all the 21 (more or less) “illegal” opcodes (alternative names given in parentheses) as they are commonly described:

ALR (ASR)

AND oper + LSR

A AND oper, 0 -> [76543210] -> C

addressing assembler opc bytes cycles
immediate ALR #oper 4B 2 2
ANC

AND oper + set C as ASL

A AND oper, bit(7) -> C

addressing assembler opc bytes cycles
immediate ANC #oper 0B 2 2
ANC (ANC2)

AND oper + set C as ROL

effectively the same as instr. 0B

A AND oper, bit(7) -> C

addressing assembler opc bytes cycles
immediate ANC #oper 2B 2 2
ANE (XAA)

* AND X + AND oper

Highly unstable, do not use.

A base value in A is determined based on the contets of A and a constant, which may be typically $00, $ff, $ee, etc. The value of this constant depends on temerature, the chip series, and maybe other factors, as well.
In order to eliminate these uncertaincies from the equation, use either 0 as the operand or a value of $FF in the accumulator.

(A OR CONST) AND X AND oper -> A

addressing assembler opc bytes cycles
immediate ANE #oper 8B 2 2 ††
ARR

AND oper + ROR

This operation involves the adder:
V-flag is set according to (A AND oper) + oper
The carry is not set, but bit 7 (sign) is exchanged with the carry

A AND oper, C -> [76543210] -> C

addressing assembler opc bytes cycles
immediate ARR #oper 6B 2 2
DCP (DCM)

DEC oper + CMP oper

M – 1 -> M, A – M

addressing assembler opc bytes cycles
zeropage DCP oper C7 2 5
zeropage,X DCP oper,X D7 2 6
absolute DCP oper CF 3 6
absolut,X DCP oper,X DF 3 7
absolut,Y DCP oper,Y DB 3 7
(indirect,X) DCP (oper,X) C3 2 8
(indirect),Y DCP (oper),Y D3 2 8
ISC (ISB, INS)

INC oper + SBC oper

M + 1 -> M, A – M – C -> A

addressing assembler opc bytes cycles
zeropage ISC oper E7 2 5
zeropage,X ISC oper,X F7 2 6
absolute ISC oper EF 3 6
absolut,X ISC oper,X FF 3 7
absolut,Y ISC oper,Y FB 3 7
(indirect,X) ISC (oper,X) E3 2 8
(indirect),Y ISC (oper),Y F3 2 4
LAS (LAR)

LDA/TSX oper

M AND SP -> A, X, SP

addressing assembler opc bytes cycles
absolut,Y LAS oper,Y BB 3 4*
LAX

LDA oper + LDX oper

M -> A -> X

addressing assembler opc bytes cycles
zeropage LAX oper A7 2 3
zeropage,Y LAX oper,Y B7 2 4
absolute LAX oper AF 3 4
absolut,Y LAX oper,Y BF 3 4*
(indirect,X) LAX (oper,X) A3 2 6
(indirect),Y LAX (oper),Y B3 2 5*
LXA (LAX immediate)

Store * AND oper in A and X

Highly unstable, involves a ‘magic’ constant, see ANE

(A OR CONST) AND oper -> A -> X

addressing assembler opc bytes cycles
immediate LXA #oper AB 2 2 ††
RLA

ROL oper + AND oper

M = C <- [76543210] <- C, A AND M -> A

addressing assembler opc bytes cycles
zeropage RLA oper 27 2 5
zeropage,X RLA oper,X 37 2 6
absolute RLA oper 2F 3 6
absolut,X RLA oper,X 3F 3 7
absolut,Y RLA oper,Y 3B 3 7
(indirect,X) RLA (oper,X) 23 2 8
(indirect),Y RLA (oper),Y 33 2 8
RRA

ROR oper + ADC oper

M = C -> [76543210] -> C, A + M + C -> A, C

addressing assembler opc bytes cycles
zeropage RRA oper 67 2 5
zeropage,X RRA oper,X 77 2 6
absolute RRA oper 6F 3 6
absolut,X RRA oper,X 7F 3 7
absolut,Y RRA oper,Y 7B 3 7
(indirect,X) RRA (oper,X) 63 2 8
(indirect),Y RRA (oper),Y 73 2 8
SAX (AXS, AAX)

A and X are put on the bus at the same time (resulting effectively in an AND operation) and stored in M

A AND X -> M

addressing assembler opc bytes cycles
zeropage SAX oper 87 2 3
zeropage,Y SAX oper,Y 97 2 4
absolute SAX oper 8F 3 4
(indirect,X) SAX (oper,X) 83 2 6
SBX (AXS, SAX)

CMP and DEX at once, sets flags like CMP

(A AND X) – oper -> X

addressing assembler opc bytes cycles
immediate SBX #oper CB 2 2
SHA (AHX, AXA)

Stores A AND X AND (high-byte of addr. + 1) at addr.

unstable: sometimes ‘AND (H+1)’ is dropped, page boundary crossings may not work (with the high-byte of the value used as the high-byte of the address)

A AND X AND (H+1) -> M

addressing assembler opc bytes cycles
absolut,Y SHA oper,Y 9F 3 5
(indirect),Y SHA (oper),Y 93 2 6
SHX (A11, SXA, XAS)

Stores X AND (high-byte of addr. + 1) at addr.

unstable: sometimes ‘AND (H+1)’ is dropped, page boundary crossings may not work (with the high-byte of the value used as the high-byte of the address)

X AND (H+1) -> M

addressing assembler opc bytes cycles
absolut,Y SHX oper,Y 9E 3 5
SHY (A11, SYA, SAY)

Stores Y AND (high-byte of addr. + 1) at addr.

unstable: sometimes ‘AND (H+1)’ is dropped, page boundary crossings may not work (with the high-byte of the value used as the high-byte of the address)

Y AND (H+1) -> M

addressing assembler opc bytes cycles
absolut,X SHY oper,X 9C 3 5
SLO (ASO)

ASL oper + ORA oper

M = C <- [76543210] <- 0, A OR M -> A

addressing assembler opc bytes cycles
zeropage SLO oper 07 2 5
zeropage,X SLO oper,X 17 2 6
absolute SLO oper 0F 3 6
absolut,X SLO oper,X 1F 3 7
absolut,Y SLO oper,Y 1B 3 7
(indirect,X) SLO (oper,X) 03 2 8
(indirect),Y SLO (oper),Y 13 2 8
SRE (LSE)

LSR oper + EOR oper

M = 0 -> [76543210] -> C, A EOR M -> A

addressing assembler opc bytes cycles
zeropage SRE oper 47 2 5
zeropage,X SRE oper,X 57 2 6
absolute SRE oper 4F 3 6
absolut,X SRE oper,X 5F 3 7
absolut,Y SRE oper,Y 5B 3 7
(indirect,X) SRE (oper,X) 43 2 8
(indirect),Y SRE (oper),Y 53 2 8
TAS (XAS, SHS)

Puts A AND X in SP and stores A AND X AND (high-byte of addr. + 1) at addr.

unstable: sometimes ‘AND (H+1)’ is dropped, page boundary crossings may not work (with the high-byte of the value used as the high-byte of the address)

A AND X -> SP, A AND X AND (H+1) -> M

addressing assembler opc bytes cycles
absolut,Y TAS oper,Y 9B 3 5
USBC (SBC)

SBC oper + NOP

effectively same as normal SBC immediate, instr. E9.

A – M – C -> A

addressing assembler opc bytes cycles
immediate USBC #oper EB 2 2
NOPs (including DOP, TOP)

Instructions effecting in ‘no operations’ in various address modes. Operands are ignored.

opc addressing bytes cycles
1A implied 1 2
3A implied 1 2
5A implied 1 2
7A implied 1 2
DA implied 1 2
FA implied 1 2
80 immediate 2 2
82 immediate 2 2
89 immediate 2 2
C2 immediate 2 2
E2 immediate 2 2
04 zeropage 2 3
44 zeropage 2 3
64 zeropage 2 3
14 zeropage,X 2 4
34 zeropage,X 2 4
54 zeropage,X 2 4
74 zeropage,X 2 4
D4 zeropage,X 2 4
F4 zeropage,X 2 4
0C absolute 3 4
1C absolut,X 3 4*
3C absolut,X 3 4*
5C absolut,X 3 4*
7C absolut,X 3 4*
DC absolut,X 3 4*
FC absolut,X 3 4*
JAM (KIL, HLT)

These instructions freeze the CPU.

The processor will be trapped infinitely in T1 phase with $FF on the data bus. — Reset required.

Instruction codes: 02, 12, 22, 32, 42, 52, 62, 72, 92, B2, D2, F2

Legend to markers used in the instruction details:

*
add 1 to cycles if page boundery is crossed
unstable
††
highly unstable

Disclaimer:
Information is provided as-is, without any guarantee of completness or correctness.
None of these “illegal” instructions are guaranteed to work, some are highly unstable, some may even start two asynchronous threads competing in race condition with the winner determined by such miniscule factors as temperature or minor differences in the production series, at other times, the outcome depends on the exact values involved and the chip series.
Use with care and at your own risk.

Well, this is all fine and good, but… we really do not learn much about hat they are and why these are.
Let’s risk another look at the instruction layout, as it ought to be viewed.

Another Look at the Instruction Layout

The 6502 instruction table is laid out according to a pattern a-b-c, where
a and b are an octal number each, followed by a group of two binary digits c,
as in the bit-vector “aaabbbcc”.

a a a b b b c c
bit 7 6 5 4 3 2 1 0
(0…7) (0…7) (0…3)

Example:
All ROR instructions share a = 3 and c = 2 (3b2) with the address mode in b.
At the same time, all instructions addressing the zero-page share b = 1 (a1c).

abc &equals; 312  &equals;>  ( 3 << 5 | 1 << 2 | 2 )  &equals;  %011.001.10  &equals;  $66  “ROR zpg”.

If we arrange the instruction table by components c, a and b, we find them all neatly lined up per address mode in the vertical columns (with the notable exception of instructions related to the X register, which show up with their respective Y counterpart for address modes involving an index by X). Notably, all the “illegals” adhere strictly to this scheme.

Moreover, all the instructions internal to the CPU and its flow of control are listed in the top quarter at c=0, while the bottom quarter at c=3, where we find the majority of “illegal” opcodes, is completely unpopulated by official opcodes. Further, for sections, where c=1 or c=2, we see opcodes of a kind sharing the same row (with the notable outliers of the two stack transfer instructions “TXS” and TSX).

MOS 6502 instruction layout
Instruction layout of the MOS 6502 MPU, “illegals” on grey background. — Open in a new tab.

While this certainly informative, it still doesn’t give away a systemic aspect of the unimplemented instructions, nor does this view tell us what they really are.

So let’s give this another try, this time arranging the instruction layout by components a, c and b:

MOS 6502 instruction table, structured view
Structured view of the 6502 instruction layout, “illegals” on grey background. — Open in a new tab.

Well, this is better, much better.

NOPs

First, we learn what the additional NOPs really are. By comparing opcodes by row and address modes by column, we can clearly see, what these ought to be.

E.g.,

$80 (a=4, c=0, b=0) is clearly “STY immediate”, attempting to store the the contents of the Y register in the literal operand.

Generally speaking, these additional NOPs are instructions with non-functional or nonsensical address modes, which do execute, but without any external effects.

JAMs

However, instructions of this group which involve indirect addressing fail entirely with the CPU infinitely trapped in T1 phase, resulting in a “JAM” (or KIL), rendering the CPU unresponsive and requiring a reset.

Instructions at ‘C = 3’

This is the really interesting part, the meat of the “illegal opcodes”.

Generally, we may observe that any of the instructions at c=3 are really inheriting their behavior from those at c=1 and c=2 in the same slot, found in the rows immediately above, same column, using the address mode of the instruction at c=1. Mind that in binary 3 is the composite of 1 and 2 with bits 0 and 1 set.

In other words, any instruction xxxxxx11 will execute the instructions at xxxxxx01 and xxxxxx10 at once, using the address mode of the instruction at xxxxxx01. (However, the general rule regarding X and Y register specific indexed address modes still applies.)

E.g.,

SAX abs” ($8F, a=4,c=3,b=3) is the composite of
STA abs” ($8D, a=4,c=1,b=3) and
STX abs” ($8E, a=4,c=2,b=3).

E.g.,

LAX X,ind” ($A3, a=5,c=3,b=0) is the composite of
LDA X,ind” ($A1, a=5,c=1,b=0) and
LDX imm ($A2, a=5,c=2,b=0).

The “Magic” Constant

Let’s have a closer look at the two highly unstable instructions “ANE” (XAA) and “LXA” (LAX immediate) involving a “magic constant” — typically $00, $FF, $EE, etc. —, which are both combinations of an accumulator operation and an inter-register transfer between the accumulator and the X register:

$8B (a=4,c=3,b=2): ANE imm = STA imm (NOP) + TXA
                   (A OR CONST) AND X AND oper -> A

$AB (a=5,c=3,b=2): LXA imm = LDA imm + TAX
                   (A OR CONST) AND oper -> A -> X

In the case of “ANE”, the contents of the accumulator is put on the internal
data lines at the same time as the contents of the X-register, while there’s
also the operand read for the immediate operation, with the result
transferred to the accumulator.

In the case of “LXA”, the immediate operand and the contents of the accumulator
are competing for the imput lines, while the result will be transferred to
both the accumulator and the X register.

The outcome of these competing, noisy conditions depends on the production
series of the chip, and maybe even on environmental conditions. This effects
in an OR-ing of the accumulator with the “magic constant” combined with an
AND-ing of the competing inputs. The final transfer to the target register(s)
then seems to work as may be expected.

(We may note that all the instructions involved in these two opcodes complete in 2 cycles, the shortest sequence available on the 6502, meaning, everything is virtually happening “at once”.)

This AND-ing of competing output values suggests that the 6502 is working internally
in active low logic, where all data lines are first set to high and then
cleared for any zero bits. This also suggests that the “magic constant” stands
merely for a partial transfer of the contents of the accumulator.

(Mind that this is not a qualified statement about the internals of the 6502 hardware, but merely an observation on its external effects.)

Much of this also applies to “TAS” (XAS, SHS), $9B, but here the extra cycles
for indexed addressing seem to contribute to the conflict being resolved
without this “magic constant”. However, “TAS” is still unstable.

The ‘H+1’ Group

There are four instructions, which add the peculiar term ‘high-byte of provided address + 1’ to the equation. These are:

SHA (AHX, AXA)       A AND X AND (H+1) -> M
                     $9F  SHA abs,Y  (5)

SHX (A11, SXA, XAS)  X AND (H+1) -> M
                     $9E  SHX abs,Y  (5)

SHY (A11, SYA, SAY)  Y AND (H+1) -> M
                     $9C  SHY abs,X  (5)

TAS (XAS, SHS)       A AND X -> SP, A AND X AND (H+1) -> M
                     $9B  TAS abs,Y  (5)

We may already see, where this comes from: as the calculations for the effective address involves the ALU, a partial result for the high-byte adds to the conflicting output values. However, depending on minor timing discrepancies, this term may be also dropped (meaning, become overriden).
We may also discern, why the effective high-address may be replaced by the ouput value altogether, in case a page boundary is crossed, since this provides just the extra amount of timing required to allow the output value to stabilize and to override the address high-byte. Again, these instructions are unstable.

The Outliers

We may note that “SHY” and “SHX” are not part of the c=3 group, but rather the unimplemented instructions “STY abs,X” (c=0) and “STX abs,Y” (c=2) respectively. Both are apparently falling back to the implementation of “STA abs,X” with the extra quirk of the ‘H+1’ term.

SHA abs,Y”, finally, is the composite instruction adhering to the c=3 rule that we have already established, executing “STA abs,X” and “SHX abs,Y” at once. (Notably, this flips the address mode to “abs,Y”, where “abs,X” may be expected. Which suggests that this adjustment for indexed instructions concerning any X register transfers is implemented as an additional stage.)

SHA ind,Y” ($93), however, is the composite of “STA ind,Y” ($91) and “SHX ind,Y” ($92), which JAMs on its own.

Mysterious NOPs

As mentioned earlier, we are able to figure out, what most of the NOPs and JAM instructions are, just from their disposition on the layout. But there is a group of 12 NOP instructions (all at a=0 and c≤3 and odd values of b), which seem to be truly empty slots. Namely these are the instructions at:

$04 (a=0, c=0, b=1)
$0C (a=0, c=0, b=3)
$14 (a=0, c=0, b=5)
$1C (a=0, c=0, b=7)
$34 (a=1, c=0, b=5)
$3C (a=1, c=0, b=7)
$44 (a=2, c=0, b=1)
$54 (a=2, c=0, b=5)
$5C (a=2, c=0, b=7)
$64 (a=3, c=0, b=1)
$74 (a=3, c=0, b=5)
$7C (a=3, c=0, b=7)

From their very position on the instruction layout, we may infer that these should be instructions internal to the CPU. Typically, instructions at (a=0/c=0) have a counterpart at (a=1/c=0) in the repective b position, as is also true for (a=2/c=0) and (a=3/c=0). E.g., PHP & PLP, BPL & BMI, CLC & SEC, and so on.
Here, however, the counterparts are missing, as well. (Only $04 and $0C have a counterpart in “BIT”, but we may have a hard time figuring out, what the counterpart of “BIT” may actually be.) For all we know, these instructions are simply unimplemented, and it’s a small wonder that the timing sequence for these instructions does resolve without a JAM. But these instructions are still interesting, as they direct our attention towards how the internal instructions which are implemented are systematically arranged on the decoding matrix.

The same pattern, BTW, may be observed for most instructions, so that we may think of even and consecutive odd values of a and same values for c and b as “opposing” or “complementary” slots, where we find in one slot the store instruction for a given register in and the other one the load instruction, both in the address mode defined by b, or a shift in one direction and the opposing shift in the other direction.

Conclusions

What we have observed here is really a text-book example of undefined behavior for undefined input patterns. For any instruction with the two least significant bits set at once (c=3) the two instructions in the respective slot with c=1 and c=2 are started in parallel, asynchronous threads with competing output values AND-ed. Minor implementation details and environmental factors may contribute to the outcome of some of these instructions and how the timing eventually stabilizes.

Notably, there are no NOPs or jamming instructions at c=3, meaning, it doesn’t matter, if any of the two threads JAMs, if the timing for one of them resolves successfully (thus advancing the internal phase).

At c=0, c=1 and c=2 we find either undocumented instructions with ineffective address modes, or undocumented instructions that fail entirely over unresolved timing issues, resulting in a “JAM”. There are just two exceptions to this rule, namely “SHY” and “SHX”, which, while unstable, may be somewhat usable.

So is any of this intentional? Hardly. It’s just undefined behavior. Orderly chaos as provided by the decoding matrix. However, we may learn some from this about the internals of 6502 and its various close cousins. — Which is at least some.

Mind that there is much more competent commentary on the 6502, which is based on analysis of the actual hardware, especially at visual6502.org. But, maybe, you found this “hermeneutic” approach, trying to reveal the systematic aspects of what may be observed externally, interesting, as well.

PS: All the tables in this post are SVG images. You may download and use them (mind the “open in a new tab” links), but please give reference to https://www.masswerk.at/6502/6502_instruction_set.html, where you can find the original tables.

Related Articles

Back to top button