Xiaomi Zigbee (2): Beyond Architecture

July 9, 2019

Preface

In our previous post, we stripped down a ZigBee-based Xiaomi smart plug, explored its internal interfaces and hardware, and with a bit of luck managed to get a hold of its firmware. In this post, we explore the firmware format and its ISA (instruction set architecture), in preparation for further reverse engineering and vulnerability assessment of the device.

NXP provides a wide range of downloads (including SDKs, documentation, and datasheets) regarding its Jennic line of ZigBee chips. This documentation can be accessed by anyone with a free NXP account, and we promptly downloaded the documents relevant to our MCU (JN5169).

We chose radare2 as our research and development framework. It is a great, open-source reverse engineering tool, with excellent support for plugins and emerging support for instruction emulation. We used the Python support for rapid development of firmware format parsing, (dis)assembly, and analysis (WIP) plugins. We named this collection of plugins pyba2 (since the ISA used by the Jennic chips is called Beyond Architecture 2), and it’s available on GitHub.

Using Python plugins in radare2 requires the installation of the lang-python package for radare2 (r2pm -i lang-python). Once the installation is complete, Python scripts can be loaded into radare2 upon execution via the -I argument. These scripts can use the r2lang module, that provides an interface to radare2’s internals.

OTA/Firmware File Format

For the purpose of delving into the firmware structure, the Bootloader documentation mentioned in the previous post (JN-AN-1003) is sufficient.

The format of the firmware is described across several chapters of the JN-AN-1003 Application Note. First, we referred to the description of the JN516x Flash Header on page 18:

Bytes	Word	Contents
0x00 - 0x0F	0-3	16-byte Boot Image Record
0x10 – 0x1D	4–7	Encryption Initialisation Vector (ignored if unencrypted)
0x1E – 0X1F	7	16-bit Software Configuration Options
0x20 – 0x23	8	32-bit Length of Binary Image in bytes
0x24 – 0x27	9	32-bit .data section Flash start address
0x28 – 0x29	10	16-bit .data section load address in RAM (word aligned)
0x2A – 0x2B	10	16-bit .data section length in 32-bit words
0x2C – 0x2D	11	16 bit .bss section start address in RAM (word aligned)
0x2E – 0x2F	11	16-bit .bss section length in 32-bit words
0x30 – 0x33	12	32-bit wake-up entry point (word aligned) – warm start
0x34 - 0x37	13	32-bit reset entry point (word aligned) – cold start
0x0038 to (MemA –1)	14 -	.text segment
MemA		.data segment

What is the 16-byte Boot Image Record (BIR)? As explained on page 17 of the Application Note, it’s a special sequence containing a 96-bit magic number, an image configuration byte, a status byte, and a 16-bit application ID:

Name	Size	Comment
Magic Number	12 bytes	Pattern to denote start sector of boot image
Configuration	1 byte	Used to carry configuration information to set up the SPI to suit the Flash device, and size of the image contained in Flash memory (see JN-AN-1003, page 17 for additional details)
Status	1 byte	0xFF – empty (erased) 0x00 – invalid 0x01 – valid 0x02 to 0xFE – reserved
Application ID	2 bytes

Finally, the description of the Magic Number appears on page 7, and for the JN516x it is 0x123456781122334455667788.

However, upon opening the OTA image we obtained, we can see that the magic number doesn’t appear at the start of the image, but is offset by 4 bytes:

$ xxd LM15_SP_mi_V1.3.22_20170228_OTA_v22_withCRC.20170526103042.bin| head
00000000: 24da 0200 1234 5678 1122 3344 5566 7788  $....4Vx."3DUfw.
00000010: 0801 0000 0000 0000 0000 0000 0000 0000  ................
00000020: 0000 0000 0002 da20 000a d0e8 0013 024e  ....... .......N
00000030: 0261 1703 000a 9e55 000a 36fc 0000 0000  .a.....U..6.....
00000040: 0000 0000 ffff ffff ffff ffff 0000 0000  ................
00000050: 0000 0000 1ef1 ee0b 0001 3800 0000 5f11  ..........8..._.
00000060: 0101 1600 0000 0200 4452 3131 3735 7231  ........DR1175r1
00000070: 7631 554e 454e 4352 5950 5445 4430 3030  v1UNENCRYPTED000
00000080: 3030 4a4e 3531 3639 5eda 0200 0000 0000  00JN5169^.......
00000090: 0000 0000 0000 0000 0000 0000 0000 0000  ................

Our best guess was that, being an OTA image, it should contain some information regarding its size regardless of the firmware format. When treated as a little-endian 32-bit integer, the first 4 bytes are interpreted as 0x2da24, or decimal 186916. Our guess proved to be correct, as the total size of the file is 186920 (the size of the length field itself being responsible for the 4 bytes difference).

pyba2/bin Plugin Development

The information above was sufficient for developing a JN516x format (bin) plugin for radare2. As documented in r2book, building Python-based plugins for radare2 is pretty straightforward: each plugin (asm/analysis/format) is simply a function that returns a dictionary filled with the description of the plugin, and pointers to its functions. In case of the format plugin, the following functionality was implemented:

{
    "name": "jennic.fw",
    "desc": "JN516x/JN517x firmware loader plugin",
    "license": "BSD",
    "check_bytes": loader.check_bytes,
    "load_bytes": loader.load_bytes,
    "info": loader.info,
    "sections": loader.sections,
    "baddr": loader.baddr,
    "entries": loader.entries,
}

The actual functionality of the plugin is implemented in the JennicLoader class (loader is an instance of this class):

class JennicLoader:
    RAM_BEGIN = 0x4000000

    def __init__(self, unused=0):
        if (unused != 0):
            self._unused = unused

    def load_buffer(self, binf, buf, loadaddr):
        return [True]

    def baddr(self, binf):
        return [0x80000]

    def _check_bytes_versioned(self, buf):
        return len(buf) > (VERSION_SIZE + HEADER_SIZE) \
            and buf[0x04:0x10] in MAGIC

    def _check_bytes_unversioned(self, buf):
        return len(buf) > (HEADER_SIZE) \
            and buf[:0xc] in MAGIC

    def check_bytes(self, buf):
        return [self._check_bytes_versioned(buf) or self._check_bytes_unversioned(buf)]

    def sections(self, binf):
        fheader = FlashHeader(binf)
        offset = VERSION_SIZE if fheader.has_version else 0

        header = {
                "name": ".header",
                "size": HEADER_SIZE,
                "vsize": HEADER_SIZE,
                "paddr": offset,
                "vaddr": self.baddr(binf)[0],
                "perm": 4, # R_PERM_R
                "has_strings": False,
                "add": True,
                "is_data": True,
            }
        data = {
                "name": ".data",
                "size": HEADER_SIZE,
                "vsize": HEADER_SIZE,
                "paddr": offset + fheader.data_flash_start - self.baddr(binf)[0],
                "vaddr": self.RAM_BEGIN + 4 * fheader.data_load_address,
                "perm": 4|2, # R_PERM_R | R_PERM_W
                "has_strings": True,
                "add": True,
                "is_data": True,
            }
        text = {
                "name": ".text",
                "size": data['paddr'] - (offset + HEADER_SIZE),
                "vsize": data['paddr'] - (offset + HEADER_SIZE),
                "paddr": offset + HEADER_SIZE,
                "vaddr": self.baddr(binf)[0] + HEADER_SIZE,
                "perm": 4|1, # R_PERM_R | R_PERM_X
                "arch": "ba2",
                "bits": 32,
                "has_strings": False,
                "add": True,
                "is_data": False,
            }
        return [header, data, text]

    def entries(self, binf):
        return [
            {"vaddr": FlashHeader(binf).cold_start},
            {"vaddr": FlashHeader(binf).warm_start},
        ]

    def info(self, binf):
        jntype = MAGIC[FlashHeader(binf).magic]
        return [{
                "type" : jntype,
                "os" : "none",
                "subsystem" : "none",
                "machine" : jntype,
                "arch" : "ba2",
                "has_va" : 1,
                "bits" : 32,
                "big_endian" : 1,
                "dbg_info" : 0,
                }]

Internally, it uses the FlashHeader class, which performs the parsing of the image, according to the aforementioned format documentation. Following is the description of its exposed functionality (note that the Python plugin framework for radare2 expects all return values to be encapsulated in lists):

check_bytes: this function is used to verify that the plugin can actually process the given binary blob as a Jennic firmware. In practice, it verifies that the header of the binary matches Jennic magic number (0x123456781122334455667788 or 123456782233445566778899), and returns True or False, accordingly.
load_buffer: this function must be implemented, but its functionality is not clear (it’s possible that it should initialize some structures, if needed, when loading the binary). In our case, we simply return True to indicate that the loading was successful, and additional functionality can be accessed.
info: this function should return a dictionary containing various information about the loaded binary. Most of the returned information is static (such as the architecture, the bit-width, and the endianness). We try to infer the type of the chip (JN516x or JN517x) based on the magic number in the header.
sections: this function should return a list of sections located in the binary. We return information about the 3 sections of the Jennic firmware (header, text, and data), based on the data parsed from the firmware header.
baddr: this function should return the base address of the firmware in memory, and in case of Jennic 5168 is the static value of 0x80000.
entries: this function should return the entry points of the binary. We return the addresses of the two entry points defined for every Jennic firmware - AppColdStart and AppWarmStart.

Beyond Architecture 2 ISA

Researching the ISA of the target chip proved to be more challenging. While it was possible to find the name and some general description of the JN516x ISA on Jennic’s website (it seems to be a closed-source implementation of OpenRisc), it contained no details whatsoever. No references or documentation were available for download, either from Jennic or from NXP.

However, as mentioned earlier, NXP provides a range of downloads for developers. Amongst the provided software packages is JN-SW-4141 (BeyondStudio for NXP). This download contains a customized version of Eclipse that comes preloaded with a gcc toolchain for compiling software for Jennic chips based on Beyond Architecture 2 (BA2). Additional downloads (such as JN-AN-1180 and JN-AN-1189) provided us with sample ZigBee projects that can be compiled by the aforementioned BeyondStudio.

The gcc toolchain (distributed by NXP with BeyondStudio in Windows binary form only) could provide us with a stepping stone to disassembling the firmware code. After all, it includes objdump, which supports basic disassembly of BA2 binaries/object-code. However, using objdump as the main disassembler is tedious, as it lacks much of the advanced functionality that comes standard in modern reverse engineering environments (radare2, IDA, etc.), such as xrefs and code flow charts. Therefore, we chose to develop our own radare2 plugin for disassembling BA2 binaries.

Given that objdump can be used to disassemble raw binary code, it could potentially be used for disassembling all possible combinations of X bytes, and the results of the disassembly could then be examined to infer the encoding of each instruction. However, that approach had serious problems, such as:

There was no way to know the maximum length of an instruction, so it would be impossible to know when to stop generating new code.
The amount of possibilites could simply be too large. If a single instruction could be no longer than 2 bytes, it’d be easily manageable. A maximum instruction size of 4 bytes would prove challenging. In practice, however, the instructions can be as long as 6 bytes - requiring 2^48 possible byte combinations to go over all the possible instruction encodings, which would be unattainable.

We, therefore, decided to take a look at the sources of the toolchain, instead. The toolchain is GPL software, and theoretically, NXP is required to provide the sources upon request. However, we didn’t need to take that course of action, as conveniently for us, others have already published the open-source toolchain on the internet. With the sources in hand, the way to a disassembly plugin was all but guaranteed.

pyba2/asm Plugin Development

The main source of information about the ISA can be found in the binutils project of the toolchain, under the bfd subdirectory. There, the file ba-isa-ba2.c contains all of the available opcodes for the 2nd version of Beyond Architecture:

static const struct ba_opcode ba_ba2_opcodes[] = {

{ "bt.movi",	"rD,G",		"0x0 00 DD DDD0 GGGG", EF(b_movi), 0, it_arith },
{ "bt.addi",	"rD,G",		"0x0 00 DD DDD1 GGGG", EF(bt_add), 0, it_arith },

{ "bt.mov",	"rD,rA",	"0x0 01 DD DDDA AAAA", EF(b_mov), 0, it_arith },
{ "bt.add",	"rD,rA",	"0x0 10 DD DDDA AAAA", EF(bt_add), 0, it_arith },

{ "bt.j",	"T",		"0x0 11 TT TTTT TTTT", EF(b_j), 0, it_jump },

/* ----------------------------- 0x1 ----------------------------- */

{ "bn.sb",	"N(rA),rB",	"0x2 00 BB BBBA AAAA NNNN NNNN", EF(l_sb), 0, it_store },
{ "bn.lbz",	"rD,N(rA)",	"0x2 01 DD DDDA AAAA NNNN NNNN", EF(l_lbz), 0, it_load },
{ "bn.sh",	"M(rA),rB",	"0x2 10 BB BBBA AAAA 0MMM MMMM", EF(l_sh), 0, it_store },
{ "bn.lhz",	"rD,M(rA)",	"0x2 10 DD DDDA AAAA 1MMM MMMM", EF(l_lhz), 0, it_load },
{ "bn.sw",	"K(rA),rB",	"0x2 11 BB BBBA AAAA 00KK KKKK", EF(l_sw), 0, it_store },
{ "bn.lwz",	"rD,K(rA)",	"0x2 11 DD DDDA AAAA 01KK KKKK", EF(l_lwz), 0, it_load },
{ "bn.lws",	"rD,K(rA)",	"0x2 11 DD DDDA AAAA 10KK KKKK", EF(b_lws), 0, it_load },
{ "bn.sd",	"J(rA),rB",	"0x2 11 BB BBBA AAAA 110J JJJJ", EF(b_sd), 0, it_store },
{ "bn.ld",	"rD,J(rA)",	"0x2 11 DD DDDA AAAA 111J JJJJ", EF(b_ld), 0, it_load },

{ "bn.addi",	"rD,rA,O",	"0x3 00 DD DDDA AAAA OOOO OOOO", EF(b_add), 0, it_arith },
{ "bn.andi",	"rD,rA,N",	"0x3 01 DD DDDA AAAA NNNN NNNN", EF(l_and), 0, it_arith },
{ "bn.ori",	"rD,rA,N",	"0x3 10 DD DDDA AAAA NNNN NNNN", EF(l_or), 0, it_arith },

{ "bn.sfeqi",	"rA,O",		"0x3 11 00 000A AAAA OOOO OOOO", EF(l_sfeq), BA_W_FLAG, it_compare },
{ "bn.sfnei",	"rA,O",		"0x3 11 00 001A AAAA OOOO OOOO", EF(l_sfne), BA_W_FLAG, it_compare },
{ "bn.sfgesi",	"rA,O",		"0x3 11 00 010A AAAA OOOO OOOO", EF(l_sfges), BA_W_FLAG, it_compare },
{ "bn.sfgeui",	"rA,O",		"0x3 11 00 011A AAAA OOOO OOOO", EF(l_sfgeu), BA_W_FLAG, it_compare },
{ "bn.sfgtsi",	"rA,O",		"0x3 11 00 100A AAAA OOOO OOOO", EF(l_sfgts), BA_W_FLAG, it_compare },
{ "bn.sfgtui",	"rA,O",		"0x3 11 00 101A AAAA OOOO OOOO", EF(l_sfgtu), BA_W_FLAG, it_compare },
{ "bn.sflesi",	"rA,O",		"0x3 11 00 110A AAAA OOOO OOOO", EF(l_sfles), BA_W_FLAG, it_compare },
{ "bn.sfleui",	"rA,O",		"0x3 11 00 111A AAAA OOOO OOOO", EF(l_sfleu), BA_W_FLAG, it_compare },
{ "bn.sfltsi",	"rA,O",		"0x3 11 01 000A AAAA OOOO OOOO", EF(l_sflts), BA_W_FLAG, it_compare },
{ "bn.sfltui",	"rA,O",		"0x3 11 01 001A AAAA OOOO OOOO", EF(l_sfltu), BA_W_FLAG, it_compare },

{ "bn.sfeq",	"rA,rB",	"0x3 11 01 010A AAAA BBBB B---", EF(l_sfeq), BA_W_FLAG, it_compare },
{ "bn.sfne",	"rA,rB",	"0x3 11 01 011A AAAA BBBB B---", EF(l_sfne), BA_W_FLAG, it_compare },
{ "bn.sfges",	"rA,rB",	"0x3 11 01 100A AAAA BBBB B---", EF(l_sfges), BA_W_FLAG, it_compare },
{ "bn.sfgeu",	"rA,rB",	"0x3 11 01 101A AAAA BBBB B---", EF(l_sfgeu), BA_W_FLAG, it_compare },
{ "bn.sfgts",	"rA,rB",	"0x3 11 01 110A AAAA BBBB B---", EF(l_sfgts), BA_W_FLAG, it_compare },
{ "bn.sfgtu",	"rA,rB",	"0x3 11 01 111A AAAA BBBB B---", EF(l_sfgtu), BA_W_FLAG, it_compare },

{ "bn.extbz",	"rD,rA",	"0x3 11 10 -00A AAAA DDDD D000", EF(b_extbz), 0, it_move },
{ "bn.extbs",	"rD,rA",	"0x3 11 10 -00A AAAA DDDD D001", EF(b_extbs), 0, it_move },
{ "bn.exthz",	"rD,rA",	"0x3 11 10 -00A AAAA DDDD D010", EF(b_exthz), 0, it_move },
{ "bn.exths",	"rD,rA",	"0x3 11 10 -00A AAAA DDDD D011", EF(b_exths), 0, it_move },
{ "bn.ff1",	"rD,rA",	"0x3 11 10 -00A AAAA DDDD D100", EF(l_ff1), 0, it_arith },
{ "bn.clz",	"rD,rA",	"0x3 11 10 -00A AAAA DDDD D101", EF(b_clz), 0, it_arith },
{ "bn.bitrev",	"rD,rA",	"0x3 11 10 -00A AAAA DDDD D110", EF(b_bitrev), 0, it_arith },
{ "bn.swab",	"rD,rA",	"0x3 11 10 -00A AAAA DDDD D111", EF(b_swab), 0, it_arith },

{ "bn.mfspr",	"rD,rA",	"0x3 11 10 -01A AAAA DDDD D000", EF(bn_mfspr), 0, it_move },
{ "bn.mtspr",	"rA,rB",	"0x3 11 10 -01A AAAA BBBB B001", EF(bn_mtspr), 0, it_move },

{ "bn.abs",	"rD,rA",	"0x3 11 10 -10A AAAA DDDD D000", EF(b_abs), 0, it_arith },
{ "bn.sqr",	"rD,rA",	"0x3 11 10 -10A AAAA DDDD D001", EF(b_sqr), 0, it_arith },
{ "bn.sqra",	"rD,rA",	"0x3 11 10 -10A AAAA DDDD D010", EF(b_sqra), 0, it_arith },

{ "bn.casei",	"rA,N",		"0x3 11 11 -00A AAAA NNNN NNNN", EF(b_casei), 0, it_jump },

/* ----------------------------- 0x3 11 11 --------------------------------- */

{ "bn.beqi",	"rB,E,P",	"0x4 00 00 EEEB BBBB PPPP PPPP", EF(b_beq), 0, it_branch },
{ "bn.bnei",	"rB,E,P",	"0x4 00 01 EEEB BBBB PPPP PPPP", EF(b_bne), 0, it_branch },
{ "bn.bgesi",	"rB,E,P",	"0x4 00 10 EEEB BBBB PPPP PPPP", EF(b_bges), 0, it_branch },
{ "bn.bgtsi",	"rB,E,P",	"0x4 00 11 EEEB BBBB PPPP PPPP", EF(b_bgts), 0, it_branch },
{ "bn.blesi",	"rB,E,P",	"0x4 01 00 EEEB BBBB PPPP PPPP", EF(b_bles), 0, it_branch },
{ "bn.bltsi",	"rB,E,P",	"0x4 01 01 EEEB BBBB PPPP PPPP", EF(b_blts), 0, it_branch },
{ "bn.j",	"Z",		"0x4 01 10 ZZZZ ZZZZ ZZZZ ZZZZ", EF(b_j), 0, it_jump },

/* ----------------------------- 0x4 01 11 000 ----------------------------- */

{ "bn.bf",	"S",		"0x4 01 11 0010 SSSS SSSS SSSS", EF(b_bf), BA_R_FLAG, it_branch },
{ "bn.bnf",	"S",		"0x4 01 11 0011 SSSS SSSS SSSS", EF(b_bnf), BA_R_FLAG, it_branch },
{ "bn.bo",	"S",		"0x4 01 11 0100 SSSS SSSS SSSS", EF(b_bo), 0, it_branch },
{ "bn.bno",	"S",		"0x4 01 11 0101 SSSS SSSS SSSS", EF(b_bno), 0, it_branch },
{ "bn.bc",	"S",		"0x4 01 11 0110 SSSS SSSS SSSS", EF(b_bc), 0, it_branch },
{ "bn.bnc",	"S",		"0x4 01 11 0111 SSSS SSSS SSSS", EF(b_bnc), 0, it_branch },

/* ----------------------------- 0x4 01 11 100 ----------------------------- */

{ "bn.entri",	"F,N",		"0x4 01 11 1010 FFFF NNNN NNNN", EF(l_entri), 0, it_store },
{ "bn.reti",	"F,N",		"0x4 01 11 1011 FFFF NNNN NNNN", EF(b_reti), 0, it_load },
{ "bn.rtnei",	"F,N",		"0x4 01 11 1100 FFFF NNNN NNNN", EF(l_rtnei), 0, it_load },
{ "bn.return",	"",		"0x4 01 11 1101 --00 ---- ----", EF(b_return), 0, it_jump },
{ "bn.jalr",	"rA",		"0x4 01 11 1101 --01 AAAA A---", EF(b_jalr), 0, it_jump },
{ "bn.jr",	"rA",		"0x4 01 11 1101 --10 AAAA A---", EF(b_jr), 0, it_jump },

{ "bn.jal",	"s",		"0x4 10 ss ssss ssss ssss ssss", EF(b_jal), 0, it_jump },

/* ----------------------------- 0x4 11 ----------------------------- */

/* ----------------------------- 0x5 0 ----------------------------- */

{ "bn.mlwz",	"rD,K(rA),C",	"0x5 00 DD DDDA AAAA CCKK KKKK", EF(b_mlwz), 0, it_load },
{ "bn.msw",	"K(rA),rB,C",	"0x5 01 BB BBBA AAAA CCKK KKKK", EF(b_msw), 0, it_store },

/* ----------------------------- 0x5 1 ----------------------------- */

{ "bn.mld",	"rD,H(rA),C",	"0x5 10 DD DDDA AAAA CC0H HHHH", EF(b_mld), 0, it_load },
{ "bn.msd",	"H(rA),rB,C",	"0x5 10 BB BBBA AAAA CC1H HHHH", EF(b_msd), 0, it_store },

{ "bn.lwza",	"rD,rA,L",	"0x5 11 DD DDDA AAAA 1100 LLLL", EF(b_lwza), 0, it_load },
{ "bn.swa",	"rA,rB,L",	"0x5 11 BB BBBA AAAA 1101 LLLL", EF(b_swa), 0, it_store },

/* ----------------------------- 0x6 ----------------------------- */

{ "bn.and",	"rD,rA,rB",	"0x6 00 DD DDDA AAAA BBBB B000", EF(l_and), 0, it_arith },
{ "bn.or",	"rD,rA,rB",	"0x6 00 DD DDDA AAAA BBBB B001", EF(l_or), 0, it_arith },
{ "bn.xor",	"rD,rA,rB",	"0x6 00 DD DDDA AAAA BBBB B010", EF(l_xor), 0, it_arith },
{ "bn.nand",	"rD,rA,rB",	"0x6 00 DD DDDA AAAA BBBB B011", EF(b_nand), 0, it_arith },
{ "bn.add",	"rD,rA,rB",	"0x6 00 DD DDDA AAAA BBBB B100", EF(b_add), 0, it_arith },
{ "bn.sub",	"rD,rA,rB",	"0x6 00 DD DDDA AAAA BBBB B101", EF(b_sub), 0, it_arith },
{ "bn.sll",	"rD,rA,rB",	"0x6 00 DD DDDA AAAA BBBB B110", EF(l_sll), 0, it_shift },
{ "bn.srl",	"rD,rA,rB",	"0x6 00 DD DDDA AAAA BBBB B111", EF(l_srl), 0, it_shift },
{ "bn.sra",	"rD,rA,rB",	"0x6 01 DD DDDA AAAA BBBB B000", EF(l_sra), 0, it_shift },
{ "bn.ror",	"rD,rA,rB",	"0x6 01 DD DDDA AAAA BBBB B001", EF(l_ror), 0, it_shift },
{ "bn.cmov",	"rD,rA,rB",	"0x6 01 DD DDDA AAAA BBBB B010", EF(l_cmov), 0, it_arith },
{ "bn.mul",	"rD,rA,rB",	"0x6 01 DD DDDA AAAA BBBB B011", EF(l_mul), 0, it_arith },
{ "bn.div",	"rD,rA,rB",	"0x6 01 DD DDDA AAAA BBBB B100", EF(l_div), 0, it_arith },
{ "bn.divu",	"rD,rA,rB",	"0x6 01 DD DDDA AAAA BBBB B101", EF(l_divu), 0, it_arith },
{ "bn.mac",	"rA,rB",	"0x6 01 00 000A AAAA BBBB B110", EF(l_mac), 0, it_mac },
{ "bn.macs",	"rA,rB",	"0x6 01 00 001A AAAA BBBB B110", EF(b_macs), 0, it_mac },
{ "bn.macsu",	"rA,rB",	"0x6 01 00 010A AAAA BBBB B110", EF(b_macsu), 0, it_mac },
{ "bn.macuu",	"rA,rB",	"0x6 01 00 011A AAAA BBBB B110", EF(b_macuu), 0, it_mac },
{ "bn.smactt",	"rA,rB",	"0x6 01 00 100A AAAA BBBB B110", EF(b_smactt), 0, it_mac },
{ "bn.smacbb",	"rA,rB",	"0x6 01 00 101A AAAA BBBB B110", EF(b_smacbb), 0, it_mac },
{ "bn.smactb",	"rA,rB",	"0x6 01 00 110A AAAA BBBB B110", EF(b_smactb), 0, it_mac },
{ "bn.umactt",	"rA,rB",	"0x6 01 00 111A AAAA BBBB B110", EF(b_umactt), 0, it_mac },
{ "bn.umacbb",	"rA,rB",	"0x6 01 01 000A AAAA BBBB B110", EF(b_umacbb), 0, it_mac },
{ "bn.umactb",	"rA,rB",	"0x6 01 01 001A AAAA BBBB B110", EF(b_umactb), 0, it_mac },
{ "bn.msu",	"rA,rB",	"0x6 01 01 010A AAAA BBBB B110", EF(b_msu), 0, it_mac },
{ "bn.msus",	"rA,rB",	"0x6 01 01 011A AAAA BBBB B110", EF(b_msus), 0, it_mac },
{ "bn.addc",	"rD,rA,rB",	"0x6 01 DD DDDA AAAA BBBB B111", EF(b_addc), 0, it_arith },
{ "bn.subb",	"rD,rA,rB",	"0x6 10 DD DDDA AAAA BBBB B000", EF(b_subb), 0, it_arith },
{ "bn.flb",	"rD,rA,rB",	"0x6 10 DD DDDA AAAA BBBB B001", EF(l_flb), 0, it_arith },
{ "bn.mulhu",	"rD,rA,rB",	"0x6 10 DD DDDA AAAA BBBB B010", EF(b_mulhu), 0, it_arith },
{ "bn.mulh",	"rD,rA,rB",	"0x6 10 DD DDDA AAAA BBBB B011", EF(b_mulh), 0, it_arith },
{ "bn.mod",	"rD,rA,rB",	"0x6 10 DD DDDA AAAA BBBB B100", EF(b_mod), 0, it_arith },
{ "bn.modu",	"rD,rA,rB",	"0x6 10 DD DDDA AAAA BBBB B101", EF(b_modu), 0, it_arith },
{ "bn.aadd",	"rD,rA,rB",	"0x6 10 DD DDDA AAAA BBBB B110", EF(b_aadd), 0, it_arith },
{ "bn.cmpxchg",	"rD,rA,rB",	"0x6 10 DD DDDA AAAA BBBB B111", EF(b_cmpxchg), 0, it_arith },

{ "bn.slli",	"rD,rA,H",	"0x6 11 DD DDDA AAAA HHHH H-00", EF(l_sll), 0, it_shift },
{ "bn.srli",	"rD,rA,H",	"0x6 11 DD DDDA AAAA HHHH H-01", EF(l_srl), 0, it_shift },
{ "bn.srai",	"rD,rA,H",	"0x6 11 DD DDDA AAAA HHHH H-10", EF(l_sra), 0, it_shift },
{ "bn.rori",	"rD,rA,H",	"0x6 11 DD DDDA AAAA HHHH H-11", EF(l_ror), 0, it_shift },

/* ----------------------------- 0x7 ----------------------------- */

{ "fn.add.s",	"rD,rA,rB",	"0x7 00 DD DDDA AAAA BBBB B000", EF(lf_add_s), 0, it_float },
{ "fn.sub.s",	"rD,rA,rB",	"0x7 00 DD DDDA AAAA BBBB B001", EF(lf_sub_s), 0, it_float },
{ "fn.mul.s",	"rD,rA,rB",	"0x7 00 DD DDDA AAAA BBBB B010", EF(lf_mul_s), 0, it_float },
{ "fn.div.s",	"rD,rA,rB",	"0x7 00 DD DDDA AAAA BBBB B011", EF(lf_div_s), 0, it_float },

{ "bn.adds",	"rD,rA,rB",	"0x7 01 DD DDDA AAAA BBBB B000", EF(b_adds), 0, it_arith },
{ "bn.subs",	"rD,rA,rB",	"0x7 01 DD DDDA AAAA BBBB B001", EF(b_subs), 0, it_arith },
{ "bn.xaadd",	"rD,rA,rB",	"0x7 01 DD DDDA AAAA BBBB B010", EF(b_xaadd), 0, it_arith },
{ "bn.xcmpxchg","rD,rA,rB",	"0x7 01 DD DDDA AAAA BBBB B011", EF(b_xcmpxchg), 0, it_arith },
{ "bn.max",	"rD,rA,rB",	"0x7 01 DD DDDA AAAA BBBB B100", EF(b_max), 0, it_arith },
{ "bn.min",	"rD,rA,rB",	"0x7 01 DD DDDA AAAA BBBB B101", EF(b_min), 0, it_arith },
{ "bn.lim",	"rD,rA,rB",	"0x7 01 DD DDDA AAAA BBBB B110", EF(b_lim), 0, it_arith },

{ "bn.slls",	"rD,rA,rB",	"0x7 10 DD DDDA AAAA BBBB B-00", EF(b_slls), 0, it_shift},
{ "bn.sllis",	"rD,rA,H",	"0x7 10 DD DDDA AAAA HHHH H-01", EF(b_sllis), 0, it_shift},

{ "fn.ftoi.s",	"rD,rA",	"0x7 11 10 --0A AAAA DDDD D000", EF(lf_ftoi_s), 0, it_float },
{ "fn.itof.s",	"rD,rA",	"0x7 11 10 --0A AAAA DDDD D001", EF(lf_itof_s), 0, it_float },


{ "bw.sb",	"h(rA),rB",	"0x8 00 BB BBBA AAAA hhhh hhhh hhhh hhhh hhhh hhhh hhhh hhhh", EF(l_sb), 0, it_store },
{ "bw.lbz",	"rD,h(rA)",	"0x8 01 DD DDDA AAAA hhhh hhhh hhhh hhhh hhhh hhhh hhhh hhhh", EF(l_lbz), 0, it_load },
{ "bw.sh",	"i(rA),rB",	"0x8 10 BB BBBA AAAA 0iii iiii iiii iiii iiii iiii iiii iiii", EF(l_sh), 0, it_store },
{ "bw.lhz",	"rD,i(rA)",	"0x8 10 DD DDDA AAAA 1iii iiii iiii iiii iiii iiii iiii iiii", EF(l_lhz), 0, it_load },
{ "bw.sw",	"w(rA),rB",	"0x8 11 BB BBBA AAAA 00ww wwww wwww wwww wwww wwww wwww wwww", EF(l_sw), 0, it_store },
{ "bw.lwz",	"rD,w(rA)",	"0x8 11 DD DDDA AAAA 01ww wwww wwww wwww wwww wwww wwww wwww", EF(l_lwz), 0, it_load },
{ "bw.lws",	"rD,w(rA)",	"0x8 11 DD DDDA AAAA 10ww wwww wwww wwww wwww wwww wwww wwww", EF(b_lws), 0, it_load },
{ "bw.sd",	"v(rA),rB",	"0x8 11 BB BBBA AAAA 110v vvvv vvvv vvvv vvvv vvvv vvvv vvvv", EF(b_sd), 0, it_store },
{ "bw.ld",	"rD,v(rA)",	"0x8 11 DD DDDA AAAA 111v vvvv vvvv vvvv vvvv vvvv vvvv vvvv", EF(b_ld), 0, it_load },

{ "bw.addi",	"rD,rA,g",	"0x9 00 DD DDDA AAAA gggg gggg gggg gggg gggg gggg gggg gggg", EF(b_add), 0, it_arith },
{ "bw.andi",	"rD,rA,h",	"0x9 01 DD DDDA AAAA hhhh hhhh hhhh hhhh hhhh hhhh hhhh hhhh", EF(l_and), 0, it_arith },
{ "bw.ori",	"rD,rA,h",	"0x9 10 DD DDDA AAAA hhhh hhhh hhhh hhhh hhhh hhhh hhhh hhhh", EF(l_or), 0, it_arith },

/* ----------------------------- 0x9 11 00 ----------------------------- */

{ "bw.sfeqi",	"rA,g",		"0x9 11 01 10-A AAAA gggg gggg gggg gggg gggg gggg gggg gggg", EF(l_sfeq), BA_W_FLAG, it_compare },
{ "bw.sfnei",	"rA,g",		"0x9 11 01 11-A AAAA gggg gggg gggg gggg gggg gggg gggg gggg", EF(l_sfne), BA_W_FLAG, it_compare },
{ "bw.sfgesi",	"rA,g",		"0x9 11 10 00-A AAAA gggg gggg gggg gggg gggg gggg gggg gggg", EF(l_sfges), BA_W_FLAG, it_compare },
{ "bw.sfgeui",	"rA,g",		"0x9 11 10 01-A AAAA gggg gggg gggg gggg gggg gggg gggg gggg", EF(l_sfgeu), BA_W_FLAG, it_compare },
{ "bw.sfgtsi",	"rA,g",		"0x9 11 10 10-A AAAA gggg gggg gggg gggg gggg gggg gggg gggg", EF(l_sfgts), BA_W_FLAG, it_compare },
{ "bw.sfgtui",	"rA,g",		"0x9 11 10 11-A AAAA gggg gggg gggg gggg gggg gggg gggg gggg", EF(l_sfgtu), BA_W_FLAG, it_compare },
{ "bw.sflesi",	"rA,g",		"0x9 11 11 00-A AAAA gggg gggg gggg gggg gggg gggg gggg gggg", EF(l_sfles), BA_W_FLAG, it_compare },
{ "bw.sfleui",	"rA,g",		"0x9 11 11 01-A AAAA gggg gggg gggg gggg gggg gggg gggg gggg", EF(l_sfleu), BA_W_FLAG, it_compare },
{ "bw.sfltsi",	"rA,g",		"0x9 11 11 10-A AAAA gggg gggg gggg gggg gggg gggg gggg gggg", EF(l_sflts), BA_W_FLAG, it_compare },
{ "bw.sfltui",	"rA,g",		"0x9 11 11 11-A AAAA gggg gggg gggg gggg gggg gggg gggg gggg", EF(l_sfltu), BA_W_FLAG, it_compare },

{ "bw.beqi",	"rB,I,u",	"0xa 00 00 00II IIIB BBBB uuuu uuuu uuuu uuuu uuuu uuuu uuuu", EF(b_beq), 0, it_branch },
{ "bw.bnei",	"rB,I,u",	"0xa 00 00 01II IIIB BBBB uuuu uuuu uuuu uuuu uuuu uuuu uuuu", EF(b_bne), 0, it_branch },
{ "bw.bgesi",	"rB,I,u",	"0xa 00 00 10II IIIB BBBB uuuu uuuu uuuu uuuu uuuu uuuu uuuu", EF(b_bges), 0, it_branch },
{ "bw.bgtsi",	"rB,I,u",	"0xa 00 00 11II IIIB BBBB uuuu uuuu uuuu uuuu uuuu uuuu uuuu", EF(b_bgts), 0, it_branch },
{ "bw.blesi",	"rB,I,u",	"0xa 00 01 00II IIIB BBBB uuuu uuuu uuuu uuuu uuuu uuuu uuuu", EF(b_bles), 0, it_branch },
{ "bw.bltsi",	"rB,I,u",	"0xa 00 01 01II IIIB BBBB uuuu uuuu uuuu uuuu uuuu uuuu uuuu", EF(b_blts), 0, it_branch },
{ "bw.bgeui",	"rB,I,u",	"0xa 00 01 10II IIIB BBBB uuuu uuuu uuuu uuuu uuuu uuuu uuuu", EF(b_bgeu), 0, it_branch },
{ "bw.bgtui",	"rB,I,u",	"0xa 00 01 11II IIIB BBBB uuuu uuuu uuuu uuuu uuuu uuuu uuuu", EF(b_bgtu), 0, it_branch },
{ "bw.bleui",	"rB,I,u",	"0xa 00 10 00II IIIB BBBB uuuu uuuu uuuu uuuu uuuu uuuu uuuu", EF(b_bleu), 0, it_branch },
{ "bw.bltui",	"rB,I,u",	"0xa 00 10 01II IIIB BBBB uuuu uuuu uuuu uuuu uuuu uuuu uuuu", EF(b_bltu), 0, it_branch },
{ "bw.beq",	"rA,rB,u",	"0xa 00 10 10AA AAAB BBBB uuuu uuuu uuuu uuuu uuuu uuuu uuuu", EF(b_beq), 0, it_branch },
{ "bw.bne",	"rA,rB,u",	"0xa 00 10 11AA AAAB BBBB uuuu uuuu uuuu uuuu uuuu uuuu uuuu", EF(b_bne), 0, it_branch },
{ "bw.bges",	"rA,rB,u",	"0xa 00 11 00AA AAAB BBBB uuuu uuuu uuuu uuuu uuuu uuuu uuuu", EF(b_bges), 0, it_branch },
{ "bw.bgts",	"rA,rB,u",	"0xa 00 11 01AA AAAB BBBB uuuu uuuu uuuu uuuu uuuu uuuu uuuu", EF(b_bgts), 0, it_branch },
{ "bw.bgeu",	"rA,rB,u",	"0xa 00 11 10AA AAAB BBBB uuuu uuuu uuuu uuuu uuuu uuuu uuuu", EF(b_bgeu), 0, it_branch },
{ "bw.bgtu",	"rA,rB,u",	"0xa 00 11 11AA AAAB BBBB uuuu uuuu uuuu uuuu uuuu uuuu uuuu", EF(b_bgtu), 0, it_branch },

{ "bw.jal",	"z",		"0xa 01 00 00-- ---- zzzz zzzz zzzz zzzz zzzz zzzz zzzz zzzz", EF(b_jal), 0, it_jump },
{ "bw.j",	"z",		"0xa 01 00 01-- ---- zzzz zzzz zzzz zzzz zzzz zzzz zzzz zzzz", EF(b_j), 0, it_jump },
{ "bw.bf",	"z",		"0xa 01 00 10-- ---- zzzz zzzz zzzz zzzz zzzz zzzz zzzz zzzz", EF(b_bf), BA_R_FLAG, it_branch },
{ "bw.bnf",	"z",		"0xa 01 00 11-- ---- zzzz zzzz zzzz zzzz zzzz zzzz zzzz zzzz", EF(b_bnf), BA_R_FLAG, it_branch },
{ "bw.ja",	"g",		"0xa 01 01 00-- ---- gggg gggg gggg gggg gggg gggg gggg gggg", EF(b_ja), 0, it_jump },
{ "bw.jma",	"rD,z",		"0xa 01 01 01DD DDD0 zzzz zzzz zzzz zzzz zzzz zzzz zzzz zzzz", EF(b_jma), 0, it_jump },
{ "bw.jmal",	"rD,z",		"0xa 01 01 01DD DDD1 zzzz zzzz zzzz zzzz zzzz zzzz zzzz zzzz", EF(b_jmal),0, it_jump },
{ "bw.lma",	"rD,z",		"0xa 01 01 10DD DDD0 zzzz zzzz zzzz zzzz zzzz zzzz zzzz zzzz", EF(b_lma), 0, it_load },
{ "bw.sma",	"rB,z",		"0xa 01 01 10BB BBB1 zzzz zzzz zzzz zzzz zzzz zzzz zzzz zzzz", EF(b_sma), 0, it_store },

{ "bw.casewi",	"rB,z",		"0xa 01 01 11BB BBB0 zzzz zzzz zzzz zzzz zzzz zzzz zzzz zzzz", EF(b_casewi), 0, it_jump },

{ "fw.beq.s",	"rA,rB,u",	"0xa 01 10 00AA AAAB BBBB uuuu uuuu uuuu uuuu uuuu uuuu uuuu", EF(f_beq_s), 0, it_branch },
{ "fw.bne.s",	"rA,rB,u",	"0xa 01 10 01AA AAAB BBBB uuuu uuuu uuuu uuuu uuuu uuuu uuuu", EF(f_bne_s), 0, it_branch },
{ "fw.bge.s",	"rA,rB,u",	"0xa 01 10 10AA AAAB BBBB uuuu uuuu uuuu uuuu uuuu uuuu uuuu", EF(f_bge_s), 0, it_branch },
{ "fw.bgt.s",	"rA,rB,u",	"0xa 01 10 11AA AAAB BBBB uuuu uuuu uuuu uuuu uuuu uuuu uuuu", EF(f_bgt_s), 0, it_branch },

{ "bw.mfspr",	"rD,rA,o",	"0xa 10 DD DDDA AAAA oooo oooo oooo oooo oooo oooo ---- -000", EF(l_mfspr), 0, it_move },
{ "bw.mtspr",	"rA,rB,o",	"0xa 10 BB BBBA AAAA oooo oooo oooo oooo oooo oooo ---- -001", EF(l_mtspr), 0, it_move },
{ "bw.addci",	"rD,rA,p",	"0xa 10 DD DDDA AAAA pppp pppp pppp pppp pppp pppp ---- -010", EF(b_addc), 0, it_arith },
{ "bw.divi",	"rD,rA,p",	"0xa 10 DD DDDA AAAA pppp pppp pppp pppp pppp pppp ---- -011", EF(l_div), 0, it_arith },
{ "bw.divui",	"rD,rA,o",	"0xa 10 DD DDDA AAAA oooo oooo oooo oooo oooo oooo ---- -100", EF(l_divu), 0, it_arith },
{ "bw.muli",	"rD,rA,p",	"0xa 10 DD DDDA AAAA pppp pppp pppp pppp pppp pppp ---- -101", EF(l_mul), 0, it_arith },
{ "bw.xori",	"rD,rA,p",	"0xa 10 DD DDDA AAAA pppp pppp pppp pppp pppp pppp ---- -110", EF(l_xor), 0, it_arith },

/* ----------------------------- 0xa 11 -------------------------- */
{ "bw.mulas",	"rD,rA,rB,H",	"0xa 11 DD DDDA AAAA BBBB BHHH HH-- ---- ---- ---- --00 0000", EF(b_mulas), 0, it_arith},
{ "bw.muluas",	"rD,rA,rB,H",	"0xa 11 DD DDDA AAAA BBBB BHHH HH-- ---- ---- ---- --00 0001", EF(b_muluas), 0, it_arith},
{ "bw.mulras",	"rD,rA,rB,H",	"0xa 11 DD DDDA AAAA BBBB BHHH HH-- ---- ---- ---- --00 0010", EF(b_mulras), 0, it_arith},
{ "bw.muluras",	"rD,rA,rB,H",	"0xa 11 DD DDDA AAAA BBBB BHHH HH-- ---- ---- ---- --00 0011", EF(b_muluras), 0, it_arith},
{ "bw.mulsu",	"rD,rA,rB",	"0xa 11 DD DDDA AAAA BBBB B--- ---- ---- ---- ---- --00 0100", EF(b_mulsu), 0, it_arith},
{ "bw.mulhsu",	"rD,rA,rB",	"0xa 11 DD DDDA AAAA BBBB B--- ---- ---- ---- ---- --00 0101", EF(b_mulhsu), 0, it_arith},
{ "bw.mulhlsu",	"rD,rQ,rA,rB",	"0xa 11 DD DDDA AAAA BBBB BQQQ QQ-- ---- ---- ---- --00 0110", EF(b_mulhlsu), 0, it_arith},
                                                                                        
{ "bw.smultt",	"rD,rA,rB",	"0xa 11 DD DDDA AAAA BBBB B--- ---- ---- ---- ---- --10 0000", EF(b_smultt), 0, it_arith},
{ "bw.smultb",	"rD,rA,rB",	"0xa 11 DD DDDA AAAA BBBB B--- ---- ---- ---- ---- --10 0001", EF(b_smultb), 0, it_arith},
{ "bw.smulbb",	"rD,rA,rB",	"0xa 11 DD DDDA AAAA BBBB B--- ---- ---- ---- ---- --10 0010", EF(b_smulbb), 0, it_arith},
{ "bw.smulwb",	"rD,rA,rB",	"0xa 11 DD DDDA AAAA BBBB B--- ---- ---- ---- ---- --10 0011", EF(b_smulwb), 0, it_arith},
{ "bw.smulwt",	"rD,rA,rB",	"0xa 11 DD DDDA AAAA BBBB B--- ---- ---- ---- ---- --10 0100", EF(b_smulwt), 0, it_arith},

{ "bw.umultt",	"rD,rA,rB",	"0xa 11 DD DDDA AAAA BBBB B--- ---- ---- ---- ---- --10 1000", EF(b_umultt), 0, it_arith},
{ "bw.umultb",	"rD,rA,rB",	"0xa 11 DD DDDA AAAA BBBB B--- ---- ---- ---- ---- --10 1001", EF(b_umultb), 0, it_arith},
{ "bw.umulbb",	"rD,rA,rB",	"0xa 11 DD DDDA AAAA BBBB B--- ---- ---- ---- ---- --10 1010", EF(b_umulbb), 0, it_arith},
{ "bw.umulwb",	"rD,rA,rB",	"0xa 11 DD DDDA AAAA BBBB B--- ---- ---- ---- ---- --10 1011", EF(b_umulwb), 0, it_arith},
{ "bw.umulwt",	"rD,rA,rB",	"0xa 11 DD DDDA AAAA BBBB B--- ---- ---- ---- ---- --10 1100", EF(b_umulwt), 0, it_arith},

{ "bw.smadtt",	"rD,rA,rB,rR",	"0xa 11 DD DDDA AAAA BBBB BRRR RR-- ---- ---- ---- --11 0000", EF(b_smadtt), 0, it_arith},
{ "bw.smadtb",	"rD,rA,rB,rR",	"0xa 11 DD DDDA AAAA BBBB BRRR RR-- ---- ---- ---- --11 0001", EF(b_smadtb), 0, it_arith},
{ "bw.smadbb",	"rD,rA,rB,rR",	"0xa 11 DD DDDA AAAA BBBB BRRR RR-- ---- ---- ---- --11 0010", EF(b_smadbb), 0, it_arith},
{ "bw.smadwb",	"rD,rA,rB,rR",	"0xa 11 DD DDDA AAAA BBBB BRRR RR-- ---- ---- ---- --11 0011", EF(b_smadwb), 0, it_arith},
{ "bw.smadwt",	"rD,rA,rB,rR",	"0xa 11 DD DDDA AAAA BBBB BRRR RR-- ---- ---- ---- --11 0100", EF(b_smadwt), 0, it_arith},

{ "bw.umadtt",	"rD,rA,rB,rR",	"0xa 11 DD DDDA AAAA BBBB BRRR RR-- ---- ---- ---- --11 1000", EF(b_umadtt), 0, it_arith},
{ "bw.umadtb",	"rD,rA,rB,rR",	"0xa 11 DD DDDA AAAA BBBB BRRR RR-- ---- ---- ---- --11 1001", EF(b_umadtb), 0, it_arith},
{ "bw.umadbb",	"rD,rA,rB,rR",	"0xa 11 DD DDDA AAAA BBBB BRRR RR-- ---- ---- ---- --11 1010", EF(b_umadbb), 0, it_arith},
{ "bw.umadwb",	"rD,rA,rB,rR",	"0xa 11 DD DDDA AAAA BBBB BRRR RR-- ---- ---- ---- --11 1011", EF(b_umadwb), 0, it_arith},
{ "bw.umadwt",	"rD,rA,rB,rR",	"0xa 11 DD DDDA AAAA BBBB BRRR RR-- ---- ---- ---- --11 1100", EF(b_umadwt), 0, it_arith},

/* ----------------------------- 0xb ----------------------------- */
{ "bw.copdss",	"rD,rA,rB,y",	"0xb 00 DD DDDA AAAA BBBB Byyy yyyy yyyy yyyy yyyy yyyy yyyy", EF(b_copdss), 0, it_arith},
{ "bw.copd",	"rD,g,H",	"0xb 01 DD DDDH HHHH gggg gggg gggg gggg gggg gggg gggg gggg", EF(b_copd), 0, it_arith},
{ "bw.cop",	"g,x",		"0xb 10 xx xxxx xxxx gggg gggg gggg gggg gggg gggg gggg gggg", EF(b_cop), 0, it_arith},

{ "bg.sb",	"Y(rA),rB",	"0xc 00 BB BBBA AAAA YYYY YYYY YYYY YYYY", EF(l_sb), 0, it_store },
{ "bg.lbz",	"rD,Y(rA)",	"0xc 01 DD DDDA AAAA YYYY YYYY YYYY YYYY", EF(l_lbz), 0, it_load },
{ "bg.sh",	"X(rA),rB",	"0xc 10 BB BBBA AAAA 0XXX XXXX XXXX XXXX", EF(l_sh), 0, it_store },
{ "bg.lhz",	"rD,X(rA)",	"0xc 10 DD DDDA AAAA 1XXX XXXX XXXX XXXX", EF(l_lhz), 0, it_load },
{ "bg.sw",	"W(rA),rB",	"0xc 11 BB BBBA AAAA 00WW WWWW WWWW WWWW", EF(l_sw), 0, it_store },
{ "bg.lwz",	"rD,W(rA)",	"0xc 11 DD DDDA AAAA 01WW WWWW WWWW WWWW", EF(l_lwz), 0, it_load },
{ "bg.lws",	"rD,W(rA)",	"0xc 11 DD DDDA AAAA 10WW WWWW WWWW WWWW", EF(b_lws), 0, it_load },
{ "bg.sd",	"V(rA),rB",	"0xc 11 BB BBBA AAAA 110V VVVV VVVV VVVV", EF(b_sd), 0, it_store },
{ "bg.ld",	"rD,V(rA)",	"0xc 11 DD DDDA AAAA 111V VVVV VVVV VVVV", EF(b_ld), 0, it_load },

{ "bg.beqi",	"rB,I,U",	"0xd 00 00 00II IIIB BBBB UUUU UUUU UUUU", EF(b_beq), 0, it_branch },
{ "bg.bnei",	"rB,I,U",	"0xd 00 00 01II IIIB BBBB UUUU UUUU UUUU", EF(b_bne), 0, it_branch },
{ "bg.bgesi",	"rB,I,U",	"0xd 00 00 10II IIIB BBBB UUUU UUUU UUUU", EF(b_bges), 0, it_branch },
{ "bg.bgtsi",	"rB,I,U",	"0xd 00 00 11II IIIB BBBB UUUU UUUU UUUU", EF(b_bgts), 0, it_branch },
{ "bg.blesi",	"rB,I,U",	"0xd 00 01 00II IIIB BBBB UUUU UUUU UUUU", EF(b_bles), 0, it_branch },
{ "bg.bltsi",	"rB,I,U",	"0xd 00 01 01II IIIB BBBB UUUU UUUU UUUU", EF(b_blts), 0, it_branch },
{ "bg.bgeui",	"rB,I,U",	"0xd 00 01 10II IIIB BBBB UUUU UUUU UUUU", EF(b_bgeu), 0, it_branch },
{ "bg.bgtui",	"rB,I,U",	"0xd 00 01 11II IIIB BBBB UUUU UUUU UUUU", EF(b_bgtu), 0, it_branch },
{ "bg.bleui",	"rB,I,U",	"0xd 00 10 00II IIIB BBBB UUUU UUUU UUUU", EF(b_bleu), 0, it_branch },
{ "bg.bltui",	"rB,I,U",	"0xd 00 10 01II IIIB BBBB UUUU UUUU UUUU", EF(b_bltu), 0, it_branch },
{ "bg.beq",	"rA,rB,U",	"0xd 00 10 10AA AAAB BBBB UUUU UUUU UUUU", EF(b_beq), 0, it_branch },
{ "bg.bne",	"rA,rB,U",	"0xd 00 10 11AA AAAB BBBB UUUU UUUU UUUU", EF(b_bne), 0, it_branch },
{ "bg.bges",	"rA,rB,U",	"0xd 00 11 00AA AAAB BBBB UUUU UUUU UUUU", EF(b_bges), 0, it_branch },
{ "bg.bgts",	"rA,rB,U",	"0xd 00 11 01AA AAAB BBBB UUUU UUUU UUUU", EF(b_bgts), 0, it_branch },
{ "bg.bgeu",	"rA,rB,U",	"0xd 00 11 10AA AAAB BBBB UUUU UUUU UUUU", EF(b_bgeu), 0, it_branch },
{ "bg.bgtu",	"rA,rB,U",	"0xd 00 11 11AA AAAB BBBB UUUU UUUU UUUU", EF(b_bgtu), 0, it_branch },

{ "bg.jal",	"t",		"0xd 01 00 tttt tttt tttt tttt tttt tttt", EF(b_jal), 0, it_jump },
{ "bg.j",	"t",		"0xd 01 01 tttt tttt tttt tttt tttt tttt", EF(b_j), 0, it_jump },
{ "bg.bf",	"t",		"0xd 01 10 tttt tttt tttt tttt tttt tttt", EF(b_bf), BA_R_FLAG, it_branch },
{ "bg.bnf",	"t",		"0xd 01 11 tttt tttt tttt tttt tttt tttt", EF(b_bnf), BA_R_FLAG, it_branch },

{ "bg.addi",	"rD,rA,Y",	"0xd 10 DD DDDA AAAA YYYY YYYY YYYY YYYY", EF(b_add), 0, it_arith },

{ "fg.beq.s",	"rA,rB,U",	"0xd 11 00 00AA AAAB BBBB UUUU UUUU UUUU", EF(f_beq_s), 0, it_branch },
{ "fg.bne.s",	"rA,rB,U",	"0xd 11 00 01AA AAAB BBBB UUUU UUUU UUUU", EF(f_bne_s), 0, it_branch },
{ "fg.bge.s",	"rA,rB,U",	"0xd 11 00 10AA AAAB BBBB UUUU UUUU UUUU", EF(f_bge_s), 0, it_branch },
{ "fg.bgt.s",	"rA,rB,U",	"0xd 11 00 11AA AAAB BBBB UUUU UUUU UUUU", EF(f_bgt_s), 0, it_branch },

/* ----------------------------- 0xe ----------------------------- */

/* ----------------------------- 0xf ----------------------------- */

{ "", "", "", EFI, 0, 0 }

};

Each instruction definition contains the instruction’s mnemonic, its list of arguments, and the encoding. The function used to emulate the instruction is included, as well - however, the implementation of those functions is not present in the sources. Those functions will be of importance when developing the analysis and emulation plugin.

Similar to the bin plugin, the basic information our asm plugin has to return to radare2 is a dictionary:

{
    "name": "ba2",
    "arch": "ba2",
    "bits": 32,
    "endian": 2, # R_SYS_ENDIAN_BIG
    "license": "BSD",
    "desc": "Beyond Architecture 2 (dis)assembly plugin",
    "assemble": assembler.assemble,
    "disassemble": assembler.disassemble,
}

While most of the information in the above dictionary is static, it exposes 2 main functions of the plugin:

assemble: this function should assemble provided assembly code into actual machine code.
disassemble: this function should parse a piece of machine code into readable instructions with arguments.

Since both functions above are highly dependant on the correct transformations between the binary and textual representation of the ISA’s instructions (the encodings of which are found in the binutils project), we chose to develop a generic module that represents those instructions, and transforms them to/from binary form. Using a generic module for the heavy lifting of instruction representation will allow us to transfer the functionality to other projects (that support Python plugins), if desired.

With the module ready, creating the pyba2/asm plugin was as easy as wrapping the module and exposing the assemble and disassemble functions:

class BA2Assembler:
    def __init__(self, unused=0):
        if (unused != 0):
            self._unused = unused

    def assemble(self, asm, addr):
        try:
            return instructions.Instruction.lower(asm, addr).encode()
        except Exception as e:
            return []

    def disassemble(self, memview, addr):
        try:
            insn, parsed_bytes = instructions.Instruction.lift(memview, addr)
            return [parsed_bytes, str(insn)]
        except Exception as e:
            return [0, "invalid"]

pyba2/analysis Plugin Development

As mentioned earlier, Modern reverse engineering tools, such as radare2, offer a lot more than simple disassembly listing. Code flow analysis is one such feature, and its popularity and importance are understandable: it allows a researcher to easily follow the various possible paths a code can follow. Such a feature requires advanced understanding of the analyzed architecture, its opcodes, and its registers:

What registers are available in the architecture?
What opcodes influence the PC (program counter)?
- Conditional jumps, that create branches in the code path
- Direct jumps, that simply continue the execution at a different point (and not at the next instruction)
- Calls, that execute a different piece of code before returning to the call site and executing the next instruction
What opcodes access the memory?
- Loads from memory into registers
- Stores into memroy from registers
- Stack access (push/pop instructions)
How to recognize basic blocks, functions (with their prologues and epilogues), etc.?

To supply all of that information to radare2, an analysis plugin is required to implement several functions:

archinfo: this function is used to provide radare2 with some basic information about the archtecture’s opcodes, based on the argument passed. archinfo(0) returns the alignment of the architecture, archinfo(1) returns the maximum size of an opcode, and archinfo(2) returns the minimum size of an opcode.
set_reg_profile: this function provides radare2 with information about the registers layout of the architecture. For basic analysis, listing the architecture’s registers is enough. However, in order to employ ESIL (radare2’s emulation framework), additional information is required, such as the size of the registers, and whether any of them overlap.
op: this function provides radare2 with all of the available information regarding the analyzed instruction, such as its classification, its target address (for opcodes that manipulate the program counter), its referenced address (for opcodes that access memory), its ESIL representation (for emulation), and so on. This is the function responsible for enriching the disassembly view with hints about the code - the targets of jumps and calls, the contents of the memory referenced by instructions, etc.

The main challenge, thereofore, in developing the analysis plugin was gaining sufficient understanding of Beyond Architecture 2. Due to lack of documentation, all of the required information about the architecture and its opcodes had to be inferred via alternative methods. We used the following approaches during the analysis of the architecture:

Sample compilation and analysis: by using BeyondStudio to compile sample firmware projects provided by NXP, we could compare the compiled source code to the resulting disassembly (which was also more readable thanks to symbols), and infer its functionality.
Guessing: as unscientific as it sounds, this research included a lot of guesswork. The purpose of some opcodes, given their mnemonic and arguments, was easy enough to guess. Other instructions’ names could be found in more documented architectures (such as MIPS), and in some cases we could prove that their purpose was similar.
Comparison with OpenRisc 1000: as BA2 is a closed-source implementation of OpenRisc 1000, it’s safe to assume that similarly sounding instructions have similar functionality. Thus, the specification provided us with at least a basic understanding of the purpose of dozens of instructions, as well as the expected functionaliy of the registers.
Execution and Observation: By using a JN5169-based development board with a connected debugger, we could execute arbitrary instructions, and observe the state changes of the CPU. This approach will be described in more detail in the following post.

Instruction size

Observing the minimum and maximum size of the instructions is easy, based on the source code in binutils. It’s clear that the smallest instructions are 2 bytes (16-bit) long, while the longest ones are 6 bytes (48-bit) long.

Registers

Understanding the purpose of the registers was possible thanks to the OpenRisc 1000 specification (17.2.1: pp. 354-356). Disassembled samples matched the description in the specification, and it seems that BA2 follows it without any changes, thus:

R0 holds a fixed 0 value
R1 is the stack pointer (SP)
R2 is the frame pointer (FP)
R9 is the link register (LR)
R11 is the return value (RV)

The information regarding calling conventions can be further used in the analysis of BA2 binaries, as we assume it will fit, as well.

Opcodes

By far, the most complex (and still incomplete) task regarding the analysis of the architecture was the reverse engineering of each of the opcodes. For each opcode, we had to understand its functionality, the side effects of its execution, and so on. This required a combination of all of the aforementioned methods (analysis of precompiled samples, inspection of the OpenRisc 1000 specification, execution and observation on a development board, and a fair amount of guesswork).

Following are a few sample instructions, each with the explanation of its analysis process (where precompiled samples were used, we arbitrarily chose to work with JN-AN-1180 (downloadable from NXP), and compile its HomeSensorEndD project):

`bn.entri`

This instruction is unique to BA2 (it’s not available in the OpenRisc specification). By analyzing its appearance in code, such as in the bRGB_LED_Off function:

┌ (fcn) sym.bRGB_LED_Off
│   sym.bRGB_LED_Off ();
│           0x00087b2c      47ac80         bn.entri    0x3,0x1
│           0x00087b2f      0080           bt.movi     r4,0x0
│           0x00087b31      0064           bt.movi     r3,0x2
│           0x00087b33      497400         bn.jal      sym.bPCA9634_SetChannelLevel ;[1] ; sym.bWhite_LED_Enable+0x8c
│           0x00087b36      0080           bt.movi     r4,0x0
│           0x00087b38      0543           bt.mov      r10,r3
│           0x00087b3a      0068           bt.movi     r3,0x1
│           0x00087b3c      4a3400         bn.jal      sym.bPCA9634_SetChannelLevel ;[1] ; sym.bWhite_LED_Enable+0x8c
│           0x00087b3f      0563           bt.mov      r11,r3
│           0x00087b41      61605d         bn.sub      r11,r0,r11
│           0x00087b44      0060           bt.movi     r3,0x0
│           0x00087b46      0080           bt.movi     r4,0x0
│           0x00087b48      4a9400         bn.jal      sym.bPCA9634_SetChannelLevel ;[1] ; sym.bWhite_LED_Enable+0x8c
│           0x00087b4b      420b88         bn.bgesi    r11,0x0,0x87b5c
│           0x00087b4e      614055         bn.sub      r10,r0,r10
│           0x00087b51      420ad0         bn.bgesi    r10,0x0,0x87b5c
│           0x00087b54      60601d         bn.sub      r3,r0,r3
│           0x00087b57      6c63f9         bn.srli     r3,r3,0x1f
│           0x00087b5a      0c80           bt.j        0x87b5e
│           0x00087b5c      0060           bt.movi     r3,0x0
└           0x00087b5e      47bc80         bn.reti     0x3,0x1

Due to its name (that’s obiously related to Entry) and location (always at the beginning of the function), we concluded early on that the purpose of this opcode is to prepare the stack frame of the function. However, the purpose of its 2 immediate parameters was still unclear. By executing it in isolation (and with different parameters) on a debugged development board, we were able to infer that the opcode bt.entri F,N pushes F GPRs (general-purpose registers), beginning with R9, onto the stack, and then further expands the stack by N words.

`bt.trap`, `bt.rfe`, `bt.ei`, `bt.di`, `bt.sys`

These are, in fact, some of the several “syntactic sugar” instructions present in the ISA. A “syntactic sugar” instruction can be decoded as a different instruction with specific arguments. However, with those arguments, it takes on a different purpose, and is better represented by a different mnemonic. bt.trap, for instance, is simply a bt.movi with the destination register being R0 (the rest of the above are bt.mov with R0 as the destination register). If we recall the ISA and the specification of the registers, R0 has a fixed value of 0, and cannot be written to. When a CPU encounters such an instruction, it will treat it as a special case, instead. As with other “syntactic sugar” instructions, they were discovered when testing our disassembly module and comparing its output to that of objdump.

`bt.ff1`

This opcode is present in OpenRisc 1000 specification (as l.ff1), and its purpose is to return the position of the lowest-order non-zero bit of its argument. We used our development board to validate this assumption.

`bt.bitrev`, `bt.clz`

The purpose of these instructions was inferred via debugging the development board. The first one (as the name suggests) reverses the bits in its argument, such that 0xdeadbeef (11011110101011011011111011101111) becomes 0xf77db57b (11110111011111011011010101111011). The other, similar to the MIPS clz instruction, is used to count the leading zeroes in a number.

Verification

Before using the pyba2 plugin for actual research, it was important to make sure its logic works correctly, and produces results consistent with the “official” disassembly (produced by objdump). The ability to compile sample firmware projects in BeyondStudio proved extremely helpful for this purpose. Building each project produces both an ELF binary (readable by objdump), and a binary firmware (in the format parsable by pyba2).

We developed a testing script that takes both the ELF and the firmware as input. Next, it loads the binary firmware into radare2, while using objdump to get a full disassembly listing of the ELF version. Finally, it reads the complete disassembly of the binary in radare2, line by line, and performs the following verifications:

Disassembly: the script compares each line of the disassembly from radare2 to the output from objdump. Through this verification, we were able to detect “syntactic sugar” instructions. Additionally, it helped us find and fix such bugs as incorrect parsing of branch target addresses, and incorrect treatment of signed/unsigned integers.
Assembly: the script attempts to assemble each line of the disassembly from radare2, and compares the resulting binary encoding of the instruction to the actual binary content of the firmware at that location. This verification ensures that our assembly logic is solid. Assembly of instructions is used extensively in the “Execution and Observation” approach to analyzing the ISA (where we attempt to execute isolated instructions on a development board and observe the state changes), and its correctness is vital to producing reliable results.

Furthermore, while the above verifications are mostly relevant to the asm part of the pyba2 plugin, the script inherently tests the correctness of the bin part of the plugin, since the code section of the firmware is expected to be parsed and placed at the correct address (and any bugs in this process would result in mismatch against objdump’s output).

Results

The combination of parsing, disassembly, and analysis capabilities of pyba2 and radare2 results in a reverse engineering environment that is far superior to objdump:

Beyond Architecture 2: Code Flow Graph in radare2

What’s next?

Our next post will focus on using a cheap, JN5168-based development board and a basic JTAG debugger to create a test harness for executing and debugging arbitrary BA2 binaries. Meanwhile, feel free to ask questions, add comments, and contribute in the comment section below and on the project’s GitHub page.