Clean Angr disassemble output

Question:

I’m developing a python script for Angr that has to print as output something in the form of:

Instruction_disassembled        opcode_bytes_of_instruction

This is my python script:

    f = open(sys.argv[2], 'w')
    base_addr = 0x100000
    p = angr.Project(sys.argv[1], auto_load_libs = False, load_options = {'main_opts':{'base_addr': base_addr}})
    cfg = p.analyses.CFGFast()
    cfg.normalize()
    for func_node in cfg.functions.values():
        for block in func_node.blocks:
            print(re.sub(r'.', '', str(block.disassembly), count = 10) + 't' + block.bytes.hex()) 

With my script I’m receiving an output that has two things that I don’t want: addresses at the beginning of the line and the opcode bytes that are printed all at the end of the block instead at the end of each line, for example:

endbr64 
0x101004:   sub rsp, 8
0x101008:   mov rax, qword ptr [rip + 0x2fd9]
0x10100f:   test    rax, rax
0x101012:   je  0x101016    f30f1efa4883ec08488b05d92f00004885c07402

Unfortunately the block is being printed as a whole and I can’t either remove the addresses or print correctly the opcode bytes.

Can you tell me another way to iterate through the functions in order to have the single instructions or how can I parse this? Thank you in advance.

Asked By: Luca

||

Answers:

I have solved it with:

for func_node in cfg.functions.values():
        for block in func_node.blocks:
            c = block.capstone

            for i in c.insns:
                f.write(' '.join(re.findall(r'.{1,2}', i.insn.bytes.hex())).upper() + 'tt' + i.mnemonic.upper() +
                        " " + i.op_str.upper() + 'n')
Answered By: Luca