Friday, April 5, 2019
Assemblers And Disassembler Softwares Computer Science Essay
Assemblers And Disassembler Softw bes Computer Science EssayA disassembler is a computer course of instruction that translates machine language into assembly language the inverse operation to that of an assembler . A disassembler differs from a decompiler which targets a high-level language so one and only(a)r than an assembly language. The output of a disassembler is practic aloney puttingted for human-readability rather than suitability for input to an assembler, making it principally a reverse-engineering tool.Assembly language source commandment mainly permits the use of constants and architectural planmer comments . These atomic number 18 normally removed from the assembled machine code by the assembler . A disassembler operational on the machine code would produce disassembly lacking these constants and comments. The disassembled output becomes more(prenominal) nasty for a human to interpret than the original annotated source code. Some disassemblers make use of the symbolic debugging tuition present in object files such as ELF. The Interactive Disassemblerallow the human user to make up mnemonic symbols for values or regions of code in an interactive session human insight employ to the disassembly process often parallels human creativity in the code writing process.Disassembly is not an exact science On CISC platforms with variable-width operating instructions, or in the presence of self- neutering code, it is possible for a bingle program to have two or more reasonable disassemblies. Determining which instructions would actually be encountered during a run of the program reduces to the proven-unsolvable halting problem.Examples of disassemblersAny interactive debugger will include some elan of viewing the disassembly of the program being debugged. Often, the same disassembly tool will be case as a standalone disassembler distributed along with the debugger. For example, objdump, part of GNU Binutils, is related to the interactive debugge r gdb . The some ofexample of dissembler arIDAILDASM is a tool contained in the .NET Framework SDK. It can be used to disassemble PE files containing Common arbitrate Language code.OllyDbg is a 32-bit assembler level analysing debuggerPVDasm is a Free, Interactive, Multi-CPU disassembler.SIMON a test/ debugger/ animator with integrated dis-assembler for Assembler, COBOL and PL/1Texe is a Free, 32bit disassembler and windows PE file analyzer.un vulnerability is a disassembler for PIC micro subduelersInteractive DisassemblerInteractive DisassemblerThe Interactive Disassembler, more crudely cognize as simply IDA, is a disassembler used for reverse engineering. It supports a variety of executable formats for different processors and operating systems. It in addition can be used as a debugger for Windows PE, Mac OS XMach-O, and LinuxELF executables. A decompiler plugin for programs compiled with a C/C++compiler is available at extra cost. The latest full version of Ida professional person is commercial.IDA coiffures much automatic code analysis, exploitation cross-references between code instalments knowledge of parameters of API calls, and other information. However the temperament of disassembly precludes total accuracy, and a great deal of human intervention is necessarily required. IDA has interactive functionality to service in improving the disassembly. A typical IDA user will begin with an automatically generated disassembly tilt and then convert sections from code to data and viceversa.ScriptingIDC scripts make it possible to extend the operation of the disassembler. Some laboursaving scripts atomic number 18 provided, which can serve as the basis for user written scripts. Most oft scripts are used for extra modification of the generated code. For example, external symbol tables can be loaded t here(predicate)(predicate)by apply the function name of the original source code. There are websites devoted to IDA scripts and offer assistance fo r frequently arising problems.Users have created plugins that allow other common land scripting languages to be used instead of, or in admission to, IDC. IdaRUB supports Ruby and IDAPython adds support for PythonSupported systems/processors/compilersOperating systemsx86WindowsGUIx86 Windows consolex86 Linux consolex86 Mac OS X spike Windows CEExecutable file formatsPE (Windows)ELF (Linux, most *BSD)Mach-O (Mac OS X)Netware .exeOS/2 .exeGeos .exeDos/Watcom LE executable (without embedded dos extender)raw binary, such as a ROM imageProcessorsIntel 8086 familyARM, including buck codeMotorola 68xxx/h8ZilogZ80MOS Technology 6502Intel i860DEC AlphaAnalog Devices ADSP218xAngstrem KR1878Atmel AVR seriesDEC series PDP11Fujitsu F2MC16L/F2MC16LXFujitsu FR 32-bit FamilyHitachi SH3/SH3B/SH4/SH4BHitachi H8 h8300/h8300a/h8s300/h8500Intel 196 series 80196/80196NPIntel 51 series 8051/80251b/80251s/80930b/80930sIntel i960 seriesIntel Itanium (ia64) seriesJava virtual machineMIPS mipsb/mipsl/mipsr /mipsrl/r5900b/r5900lMicrochip PIC PIC12Cxx/PIC16Cxx/PIC18CxxMSILMitsubishi 7700 Family m7700/m7750Mitsubishi m32/m32rxMitsubishi m740Mitsubishi m7900Motorola DSP 5600x Family dsp561xx/dsp5663xx/dsp566xx/dsp56kMotorola ColdFireMotorola HCS12NEC 78K0/78K0SPA-RISCPowerPCSGS-Thomson ST20/ST20c4/ST7SPARC FamilySamsung SAM8Siemens C166 seriesTMS320Cxxx seriesCompiler/libraries (for automatic library function recognition)3Borland C++ 5.x for nation/WindowsBorland C++ 3.1Borland C Builder v4 for DOS/WindowsGNU C++ for CygwinMicrosoft CMicrosoft QuickCMicrosoft Visual C++Watcom C++ (16/32 bit) for DOS/OS2ARM C v1.2GNU C++ for Unix/commonSIMON (Batch Interactive test/debug)SIMON (Batch interactive test/debug) was a proprietary test/debugging toolkit for interactively testing Batch programs knowing to run on IBMs organisation 360/370/390 architecture.It operated in two modes, one of which was full instruction set simulator mode and provided Instruction step, conditional syllabus Breakpoint (Pause) and storage alteration features for Assembler, COBOL and PL/1 programs.High level language (HLL) users were also able to see and modify variables directly at a breakpoint by their symbolic names and set conditional breakpoints by data content.Many of the features were also available in partial monitor mode which relied on deliberately interrupting the program at pre-defined points or when a program check occurred.In this mode, processing was not significantly different from traffic pattern processing speed without monitoring.It additionally provided features to prevent cover program errors such as Program Check, waste branch , and Program loop. It was possible to correct many errors and interactively alter the control flow of the executing application program. This permitted more errors to be detected for each compilation which, at the time, were often scheduled batch jobs with printed output, often requiring some(prenominal) hours turnaround before the abutting test r un.Operating SystemsSimon could be executed on IBMMVS, MVS/XA, ESA or DOS/VSE operating systems and required IBM 3270 terminals for interaction with the application program.LIDAlida is basically a disassembler and code analysis tool. It uses the bastards libdisasm for virtuoso opcode It allows interactive control over the generated deadlisting via commands and builtin tools.featuresIt trace exertion flow of binaryIt work with symbolic names interactive naming of functions, labels, commenting of code.It scan for cognize anti-debugging, anti-disassembling techniquesIt scan for user defined code sequencesIt integrated patcherIt also integrated cryptoanalyzerMany disassemblers out on that point use the output of objdump lida that tries a more serious approach. The several limitations of objdump are broken by using libdisasm and by tracing the execution flow of the program.Further by having the control over the disassembly more features can be included. Everybody who has already wor ked on some deadlisting will immediate feel a desire to work interactive with the code and be able to modify it.Therefore lida will have an integrated patcher resolves symbolic names, provides the ability to comment the code, serves efficient browsing methods. The more exotic features of lida should be on the analysis side. The code can be scanned for custom sequences known antidebugging techniques known encryption algorithms also you will be able to directly work with the programs data and for example occur it to several customizable en-/decryption routines.This of course only makes limited sense as it is not a debugger. bully often I really missed this functionality.Limitations of objdump based disassemblersUsual programs one would like to disassemble are either coded directly in assembly, or use some tricks to avoid beeing disassembled. I will here give a short overview of the most objdump featuresobjdump relies on section headersIt is an ELF executable that contains correct section headers. Tough for the OS-loader to run an ELF binary, section headers are not obligatory at all. The important issue to get a process loaded into memory are the program headers .So the first common anti disassembling trick is to either drop or manipulate the ELF section headers By doing so, objdump refuses to perform the disassemblyemailprotected file tiny-crackmetiny-crackme ELF 32-bit LSB executable, Intel 80386, version 1, statically linked, corrupted section header sizeemailprotected objdump -D tiny-crackmeobjdump tiny-crackme File format not recognizedThe binary I took as example to verify is yanistos tiny-crackmeobjdump does not trace the execution flow IBy not tracing the execution flow objdump can easily be fooled to just disassemble a few lines and stop there.This means it does not recognize any functions, does not see the code which is stored in data sections.objdump does not trace the execution flowAdditionally some other common trick is to insert garbage opc odes and overjump them to disalign the disassembly from the execution flow.Example When an instruction jumps into the middle of the next instruction, objdump does not disassemble from this exact location. It will continue with the next instruction and consequently dissasemble garbage from here on.As a result you will mainly see totally usesless instructions in the social unit disassembly.. Implementation Detailslida uses libdasm of the bastard for bingle opcode decoding. It does not use the whole environment including the typhoon database.The main program is coded in perl/TK which uses a C backend for the most timeconsuming parts (disassembly, analysis, scanning for strings). Generally lida is designed to be as fast as possible (the disassembly) by trying not to waste all your RAM lida is designed to be also efficient in usability. Therefore all important functions are accessible via single keystrokes, or short commands. This means no clicking around is necessary, you can enter your tasks directly into the commandline.The disassembling engineThe disassembling is do in currently 4 (or 6) sneak offes, default is all 61st overcome is the main control flow disassemblyHere the disassembly is started from the executables entrypoint, and recursivelydisassembles the binary by following each branch, and stepping into each sub-routine.This leads in also disassembling code blocks in data sections, if existent ),so the disassembly is not limited to a .text section.Also, if indirect jumps/calls are used, the final destination is looked upin the binaries data of course2nd pass for glibc binariesA heuristic scan scans for the main() function and starts pass1 there (so also re-cursive disassembling)3rd pass all other code sectionsThis pass repeats pass1 for all found executable sections, and starts at sectionstart. If the binary does not contain section headers, the disassembly startsat the first loaded executable address.4th pass functionsThis pass scans for typic al function prologues and starts pass1 at each foundaddress. This is for discovering code regions which are not explicitly called,and where their entrypoints are evaluated at runtime.5th pass disassembling cavesAll passes build up a map of the binary. If until now there are code regionswhich were not yet disassembled, they can be now.6th pass remaindersIf pass 5 was executed, and there are still caves, they are displayed as DB xx, Definitely for pass 4 and 5 there are enhancements to come, as well as for the recursive disassembly function itself.Also to have-to doe with whenever a jump into the middle of a previous instruction is beeing found,currently those addresses are beeing marked. To follow is a representation of instructions within instructions (compare 3.1), as of course by intelligent placing of opcodes both instructions can be valid and used during the execution flow.Signature ScanningBasically it is done by a signature scanning. I quote it because it is not a simple pa ttern matching.For understanding that, one needs a little understanding of typical hash-encryption algorythms.Lets take for example a MD5 hash. How can we honor the code that does an MD5 hash?On a very high level generating a hash is usually done in 3 steps the init function, the update function and the finalize function.The init function usually sets up an array of some numeric values, which are then modified in a loop using the input data (plain data) during the algorythm, until the hash is calculated.The finalize function creates the representation in a common format (easily spoken it pads the digest and is appending the size).Hoewever, it does not matter to know actually how the algorythm works to find it Due to the common fact, that the initialization functions use fixed numeric initialization values, which are the same in every implementation, as they are part of the algorythm these are the values we are searching for. For MD5 those are0x674523010xefcdab890x98badcfe0x1032547 6So to find an MD5 implementation, it is necessary to scan for those dword values, of course they can appear in any order (strange enough nearly ever they are used in the listed order above). Now as those dwords can exist also in just any binary by accident (oltough seldom) some smarter scanning is done the values need to appear in a limited size of a code block. The values can be in any order, and also some fuzzyness has been added to scan for a little bit altered init values. trial-and-error ScanningHeuristic scanning is not yet implemented. It is intended to find custom crypto code.Basically it is beeing looked for a sequence of suspicious opcode sequences, which look like an encryption routine.OllyDbg is an x86debugger that emphasizes binary code analysis, which is useful when source code is not available. It traces registers, recognizes procedures, API calls, switches, tables, constants and strings, as well as locates routines from object files and libraries. According to the programs help file, version 1.10 is the final 1.x release. discrepancy 2.0 is in development and is being written from the ground up. The software is free of cost, but the shareware license requires users to register with the author. The current version of OllyDbg cannot disassemble binaries compiled for 64-bit processors.
Subscribe to:
Post Comments (Atom)
No comments:
Post a Comment
Note: Only a member of this blog may post a comment.