Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Free Learning
Arrow right icon
Mastering Assembly Programming
Mastering Assembly Programming

Mastering Assembly Programming: From instruction set to kernel module with Intel processor

Arrow left icon
Profile Icon Alexey Lyashko
Arrow right icon
$19.99 per month
Full star icon Full star icon Full star icon Half star icon Empty star icon 3.1 (8 Ratings)
Paperback Sep 2017 290 pages 1st Edition
eBook
$27.98 $39.99
Paperback
$48.99
Subscription
Free Trial
Renews at $19.99p/m
Arrow left icon
Profile Icon Alexey Lyashko
Arrow right icon
$19.99 per month
Full star icon Full star icon Full star icon Half star icon Empty star icon 3.1 (8 Ratings)
Paperback Sep 2017 290 pages 1st Edition
eBook
$27.98 $39.99
Paperback
$48.99
Subscription
Free Trial
Renews at $19.99p/m
eBook
$27.98 $39.99
Paperback
$48.99
Subscription
Free Trial
Renews at $19.99p/m

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing
Table of content icon View table of contents Preview book icon Preview Book

Mastering Assembly Programming

Intel Architecture

-What languages do you usually use?
-C and Assembly. In fact, I love programming in Assembly.
-Hmmm... I would not have publicly admitted that...

When speaking about the Assembly language, people usually imagine a sort of unknown and dangerous beast, which obeys only the weirdest representatives of the programming community, or a gun that may only be used for shooting your own leg. Just as any prejudice, this one is rooted in ignorance and the primal fear of the unknown. The purpose of this book is not only to help you overcome this prejudice, but also to show how the Assembly language may become a powerful tool, a sharp lancet, that will help you perform certain tasks, even sophisticated ones, with elegance and relative simplicity, avoiding the unnecessary complications which are, sometimes, implied by high-level languages.

First of all, what is the Assembly language? To put it simply and precisely, we may safely define the Assembly language as symbolic or human readable machine code as each Assembly instruction translates into a single machine instruction (with a few exceptions). To be even more precise, there is no such thing as a single Assembly language as, instead, there are numerous Assembly languages--one per platform, where a platform is a programmable device. Almost any programmable device with a certain instruction set may have its own Assembly language, but this is not always so. Exceptions are devices such as, for example, NAND flash chips, which have their own command set, but have no means for fetching instructions from memory and executing them without implicitly being told to do so.

In order to be able to effectively use the Assembly language, one has to have a precise understanding of the underlying platform, as programming in the Assembly language means "talking" directly to the device. The deeper such understanding is, the more efficient is Assembly programming; however, we are not going to look at this in great detail, as this is beyond the scope of the book. One book would not be enough to cover each and every aspect of the specific architecture. Since we are going to concentrate on the Intel architecture during the course of this book, let's try to obtain at least a general understanding of Intel's x86/AMD64 architectures, and try to enrich it and make it deeper.

This chapter, in particular, covers processor registers and the functionality thereof and briefly describes memory organization (for example, segmentation and paging).

  • General purpose registers: Despite the fact that some of them have special meanings under certain circumstances, these registers, as the name of the group states, may be used for any purpose.
  • Floating point registers: These registers are used for floating point operations.
  • Segment registers: These registers are hardly accessed by applications (the most common case is setting up structured exception handlers on Windows); however, it is important to cover them here so we may better understand the way the CPU percives RAM. The part of the chapter that discusses segment registers also addresses a few memory organization aspects, such as segmentation and paging.
  • Control registers: This is a tiny group of registers of registers of high importance, as they control the behavior of the processor as well as enable or disable certain features.
  • Debug registers: Although registers of this group are mainly used by debuggers, they add some interesting abilities to our code, for example the ability to set hardware breakpoints when tracing execution of a program.
  • EFlags register: This is also known as the status register on some platforms. This one provides us with the information regarding the result of the latest arithmetic logic unit (ALU) operation performed, as well as some settings of the CPU itself.

Processor registers

Each programmable device, and Intel processors are not an exception, has a set of general purpose registers--memory cells located physically on the die, thus providing low latency access. They are used for temporary storage of data that a processor operates on or data that is frequently accessed (if the amount of general purpose registers allows this). The amount and bit size of registers on an Intel CPU vary in accordance with the current mode of operation. An Intel CPU has at least two modes:

  • Real mode: This is the good old DOS mode. When the processor is powered up, it starts in the real mode, which has certain limitations, such as the size of the address bus, which is only 20 bits, and the segmented memory space.
  • Protected mode: This was first introduced in 80286. This mode provides access to larger amount of memory, as it uses different memory segmentation mechanisms. Paging, introduced in 80386, allows even easier memory addressing virtualization.

Since about 2003, we also have the so-called long mode--64-bit registers/addressing (although, not all 64 bits are used for addressing yet), flat memory model, and RIP-based addressing (addressing relative to the instruction pointer register). In this book, we will work with 32-bit protected (there is such a thing as the 16-bit protected mode, but that is out of scope) and Long, which is a 64-bit mode of operation. The long mode may be considered a 64-bit extension of the protected mode, which evolved from 16-bit to 32-bit. It is important to know that registers accessible in the earlier mode are also accessible in the newer mode, meaning that the registers that were accessible in the real mode are also accessible in the protected mode, and that registers accessible in the protected mode would also be accessible in the long mode (if long mode is supported by the processor). There are a few exceptions regarding the bit width of certain registers and we will look at this soon in this chapter. However, since 16-bit modes (real and 16-bit protected modes) are no longer used by application developers (with minor possible exceptions), in this book, we will work on protected and long modes only.

General purpose registers

Depending on the mode of the operation (protected or long), there are 8 to 16 available general purpose registers in modern Intel processors. Each register is divided into subregisters, allowing access to data with a bit width lower than the width of the register.

The following table shows general purpose registers (further referred to as GPR):

Table 1: x86/x86_64 registers
All R* registers are only available in the long mode. Registers SIL, DIL, BPL, and SPL are only available in the long mode. Registers AH, BH, CH, and DH cannot be used in instructions that are not valid outside the long mode.

For convenience, we will refer to the registers by their 32-bit names (such as EAX, EBX, and so on) when we do not need to explicitly refer to a register of a certain bit width. The preceding table shows all general purpose registers available on the Intel platform. Some of them are only available in the long mode (all 64-bit registers, R* registers, and a few of the 8-bit registers) and certain combinations are not allowed. However, despite the fact that we can use those registers for any purpose, some of them do have a special meaning in certain circumstances.

Accumulators

The EAX register is also known as an accumulator and is used with multiplication and division operations, both as implied and target operands. It is important to mention that the result of a binary multiplication is twice the size of the operands and the result of a binary division consists of two parts (quotient and remainder), each of which has the same bit width as the operands. Since the x86 architecture began with 16-bit registers and for the sake of backward compatibility, the EDX register is used for storing partial results when the values of the operands are larger than could fit into 8 bits. For example, if we want to multiply two bytes, 0x50 and 0x04, we would expect the result to be 0x140, which cannot be stored in a single byte. However, since the operands were 8 bits in size, the result is stored into the AX register, which is 16 bits. But if we want to multiply 0x150 by 0x104, the result would need 17 bits to be stored (0x150 * 0x104 = 0x15540) and, as we have mentioned already, the first x86 registers were only 16 bits. This is the reason for using an additional register; in the case of the Intel architecture, this register is EDX (to be more precise, only the DX part would be used in this specific case). As a verbal explanation may sometimes be too generalized, it would be better to simply demonstrate the rule.

Operand size Source 1 Source 2 Destination
8 bits (byte) AL 8-bit register or 8-bit memory AX
16 bits (word) AX 16-bit register or 16-bit memory DX:AX
32 bits (double word) EAX 32-bit register or 32-bit memory EDX:EAX
64 bits (quad word) RAX 64-bit register or 64-bit memory RDX:RAX

Division implies a slightly different rule. To be more precise, this is the inverted multiplication rule, meaning that the result of the operation is half the bit width of the dividend, which in turn means that the largest dividend in the long mode may be 128-bit wide. The smallest dividend value remains the same as in the smallest value of the source operand in the case of multiplication--8 bits.

Operand size Dividend Divisor Quotient Remainder
8/16 bits AX 8-bit register or 8-bit memory AL AH
16/32 bits DX:AX 16-bit memory or 16-bit register AX DX
32/64 bits EDX:EAX 32-bit register or 32-bit memory EAX EDX
64/128 bits RDX:RAX 64-bit register or 64-bit memory RAX RDX

Counter

ECX register - also known as counter register. This register is used in loops as a loop iteration counter. It is first loaded with a number of iterations, and then decremented each time the loop instruction is executed until the value stored in ECX becomes zero, which instructs the processor to break out of the loop. We can compare this to the do{...}while() clause in C:

int ecx = 10; 
do
{
// do your stuff
ecx--;
}while(ecx > 0);

Another common usage of this register, actually the usage of its least significant part, CL, is in bitwise shift operations, where it contains the number of bits in which the source operand should be shifted. Consider the following code, for example:

mov eax, 0x12345
mov cl, 5
shl eax, cl

This would result in the register EAX being shifted 5 bits to the left (having the value of 0x2468a0 as a result).

Stack pointer

An ESP register is the stack pointer. This register, together with the SS register (the SS register is explained a bit later in this chapter), describes the stack area of a thread, where SS contains the descriptor of the stack segment and ESP is the index that points to the current position within the stack.

Source and destination indices

ESI and EDI registers serve as source and destination index registers in string operations, where ESI contains the source address and EDI, obviously, the destination address. We will talk about these registers a bit more in Chapter 3, Intel Instruction Set Architecture (ISA).

Base pointer

EBP. This register is called the base pointer as its most common use is to point to the base of a stack frame during function calls. However, unlike the previously discussed registers, you may use any other register for this purpose if needed.

Another register worth mentioning here is EBX, which, in the good old days of 16-bit modes (when it was still just a BX register), was one of the few registers that we could use as a base for addressing. Unlike EBP, EBX was (in the case of the XLAT instruction, which by default uses DS:EBX, still is) intended to point to a data segment.

Instruction pointer

There is one more special register that cannot be used for data storage--EIP (IP in the real mode or RIP in the long mode). This is the instruction pointer and contains the address of the instruction after the instruction currently being executed. All instructions are implicitly fetched from the code segment by the CPU; thus the full address of the instruction following the one being executed should be described as CS:IP. Also, there is no regular way to modify its content directly. It is not impossible, but we can't just use a mov instruction in order to load a value into EIP.

All the other registers have no special meaning from the processor's perspective and may be used for any purpose.

Floating point registers

The CPU itself has no means for floating point arithmetic operations. In 1980, Intel announced the Intel 8087 - the floating point coprocessor for the 8086 line. 8087 remained as a separate installable device until 1989, when Intel came up with the 80486 (i486) processor, which had an integrated 8087 circuit. However, when talking about floating point registers and floating point instructions, we still refer to 8087 as a floating-point unit (FPU) or, sometimes, still as a floating-point coprocessor (however, the latter is becoming more and more rare).

8087 has eight registers, 80 bits each, arranged in a stack fashion, meaning that operands are pushed onto this stack from the memory and results are popped from the topmost register to the memory. These registers are named ST0 to ST7 (ST--stack) and the most used one, that is, the ST0 register, may be referred to as simply ST.

The floating-point coprocessor supports several data types:

  • 80-bit extended-precision real
  • 64-bit double-precision real
  • 32-bit single-precision real
  • 18-digit decimal integer
  • 64-bit binary integer
  • 32-bit binary integer
  • 16-bit binary integer

The floating-point coprocessor will be discussed in more detail in Chapter 3, Intel Instruction Set Architecture (ISA).

XMM registers

The 128-bit XMM registers are part of the SSE extension (where SSE is short for Streaming SIMD Extension, and SIMD, in turn, stands for single instruction multiple data). There are eight XMM registers available in non -64-bit modes and 16 XMM registers in long mode, which allow simultaneous operations on:

  • 16 bytes
  • eight words
  • four double words
  • two quad words
  • four floats
  • two doubles

We will pay much more attention to these registers and the technology behind them in Chapter 5, Parallel Data Processing.

Segment registers and memory organization

Memory organization is one of the most important aspects of CPU design. The first thing to note is that when we say "memory organization", we do not mean its physical layout on memory chips/boards. For us, it is much more important how the CPU sees memory and how it communicates with it (on a higher level, of course, as we are not going to dive into the hardware aspects of the architecture).

However, as the book is dedicated to application programming, rather than operating system development, we will further consider the most relevant aspects of memory organization and access in this section.

Real mode

Segment registers are a rather interesting topic, as they are the ones that tell the processor which memory areas may be accessed and how exactly they may be accessed. In real mode, segment registers used to contain a 16-bit segment address. The difference between a normal address and segment address is that the latter is shifted 4 bits to the right when stored in the segment register. For example, if a certain segment register was loaded with the 0x1234 value, it, in fact, was pointing to the address 0x12340; therefore, pointers in real mode were rather offsets into segments pointed to by segment registers. As an example, let's take the DI register (as we are talking about a 16-bit real mode now), which is used with the DS (data segment) register automatically, and load it with, let's say, 0x4321 when the DS register is loaded with the 0x1234 value. Then the 20-bit address would be 0x12340 + 0x4321 = 0x16661. Thus, it was possible to address at most 1 MB of memory in real mode.

There are in total six segment registers:

  • CS: This register contains the base address of the currently used code segment.
  • DS: This register contains the base address of the currently used data segment.
  • SS: This register contains the base address of the currently used stack segment.
  • ES: This is the extra data segment for the programmer's use.
  • FS and GS: These were introduced with the Intel 80386 processor. These two segment registers have no specific hardware-defined function and are for the programmer's use. It is important to know that they do have specific tasks in Windows and Linux, but those tasks are operating system dependent only and have no connection to hardware specifications.

The CS register is used together with the IP register (the instructions pointer, also known as the program counter on other platforms), where the IP (or EIP in protected mode and RIP in long mode) points to the offset of the instruction in the code segment following the instruction currently being executed.

DS and ES are implied when using SI and DI registers, respectively, unless another segment register is implicitly specified in the instruction. For example, the lodsb instruction, although, it is written with no operands, loads a byte from the address specified by DS:SI into the AL register and the stosb instruction (which has no visible operands either) stores a byte from the AL register at the address specified by ES:DI. Using SI/DI registers with other segments would require explicitly mentioning those segments with the relevant segment register. Consider the following code, for example:

mov ax, [si] 
mov [es:di], ax

The preceding code loads a double word from the location pointed by DS:SI and stores it to another location pointed by ES:DI.

The interesting thing about segment registers and segments at all is that they may peacefully overlap. Consider a situation where you want to copy a portion of code to either another place in the code segment or into a temporary buffer (for example, for decryptor). In such a case, both CS and DS registers may either point to the same location or the DS register may point somewhere into the code segment.

Protected mode - segmentation

While it was all fine and simple in real mode, things become a bit more complicated when it comes to protected mode. Unfortunately, memory segmentation is still intact, but the segment register no longer contain addresses. Instead, they are loaded with the so-called selectors, which are, in turn, the indices into the descriptor table multiplied by 8 (shifted 3 bits to the left). The two least significant bits designate the requested privilege level (0 for kernel space to 3 for user land). The third bit (at index 2) is the TI bit (table indicator), which indicates whether the descriptor being referred is in a global descriptor table (0) or in a local descriptor table (1). The memory descriptor is a tiny 8-byte structure, which describes the range of physical memory, its access rights, and some additional attributes:

Table 2: Memory descriptor structure

Descriptors are stored in at least two tables:

  • GDT: Global descriptor table (used by the operating system)
  • LDT: Local descriptor table (per task descriptor table)

As we may conclude, the organization of memory in protected mode is not that different from that in real mode after all.

There are other types of descriptors--interrupt descriptors (stored in the interrupt description table (IDT)) and system descriptors; however, since these are in use in kernel space only, we will not discuss them, as that falls out of the scope of this book.

Protected mode - paging

Paging is a more convenient memory management scheme introduced in 80386 and has been a bit enhanced since then. The idea behind paging is memory virtualization--this is the mechanism that makes it possible for different processes to have the same memory layout. In fact, the addresses we use in pointers (if we are writing in C, C++, or any other high-level language that compiles into native code) are virtual and do not correspond to physical addresses. The translation of a virtual address into a physical address is implemented in hardware and is performed by the CPU (however, some operating system interventions are possible).

By default, a 32-bit CPU uses a two-level translation scheme for the derivation of a physical address from the supplied virtual one.

The following table explains how a virtual address is used in order to find a physical address:

Address bits Meaning
0 - 11 Offset into a 4 KB page
12 - 21 Index of the page entry in the table of 1024 pages
22 - 31 Index of the page table entry in a 1024-entries page directory
Table 3: Virtual address to physical address translation

Most, if not all, modern processors based on the Intel architecture also support Page Size Extension (PSE), which makes it possible to use the so-called large pages of 4 MB. In this case, the translation of a virtual address into a physical address is a bit different, as there is no page table any more. The following table shows the meaning of bits in a 32-bit virtual address:

Address bits Meaning
0 - 21 Offset into a 4 MB page
22 - 31 Index of the corresponding entry in a 1024-entries page directory
Table 4: Virtual address to physical address translation with PSE enabled

Furthermore, the Physical Address Extension (PAE) was introduced, which significantly changes the scheme and allows access to a much bigger range of memory. In protected mode, PAE adds a page directory pointer table of four entries and the virtual to physical address conversion would be as per the following table:

Address bits Meaning
0 - 11 Offset into a 4 KB page
12 - 20 Index of a page entry in the table of 512 pages
21 - 29 Index of a page table entry in a 512-entries page directory
30 - 31 Index of a page directory entry in a four-entries page directory pointer table
Table 5: Virtual to physical address translation with PAE enabled (no PSE)

Enabling PSE in addition to PAE forces each entry in the page directory to point directly to a 2 MB page instead of an entry in a page table.

Long mode - paging

The only address virtualization allowed in long mode is paging with PAE enabled; however, it adds one more table--the page map level 4 table as the root entry; therefore, the conversion of a virtual address to a physical address uses the bits of a virtual address in the way described in the following table:

Address bits Meaning
0 - 11 Offset into a 4 KB page
12 - 20 Index of a page entry in the table of 512 pages
21 - 29 Index of a page table entry in the page directory
30 - 38 Index of a page directory entry in the page directory pointer table
39 - 47 Index of a page directory pointer table in the page-map level 4 table
Table 6: Virtual to physical address translation in long mode

It is, however, important to mention that despite the fact that it is a 64-bit architecture, the MMU only uses the first 48 bits of the virtual address (also called the linear address).

The whole process of address resolution is performed by the memory management unit (MMU) in the CPU itself, and the programmer is only responsible for actually building these tables and enabling PAE/PSE. However, this topic is much wider than may be covered in a single chapter and falls a bit out of the scope of this book.

Control registers

Processors based on the Intel architecture have a set of control registers that are used for configuration of the processor at run time (such as switching between execution modes). These registers are 32-bit wide on x86 and 64-bit wide on AMD64 (long mode).

There are six control registers and one Extended Feature Enable Register (EFER):

  • CR0: This register contains various control flags that modify the basic operation of the processor.
  • CR1: This register is reserved for future use.
  • CR2: This register contains the Page Fault Linear Address when a page fault occurs.
  • CR3: This register is used when virtual addressing is enabled (paging) and contains the physical address of the page directory, page directory pointer table, or page map level 4 table, depending on the current mode of operation.
  • CR4: This register is used in the protected mode for controlling different options of the processor.
  • CR8: This register is new and is only available in long mode. It is used for prioritization of external interrupts.
  • EFER: This register is one of the several model-specific registers. It is used for enabling/disabling SYSCALL/SYSRET instructions, entering/exiting long mode, and a few other features. Other model-specific registers are of no interest for us.

However, these registers are not accessible in ring3 (user land).

Debug registers

In addition to control registers, processors also have a set of so-called debug registers, which are mostly used by debuggers for setting the so-called hardware breakpoints. These registers are in fact a very powerful tool when it comes to control over other threads or even processes.

Debug address registers DR0 - DR3

Debug registers 0 to 3 (DR0, DR1, DR2, and DR3) are used to store virtual (linear) addresses of the so-called hardware breakpoints.

Debug control register (DR7)

DR7 defines how the breakpoints set in Debug Address Registers should be interpreted by the processor and whether they should be interpreted at all.

The bits layout of this register is shown in the following table:

Table 3: DR7 bit layout

L* bits, when set to 1, enable breakpoint at the address which is specified in the corresponding Debug Address Register locally--within a task. These bits are reset by the processor on each task switch. G* bits, on the contrary, enable breakpoints globally--for all tasks, meaning that these bits are not reset by the processor.

The R/W* bits specify breakpoint conditions, as follows:

  • 00: Break on instruction execution
  • 01: Break when the specified address is accessed for writing only
  • 10: Undefined
  • 11: Break on either read or write access or when an instruction at the specified address is executed

The LEN* bits specify the size of a breakpoint in bytes, thus, allowing coverage of more than one instruction or more than one byte of data:

  • 00: Breakpoint is 1-byte long
  • 01: Breakpoint is 2-bytes long
  • 10: Breakpoint is 8-bytes long (long mode only)
  • 11: Breakpoint is 4-bytes long

Debug status register (DR6)

When an enabled breakpoint is triggered, the corresponding bit of the four low-order bits in DR6 is set to 1 before entering the debug handler, thus, providing the handler with information about the triggered breakpoint (bit 0 corresponds to the breakpoint in DR0, bit 1 to the breakpoint in DR1, and so on).

The EFlags register

It would have been impossible to write programs in any language for a given platform if the processor had no means to report its status and/or the status of the last operation. More than that, the processor itself needs this information from time to time. Try to imagine a processor unable to conditionally control the execution flow of a program--sounds like a nightmare, doesn't it?

The most common way for a program to obtain information on the last operation or on a certain configuration of an Intel-based processor is through the EFlags register (E stands for extended). This register is referred to as Flags in real mode, EFlags in protected mode, or RFlags in long mode.

Let's take a look at the meaning of the individual bits (also referred to as flags) of this register and its usage.

Bit #0 - carry flag

The carry flag (CF) is mostly used for the detection of carry/borrow in arithmetic operations and is set if the bit width result of the last such operation (such as addition and subtraction) exceeds the width of the ALU. For example, the addition of two 8-bit values, 255 and 1, would result in 256, which requires at least nine bits to be stored. In such a case, bit eight (the ninth bit) is placed into the CF, thus, letting us and the processor know that the last operation ended with carry.

Bit #2 - parity flag

The parity flag (PF) is set to 1 in case the number of 1s in the least significant byte is even; otherwise, the flag is set to zero.

Bit #4 - adjust flag

The adjust flag (AF) signals when a carry or borrow occurred in the four least significant bits (lower nibble) and is primarily used with binary coded decimal (BCD) arithmetics.

Bit #6 - zero flag

The zero flag (ZF) is set when the result of an arithmetic or bitwise operation is 0. This includes operations that do not store the result (for example, comparison and bit test).

Bit #7 - sign flag

The sign flag (SF) is set when the last mathematical operation resulted in a negative number; in other words, when the most significant bit of the result was set.

Bit #8 - trap flag

When set, the trap flag (TF) causes a single step interrupt after every executed instruction.

Bit #9 - interrupt enable flag

The interrup enable flag (IF) defines whether processor will or will not react to incoming interrupts. This flag is only accessible in real mode or at the Ring 0 protection level in other modes.

Bit #10 - direction flag

The direction flag (DF) controls the direction of string operations. An operation is performed from the lower address to the higher address if the flag is reset (is 0) or from the higher address to the lower address if the flag is set (is 1).

Bit #11 - overflow flag

The overflow flag (OF) is sometimes perceived as two's complement form of the carry flag, which is not really the case. OF is set when the result of the operation is either too small or too big a number to fit into the destination operand. For example, consider the addition of two 8-bit positive values, 0x74 and 0x7f. The resulting value of such an addition is 0xf3, which is still 8-bit, which is fine for unsigned numbers, but since we added two values that we considered to be signed, there has to be the sign bit and there are no more bits to store the 9-bit signed result. The same would happen if we try to add two negative 8-bit values, 0x82 and 0x81. The meaning of the addition of two negative numbers is in the fact subtraction of a positive number from a negative number, which should result in an even smaller number. Thus, 0x82 + 0x81 would result in 0x103, where the ninth bit, 1, is the sign bit, but it cannot be stored in an 8-bit operand. The same applies to larger operands (16, 32, and 64-bit).

Remaining bits

The remaining 20 bits of the EFlags register are not that important for us while in user-land except, probably, the ID bit (bit #21). The ID flag indicates whether we can or cannot use the CPUID instruction.

Bits 32 - 63 of the RFlags register in long mode would be all 0s.

Summary

In this chapter, we have briefly run through the basics of the internal structure of x86-based processors essential for the further understanding of topics covered in later chapters. Being a huge fan of Occam's Razor principle, yours truly had no intention to replicate Intel's programmer manual; however, certain topics covered in this chapter exceed the range of topics necessary for a successful start with Assembly language programming.

However, I believe that you would agree--we've had enough of dry information here and it is the right time to start doing something. Let's begin by setting up the development environment for Assembly language programming in Chapter 2, Setting Up a Development Environment.

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Understand the Assembly programming concepts and the benefits of examining the AL codes generated from high level languages
  • Learn to incorporate the assembly language routines in your high level language applications
  • Understand how a CPU works when programming in high level languages

Description

The Assembly language is the lowest level human readable programming language on any platform. Knowing the way things are on the Assembly level will help developers design their code in a much more elegant and efficient way. It may be produced by compiling source code from a high-level programming language (such as C/C++) but can also be written from scratch. Assembly code can be converted to machine code using an assembler. The first section of the book starts with setting up the development environment on Windows and Linux, mentioning most common toolchains. The reader is led through the basic structure of CPU and memory, and is presented the most important Assembly instructions through examples for both Windows and Linux, 32 and 64 bits. Then the reader would understand how high level languages are translated into Assembly and then compiled into object code. Finally we will cover patching existing code, either legacy code without sources or a running code in same or remote process.

Who is this book for?

This book is for developers who would like to learn about Assembly language. Prior programming knowledge of C and C++ is assumed.

What you will learn

  • • Obtain deeper understanding of the underlying platform
  • • Understand binary arithmetic and logic operations
  • • Create elegant and efficient code in Assembly language
  • • Understand how to link Assembly code to outer world
  • • Obtain in-depth understanding of relevant internal mechanisms of Intel CPU
  • • Write stable, efficient and elegant patches for running processes

Product Details

Country selected
Publication date, Length, Edition, Language, ISBN-13
Publication date : Sep 27, 2017
Length: 290 pages
Edition : 1st
Language : English
ISBN-13 : 9781787287488
Category :
Languages :

What do you get with a Packt Subscription?

Free for first 7 days. $19.99 p/m after that. Cancel any time!
Product feature icon Unlimited ad-free access to the largest independent learning library in tech. Access this title and thousands more!
Product feature icon 50+ new titles added per month, including many first-to-market concepts and exclusive early access to books as they are being written.
Product feature icon Innovative learning tools, including AI book assistants, code context explainers, and text-to-speech.
Product feature icon Thousands of reference materials covering every tech concept you need to stay up to date.
Subscribe now
View plans & pricing

Product Details

Publication date : Sep 27, 2017
Length: 290 pages
Edition : 1st
Language : English
ISBN-13 : 9781787287488
Category :
Languages :

Packt Subscriptions

See our plans and pricing
Modal Close icon
$19.99 billed monthly
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Simple pricing, no contract
$199.99 billed annually
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts
$279.99 billed in 18 months
Feature tick icon Unlimited access to Packt's library of 7,000+ practical books and videos
Feature tick icon Constantly refreshed with 50+ new titles a month
Feature tick icon Exclusive Early access to books as they're written
Feature tick icon Solve problems while you work with advanced search and reference features
Feature tick icon Offline reading on the mobile app
Feature tick icon Choose a DRM-free eBook or Video every month to keep
Feature tick icon PLUS own as many other DRM-free eBooks or Videos as you like for just $5 each
Feature tick icon Exclusive print discounts

Frequently bought together


Stars icon
Total $ 152.97
Mastering Assembly Programming
$48.99
Mastering Linux Kernel Development
$54.99
Linux Device Drivers Development
$48.99
Total $ 152.97 Stars icon
Banner background image

Table of Contents

11 Chapters
Intel Architecture Chevron down icon Chevron up icon
Setting Up a Development Environment Chevron down icon Chevron up icon
Intel Instruction Set Architecture (ISA) Chevron down icon Chevron up icon
Memory Addressing Modes Chevron down icon Chevron up icon
Parallel Data Processing Chevron down icon Chevron up icon
Macro Instructions Chevron down icon Chevron up icon
Data Structures Chevron down icon Chevron up icon
Mixing Modules Written in Assembly and Those Written in High-Level Languages Chevron down icon Chevron up icon
Operating System Interface Chevron down icon Chevron up icon
Patching Legacy Code Chevron down icon Chevron up icon
Oh, Almost Forgot Chevron down icon Chevron up icon

Customer reviews

Top Reviews
Rating distribution
Full star icon Full star icon Full star icon Half star icon Empty star icon 3.1
(8 Ratings)
5 star 25%
4 star 12.5%
3 star 37.5%
2 star 0%
1 star 25%
Filter icon Filter
Top Reviews

Filter reviews by




RPDevJesco Dec 13, 2017
Full star icon Full star icon Full star icon Full star icon Full star icon 5
This book covers all the main points of diving into the exciting world of assembly programming. I purchased the digital and physical copy of the book and wholeheartedly believe that it should be a book on every programmer's shelf.
Amazon Verified review Amazon
Hamlet Solano Diaz Jan 25, 2020
Full star icon Full star icon Full star icon Full star icon Full star icon 5
Complete
Amazon Verified review Amazon
Victor Jul 14, 2018
Full star icon Full star icon Full star icon Full star icon Empty star icon 4
This book gives an excellent introduction into low-level programming and assembly programming itself. It gives you just an idea of how everything works and so you'll not become a guru after reading this book, you'll just have something common in your mind, some idea you will be able to use later. There are another books with better quality of assembly programming teaching but this one is good for the beginners too.I liked it especially for this kind of introduction like "explain like I am five" - you'll not know everything in really low-level details but you'll be able to dive into it later. I also realized how this book is good after reading the "Low Level Programming" by Igor Zhirkov (Apress): you already know enough to read the latter one but the LLP is actually much more good in explaining details than the "Mastering Assembly Programming".As a summary, I'd suggest to start learning assembly programming from this book and then go to the "Low level programming" after.
Amazon Verified review Amazon
Rui Craveiro Jan 24, 2024
Full star icon Full star icon Full star icon Empty star icon Empty star icon 3
This book is really complete, but it is better as a reference material than a learning tool. The author failed to take the reader's hand and spend much more time walking him through the fundamentals instead of providing all the broad details the book provides. I suspended reading it for a more pedagogical alternative, but I will get back to it once I already master the basics.
Feefo Verified review Feefo
DNAunion Dec 26, 2017
Full star icon Full star icon Full star icon Empty star icon Empty star icon 3
It don't particularly like the organization. I would prefer it to start in chapter 1 (after getting an environment setup) with the basic "Hello World" program, and build from slowly and clearly from there. Instead, it starts by listing Intel CPU registers in chapter 1. By chapter 3 the discussion is not so easy to follow: the text does not explain well but instead gives short statements, apparently assuming you have some preexisting knowledge of assembly. Again, a better approach would be to build slowly upon the foundation of "Hello World" - adding one or two new instructions along the way, and explaining what each new instruction does - instead of jumping into listings of CPU registers and then listings of the instruction set.I realize the book is titled MASTERING ... not BEGINNING ..., but the beginning of the book states, "This book is primarily intended for developers wishing to enrich their understanding of low-level proceedings, but, in fact, there is no special requirement for much experience, although a certain level of experience is anticipated."
Amazon Verified review Amazon
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

What is included in a Packt subscription? Chevron down icon Chevron up icon

A subscription provides you with full access to view all Packt and licnesed content online, this includes exclusive access to Early Access titles. Depending on the tier chosen you can also earn credits and discounts to use for owning content

How can I cancel my subscription? Chevron down icon Chevron up icon

To cancel your subscription with us simply go to the account page - found in the top right of the page or at https://subscription.packtpub.com/my-account/subscription - From here you will see the ‘cancel subscription’ button in the grey box with your subscription information in.

What are credits? Chevron down icon Chevron up icon

Credits can be earned from reading 40 section of any title within the payment cycle - a month starting from the day of subscription payment. You also earn a Credit every month if you subscribe to our annual or 18 month plans. Credits can be used to buy books DRM free, the same way that you would pay for a book. Your credits can be found in the subscription homepage - subscription.packtpub.com - clicking on ‘the my’ library dropdown and selecting ‘credits’.

What happens if an Early Access Course is cancelled? Chevron down icon Chevron up icon

Projects are rarely cancelled, but sometimes it's unavoidable. If an Early Access course is cancelled or excessively delayed, you can exchange your purchase for another course. For further details, please contact us here.

Where can I send feedback about an Early Access title? Chevron down icon Chevron up icon

If you have any feedback about the product you're reading, or Early Access in general, then please fill out a contact form here and we'll make sure the feedback gets to the right team. 

Can I download the code files for Early Access titles? Chevron down icon Chevron up icon

We try to ensure that all books in Early Access have code available to use, download, and fork on GitHub. This helps us be more agile in the development of the book, and helps keep the often changing code base of new versions and new technologies as up to date as possible. Unfortunately, however, there will be rare cases when it is not possible for us to have downloadable code samples available until publication.

When we publish the book, the code files will also be available to download from the Packt website.

How accurate is the publication date? Chevron down icon Chevron up icon

The publication date is as accurate as we can be at any point in the project. Unfortunately, delays can happen. Often those delays are out of our control, such as changes to the technology code base or delays in the tech release. We do our best to give you an accurate estimate of the publication date at any given time, and as more chapters are delivered, the more accurate the delivery date will become.

How will I know when new chapters are ready? Chevron down icon Chevron up icon

We'll let you know every time there has been an update to a course that you've bought in Early Access. You'll get an email to let you know there has been a new chapter, or a change to a previous chapter. The new chapters are automatically added to your account, so you can also check back there any time you're ready and download or read them online.

I am a Packt subscriber, do I get Early Access? Chevron down icon Chevron up icon

Yes, all Early Access content is fully available through your subscription. You will need to have a paid for or active trial subscription in order to access all titles.

How is Early Access delivered? Chevron down icon Chevron up icon

Early Access is currently only available as a PDF or through our online reader. As we make changes or add new chapters, the files in your Packt account will be updated so you can download them again or view them online immediately.

How do I buy Early Access content? Chevron down icon Chevron up icon

Early Access is a way of us getting our content to you quicker, but the method of buying the Early Access course is still the same. Just find the course you want to buy, go through the check-out steps, and you’ll get a confirmation email from us with information and a link to the relevant Early Access courses.

What is Early Access? Chevron down icon Chevron up icon

Keeping up to date with the latest technology is difficult; new versions, new frameworks, new techniques. This feature gives you a head-start to our content, as it's being created. With Early Access you'll receive each chapter as it's written, and get regular updates throughout the product's development, as well as the final course as soon as it's ready.We created Early Access as a means of giving you the information you need, as soon as it's available. As we go through the process of developing a course, 99% of it can be ready but we can't publish until that last 1% falls in to place. Early Access helps to unlock the potential of our content early, to help you start your learning when you need it most. You not only get access to every chapter as it's delivered, edited, and updated, but you'll also get the finalized, DRM-free product to download in any format you want when it's published. As a member of Packt, you'll also be eligible for our exclusive offers, including a free course every day, and discounts on new and popular titles.