Fri 20 May 2022

Memory in Assembly - How different areas of memory are treated at the metal

I thought it might make for an interesting learning exercise to see how memory in different areas is treated at the assembly level. The program itself is simple, but I hope it will provide some insight.

C Program

const int constant_int = 0xc07;
int main() {
    static int static_int = 0x57a71c;
    int stack_int = 0x57ac3;
    int* malloc_int = malloc(1);
    malloc_int[0] = 0xa1;
    func_int(stack_int);
    func_int(*malloc_int);
    func_int(static_int);
    func_const(constant_int);
}

ARMv8 Assembly

stp         x29,x30,[sp, #local_20]!                
mov         x29,sp                                  
mov         w0,#0x7ac3                              
movk        w0,#0x5, LSL #16                        
str         w0,[sp, #stack_int+0x20]                
mov         x0,#0x1                                 
bl          __stubs:__stubs::_malloc                ;void * _malloc(size_t param_1)
str         x0,[sp, #malloc_int+0x20]               
ldr         x0,[sp, #malloc_int+0x20]               
mov         w1,#0xa1                                
str         w1,[x0]                                 
ldr         w0,[sp, #stack_int+0x20]                
bl          func_int                                ;void func_int(int x)
ldr         x0,[sp, #malloc_int+0x20]               
ldr         w0,[x0]                                 
bl          func_int                                ;void func_int(int x)
adrp        x0,0x100008000                          
add         x0,x0,#0x0                              
ldr         w0,[x0]=>__data:main::static_int        ;= 57A71Ch
bl          func_int                                ;void func_int(int x)
mov         w0,#0xc07                               
bl          func_const                              ;void func_const(int x)
mov         w0,#0x0                                 
ldp         x29=>local_20,x30,[sp], #0x20           
ret

Static Integer

You may notice that our static (0x57a71c) variable is not declared in the assembly. This is because everything you see above is in the TEXT segment. The part of the assembly that is read only. Text segment is labelled as such with an assembly directive.

Static variables exist for the lifetime of the program but can change their value and so shouldn't be loaded into the text segment. Instead, values that can be edited but are static and initialised are loaded into the data segment.

// __data
// __DATA
// ram:100008000-ram:100008003
//

_static_int.0
main::static int
    int     57A71Ch
//
//__DATA
//__DATA
// ram:100008004-ram:10000bfff

With Ghidra we can see that our file contains a directive to load our static variable into the data segment at launch. We also have a label to this static variable that somewhat matches our C code.

Stack Integer

This works by moving our value into a register, then storing the contents of that register into an area of memory offset from the stack pointer. The stack pointer is a special register that the CPU uses to tell us where our stack is. Because everything we store on the stack is a known size, the compiler works out the offsets and knows where things should be stored in relation to each other.

C

int stack_int = 0x57ac3;

Assembly

mov         w0,#0x7ac3               ; move integer into register 0                             
movk        w0,#0x5, LSL #16                        
str         w0,[sp, #stack_int+0x20] ; store integer into RAM address pointed to by
                                     ; stack pointer (sp) + an offset based on number
                                     ; size. Our number is now in stack memory

Heap Integer

This works by asking for a given amount of memory to be allocated, the kernel allocates the new memory in our address space then hands us back a pointer to it. The address we get back is stored on the stack, and we can load values into this address to store on the heap.

C

int* malloc_int = malloc(16);

malloc_int[0] = 0xa1;

Assembly

mov         x0,#0x1                     ; Set first parameter of malloc to 1 (number of 
                                        ; bytes to allocate)                        
bl          __stubs:__stubs::_malloc    ; call malloc
str         x0,[sp, #malloc_int+0x20]   ; Store return value of malloc (register 0)
                                        ; onto the stack (sp + offset)




ldr         x0,[sp, #malloc_int+0x20]   ; Load heap address (which is stored on stack)
                                        ; into register              
mov         w1,#0xa1                    ; move our integer into register 1              
str         w1,[x0]                     ; Store our integer into the heap address 
                                        ; (integer is now on the heap)

Notice how the bottom two lines are effectively the same as what we did on the stack, everything else is all the extra overhead of using the heap, this is one of the reasons why it's easier to use the stack to store local variables (another obvious one is we don't have to free things on the stack whereas in a real program we would also have to remember to free this memory)

Calling a function with a stack variable

C

func_int(stack_int);

Assembly

ldr         w0,[sp, #stack_int+0x20] ; Set parameter 0 to value stored in the 
                                     ;  address pointed to by stack_int                
bl          func_int                 ; call function

Calling a function with a heap variable

C

func_int(*malloc_int);

Assembly

ldr         x0,[sp, #malloc_int+0x20] ; load the value stored at the malloc_int variable (our
                                      ; heap pointer) into register 0
ldr         w0,[x0]                   ; load the value at our heap address (the actual number) 
                                      ; into register 0 as the parameter to our function.                         
bl          func_int                  ; call function

Calling a function with a static variable

C

func_int(static_int);

Assembly

adrp        x0,0x100008000   ; Since static variables are in the data section,
                             ; we know the precise address (relative to our 
                             ; program) of them, and so we don't need to use the 
                             ; stack pointer, just the program counter which adrp 
                             ; does by default                        
add         x0,x0,#0x0                                    
ldr         w0,[x0]          ; load the value at the data address from memory 
                             ; into register 0 (our function parameter)
bl          func_int         ; call function

Calling a function with a constant variable

func_const(constant_int);

mov         w0,#0xc07   ; A constant variable never changes so we don't need to 
                        ; use addresses, the constant is here in the text section 
                        ; with the code and is loaded by value into register 0
bl          func_const. ; call function

Recap

Using stack variables, we load everything relative to the stack pointer.
With heap variables we first load the address we have on the stack, then load the value from this address
With a static variable we load everything from a known location relative to the program counter (in the data segment)
With a constant variable we can just use the values as is which is stored in the text of the assembly itself.