Thread: Local Variable Storage

  1. #1
    Registered User
    Join Date
    Oct 2022
    Posts
    102

    Local Variable Storage

    Hello,
    I am working on understanding C memory layout using a GCC compiler on Ubuntu (installed in a virtual machine on Windows 11).

    Here is the C code I am working with:
    Code:
    #include <stdio.h>
    
    
    int global_var = 10;           // Global variable (initialized)
    int global_uninit_var;         // Global variable (uninitialized)
    
    
    void test_function() {
        int local_var = 20;        // Local variable
        static int static_var = 30; // Static local variable (initialized)
        printf("Local Variable: %d\n", local_var);
    }
    
    
    int main() {
        int main_local_var = 40;   // Local variable in main
        printf("Global Variable: %d\n", global_var);
        printf("Uninitialized Global Variable: %d\n", global_uninit_var);
        test_function();
        printf("Main Local Variable: %d\n", main_local_var);
        return 0;
    }

    Here is what I have done so far to examine the memory layout:


    1. Compiled the program with gcc

    Code:
    gcc -c memory_layout.c
    2. Ran the objdump command to examine the symbol table
    Code:
    objdump -t memory_layout.o
    Output from objdump:

    Code:
     memory_layout.o:     file format elf64-x86-64
    
    
    SYMBOL TABLE:
    0000000000000000 l    df *ABS*    0000000000000000 memory_layout.c
    0000000000000000 l    d  .text    0000000000000000 .text
    0000000000000000 l    d  .rodata  0000000000000000 .rodata
    0000000000000004 l     O .data    0000000000000004 static_var.0
    0000000000000000 g     O .data    0000000000000004 global_var
    0000000000000000 g     O .bss     0000000000000004 global_uninit_var
    0000000000000000 g     F .text    000000000000002f test_function
    0000000000000000         *UND*    0000000000000000 printf
    000000000000002f g     F .text    0000000000000075 main

    From my understanding:

    • Global variables like global_var (initialized) are stored in the .data segment, and uninitialized global variables like global_uninit_var are stored in the .bss segment.
    • Static local variables (like static_var in test_function) are also stored in the .data segment since they persist across function calls.


    However, I noticed that local variables (like local_var in test_function and main_local_var in main) do not appear in the symbol table. I believe this is because local variables are dynamically allocated in the stack segment during runtime

    How to confirm/ verify that local variable is stored in stack memory ?

  2. #2
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,686
    Essentially correct, except that "like static_var in test_function" may end up in .bss if your initialiser was zero.

    Uninitialised globals are default initialised to zeros.
    Anything that ends up in .bss will be zeroed (because .bss is allocated and zeroed by the program loader).

    > How to confirm/ verify that local variable is stored in stack memory ?
    Disassemble the code (objdump -d) to see stack relative operations for local variables.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  3. #3
    Registered User
    Join Date
    Oct 2022
    Posts
    102
    Quote Originally Posted by Salem View Post
    > How to confirm/ verify that local variable is stored in stack memory ?
    Disassemble the code (objdump -d) to see stack relative operations for local variables.
    Thank you for your earlier insights. I followed your suggestion and disassembled my code using objdump -d.

    I’m still unclear about where exactly the local variables are stored in memory based on the output.

    Code:
    objdump -d memory_layout.o
    
    memory_layout.o:     file format elf64-x86-64
    
    
    
    
    Disassembly of section .text:
    
    
    0000000000000000 <test_function>:
       0:	f3 0f 1e fa          	endbr64
       4:	55                   	push   %rbp
       5:	48 89 e5             	mov    %rsp,%rbp
       8:	48 83 ec 10          	sub    $0x10,%rsp
       c:	c7 45 fc 14 00 00 00 	movl   $0x14,-0x4(%rbp)
      13:	8b 45 fc             	mov    -0x4(%rbp),%eax
      16:	89 c6                	mov    %eax,%esi
      18:	48 8d 05 00 00 00 00 	lea    0x0(%rip),%rax        # 1f <test_function+0x1f>
      1f:	48 89 c7             	mov    %rax,%rdi
      22:	b8 00 00 00 00       	mov    $0x0,%eax
      27:	e8 00 00 00 00       	call   2c <test_function+0x2c>
      2c:	90                   	nop
      2d:	c9                   	leave
      2e:	c3                   	ret
    
    
    000000000000002f <main>:
      2f:	f3 0f 1e fa          	endbr64
      33:	55                   	push   %rbp
      34:	48 89 e5             	mov    %rsp,%rbp
      37:	48 83 ec 10          	sub    $0x10,%rsp
      3b:	c7 45 fc 28 00 00 00 	movl   $0x28,-0x4(%rbp)
      42:	8b 05 00 00 00 00    	mov    0x0(%rip),%eax        # 48 <main+0x19>
      48:	89 c6                	mov    %eax,%esi
      4a:	48 8d 05 00 00 00 00 	lea    0x0(%rip),%rax        # 51 <main+0x22>
      51:	48 89 c7             	mov    %rax,%rdi
      54:	b8 00 00 00 00       	mov    $0x0,%eax
      59:	e8 00 00 00 00       	call   5e <main+0x2f>
      5e:	8b 05 00 00 00 00    	mov    0x0(%rip),%eax        # 64 <main+0x35>
      64:	89 c6                	mov    %eax,%esi
      66:	48 8d 05 00 00 00 00 	lea    0x0(%rip),%rax        # 6d <main+0x3e>
      6d:	48 89 c7             	mov    %rax,%rdi
      70:	b8 00 00 00 00       	mov    $0x0,%eax
      75:	e8 00 00 00 00       	call   7a <main+0x4b>
      7a:	b8 00 00 00 00       	mov    $0x0,%eax
      7f:	e8 00 00 00 00       	call   84 <main+0x55>
      84:	8b 45 fc             	mov    -0x4(%rbp),%eax
      87:	89 c6                	mov    %eax,%esi
      89:	48 8d 05 00 00 00 00 	lea    0x0(%rip),%rax        # 90 <main+0x61>
      90:	48 89 c7             	mov    %rax,%rdi
      93:	b8 00 00 00 00       	mov    $0x0,%eax
      98:	e8 00 00 00 00       	call   9d <main+0x6e>
      9d:	b8 00 00 00 00       	mov    $0x0,%eax
      a2:	c9                   	leave
      a3:	c3                   	ret

  4. #4
    Registered User
    Join Date
    Apr 2021
    Posts
    148
    Quote Originally Posted by Kittu20 View Post
    I’m still unclear about where exactly the local variables are stored in memory based on the output.

    Code:
    objdump -d memory_layout.o
    
    memory_layout.o:     file format elf64-x86-64
    
    
    
    
    Disassembly of section .text:
    
    
    0000000000000000 <test_function>:
       0:    f3 0f 1e fa              endbr64
       4:    55                       push   %rbp
       5:    48 89 e5                 mov    %rsp,%rbp
       8:    48 83 ec 10              sub    $0x10,%rsp
       c:    c7 45 fc 14 00 00 00     movl   $0x14,-0x4(%rbp)
      13:    8b 45 fc                 mov    -0x4(%rbp),%eax
      16:    89 c6                    mov    %eax,%esi
      18:    48 8d 05 00 00 00 00     lea    0x0(%rip),%rax        # 1f <test_function+0x1f>
      1f:    48 89 c7                 mov    %rax,%rdi
      22:    b8 00 00 00 00           mov    $0x0,%eax
      27:    e8 00 00 00 00           call   2c <test_function+0x2c>
      2c:    90                       nop
      2d:    c9                       leave
      2e:    c3                       ret
    
    
    000000000000002f <main>:
      2f:    f3 0f 1e fa              endbr64
      33:    55                       push   %rbp
      34:    48 89 e5                 mov    %rsp,%rbp
      37:    48 83 ec 10              sub    $0x10,%rsp
      3b:    c7 45 fc 28 00 00 00     movl   $0x28,-0x4(%rbp)
      42:    8b 05 00 00 00 00        mov    0x0(%rip),%eax        # 48 <main+0x19>
      48:    89 c6                    mov    %eax,%esi
      4a:    48 8d 05 00 00 00 00     lea    0x0(%rip),%rax        # 51 <main+0x22>
      51:    48 89 c7                 mov    %rax,%rdi
      54:    b8 00 00 00 00           mov    $0x0,%eax
      59:    e8 00 00 00 00           call   5e <main+0x2f>
      5e:    8b 05 00 00 00 00        mov    0x0(%rip),%eax        # 64 <main+0x35>
      64:    89 c6                    mov    %eax,%esi
      66:    48 8d 05 00 00 00 00     lea    0x0(%rip),%rax        # 6d <main+0x3e>
      6d:    48 89 c7                 mov    %rax,%rdi
      70:    b8 00 00 00 00           mov    $0x0,%eax
      75:    e8 00 00 00 00           call   7a <main+0x4b>
      7a:    b8 00 00 00 00           mov    $0x0,%eax
      7f:    e8 00 00 00 00           call   84 <main+0x55>
      84:    8b 45 fc                 mov    -0x4(%rbp),%eax
      87:    89 c6                    mov    %eax,%esi
      89:    48 8d 05 00 00 00 00     lea    0x0(%rip),%rax        # 90 <main+0x61>
      90:    48 89 c7                 mov    %rax,%rdi
      93:    b8 00 00 00 00           mov    $0x0,%eax
      98:    e8 00 00 00 00           call   9d <main+0x6e>
      9d:    b8 00 00 00 00           mov    $0x0,%eax
      a2:    c9                       leave
      a3:    c3                       ret
    To answer your question, they're stored right here:

    Code:
       8:    48 83 ec 10              sub    $0x10,%rsp
    Here's a WikiPedia article that covers the material: Call stack - Wikipedia

    Local variables don't have "a" location in memory, the way globals do. They might be used in a recursive function, meaning there might need to be an arbitrary number of the same local variable, one for each level of recursive call. Consider this strange definition:

    Code:
    int factorial(int n) {
        if (n == 0) return 1;
    
        int result = factorial(n - 1);
        result *= n;
        return result;
    }
    Now, in reality you probably wouldn't write the code that way, and the compiler would almost certainly optimize away the local variable. (Or convert it to a register, at least.)

    But if you had some kind of "do it exactly as I coded it" option on the compiler, you might get something like this for factorial(3):

    Code:
        factorial(3):
            result(local) = 2!    // This is on the call stack at some location, 0x100
            factorial(2):
                result(local) = 1!    // This is on the call stack at some location, 0xE0
                factorial(1):
                    result(local) = 0!    // This is on the call stack at some location, 0xC0
                    factorial(0):
                        immediate return, result is never initialized
    The recursive function will have (in my example) 4 stack frames, with three of them containing "initialized" values for the result local variable. (In fact, they won't be initialized until all the child calls return. But you see where I'm going, I hope.)

    So each local variable needs to have a separate existence, for each "level" of nested recursive call, all the way down until the exit condition is satisfied.

    Having "a" location in memory for a local variable won't do. There needs to be "a" location in memory for each separate instance of the local variable -- that is, for each nested recursive call, a separate copy or instance of the local variable(s).

    This is part of why they are stored on the stack. So there can be multiple instances that all exist at the same time.

    For a large machine (32-bit desktop, or bigger) the compiler will generally figure out how much space (in bytes) it needs for all local storage. Sometimes the same bytes will get re-used as a different variable. (This generally suggests your functions are too long! Having local variables die out and get recycled into different local variables suggests you have "changed topics" and maybe you need a separate function.)

    So the compiler just says "I need LS bytes of local storage" and subtracts that many bytes from the stack pointer (on Intel machines, the stack "grows" in the negative direction). So that "sub(tract)" operation was an efficient way to allocate $0x10 (= sixteen) bytes of local variables.

  5. #5
    Registered User rstanley's Avatar
    Join Date
    Jun 2014
    Location
    New York, NY
    Posts
    1,133
    Quote Originally Posted by Kittu20 View Post
    Thank you for your earlier insights. I followed your suggestion and disassembled my code using objdump -d.

    I’m still unclear about where exactly the local variables are stored in memory based on the output.
    You don't need to examine the details to prove where variables are located. The simple rules of C guarantees that:

    Non-static "Automatic" variables such as "uval" and "ival" defined in a function as in the following code, are located in the stack frame for the function call, and each call to that function creates different copies of those variables. The same goes for formal parameters of the function.

    The addresses of these variables would be different for each function call stack frame.

    Static variables, such as "call" in the code below, HAS to be located in the global data segment, and only ONE copy of the static variable is available in each function call stack frame!

    The address of "call" in each function call stack frame would all be the exact same address!

    Code:
    void foo(int x, float f)
    {
      // uval & ival are stored in the stack frame for each function call
      int uval;              // Unitiialized variable (Do not do this!!!)
      int ival = 10;         // Properly initialized variable 
      static int call = 1;   // Static variable retains value (Global data storage)
    
      // ...
    }
    We just accept these rules, as with all the rest of the rules defined in the C Standards, C89/90, C99, C11, C17, and the upcoming, C23, in a Standards Complaint compiler, such as gcc. We don't need to prove them, we just use them in the programs we write.

    Also, please look at this Cheatsheet for the details of an Elf format executable.
    Last edited by rstanley; 4 Weeks Ago at 09:14 AM. Reason: Added a link

  6. #6
    Registered User
    Join Date
    May 2012
    Location
    Arizona, USA
    Posts
    959
    Quote Originally Posted by rstanley View Post
    You don't need to examine the details to prove where variables are located. The simple rules of C guarantees that:

    Non-static "Automatic" variables such as "uval" and "ival" defined in a function as in the following code, are located in the stack frame for the function call, and each call to that function creates different copies of those variables. The same goes for formal parameters of the function.
    I'd be careful calling them the "rules" of C. The C standards (C89, C99, C11, C17, C2x) are really the only official "rules" of C, and they make no mention of a stack at all. It's very common for implementations to use a stack (usually a hardware stack, or a software stack if there's no suitable hardware stack) for automatic variables and function arguments, but it's not guaranteed. C running on a Lisp Machine, for example, might use a list of objects in place of a stack.

    Side note: some implementations of C for systems that do have a hardware stack choose to use a software stack instead. The MOS 6502 has a hardware stack, but it's only 256 bytes in size, so CC65 uses a larger but slower software stack. SDCC for the 6502, on the other hand, uses the hardware stack but (I believe) only for automatic variables in recursive functions and for return addresses (and for saving short-lived temporary values inside functions); for non-recursive functions, it allocates automatic variables in static storage during compilation (it actually reuses the same space for multiple functions' automatic variables if it knows their lifetimes do not overlap).

  7. #7
    Registered User
    Join Date
    Oct 2022
    Posts
    102
    Thank you for your continued help. I wanted to clarify that my main goal is to inspect where different types of variables (global, local, static, extern, volatile, constant, pointer) and functions are stored in memory.


    From my research, IÂ’ve found that we can examine this information through object files, map files, and ELF files, which reveal how memory is allocated across different sections like .data, .bss, and .text. IÂ’ve been using tools like objdump, nm, to view where global and static variables are stored.


    My primary interest is in inspecting where variables like local, extern, volatile, constant, pointers, and functions are stored in memory during runtime. I understand that we can also print variable addresses directly within the program to infer which memory region they belong to based on their address ranges.


    Tools like objdump and nm provide an overview of how memory is allocated at compile time (e.g., which variables go to .bss, .data, or .text segments). However, these tools don't show runtime behavior.

    While the code is running, tools like valgrind might be use to inspect the actual memory usage, including stack, heap,


    someone can confirm if this approach covers everything I need for in-depth memory inspection. Are there any other tools or methods you would recommend for inspecting memory regions ?

  8. #8
    Registered User rstanley's Avatar
    Join Date
    Jun 2014
    Location
    New York, NY
    Posts
    1,133
    Quote Originally Posted by Kittu20 View Post
    Thank you for your continued help. I wanted to clarify that my main goal is to inspect where different types of variables (global, local, static, extern, volatile, constant, pointer) and functions are stored in memory.


    From my research, IÂ’ve found that we can examine this information through object files, map files, and ELF files, which reveal how memory is allocated across different sections like .data, .bss, and .text. IÂ’ve been using tools like objdump, nm, to view where global and static variables are stored.


    My primary interest is in inspecting where variables like local, extern, volatile, constant, pointers, and functions are stored in memory during runtime. I understand that we can also print variable addresses directly within the program to infer which memory region they belong to based on their address ranges.


    Tools like objdump and nm provide an overview of how memory is allocated at compile time (e.g., which variables go to .bss, .data, or .text segments). However, these tools don't show runtime behavior.

    While the code is running, tools like valgrind might be use to inspect the actual memory usage, including stack, heap,


    someone can confirm if this approach covers everything I need for in-depth memory inspection. Are there any other tools or methods you would recommend for inspecting memory regions ?
    I would suggest using the GNU, gdb debugger, rather than valgrind. There are many tutorials available online along with the GNU documentation. This would allow you to step through the execution of a program and see the details of the executable.

    There is also, Code::Blocks, an IDE, (Integrated Development Environment) that provides a GUI interface for gdb.

    A good up to date book on the C Programming Language would explain all the details of the language.

  9. #9
    Registered User rstanley's Avatar
    Join Date
    Jun 2014
    Location
    New York, NY
    Posts
    1,133
    Quote Originally Posted by christop View Post
    I'd be careful calling them the "rules" of C. The C standards (C89, C99, C11, C17, C2x) are really the only official "rules" of C, and they make no mention of a stack at all. It's very common for implementations to use a stack (usually a hardware stack, or a software stack if there's no suitable hardware stack) for automatic variables and function arguments, but it's not guaranteed. C running on a Lisp Machine, for example, might use a list of objects in place of a stack.

    Side note: some implementations of C for systems that do have a hardware stack choose to use a software stack instead. The MOS 6502 has a hardware stack, but it's only 256 bytes in size, so CC65 uses a larger but slower software stack. SDCC for the 6502, on the other hand, uses the hardware stack but (I believe) only for automatic variables in recursive functions and for return addresses (and for saving short-lived temporary values inside functions); for non-recursive functions, it allocates automatic variables in static storage during compilation (it actually reuses the same space for multiple functions' automatic variables if it knows their lifetimes do not overlap).
    C Language vs. Executable file format
    Apples vs. Oranges

    At no time did I discuss the implementation of the "stack", "stack frames", "heap", "code segment", "bss segment", etc... I was simply discussing the "Concepts" and "Rules" of the C Programming Language.

    Yes, the "Stack" is never mentioned in the C Standards, as it is NOT part of the language! The Operating System, and the Executable File Format defined for that O/S defines the details of how the executable is laid out, such as how the stack is implemented, hardware, software, "List of objects", etc... In the case of Linux, it is the Elf executable format, described in the "Cheatsheet" that I provided in my previous posting. Other O/S's may define the executable file format, and memory layout much different.

    As a programmer, we don't care if the stack, is a hardware, software, object list, or any other "Implementation"! We expect that an automatic variable will work the same on ALL S. C. compilers! The same with all other "features", and rules of the most portable language ever created.

  10. #10
    Registered User
    Join Date
    Sep 2024
    Posts
    10
    I was of the opinion that inspecting this using a debugger would be a good if not a better idea.

  11. #11
    Registered User
    Join Date
    May 2012
    Location
    Arizona, USA
    Posts
    959
    Quote Originally Posted by rstanley View Post
    At no time did I discuss the implementation of the "stack", "stack frames", "heap", "code segment", "bss segment", etc... I was simply discussing the "Concepts" and "Rules" of the C Programming Language.
    Right, but the "rules" of C are defined by the Standards, not by any particular implementation or "concept" of the language. There's just too many different implementations and sets of concepts of C to call any one of them the rule. (GCC isn't the rule, Clang isn't the rule, and MSVC very much isn't the rule, thank goodness!)

    If you had instead said that many (or most) implementations do have a concept of a stack and allocate automatic variables in a function's stack frame (since it's a relatively obvious way to implement automatic variables on a lot of hardware), I'd have no objection. But simply calling something a "rule" doesn't make it one. It's a slippery slope from here to the writings of Herbert Schildt.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 4
    Last Post: 01-14-2019, 01:36 PM
  2. thread local storage wasn't freed properly
    By icegood in forum C Programming
    Replies: 9
    Last Post: 12-14-2012, 09:45 AM
  3. Static Local Variable vs. Global Variable
    By arpsmack in forum C Programming
    Replies: 7
    Last Post: 08-21-2008, 03:35 AM
  4. Is getting object from Thread local storage expensive?
    By 6tr6tr in forum C++ Programming
    Replies: 2
    Last Post: 04-21-2008, 08:08 AM
  5. Thread Local Storage
    By Hunter2 in forum Windows Programming
    Replies: 6
    Last Post: 07-06-2004, 04:00 PM

Tags for this Thread