Thread: Why C only writes string literals in the .data section?

  1. #1
    Registered User
    Join Date
    Oct 2021
    Posts
    140

    Why C only writes string literals in the .data section?

    Hello, you lovely people. I wanted to make a question regarding a behavior in the C programming language.

    So, in C, we have the string literals that are saved in the .data section of the file and a pointer to them is used. Something like
    Code:
    char* name = "John";
    So, my question is... why is this the case (and is it? Or, I don't know something...)? Why other types are not saved and pre-written in the data section when known at compile time, and they are instead pushed (at runtime) in the stack? Is it because string literals are read only? Also, is there a performance hit, regarding caching with the .data section (that will be allocated at runtime) vs the stack?

  2. #2
    Registered User
    Join Date
    Sep 2024
    Posts
    10
    Quote Originally Posted by rempas View Post

    So, in C, we have the string literals that are saved in the .data section of the file and a pointer to them is used. Something like
    Code:
    char* name = "John";
    So, my question is... why is this the case (and is it? Or, I don't know something...)? Why other types are not saved and pre-written in the data section when known at compile time, and they are instead pushed (at runtime) in the stack? Is it because string literals are read only? Also, is there a performance hit, regarding caching with the .data section (that will be allocated at runtime) vs the stack?
    I'm obviously not sure about this but ....

    If the string literal is saved in the .data section, don't you have to push that address to the stack too? My guess would be that initializing variables is costly especially if you have to keep doing it over and over again, for instance, if this function is calling multiple times or even many times.

    In fact, if the string is const, like you have there, it is generally a good idea to tag it 'static' and qualify it as 'const'

    Code:
    static const char *name = "John";
    Last edited by ReDress; 3 Weeks Ago at 07:51 AM.

  3. #3
    Registered User rstanley's Avatar
    Join Date
    Jun 2014
    Location
    New York, NY
    Posts
    1,132
    Quote Originally Posted by ReDress View Post
    I'm obviously not sure about this but ....

    If the string literal is saved in the .data section, don't you have to push that address to the stack too? My guess would be that initializing variables is costly especially if you have to keep doing it over and over again, for instance, if this function is calling multiple times or even many times.

    In fact, if the string is const, like you have there, it is generally a good idea to tag it 'static' and qualify it as 'const'

    Code:
    static const char *name = "John";
    No need for the static keyword.

    char *name = "John";
    name is a pointer to a char, and assigned the address of 'J' in a string that cannot be altered.

    name = "Mary";
    Name can be assigned a different address.

    char * const name = "John";
    name is a const pointer to char. name cannot be assigned a different address.

    char name[20] = "John";
    Now two copies of the string. One constant string, located in the data segment, and name is an Nul terminated array, in a function or as a global array, containing a copy of the string that can be altered.

    A good thorough up to date book on the C Programming Language, would explain this, and all other features of the language.

  4. #4
    Registered User
    Join Date
    Sep 2024
    Posts
    10
    Quote Originally Posted by rstanley View Post

    No need for the static keyword.
    When you write code, it is generally a good idea to follow already established consensus and guidelines for a certain programming language. These are established by consensus and not by a book.

    For instance, this is a Python code for filtering through an array, you can do it in a million different ways, but this is the Pythonic way of doing it.

    Code:
    [x for x in a if x % 2 == 0]
    Thanks you!

  5. #5
    Registered User
    Join Date
    Oct 2021
    Posts
    140
    Quote Originally Posted by ReDress View Post
    If the string literal is saved in the .data section, don't you have to push that address to the stack too? My guess would be that initializing variables is costly especially if you have to keep doing it over and over again, for instance, if this function is calling multiple times or even many times.
    Yes. But you would have to push only the address, instead of all the characters. If you have a string that has 200 characters, this is pushing 200 bytes, vs pushing 8 (addresses are 8 bytes long).

    So, that was my point when asking why other data types are not stored like that, and it only works with strings. I want to learn if there is a practical reason for that or if it's just there for historical reasons because it happened to be like that and people just kept it (which when it comes to C, I know it's the reason for A LOT of things).

  6. #6
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,684
    When your compiler sees
    char *n1 = "John";

    What it's doing behind your back is this.
    Code:
    static const char compiler_generated_name[] = { 'J', 'o', 'h', 'n', '\0' };
    char *n1 = compiler_generated_name;
    In other words, it's an initialised char array with a name you don't get to see.

    The (dis)-advantage of using "string" is that the compiler usually quietly drops complaints about const.

    And because nobody's got time for this level of micro-management, BK/KT decided that when the compiler saw a "string", it would do all the work for you.
    Code:
    const char m1[] = { 'H', 'e', 'l', 'l', 'o', ',', ' ', 
                        'w', 'o', 'r', 'l', 'd', '\n', '\0' };
    printf(m1);
    String constants normally end up in .rodata, not .data (that's why they're constants).
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  7. #7
    Registered User
    Join Date
    May 2012
    Location
    Arizona, USA
    Posts
    959
    As rstanley mentioned, strings with automatic storage are declared like this inside a function:

    Code:
    char name[] = "John";
    (You can leave out the length and it'll be computed for you.)

    When that variable is initialized, the string may be copied from the .rodata section to the automatic variable. With a small string like "Joe" and depending on optimization level on some platforms, GCC may initialize the string by pushing a 4-byte integer to the stack (in little-endian, it's 0x66647626 for "Joe").

    Likewise, automatic variables of structure types may also be initialized from read-only memory or by pushing values directly to the stack. Take the following example:

    Code:
    struct { int i, j, k; } foo = { 1, 2 }; // k is 0
    The object foo may be copied from .rodata each time it's initialized, or the values 1, 2, and 0 (or in reverse order when the stack grows down) may be pushed directly to the stack.

    The point is, automatic variables of some types may be either copied from somewhere (typically .rodata) or be initialized with immediate values. It all depends on what the compiler decides to do, which generally depends on how big the type is, from what I've seen.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. string literals in C#
    By KIBO in forum C# Programming
    Replies: 2
    Last Post: 04-07-2012, 01:12 AM
  2. help! string literals
    By cakestler in forum C Programming
    Replies: 16
    Last Post: 02-05-2009, 11:41 AM
  3. Can't add string literals.
    By computerquip in forum C++ Programming
    Replies: 3
    Last Post: 11-13-2008, 01:20 AM
  4. UTF-8 string literals
    By pdc in forum C Programming
    Replies: 6
    Last Post: 07-28-2005, 02:52 PM
  5. String Literals
    By Trauts in forum C++ Programming
    Replies: 8
    Last Post: 05-17-2003, 08:39 PM

Tags for this Thread