Thread: Creating a long, specific, c string

  1. #1
    Jack of many languages Dino's Avatar
    Join Date
    Nov 2007
    Location
    Chappell Hill, Texas
    Posts
    2,332

    Creating a long, specific, c string

    I need to create a c string that is 1100 bytes long, with a whole lot of blanks in the middle, for a boundary test.

    Is there a way, in C, to define this string at compile time vs generating what I need at runtime?

    I’m looking for something like a duplication factor for the 1098 blanks in between my start character and end character.

    Perhaps a macro?

    And no, I do not care to type, for example, 50 blanks and then concatenation them via copy/paste.

    Thanks.
    Mainframe assembler programmer by trade. C coder when I can.

  2. #2
    Registered User
    Join Date
    Dec 2017
    Posts
    1,633
    I think you mean something like this:
    Code:
    #include <stdio.h>
    #include <string.h>
     
    #define SIZE 1100
     
    #define B10   "          "
    #define B50   B10 B10 B10 B10 B10
    #define B100  B50 B50
    #define B500  B100 B100 B100 B100 B100
    #define B1000 B500 B500
     
    int main() {
        // "x" + 1098 blanks + "y"
        char s[SIZE+1] = "x" B1000 B50 B10 B10 B10 B10 "        y";
     
        printf("%zu\n", strlen(s));
     
        return 0;
    }
    But I don't see why you can't do something like this (even though it's runtime) :
    Code:
        char s[SIZE + 1] = {0};
        memset(s, ' ', SIZE);
        s[0] = 'x';
        s[SIZE - 1] = 'y';
    A little inaccuracy saves tons of explanation. - H.H. Munro

  3. #3
    Jack of many languages Dino's Avatar
    Join Date
    Nov 2007
    Location
    Chappell Hill, Texas
    Posts
    2,332
    Quote Originally Posted by john.c View Post
    I think you mean something like this:
    Not super elegant, but it works well enough. Thanks!
    Mainframe assembler programmer by trade. C coder when I can.

  4. #4
    Registered User
    Join Date
    Feb 2019
    Posts
    1,078
    Just a tiny correction in John's example (which isn't wrong! just redundant)... This:
    Code:
    char s[SIZE + 1] = { 0 };
    Will fill the entire array with zeros, but this:
    Code:
    memset( s, ' ', SIZE );
    Will fill the array again, this time, with spaces.

    The compiler will not "optimize" this. The faster code should be:
    Code:
    char s[SIZE + 1];    // no initializer
    memset( s, ' ', SIZE );
    s[SIZE] = '\0';
    
    s[0] = 'x';
    s[SIZE - 1] = 'y';
    PS: And his solution using macros is very nice!

    []s
    Fred
    Last edited by flp1969; 08-31-2022 at 05:41 AM.

  5. #5
    Jack of many languages Dino's Avatar
    Join Date
    Nov 2007
    Location
    Chappell Hill, Texas
    Posts
    2,332
    In the #define solution above, will the, for instance, B500 fields, since not directly used, be part of the executable, affecting the final size of the executable?

    In assembler, I code this:

    Code:
    Field     DC   C’X’
              DC   1098C’ ‘
              DC   C’Y’
    Mainframe assembler programmer by trade. C coder when I can.

  6. #6
    Registered User
    Join Date
    Feb 2019
    Posts
    1,078
    Quote Originally Posted by Dino View Post
    In the #define solution above, will the, for instance, B500 fields, since not directly used, be part of the executable, affecting the final size of the executable?
    But they are directly used! And, yes, using a literal can increase the executable size. but not always. As an example, these two functions are the same in my machine, using GCC 10.2):
    Code:
    #include <stdio.h>
    #include <string.h>
    
    #define B10 "          "
    #define B50 B10 B10 B10 B10 B10
    #define B100 B50 B50
    
    #define STRSIZE 100
    
    void f( void )
    {
      char s[STRSIZE+1] = B100;
    
      puts( s );
    }
    
    void g( void )
    {
      char s[STRSIZE+1];
    
      memset( s, ' ', sizeof s - 1 );
      s[sizeof s - 1] = '\0';
    
      puts( s );
    }
    Both create the same asm code with maximum optimization:
    Code:
    $ cc -O2 -fomit-frame-pointer -fcf-protection=none -fno-stack-protector -c -o test.o test.c
    Code:
      section .text
    
    f:    ; and g:
      ; Reserve space for s.
      sub     rsp, 120
    
      ; Fill the array s.
      movdqa  xmm0, [spaces]
      mov     rdi, rsp
      mov     DWORD [rsp+96], 0x20202020
      mov     BYTE [rsp+100], 0
      movups  [rsp], xmm0
      movups  [rsp+16], xmm0
      movups  [rsp+32], xmm0
      movups  [rsp+48], xmm0
      movups  [rsp+64], xmm0
      movups  [rsp+80], xmm0
    
      call    puts wrt ..plt
    
      ; Dispose of s.
      add     rsp, 120
      ret
    
      section .rodata
      align 16
    spaces:
      times 2 dq 0x2020202020202020
    In other cases, like this:
    Code:
    #include <stdio.h>
    #include <string.h>
    
    #define STRSIZE 79
    
    #ifdef STATIC
    void f( void )
    {
      char s[STRSIZE+1] = "abcdefghijklmnopqrstuvwxyz"
                          "abcdefghijklmnopqrstuvwxyz"
                          "abcdefghijklmnopqrstuvwxyz";
    
      puts( s );
    }
    #else
    void f( void )
    {
      char s[STRSIZE+1];
      char *p;
    
      s[sizeof s - 1] = '\0';
      p = s;
      for ( int j = 0; j < 3; j++ )
        for ( int i = 'a'; i <= 'z'; i++ )
          *p++ = i;
    
      puts( s );
    }
    #endif
    We have:
    Code:
    $ cc -DSTATIC -O2 -fomit-frame-pointer -fcf-protection=none -fno-stack-protector -c -o test1.o test.c
    $ cc -O2 -fomit-frame-pointer -fcf-protection=none -fno-stack-protector -c -o test2.o test.c
    $ objdump -x test1.o | sed -n '/^Sec/,/^SYM/p'
    Sections:
    Idx Name          Siz.      VMA               LMA               File off  Algn
      0 .text         00000051  0000000000000000  0000000000000000  00000040  2**4
                      CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
      1 .data         00000000  0000000000000000  0000000000000000  00000091  2**0
                      CONTENTS, ALLOC, LOAD, DATA
      2 .bss          00000000  0000000000000000  0000000000000000  00000091  2**0
                      ALLOC
      3 .rodata.cst16 00000050  0000000000000000  0000000000000000  000000a0  2**4
                      CONTENTS, ALLOC, LOAD, READONLY, DATA
      4 .comment      00000028  0000000000000000  0000000000000000  000000f0  2**0
                      CONTENTS, READONLY
      5 .note.GNU-stack 00000000  0000000000000000  0000000000000000  00000118  2**0
                      CONTENTS, READONLY
      6 .eh_frame     00000030  0000000000000000  0000000000000000  00000118  2**3
                      CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
    SYMBOL TABLE:
    
    $ objdump -x test2.o | sed -n '/^Sec/,/^SYM/p'
    Sections:
    Idx Name          Siz.      VMA               LMA               File off  Algn
      0 .text         00000049  0000000000000000  0000000000000000  00000040  2**4
                      CONTENTS, ALLOC, LOAD, RELOC, READONLY, CODE
      1 .data         00000000  0000000000000000  0000000000000000  00000089  2**0
                      CONTENTS, ALLOC, LOAD, DATA
      2 .bss          00000000  0000000000000000  0000000000000000  00000089  2**0
                      ALLOC
      3 .comment      00000028  0000000000000000  0000000000000000  00000089  2**0
                      CONTENTS, READONLY
      4 .note.GNU-stack 00000000  0000000000000000  0000000000000000  000000b1  2**0
                      CONTENTS, READONLY
      5 .eh_frame     00000030  0000000000000000  0000000000000000  000000b8  2**3
                      CONTENTS, ALLOC, LOAD, RELOC, READONLY, DATA
    SYMBOL TABLE:
    Notice the extra .rodata section with 80 bytes (values in hexadecimal) on test1.o.

    []s
    Fred
    Last edited by flp1969; 08-31-2022 at 07:44 AM.

  7. #7
    Registered User
    Join Date
    Dec 2017
    Posts
    1,633
    The only thing defines add is the textual replacement that they do. The defines themselves don't add anything to a C program. They completely disappear after the preprocessing step of compilation.

    @ftp, I agree that it's silly to fill a large array with zeroes just to set the last element to zero!
    A little inaccuracy saves tons of explanation. - H.H. Munro

  8. #8
    Jack of many languages Dino's Avatar
    Join Date
    Nov 2007
    Location
    Chappell Hill, Texas
    Posts
    2,332
    Quote Originally Posted by john.c View Post
    The only thing defines add is the textual replacement that they do. The defines themselves don't add anything to a C program. They completely disappear after the preprocessing step of compilation…
    That’s what I was expecting, but wasn’t certain. Thx
    Mainframe assembler programmer by trade. C coder when I can.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Creating files in a specific directory
    By hinesro in forum C Programming
    Replies: 5
    Last Post: 02-01-2015, 01:36 AM
  2. Replies: 16
    Last Post: 08-01-2012, 12:14 AM
  3. problem creating program to the specific specification
    By rushhour in forum C++ Programming
    Replies: 22
    Last Post: 11-28-2008, 12:15 AM
  4. unsigned long long to string conversion
    By Wiretron in forum C++ Programming
    Replies: 6
    Last Post: 12-21-2007, 04:02 AM
  5. Creating files in specific directories
    By Kyoto Oshiro in forum C++ Programming
    Replies: 2
    Last Post: 03-06-2002, 08:50 PM

Tags for this Thread