Academic Question About Compilers

This is a discussion on Academic Question About Compilers within the C Programming forums, part of the General Programming Boards category; How would something like this: int x; char hello[6] = "hello"; x = atoi(hello); be compiled? I'm asking what the ...

  1. #1
    Registered User
    Join Date
    Aug 2007
    Posts
    81

    Academic Question About Compilers

    How would something like this:

    int x;
    char hello[6] = "hello";
    x = atoi(hello);

    be compiled? I'm asking what the compiler will do internally to compile something like this. From generation of symbols, the code to pass parameters to library functions, how the stack is involved etc.

    Thanks a lot!

  2. #2
    Fountain of knowledge.
    Join Date
    May 2006
    Posts
    794
    Maybe there is an option on your compiler to produce the assembler code?

  3. #3
    Fountain of knowledge.
    Join Date
    May 2006
    Posts
    794
    Anyway I was interested in having a go at producing the assembler code myself
    I tried it on this short program I wrote earlier.

    Code:
    #include <stdio.h>
    #include <string.h>
    #include <stdlib.h>
    
    main(argc,argv)
    	int  argc;
    	char *argv[]; 
    {
    	system("testbat");
    }
    compiled with


    Code:
    set DJGPP=C:\DJGPP\DJGPP.ENV
    set PATH=C:\DJGPP\BIN;%PATH%
    
    gcc -S  testb.c
    Which produces this:-
    Which may answer some of your question (or not).

    Code:
    	.file	"testb.c"
    	.section .text
    LC0:
    	.ascii "testbat\0"
    .globl _main
    _main:
    	pushl	%ebp
    	movl	%esp, %ebp
    	subl	$8, %esp
    	andl	$-16, %esp
    	movl	$0, %eax
    	addl	$15, %eax
    	addl	$15, %eax
    	shrl	$4, %eax
    	sall	$4, %eax
    	subl	%eax, %esp
    	subl	$12, %esp
    	pushl	$LC0
    	call	_system
    	addl	$16, %esp
    	leave
    	ret
    	.ident	"GCC: (GNU) 4.0.1"
    Not sure what all that is about lol,

    Code:
    pushl	$LC0
    	call	_system
    Seems to be the passing of parameter to the 'system' call, I think the other stuff
    is to do with the program arguements.

    Anyone know what 'leave' does?

  4. #4
    Registered User
    Join Date
    Oct 2001
    Posts
    2,129
    Code:
    main(argc,argv)
    	int  argc;
    	char *argv[];
    Don't use the pre-ANSI way of function signatures. Use this:
    Code:
    int main(int argc,char *argv[] )
    {
         // code
    }
    Code:
    int x; // int x is put ion the stack
    char hello[6] = "hello"; // some memory of 6 bytes is put on the stack, and hello is copied into it.
    x = atoi(hello); // address of hello is put in a register, then atoi is called, 
    //then the address of x is put in a register and the return value register (whatever that is) is put into the address of x
    this is just the usual way of doing things, not how a compiler must do. it might change with different optimizations or compilers.
    Last edited by robwhit; 02-20-2008 at 09:04 PM.

  5. #5
    Captain Crash brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,248
    Quote Originally Posted by esbo View Post
    Code:
    	.file	"testb.c"
    	.section .text
    .file is an informational directive which tells the assembler what the original source file name is. .section .text tells the assembler to place the following data and instructions into the object section called ".text."

    Code:
    LC0:
    	.ascii "testbat\0"
    LC0 is a label generated by the compiler which is used to reference the string literal "testbat\0", which you used in the source code. In other words, it is storing the string literal and giving it the name "LC0."

    Code:
    .globl _main
    _main:
    .globl tells the assembler that the symbol "_main" should be exported for linking.

    Code:
    	pushl	%ebp
    	movl	%esp, %ebp

    Standard function prolog. Save the base pointer of the parent function, and set up a new base pointer at the current stack frame.

    Code:
    	subl	$8, %esp
    	andl	$-16, %esp
    	movl	$0, %eax
    	addl	$15, %eax
    	addl	$15, %eax
    	shrl	$4, %eax
    	sall	$4, %eax
    	subl	%eax, %esp
    	subl	$12, %esp
    A lot of manipulation of the stack pointer. This is creating space for local variables and making sure the stack is aligned properly. If you had compiled with optimization, this code should be much simpler than this.

    Code:
    	pushl	$LC0
    Push a pointer to the string literal "testbat\0" onto the stack in preparation for the function call:

    Code:
    	call	_system
    Call the system() function.

    Code:
    	addl	$16, %esp
    Fix up the stack pointer, discarding the previously pushed parameters as well as the local variables.

    Code:
    	leave
    This undoes the effect of the function prolog.

    Code:
    	ret
    And return from main().

    Code:
    	.ident	"GCC: (GNU) 4.0.1"
    Identify the compiler program.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. C++ Builder Comparison
    By ryanlcs in forum Tech Board
    Replies: 14
    Last Post: 08-20-2006, 09:56 AM
  2. MS Academic License for VS.NET Pro
    By lightatdawn in forum Tech Board
    Replies: 4
    Last Post: 03-27-2004, 07:48 AM
  3. Question...
    By TechWins in forum A Brief History of Cprogramming.com
    Replies: 16
    Last Post: 07-28-2003, 09:47 PM
  4. opengl DC question
    By SAMSAM in forum Game Programming
    Replies: 6
    Last Post: 02-26-2003, 08:22 PM
  5. Compilers, Compilers, Compilers
    By Stan100 in forum C++ Programming
    Replies: 11
    Last Post: 11-08-2002, 03:21 PM

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21