Thread: Full break down of high level language to executed code

  1. #1
    Registered User
    Join Date
    Apr 2010
    Location
    Vancouver
    Posts
    132

    Full break down of high level language to executed code

    Every time I think I know the full process of turning source code into the thing that get runs, it turns out I'm missing a step (this time it was microcode). So the complete hierarchy, soup to nuts, goes

    Source code -> preprocessor -> compiler -> assembler -> linker -> OS loader -> microcoder?

    Few questions:
    1)Does the linker produce machine code?
    2)What is the device that turns machine code into microcode called? Or is it just done by the CPU?
    3)Am I missing any steps?
    4)I used to think programing was giving the human readable source code to the compiler which turned it into 0s and 1s that the computer understands. Is this 100% wrong? For example is everything in a computer technically 0s and 1s or is there actually a step where a distinction can be made?

  2. #2
    Registered User
    Join Date
    Apr 2010
    Location
    Vancouver
    Posts
    132
    Isn't it misleading that people say there is a "one-to-one correspondence between assembly language instructions and machine language instructions" because there may not be a one-to-one correspondence between machine code and microcode? So you can't really say that given the assembly to a program you know exactly what a computer is going to do.

  3. #3
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    1. Traditionally, the linker links machine code from object files to produce a binary image. The machine code was already generated by the assembler. These days the line between compiler/assembler/linker is blurrier, with link-time code generation and whole program optimizing compilers.
    2. The CPU transforms machine instructions into some completely proprietary internal format called microcode. Or not. Different approaches are taken. The point is to translate the front-end instruction format into the signals necessary to drive the rest of the CPU components to implement that instruction. The device within the CPU doing that is called the instruction decoder (heh).
    3. You aren't missing a step, you inserted an extra step of "microcoder." As said, the CPU does this internally.
    4. I think it is wrong, in a way. The compiler doesn't really turn something that isn't 0's and 1's, into 0's and 1's. Everything going on inside that box is just 0's and 1's all the time. The existance of anything else is just an illusion presented to you on the screen, all driven by hardware that works with nothing but on and off signals. The non-binary representation is all in your head.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  4. #4
    Registered User
    Join Date
    Apr 2010
    Location
    Vancouver
    Posts
    132
    Quote Originally Posted by brewbuck View Post
    2. The CPU transforms machine instructions into some completely proprietary internal format called microcode.
    But assembly is already architecture specific.

  5. #5
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by c_weed View Post
    But assembly is already architecture specific.
    The two things serve different purposes. Machine language is expected to not change drastically over multiple revisions of a processor family, and it's publicly documented (otherwise nobody could use their CPU...) Microcode is the glue between the publicly known machine language and the actual way the processor works internally. It is thus very specific to the particular model of CPU in a way that a normal programmer is simply not interested in. We want the machine language/assembly language to be stable so we don't have to keep learning it over again. The manufacturer has to actually make the CPU *work*.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  6. #6
    Registered User
    Join Date
    Jun 2005
    Posts
    6,815
    Microcode translates machine instructions into a series of operations within circuits of the CPU. That allows behaviour of application and operating system software to to be independent of the internal electronic functions within the CPU.

    If software and operating systems had to work by directly triggering CPU electronics, even minor changes of the circuitry within a CPU would break software. Every operating system and every application program would have to be rebuilt from source for (almost) every change of CPU electronics.

    Another thing that is possible with microcode is changing a CPU's instruction set, without having to fiddle with the CPU electronics. So it is possible for a manufacturer to only physically manufacture one CPU type, but sell different CPUs that have completely different instruction sets. That gives economy of scale for the device manufacture.
    Right 98% of the time, and don't care about the other 3%.

    If I seem grumpy or unhelpful in reply to you, or tell you you need to demonstrate more effort before you can expect help, it is likely you deserve it. Suck it up, Buttercup, and read this, this, and this before posting again.

  7. #7
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    These kinds of technical questions, I think, lends themselves more to TechTalk instead of a specific language. Just a FYI. The language boards are most about discussing the languages themselves, and compilers, linkers, assemblers, CPUs, etc are all outside the range of languages.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  8. #8
    Registered User
    Join Date
    Apr 2010
    Location
    Vancouver
    Posts
    132
    Quote Originally Posted by grumpy View Post
    If software and operating systems had to work by directly triggering CPU electronics, even minor changes of the circuitry within a CPU would break software. Every operating system and every application program would have to be rebuilt from source for (almost) every change of CPU electronics.
    Couldn't this is achieved by high level languages (such as C, C++) in the sense they provide another layer of abstraction so that a programer only needs to know the high level language and let the compiler worry about outputting assembly targeted at the specific platform?

  9. #9
    Registered User
    Join Date
    Apr 2010
    Location
    Vancouver
    Posts
    132
    Quote Originally Posted by brewbuck View Post
    The two things serve different purposes. Machine language is expected to not change drastically over multiple revisions of a processor family, and it's publicly documented (otherwise nobody could use their CPU...) Microcode is the glue between the publicly known machine language and the actual way the processor works internally. It is thus very specific to the particular model of CPU in a way that a normal programmer is simply not interested in. We want the machine language/assembly language to be stable so we don't have to keep learning it over again. The manufacturer has to actually make the CPU *work*.
    So in a nut shell the reason is, while assembly is fairly specific compared to higher level languages (e.g. specific to an architecture, like x86) microcode is EVEN MORE SPECIFIC where it is drastically different between even generations of a processor (e.g. an Intel i5 with Sandy Bridge would have different architecture than an Intel i5 with Ivy Bridge)?

  10. #10
    C++まいる!Cをこわせ!
    Join Date
    Oct 2007
    Location
    Inside my computer
    Posts
    24,654
    Quote Originally Posted by c_weed View Post
    Couldn't this is achieved by high level languages (such as C, C++) in the sense they provide another layer of abstraction so that a programer only needs to know the high level language and let the compiler worry about outputting assembly targeted at the specific platform?
    Then the compiler vendors would have to rewrite their compiler every time the instruction set changes. How long did it take all popular compilers right now to produce good assembly?

    Quote Originally Posted by c_weed View Post
    So in a nut shell the reason is, while assembly is fairly specific compared to higher level languages (e.g. specific to an architecture, like x86) microcode is EVEN MORE SPECIFIC where it is drastically different between even generations of a processor (e.g. an Intel i5 with Sandy Bridge would have different architecture than an Intel i5 with Ivy Bridge)?
    Microcodes are most likely broken into very small, specific instructions, that's true.
    However, I seriously doubt that they change drastically between every generation. This is the same argument as when the high-level instruction set changes.
    The internal architecture is made to work with the micro-ops the processor generates, so if those micro-ops change drastically, then the entire architecture needs to be reworked. Sounds like a lot of work, doesn't it?
    Remember: hardware can also be seen as a software architecture--it is also written in a high-level language (VHDL or Verilog are the mist popular), so there is a lot of legacy code, I'll bet.
    Quote Originally Posted by Adak View Post
    io.h certainly IS included in some modern compilers. It is no longer part of the standard for C, but it is nevertheless, included in the very latest Pelles C versions.
    Quote Originally Posted by Salem View Post
    You mean it's included as a crutch to help ancient programmers limp along without them having to relearn too much.

    Outside of your DOS world, your header file is meaningless.

  11. #11
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by c_weed View Post
    So in a nut shell the reason is, while assembly is fairly specific compared to higher level languages (e.g. specific to an architecture, like x86) microcode is EVEN MORE SPECIFIC where it is drastically different between even generations of a processor (e.g. an Intel i5 with Sandy Bridge would have different architecture than an Intel i5 with Ivy Bridge)?
    Drastic is relative. Between one model and the next things might change a lot, or not change at all. The point is there needs to be an abstraction layer there in case it needs to change.

    We're talking about an implementation detail here. It is what it is. The only reason people like you and me even know of its existence is because sometimes it can be upgraded in case a bug is found.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  12. #12
    Registered User
    Join Date
    Apr 2013
    Posts
    1,658
    Wiki article for microcode:

    Microcode - Wikipedia, the free encyclopedia

    Some older mini computers used op code bits to index into a "one bit per operation" table, for example, 7 bits of op code to index an 128 entry table of 80 bits each, 79 operations, plus 49 nops. Wiki describes a more generic form of this as a horizontal microcode (wide control store table).
    Last edited by rcgldr; 06-26-2014 at 12:45 PM.

  13. #13
    Registered User
    Join Date
    Jun 2005
    Posts
    6,815
    Quote Originally Posted by c_weed View Post
    Couldn't this is achieved by high level languages (such as C, C++) in the sense they provide another layer of abstraction so that a programer only needs to know the high level language and let the compiler worry about outputting assembly targeted at the specific platform?
    Nope. Because the executables would be specific to one CPU version, and maybe even just one CPU.

    Any variation in the manufacture process of any modern integrated circuit (a CPU, GPU, FPGA, DRAM, etc) causes subtle variations of electrical characteristics within the device, and between devices. Such variation can be reduced by more precise fabrication technique, but not eliminated (t least, not without changing our understanding of certain physical laws). Such variations existed for older CPUs, but are more significant for modern CPUs that contain billions of very small transistors than for older CPUs.

    One of the things achieved by microcode is shielding software from such variations. I certainly wouldn't want two ostensibly identical computers, but have to build the software differently for each one.
    Right 98% of the time, and don't care about the other 3%.

    If I seem grumpy or unhelpful in reply to you, or tell you you need to demonstrate more effort before you can expect help, it is likely you deserve it. Suck it up, Buttercup, and read this, this, and this before posting again.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. High or Low Level Language ?
    By Mr.Lnx in forum C Programming
    Replies: 17
    Last Post: 09-28-2012, 08:18 AM
  2. Is C++ or C language is a high Level language?
    By uthmankhale in forum C++ Programming
    Replies: 5
    Last Post: 08-25-2011, 06:00 PM
  3. Replies: 0
    Last Post: 08-25-2010, 08:04 AM
  4. C/C++, low or high level?
    By Sentral in forum A Brief History of Cprogramming.com
    Replies: 4
    Last Post: 01-23-2007, 11:43 PM
  5. nVidia's NEW high-level programming language for 3d graphics
    By Captain Penguin in forum A Brief History of Cprogramming.com
    Replies: 1
    Last Post: 06-13-2002, 08:16 PM