Thread: Very basic computer sience

  1. #1
    Registered User
    Join Date
    Jan 2005
    Posts
    183

    Very basic computer sience

    Hi all. I have a bunch of questions, if you will kindly indulge me
    I have always wanted to know in depth how computer systems work. I have done a little research and have discovered (please correct me if I'm wrong here,) that Assembly is the most raw, low level programming language there is. From material I have read on the internet, I get the impression that this is "one step up" from binary code. Is this correct?

    I shall try to elaborate. Say I have an executable file called 'example.exe'. I dissassemble the file and find the following instruction (at the programs entry point):
    Code:
    shl     eax, 4
    This is Assembly code, correct? If I were to then paste this into an ascii to binary converter, I find the following string of 1s and 0s:
    Code:
    0111001101101000011011000010000000100000001000000010000000100000011001010110000101111000001011000010000000110100
    Am I correct in thinking that everytime 'example.exe' is run, this series of 1s and 0s is processed by the systems CPU?

    I have lots more questions, but I wish to understand the VERY basics first.
    Any help/links would be greatly appreciated.
    Thanks

  2. #2
    l'Anziano DavidP's Avatar
    Join Date
    Aug 2001
    Location
    Plano, Texas, United States
    Posts
    2,743
    An "ascii to binary" converter is not the correct kind of converter to use. You will get an output of 1's and 0's, yes, but you just got the incorrect set of 1's and 0's.

    However, assuming you convert the 1's and 0's correctly, then yes, that string of 1's and 0's is what gets thrown at the processor as the "instruction".

    If you want the correct conversion from Assembly to Machine Code, then you need to look it up in Intel's manuals or on some website that has the correct codes. A hex editor would also be useful, assuming you type in that assembly language instruction, compile it, and then look at the compiled code in a hex editor. There are many ways to get the correct binary code...but an "ascii to binary" converter just isn't the correct method.
    My Website

    "Circular logic is good because it is."

  3. #3
    (?<!re)tired Mario F.'s Avatar
    Join Date
    May 2006
    Location
    Ireland
    Posts
    8,446
    Not quite. But you are very close, Necrofear:

    Assembly is the most raw, low level programming language there is. From material I have read on the internet, I get the impression that this is "one step up" from binary code. Is this correct?
    Yes. That's basically it.

    This is Assembly code, correct?
    Yes. It's a shift left instruction (shl), ordering the contents of the eax register to shift left by 4 bits.

    If I were to then paste this into an ascii to binary converter [...]
    Am I correct in thinking that everytime 'example.exe' is run, this series of 1s and 0s is processed by the systems CPU?
    Nope. You do not convert the ascii representation into binary. What you are looking at is assembly source code. The assembler will read that source code and produce the end file more or less like this:

    The instruction SHL is a mnemonic to an opcode. The opcode is, I believe C1 or D1. This is the hexadecimal representation of the opcode, that translates to 11000001 in binary (I'm assuming its C1. I don't actually know much about x86 assembly).

    Now, depending on how this opcode is used by the assembler, next you append to that the binary code for the EAX register followed by 100 (which is the binary representation of 4) or the other way around.

    The end string of 1s and 0s is the actual binary code that gets sent to the processor.
    Last edited by Mario F.; 03-13-2010 at 04:44 PM.
    Originally Posted by brewbuck:
    Reimplementing a large system in another language to get a 25% performance boost is nonsense. It would be cheaper to just get a computer which is 25% faster.

  4. #4
    Registered User
    Join Date
    Jan 2005
    Posts
    183
    Thankyou both, they were very helpful answers.

    Just to clarify, when creating an asm program, an assembler converts the asm code to binary which is stored in the executable file. (As oppose to the asm-binary conversion occuring at runtime) Is that right?

    I have considered familiarising myself with asm, but I have read in quite a few places that it is very processor dependant. To my understanding, this is due to different instruction sets on different processors. Then I started wondering about high-level languages, which are more portable (according to sir internet). I don't understand how and I would greatly appreciate if someone could point me in the right direction.

    As far as I'm aware, high-level languages are converted to machine code, just like assembly. Would this machine code not also present the processor with a series of instructions? And would the processors ability to process these instructions not depend on the instruction set it has? Thus, asm and a high-level language should be equally un-portable.

    I'm sure I've got the wrong end of the stick somehow, so any help you could offer would be great.
    Thanks again for your time

  5. #5
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    Just to clarify, when creating an asm program, an assembler converts the asm code to binary which is stored in the executable file. (As oppose to the asm-binary conversion occuring at runtime) Is that right?
    That is right.

    ASM is unportable, because the instruction sets are different. High level languages allow you to write your programs in a more abstract way (in terms of variables, mathematical operations, etc).

    A compiler will convert the high level code to asm/machine code for your target processor.

    A new compiler will need to be written for each processor type (well, some support more than 1 "target"s, but let's ignore that for now). They take in the same high level language code, and generate different asm for their target(s).

  6. #6
    Registered User
    Join Date
    Jan 2005
    Posts
    183
    Ahh I see. Thanks Cyberfish, I think I understand.

    So if I compiled a high level language program on one machine and transferred the executable file to another machine, then it may not run if the processor types are different? What would happen exactly... Would the program run until it reached the command that wasn't supported by the processor and then crash? Would it just miss out that section of code? Or would the system somehow recognise that an operation within the code is not supported by the processor and not execute the program at all? Or does it vary between operating systems/processors?

    I'm sorry for so many questions I'm still unsure whether to try to learn asm or just stick to high level languages. I'm just trying to weigh out the pros and cons of each.
    Many thanks.

  7. #7
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    So if I compiled a high level language program on one machine and transferred the executable file to another machine, then it may not run if the processor types are different?
    Correct.

    Op codes (binary code) of different processors are so different that running code written for another processor will be like interpreting Japanese using English grammar or something. It won't make any sense. Theoretically, it will just do very random things and likely crash within the first few instructions.

    In the real world, an executable contains more than just the machine code. It also contains OS-specific headers that tells the OS some things about the program, size of the program, what DLLs it needs, checksum, etc. The OS will likely not allow you to run the program at all.

    As for learning asm vs high level languages, if you have never done any programming before, I strongly suggest going with a high level language. Most people only learn asm after they had significant experience with high level languages, if at all (many/most programming jobs don't require asm). ASM is a lot more low level and tedious, and will prevent you from seeing the "big picture" (general programming concepts).

    Nowadays people only write asm if no compiler is available, for low level hardware access, or they need every last drop of performance (like a function that's called a few million times / second in a game).

  8. #8
    Registered User
    Join Date
    Jan 2005
    Posts
    183
    Thanks again cyberfish. I will certainly stick to high level languages for now (I already know a little C++). However, I am fascinated by reverse engineering, and I believe that asm knowlage is important in that area so I will have to learn it at some point. But I shall take your advice and become proficient in C++ before I move on to asm.

    I have one more quick question about portability. When I read the last reply, I found myself wondering: "If I compiled a (C++) program on my system, then attempted to run it on 10, 000 other, random systems, how many processors would support the code?" I guess I'm asking how many different instruction sets there are for processors, and how likely it is that code compiled on my computer would not run correctly on another computer (that uses a different type of processor). If all processors are different, how do professional software developers overcome portability issues?

    Thanks for all the help, I've learnt alot

  9. #9
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    They don't have to be the same processor, just the same architecture.

    All PCs (desktops, laptops, Intel, AMD, etc) are x86, so your code will run on all of them, provided you don't use more advanced feature like SSE that are only available in later CPUs.

    A different architecture would be something like older (non-Intel) Macs, your cellphone, PDA, mainframes, the microcontroller in your dish washer, etc.

    In reality, though, just the same architecture is not enough. You also need the same OS. For that, unfortunately, you will have to compile your code separately for all OSes you want your code to run on.

  10. #10
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    It depends on what the program does. Your typical "hello world" will compile and run on any architecture and operating system, as long as the C standard IO library exists there -- which on a normal computer with a screen and keyboard, it will.

    One of the primary purpose of C/C++ is to provide a highly portable, platform independent means of programming on reasonably low level, and it does. Platform specific things tend to involve the filesystem or some high level interface.
    Last edited by MK27; 03-13-2010 at 08:24 PM.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  11. #11
    (?<!re)tired Mario F.'s Avatar
    Join Date
    May 2006
    Location
    Ireland
    Posts
    8,446
    Quote Originally Posted by Necrofear View Post
    If all processors are different, how do professional software developers overcome portability issues?
    To be clear, in addition to what has been said (it's processor architecture, not actually processor, and you try to base your code on the standard libraries), for most part the software developers rely on the compiler to do that for them.

    If you create a C or C++ program, you can compile that program on Windows. If you wish to compile it on Linux, you then use a Linux compiler that will make sure it compiles your code to a form understandable by that operating system. Conversely, if you compile for x86 processors, you do it on x86 compilers. But if you want to use it on a VAX, you compile it on compilers that generate code for DEC Alpha processors.

    If your program can compile on one system or operating system, but not on another, it is said your program is not portable. This can happen because your C or C++ code uses features that are specific to a certain environment, or libraries that don't exist on the other environment. Programmers, when faced with this situation use one of two options:

    - They change their code so that it use portable features only. That is features that can compile on the systems they intend the program to run on. There's plenty that can be done here, from using features that instruct the compiler to use different portions of the code depending on the system, to the use of libraries that have been ported to the other systems, to careful use of standard language features only, etc.

    - They branch the program. One branch is for one system and another is for the other system. That's two programs being developed in separate.

    Portability is the Big issue of software development. But it isn't always a necessary issue. It's also conceivable, common, generally accepted, and many times desirable, that a programmer may simply create the program to run on one system, and one system only.

    For interpreted languages like Java the task is made easier. While there may still be the need to make adjustments to the code in order for it to run on different systems, these are far less than with compiled languages like C or C++. On this case it's the task of the interpreter to make the code portable. And the Interpreter has the capacity (and it is one of its tasks) to considerably reduce the burden on the programmer. Something a compiler can't (and shouldn't) do. Languages like Java were created with a thought in mind: "code once, run anywhere". It's still a pipe dream on many occasions, but certainly Java is much easier to code for more than one system or operating system.

    Anyways, the idea here is that compilers will take away much of the work from the programmer in what is the relationship of their code with the machine architecture and operating system. The reason they are called high-level languages.
    Last edited by Mario F.; 03-13-2010 at 10:13 PM.
    Originally Posted by brewbuck:
    Reimplementing a large system in another language to get a 25% performance boost is nonsense. It would be cheaper to just get a computer which is 25% faster.

  12. #12
    Registered User
    Join Date
    Jan 2005
    Posts
    183
    Thankyou very much for all the replies.
    I understand alot more now

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 34
    Last Post: 02-26-2006, 01:16 PM
  2. what are your thoughts on visual basic?
    By orion- in forum A Brief History of Cprogramming.com
    Replies: 16
    Last Post: 09-22-2005, 04:28 AM
  3. upgrade computer ?
    By teeyester in forum Tech Board
    Replies: 4
    Last Post: 10-11-2003, 09:51 PM
  4. Regarding Undergraduate Computer Majors
    By UnregdRegd in forum A Brief History of Cprogramming.com
    Replies: 11
    Last Post: 10-04-2003, 11:55 AM