Thread: How do you read code?

  1. #1
    Registered User
    Join Date
    Dec 2007
    Posts
    51

    How do you read code?

    I've programmed with C for about 7 years now (10+ year linux user as well) and would like to help out with the community. I like the whole idea of "open source" and now I think its time that i put my skills to use in the programming world.

    My question is:

    On these big open source projects (30,000 line projects) whats the best way to jump into the code? These projects can be very difficult and complex to understand so whats the best way to jump right in? Is there a neutral program out there that can just step through code line by line like a debugger does?

    How do you guys do it? I've been looking for books for a couple years now but had no luck on this topic. What do you guys think?

    Do you guys use any tools to help you read code better? Let me know your strategies.

  2. #2
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    I find that this is less about C programming and more about reading (other people's) code in general, so I have moved this to the General Discussion board.

    Personally, I have only grasped parts of large projects (e.g., parts of a library that interest me), or entire small projects (e.g., the TUT unit testing framework nightmare that I eventually gave up on). Since they were libraries rather than end user programs it was a matter of looking at the API and then peeking at the source code to see how they were implemented.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  3. #3
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    There are software called "Code Browser", which is a tool that reads the source code and cross-references it, then allows you to jump from one piece of code to another.

    However, the other thing is that you can't just take a 30K lines software project, read all of it, and understand it [well, at least I know I can't - and I've been working with software for over 20 years now - including working on large open-source projects].

    The best path, I think, is to have a project with a goal [1], and then implement that, along with suitable test-code to prove that the code works - hopefully there already is some existing test-code that proves that you haven't broken anything else. To achieve this goal, you need to study the code and understand the details of the part that needs modification. As a starter project, you don't want something that changes the entire structure of the entire source code - that requires, most likely, too much knowledge to be gathered before you start changing things.

    The points I'm making:
    1. You most often don't need to understand how the entire project in detail.
    2. Start small - make small changes to begin with. Growing up with the project, increasing your knowledge and skill as you go along, is just about the only way to understand "all" of the project [and in some cases, you simply can't understand ALL of it anyways].


    [1] This goal should be such that you believe from your current overall programming skills that you can achieve it without any major difficulty, and the goal should be small enough that you can set a reasonable time to achieve it [say 3 days or 5 weeks - depending on the time you are willing to spend, and the difficulty of the task itself]. It is, obviously, no point thinking that you can modify the scheduler in the Linux kernel if you have never worked on schedulers or in kernel code ever before - that's setting yourself up for failure.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  4. #4
    Registered User
    Join Date
    Dec 2007
    Posts
    51
    Quote Originally Posted by matsp View Post
    There are software called "Code Browser", which is a tool that reads the source code and cross-references it, then allows you to jump from one piece of code to another.

    However, the other thing is that you can't just take a 30K lines software project, read all of it, and understand it [well, at least I know I can't - and I've been working with software for over 20 years now - including working on large open-source projects].

    The best path, I think, is to have a project with a goal [1], and then implement that, along with suitable test-code to prove that the code works - hopefully there already is some existing test-code that proves that you haven't broken anything else. To achieve this goal, you need to study the code and understand the details of the part that needs modification. As a starter project, you don't want something that changes the entire structure of the entire source code - that requires, most likely, too much knowledge to be gathered before you start changing things.

    The points I'm making:
    1. You most often don't need to understand how the entire project in detail.
    2. Start small - make small changes to begin with. Growing up with the project, increasing your knowledge and skill as you go along, is just about the only way to understand "all" of the project [and in some cases, you simply can't understand ALL of it anyways].


    [1] This goal should be such that you believe from your current overall programming skills that you can achieve it without any major difficulty, and the goal should be small enough that you can set a reasonable time to achieve it [say 3 days or 5 weeks - depending on the time you are willing to spend, and the difficulty of the task itself]. It is, obviously, no point thinking that you can modify the scheduler in the Linux kernel if you have never worked on schedulers or in kernel code ever before - that's setting yourself up for failure.

    --
    Mats
    You make a good point. From the looks of your experience as a kernel hacker, I have a scenario for you:

    Say you had the job of "porting" a kernel (lets say the linux kernel) to another architecture (Lets say, the ARM microprocessor).

    Now by default, the kernel was orginally written for the x86 arch.

    In order to do something like this, you obviously need to understand two things:
    The arch of x86 and arch of ARM.

    So heres my question:

    Where do you start with that? I mean, kernels are typically huge and complex. You need an understanding of the entire codebase to recognize what needs changing and manipulating to "make it work" for another subsystem.

    I used the "Linux Kernel" because its said to be very portable

    Based on your experience, What are your thoughts on this? Tell me how you would approach this situation.

  5. #5
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,660
    Some suggestions (not all free).

    One of the better code editors I've used is Source Insight (http://www.sourceinsight.com/) which will create a cross-reference of a project, and allow you to browse that in an interactive fashion.

    Another tool is Source Navigator (http://sourcenav.sourceforge.net/). The GUI is clunky to say the least, but the database it generates is very accessible if you want to do some kind of analysis which is outside the scope of what is provided.

    The Linux Kernel has it's own cross referencing tool (http://lxr.linux.no/), which can be used for other projects as well.

    > Where do you start with that?
    Given that it's already been ported to several, perhaps there's a "howto" around, or perhaps even some kind of design documentation.

    Also, anywhere where the ASM keyword is used will need to be ported. In conjunction with that, look out for any #ifdef conditionals which refer to symbols of an architecture nature, eg
    #ifdef X86
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  6. #6
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by someprogr View Post
    You make a good point. From the looks of your experience as a kernel hacker, I have a scenario for you:

    Say you had the job of "porting" a kernel (lets say the linux kernel) to another architecture (Lets say, the ARM microprocessor).

    Now by default, the kernel was orginally written for the x86 arch.

    In order to do something like this, you obviously need to understand two things:
    The arch of x86 and arch of ARM.

    So heres my question:

    Where do you start with that? I mean, kernels are typically huge and complex. You need an understanding of the entire codebase to recognize what needs changing and manipulating to "make it work" for another subsystem.

    I used the "Linux Kernel" because its said to be very portable

    Based on your experience, What are your thoughts on this? Tell me how you would approach this situation.
    Where you start is by looking at the general structure [1] and finding the "arch" (short for Architecture) dependent parts. SInce Linux has been ported to MANY different architectures, it's unlikely that any non-portable code is found outside the "arch/xxx" tree. Just find a architecture that you are familiar with [e.g. x86] [or alternatively, find an architecture similar to ARM] and start porting the parts that are different between x86 and ARM.

    You will need to be quite familiar with the OS (Linux in this case) architecture, have good knowledge of both the "from" and "to" processors that you are porting between - particularly the "to" processor - and be able to read up on the "from" processor sufficiently to understand what the often strange concoctions that appear in the inline assembler does.

    This is definitely not a trivial task - my first OS port, which wasn't a full-fledged Linux, took several months. The second one, which had less assembler in the first place, and I had more understanding of the "to" architecture thanks to the first port, took about two weeks to get going and 95% functional.

    [1] According to this page, http://www.treblig.org/Linux_kernel_source_finder.html, there is already an ARM project for Linux, so I expect that you will not actually NEED to perform this task as such - but the above explains how to do the task in generic terms.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  7. #7
    Registered User Jaqui's Avatar
    Join Date
    Feb 2005
    Posts
    416
    Since, as has been mentioned, it's not really possible to understand the code base for a large open source project, why not pick a project you would like to work on, and check their bug reports, get the sources and start fixing the bugs.

    you will learn the code base by actively working on it to track down the bugs.
    you will become a recognized contributor to the code for that project.
    you get to help a project you think is worthwhile.
    the project gets the bugs fixed.
    the project gets another person developing for it.
    the project grows better in every wa.
    Quote Originally Posted by Jeff Henager
    If the average user can put a CD in and boot the system and follow the prompts, he can install and use Linux. If he can't do that simple task, he doesn't need to be around technology.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 3
    Last Post: 03-04-2005, 02:46 PM
  2. Obfuscated Code Contest
    By Stack Overflow in forum Contests Board
    Replies: 51
    Last Post: 01-21-2005, 04:17 PM
  3. Seems like correct code, but results are not right...
    By OmniMirror in forum C Programming
    Replies: 4
    Last Post: 02-13-2003, 01:33 PM
  4. << !! Posting Code? Read this First !! >>
    By biosx in forum C++ Programming
    Replies: 1
    Last Post: 03-20-2002, 12:51 PM
  5. << !! Posting Code? Read this First !! >>
    By biosx in forum C Programming
    Replies: 1
    Last Post: 03-20-2002, 12:51 PM