Thread: Understanding a 40,000 lines C source code

  1. #1
    Registered User
    Join Date
    Sep 2008
    Posts
    5

    Understanding a 40,000 lines C source code

    Greetings Everyone,

    I am trying to understand 40,000+ lines source code (12 different source files) so as to make modifications at various points in the code. I was the end user of that program and found many improvement areas.

    Can you (experienced programmers) suggest some useful tools to understand the source code of this size written by someone else?

    I did extensive search in this forum to find similar questions before posting this one. I tried DDD (debugger) but not really successful.

    Thanks in advance for your valuable recommendations.

    Sen

  2. #2
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    1) What is it in the code that you want to change / improve?

    Looking for way s to improve code without understanding it's deficiencies, is like trying to cure a patient, before you know what's the problem(s)

    After you know the problems, consider whether it would be better to write a new version, or to fix up the current version.

    2) Look to the specifics that you identified in step #1, above. Don't try to "improve the code", try to improve one part of one function, of the code, that you have identified as the problem area.

    Does it need a change in the algorithm of that function, or just a little extension of logic? Sometimes the best code you can write is the code you can delete and/or simplify.

    Forget that it has 40,000 lines of code. Focus *only* on the area's that you want to fix, if you have decided that "to fix it" is the right way to go.

    Be sure to keep an original, and document and comment, whatever you change. When you think you have improved it - put that aside, and test it. Nothing's worse than having made changes to 8 different functions, with a program this big, and *then* notice that the program no longer even works correctly - or runs in anything like the run-time that it used to.

    What does this program do, and why do you need / want to improve it? (just curious).

  3. #3
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Quote Originally Posted by Adak
    Be sure to keep an original, and document and comment, whatever you change. When you think you have improved it - put that aside, and test it. Nothing's worse than having made changes to 8 different functions, with a program this big, and *then* notice that the program no longer even works correctly - or runs in anything like the run-time that it used to.
    Use a version control system.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  4. #4
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    40,000 lines in 12 files? So there must be at least one file longer than 3333 lines. Ugh...

    How you approach this depends how well-designed the code is. If the architecture is sound, then reasonably small function changes will only affect small parts of the code -- in this case, the major effort is simply identifying the piece of code where you need to make your change.

    If the architecture is bad, then even a small change might involve touching the code in many places, and would require you to understand far more about it than if it was designed well in the beginning.

    We can't give much more advice how to proceed without knowing more about what the code does, and what you want to make it do differently.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  5. #5
    Registered User
    Join Date
    Sep 2008
    Posts
    5
    Thanks for the replies. I am sorry for not being very specific. As I mentioned earlier, I was using this program (its a complex optimization code for non-linear regression with thousands of variables), as a black box. Now I want to understand and master the code line by line.
    My question is, how one can understand a source code written by others in an efficient way? Are there any tools available to do the same for every day programmers ?

    My ultimate goal is to master the code, so that I can modify the algorithm and add some additional functions, as well as make that code work in parallel computing environment.

    Hope I am clear now. Thanks !!
    Last edited by senglaxo; 03-16-2010 at 01:37 PM.

  6. #6
    Officially An Architect brewbuck's Avatar
    Join Date
    Mar 2007
    Location
    Portland, OR
    Posts
    7,396
    Quote Originally Posted by senglaxo View Post
    Thanks for the replies. I am sorry for not being very specific. As I mentioned earlier, I was using this program (its a complex optimization code for non-linear regression with thousands of variables), as a black box. Now I want to understand and master the code line by line.
    My question is, how one can understand a source code written by others in an efficient way? Are there any tools available to do the same for every day programmers ?

    My ultimate goal is to master the code, so that I can modify the algorithm and add some additional functions, as well as make that code work in parallel computing environment.

    Hope I am clear now. Thanks !!
    A lot of code is just busy-work. You shouldn't feel like you need to understand each and every line of code in order to know how it all works.

    The two basic methods for understanding code are static (looking and source) and dynamic (watching it run in a debugger). To get a handle on a starting point, begin by compiling the library in debug mode. Link it into a small program that uses some of its functions. Then in the debugger, step into the library and start watching what it does.
    Code:
    //try
    //{
    	if (a) do { f( b); } while(1);
    	else   do { f(!b); } while(1);
    //}

  7. #7
    Registered User UltraKing227's Avatar
    Join Date
    Jan 2010
    Location
    USA, New york
    Posts
    123
    understanding a code line by line... not really hard, i do it every day!
    as brewbuck said, its better with a debugger. but if you want to do
    it the static way, here it is:

    1) Backup the code and dont change it!

    2) Start by any file included in the sourcepack, just fire it up.

    3) Cut the first line and Compile & Run the program.

    4) Now Paste the first line back and change the parameters one
    by one.

    5) Compile & Run again.

    6) Continue step 3 to 5 for each line.
    atleast, this is what i do everytime i need to edit codes made by others.
    though, the longest code i ever tested was 19,000 lines. but im sure the
    princible is the same! if veterans or elites find this wrong, feel free to
    correct me.

  8. #8
    Registered User
    Join Date
    Sep 2008
    Posts
    5
    Thanks Brewbuck and Ultraking !

    I use both static and dynamic (ddd) methods as you mentioned, but in 4 months I just made progress upto 1300 lines in the main file, going back and forth to other files, looking what those functions do. The problem is that each function needs huge number of arguments of different complexity, so I find it difficult running them as individual pieces.


    I am very novice in programming and unfortunately I don't have any direct contact with any experienced programmers. But I was pretty sure that there must be some better and efficient ways of doing it ----> prompting me to shoot the question here.

    thanks again for your valuable time and suggestions.

  9. #9
    Registered User
    Join Date
    Feb 2010
    Posts
    11
    It might be blatantly obvious, but after backing up the original I'd run the whole lot through some kind of code formatter/beautifier to get the source into a consistent and readable format (if it isn't); using something like astyle (Artistic Style - Index).

  10. #10
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,661
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  11. #11
    Registered User
    Join Date
    Sep 2008
    Posts
    5
    Salem, yes both tools sound great. I will explore them. Thanks everyone for your suggestions. I appreciate very much.

    Sen

  12. #12
    Registered User
    Join Date
    Dec 2007
    Posts
    214
    Quote Originally Posted by brewbuck View Post
    40,000 lines in 12 files? So there must be at least one file longer than 3333 lines. Ugh...
    The first application I worked on in my current job probably had a function with 3333 lines.
    That may be a slight exaggeration, but one function had a seemingly endless series of multiple nested if-else-if statements.

    At my old job I worked on a DOS based application where the original programmer passed no parameters to his functions. ALL variables were global. And these global variables were named flag1, flag2, flag3...flag1028...
    His functions names were also cryptic. You had no idea what the function did by reading the name.

    I hope my code is better.

  13. #13
    Registered User
    Join Date
    Sep 2008
    Posts
    5
    well, in my case, of course the code has some useful comments, but may be because of the complexity of the algorithms, I find it difficult to grasp it faster.

  14. #14
    Registered User
    Join Date
    Dec 2007
    Posts
    2,675
    The code I'm working in has TONS of functions > 1000 lines. Not sure what the longest is. When I read the title I thought perhaps the OP was one of our new hires

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Documenting Source Code
    By jverkoey in forum A Brief History of Cprogramming.com
    Replies: 14
    Last Post: 01-04-2008, 12:18 PM
  2. DxEngine source code
    By Sang-drax in forum Game Programming
    Replies: 5
    Last Post: 06-26-2003, 05:50 PM
  3. Lines from Unix's source code have been copied into the heart of Linux????
    By zahid in forum A Brief History of Cprogramming.com
    Replies: 13
    Last Post: 05-19-2003, 03:50 PM
  4. True ASM vs. Fake ASM ????
    By DavidP in forum A Brief History of Cprogramming.com
    Replies: 7
    Last Post: 04-02-2003, 04:28 AM
  5. C source code for int25 or code help
    By Unregistered in forum C Programming
    Replies: 0
    Last Post: 09-26-2001, 02:04 AM

Tags for this Thread