Thread: A question about Linux's "diff" command

  1. #1
    Registered User
    Join Date
    Apr 2007
    Posts
    284

    A question about Linux's "diff" command

    How does diff compare the differences of 2 files?
    Is it implemented with a hash table?
    Is it possible that: you "diff" 2 files A and B, and you shuffle the lines of B and diff them again. Will you find the diff results different from time to time?

  2. #2
    Banned master5001's Avatar
    Join Date
    Aug 2001
    Location
    Visalia, CA, USA
    Posts
    3,685
    First off,

    Code:
    man diff
    Secondly, and more helpfully, if I understand your question correctly you have a couple of options. One of which is to first do the above thing, and use the appropriate command line arguments. The second would be to use grep in conjunction with diff.

  3. #3
    Registered User
    Join Date
    Dec 2006
    Location
    Canada
    Posts
    3,229
    I think he is asking about the actual implementation of diff.

    Unfortunately I don't know the answer, but it's open source .

  4. #4
    Kernel hacker
    Join Date
    Jul 2007
    Location
    Farncombe, Surrey, England
    Posts
    15,677
    Quote Originally Posted by meili100 View Post
    How does diff compare the differences of 2 files?
    Is it implemented with a hash table?
    Is it possible that: you "diff" 2 files A and B, and you shuffle the lines of B and diff them again. Will you find the diff results different from time to time?
    Diff compares the actual text content (on text-files, binary files it does a byte-by-byte comparison). It does not use any form of "magic" in the actual comparison - simple string compare [with slight modifications to compensate for flags like "--ignore-white-space" and "--ignore-case"].

    Shuffling lines about (assuming the lines are actually different, of course) will definitely always make a difference with diff.

    The "magical" part of diff is the bit that figures out where the files are the same again, which I don't actually know how it works - I think it searches forwards until it finds a line that matches, but I'm fairly sure it is a little bit more complex than that.

    --
    Mats
    Compilers can produce warnings - make the compiler programmers happy: Use them!
    Please don't PM me for help - and no, I don't do help over instant messengers.

  5. #5
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    Look at the source of GNU diff. It starts with a very long comment discussing the algorithm used.
    http://www.gnu.org/software/diffutils/diffutils.html
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Question on 'top' command
    By muthus in forum Linux Programming
    Replies: 3
    Last Post: 01-21-2008, 01:18 AM
  2. command line arguments question
    By bazzano in forum C Programming
    Replies: 2
    Last Post: 10-17-2005, 03:57 AM
  3. exe files in -c- language
    By enjoy in forum C Programming
    Replies: 6
    Last Post: 05-18-2004, 04:36 PM
  4. command line param question
    By moi in forum C Programming
    Replies: 1
    Last Post: 08-05-2002, 08:07 PM
  5. question about those command thngy's
    By face_master in forum C++ Programming
    Replies: 4
    Last Post: 09-15-2001, 08:54 AM