Thread: Help explaining test questions

  1. #1
    For Narnia! Sentral's Avatar
    Join Date
    May 2005
    Location
    Narnia
    Posts
    719

    Help explaining test questions

    I would like it if someone could explain to me these test questions, and help me understand them better, because some are confusing me.

    1) Which of the following statements about sorting 5 elements is the strongest statement that is directly implied by the information theoretic lower bound?

    (a) 6 comparisons are sufficient
    (b) 6 comparisons are necessary and sufficient
    (c) 7 comparisons are necessary
    (d) 7 comparisons are sufficient
    (e) 7 comparisons are necessary and sufficient

    2) 1000 random integers are generated randomly with a uniform distribution over the range 1
    to 1000 inclusive. Which of the following would indicate a poor generator?

    (a) the average of the numbers is about 499
    (b) each number appears exactly once
    (c) no four consecutive numbers are all even
    (d) two of the above
    (e) all of (a), (b), and (c)

    3) Which of the following must be true about file compression in general?

    (a) All files can be compressed
    (b) For any file, there must be some codes that are longer than others
    (c) The code must be a prefix code
    (d) All of the above
    (e) none

    4) Which of the following characterizes a Huffman coding tree?

    (a) all items are stored at the leaves
    (b) no nodes have one child
    (c) the tree is balanced
    (d) a and b only
    (e) all a,b, and c

    5) Which of the following is a valid prefix code for a file containing only the characters a,b, c, and d?

    a) a=1,b=0,c=01,d=10
    b)a=10,b=01,c=00,d=11
    c)a=1,b=01,c=101,d=0001
    d)a=1,b=10,c=100,d=1000
    e) All are valid

    6) Which one is false for Huffman coding?

    a) Huffman coding results in full tree
    b) Nodes at greater depth in the tree have larger code
    c) Huffman's algorithm constructs optimal prefix code
    d) For a file containing only k characters, Huffman coding can result in a code of length k
    e) all above are true

    7) When the edge (u,v) is added to a directed graph which of the following does not occur?

    a) u is added to the dictionary if it did not exist
    b) v is added to the dictionary if it did not exist
    c) v is added to u's adjacency list
    d) u is added to v's adjacency list
    e) all the above occur


    My answers were B, A, D, D, D, D, E respectively, but these were wrong, and I would like for someone to explain why. Thank you.
    Last edited by Sentral; 11-08-2009 at 02:30 PM.
    Videogame Memories!
    A site dedicated to keeping videogame memories alive!

    http://www.videogamememories.com/
    Share your experiences with us now!

    "We will game forever!"

  2. #2
    Registered User
    Join Date
    Jan 2005
    Posts
    7,366
    Why was your answer to #2 A? I'll look just at C as a start: the odds of a number being even is 1/2, so the odds of two consecutive numbers being even is 1/4, 3 is 1/8 and 4 is 1/16. When you are drawing random numbers 10000 times, then it is almost certain that your 1/16 scenario will show up a few times. So if the random numbers don't have any group of four consecutive evens, then it is probably a bad generator.

    For #7, there's an important word in the question that tells you why not all the answers occur. Can you find it?

  3. #3
    For Narnia! Sentral's Avatar
    Join Date
    May 2005
    Location
    Narnia
    Posts
    719
    Quote Originally Posted by Daved View Post
    Why was your answer to #2 A? I'll look just at C as a start: the odds of a number being even is 1/2, so the odds of two consecutive numbers being even is 1/4, 3 is 1/8 and 4 is 1/16. When you are drawing random numbers 10000 times, then it is almost certain that your 1/16 scenario will show up a few times. So if the random numbers don't have any group of four consecutive evens, then it is probably a bad generator.

    For #7, there's an important word in the question that tells you why not all the answers occur. Can you find it?
    I had a typo in the numbers, it's only 1000 numbers, but I don't believe that would make a difference. My reason for A was because if you average the numbers from 1 to 1000 you get 500.5 not 499. So having the average of 499 would indicate a bad generator.

    Yea, for #7 is says which of the following "does not occur", but that still doesn't help me.

    I don't understand why the others are wrong though, I thought they were correct.
    Videogame Memories!
    A site dedicated to keeping videogame memories alive!

    http://www.videogamememories.com/
    Share your experiences with us now!

    "We will game forever!"

  4. #4
    Registered User
    Join Date
    Jan 2005
    Posts
    7,366
    I think you have a misunderstanding of how a random number generator is supposed to work. It is not supposed to output every number from 1 to 1000. It is only supposed to have an equal probability of outputting any number from 1 to 1000. In practice, it is virtually impossible to get every number to come out only once. (Imagine how hard it would be roll 6 dice and have them show up with a 1, 2, 3, 4, 5 and 6 in some order. Now imagine the same thing with 1000 dice that have 1000 sides.) Instead you get some numbers showing up multiple times and others not showing up at all.

    So because the numbers that are output aren't always exactly the same, the average isn't always going to be exactly the same. You're right that the average of all the numbers from 1 to 1000 is 500.5, but when generating random numbers you'll just get close to that number. Since 499 is pretty close to 500.5, it's perfectly fine for the generator to output numbers with that average.

    For #7, the word I was thinking of is "directed". I'm not as familiar with the other questions so I won't try to help with those.

  5. #5
    For Narnia! Sentral's Avatar
    Join Date
    May 2005
    Location
    Narnia
    Posts
    719
    So you're saying the answer should be "two of the above" because a bad generator would be one in which each number appears exactly once, and that no four consecutive numbers are even.
    Videogame Memories!
    A site dedicated to keeping videogame memories alive!

    http://www.videogamememories.com/
    Share your experiences with us now!

    "We will game forever!"

  6. #6
    l'Anziano DavidP's Avatar
    Join Date
    Aug 2001
    Location
    Plano, Texas, United States
    Posts
    2,743
    #2: as the number of random samples increases to infinity, the average will approach 500.5, however, we don't have infinity samples do we 499 is perfectly fine.

    I would be more worried about B than C, because C says "NO four consecutive numbers are all even". Hence, C says, "if you took a sample of 4 consecutive numbers anywhere in the randomly generate numbers, you couldn't find a series anywhere where they are all even". That sounds pretty random to me! If it said, "there are 4 consecutive even numbers", then I would be worried about C.

    I would answer B for number #2. If every single number only showed up exactly once in a range of 1 to 1000 with 1000 different samples...i would be VERY worried. That shows some predictable behavior. Essentially I could figure out that the next random number is going to be a number that has not already been generated, but the next random number should be completely independent of those that have already been generated. It should be equally probably to draw the number 5 next time, even if I have already drawn the number 5 the previous 10 times. They are independent events. The answer is B.

    #7: yes, the key word is "directed". Think about who is really adjacent to who if you can't really reach one node from the other node.
    My Website

    "Circular logic is good because it is."

  7. #7
    Registered User
    Join Date
    Jan 2005
    Posts
    7,366
    >> The answer is B.
    I don't think so. See my discussion earlier of 2C. To make it easier, divide the 1000 numbers into 250 groups of four adjacent values. There is a 1/16 chance that any four consecutive values are even, and we have these 250 instances of four consecutive values. So what are the chances that a 1/16 probability will occur zero times over 250 trials? Not very good. (15/16)^250 = 0.00000009836. So, only one out of every ten million valid random number generations would have zero instances of four consecutive evens amongst the 250 groups.

    And that's just for the 250 separate sets of four. There are actually 997 groups of four available, but they overlap and I'm not sure how that affects the numbers.

  8. #8
    Registered User
    Join Date
    Jul 2009
    Posts
    50
    I can tell you that 3 is definitely not D.

    Consider a file that has one byte, and only one byte. How would you compress that?

    For 5, we can deduce the answer by considering if a=1, then no other code word can start with a(1), lest the receiver get confused about if 1 is an entire code word or simply a prefix to a code word.

    What is this for? Is it for a class? If so, it seems like it's pretty high-level stuff and you should be able to ask your professor about it, and get much better input than from here. Most of this stuff can be found fairly easily online, as well.(I haven't dealt with huffman stuff in a while, so I actually googled it to refresh a bit).

  9. #9
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by Daved View Post
    >> The answer is B.
    I don't think so. See my discussion earlier of 2C. To make it easier, divide the 1000 numbers into 250 groups of four adjacent values. There is a 1/16 chance that any four consecutive values are even, and we have these 250 instances of four consecutive values. So what are the chances that a 1/16 probability will occur zero times over 250 trials? Not very good. (15/16)^250 = 0.00000009836.
    The way you are calculating these odds is IMO a little flaky. You are not talking about 250 trials -- you are talking about a single set of 1000 numbers from 1 to 1000. If you did 250 sets of 4 numbers from 1 to 1000, I would agree that you should have a very high probability of getting such a series. But a single set of 1000 numbers is very different -- the chance of NOT having such a series will be much, much more likely.

    Still, MOST likely you will get one. But I would say not getting one on a SINGLE trial (that is the test) is not a sign of a poor generator. The only sign of a poor generator on a single trial like that would be if every number occurred once -- since that would not be random at all.

    Also, since B MUST be in our answer here, saying "any two of the above" is WRONG because that does not necessarily include B. So for #2, the only correct choice of a, b, c, d, or e is B.
    Last edited by MK27; 11-09-2009 at 11:59 AM.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

  10. #10
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    I think the problem with your answer to #6 is the definition of "code". If code means "a bit sequence representing a character", your answer is correct. If code means "result of the compression", D is true, if all k characters are distinct, k is a power of two, and the original code already used only ld(k) bits. In this case, you also have a full tree (if that means what I think), leading to answer E.)
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  11. #11
    Registered User
    Join Date
    Jan 2005
    Posts
    7,366
    Sorry MK27, I don't understand your argument.

    1000 numbers are generated. If you take every four numbers as a sequence then you have 250 possible sequences where you could have four even numbers. There are actually 997 potential sequences, but the 250 is a subset of the total that is easier to work with. So if the odds are minuscule that none of the 250 groups are all even, then it means that the odds are even more minuscule that none of the 997 groups are all even.

  12. #12
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    For question five, only one code fulfills the prefix property.
    In a, 'a' is a prefix of 'd' and 'b' a prefix of 'c'. The sequence 'ab' is not distinguishable from 'd', nor 'ba' from 'c'.
    In c, 'a' is a prefix of 'c'. 'ab' and 'c' have the same coding.
    In d, 'a' is a prefix of everything else, 'b' of 'c' and 'd', and 'c' of 'd'. Even though the code is not ambiguous, it's not a prefix code. (It's a suffix code, but there is no such thing in practice.)
    Only b fulfills the prefix property.
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  13. #13
    Cat without Hat CornedBee's Avatar
    Join Date
    Apr 2003
    Posts
    8,895
    Likelihood of a single number being even: 1/2
    Likelihood of four numbers all being even: 1/16
    Likelihood of four numbers containing at least one odd number: 15/16
    Number of consecutive four-number sequences in 1000 numbers: 997
    Likelihood of all of these sequences containing at least one odd number: no idea, actually. Does the fact that every odd number interrupts four sequences invalidate the naive calculation?
    Naive calculation: (15/16)^997 = 1.13595877459903082666e-28 = 0.000000000000000000000000000113595877459903082666
    All the buzzt!
    CornedBee

    "There is not now, nor has there ever been, nor will there ever be, any programming language in which it is the least bit difficult to write bad code."
    - Flon's Law

  14. #14
    Registered User
    Join Date
    Jan 2005
    Posts
    7,366
    >> Does the fact that every odd number interrupts four sequences invalidate the naive calculation?

    Yeah, that's an interesting question. I have no idea either so I figured it was safer to ignore. Somebody should write a program to find out.

  15. #15
    spurious conceit MK27's Avatar
    Join Date
    Jul 2008
    Location
    segmentation fault
    Posts
    8,300
    Quote Originally Posted by Daved View Post
    Sorry MK27, I don't understand your argument.
    1000 numbers are generated. If you take every four numbers as a sequence then you have 250 possible sequences where you could have four even numbers. There are actually 997 potential sequences,
    Yes, but almost all of those will involve a 3 number sequence that is in the preceeding and subsequent series. That is NOT a normal property of 250 (or 997) discrete sets.

    Furthermore, the ranges are very compressed because each "set" is actually from the next "first" (low) number to the four next number, eg:

    231 232 235 237

    If you picked a set of four from the range 1-5, your chances of getting 4 consecutive even numbers is NIL.

    So my point is, you are considering this the wrong way when you describe it in terms of "997 sets of 4 numbers from 1 to 1000". That implies you could have sets like

    2 6 400 888

    as one of the 997 or 250 sets. In a set of 1000 numbers (from 1-1000), you are very very very unlikely to have that. You are implying you can just pick a set, and exclude any of the 996 other numbers that are part of the larger set that separated these numbers originally. The numbers MUST be considered sequentially, you cannot just pick four out of 1000 250 times, etc, and use that for your analysis.

    Furthermore, as I said, I guarantee B must be part of the answer since this is a standard misconception of how a random generator works (I have seen it here before -- someone believes "random distribution" means the every number should occur in random order, but not repeat, which is wrong). And there is only one choice to definitively include B. If you say "two or more" that implies B could occur, but not A or C, and that would be okay. No, it wouldn't. B would definitively indicate a non-random generator.
    Last edited by MK27; 11-09-2009 at 01:02 PM.
    C programming resources:
    GNU C Function and Macro Index -- glibc reference manual
    The C Book -- nice online learner guide
    Current ISO draft standard
    CCAN -- new CPAN like open source library repository
    3 (different) GNU debugger tutorials: #1 -- #2 -- #3
    cpwiki -- our wiki on sourceforge

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Computer Programmer Aptitude Battery Test (CPAB)
    By scotty101 in forum General Discussions
    Replies: 40
    Last Post: 02-22-2010, 11:49 PM
  2. Newbie needs help..
    By xpress urself in forum C++ Programming
    Replies: 3
    Last Post: 07-26-2007, 07:22 PM
  3. A very long list of questions... maybe to long...
    By Ravens'sWrath in forum C Programming
    Replies: 16
    Last Post: 05-16-2007, 05:36 AM
  4. My C++ test is COMING!!...
    By [Z-D] in forum C++ Programming
    Replies: 52
    Last Post: 12-01-2006, 08:02 PM
  5. Bitwise Test Questions
    By Mister C in forum C Programming
    Replies: 9
    Last Post: 11-27-2002, 06:06 PM