Thread: Counting unique lines/strings?

  1. #1
    Registered User
    Join Date
    May 2016
    Posts
    6

    Counting unique lines/strings?

    Hi,

    Is there a way to count unique lines? So for example if I do this:

    ./count
    one
    two
    one
    one
    two
    three
    /press ctrl + D/

    the program will tell me that there are 3 unique lines and 6 lines in total.

    I'm using fgets read line by line, and I have no problem getting the total number of lines, but I don't know how to approach the "unique lines" part. This is my first month of learning C, so please keep replies simple hopefully my English makes sense, and thank you so much!

  2. #2
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,413
    Yes, you can keep track of the lines previously read.
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  3. #3
    Registered User
    Join Date
    May 2016
    Posts
    6
    Quote Originally Posted by laserlight View Post
    Yes, you can keep track of the lines previously read.
    How can I do that? This is my code right now:
    Code:
    #include <stdio.h>
    #include <stdlib.h>
    
    #define MAX_LINE 1024
    #define MAX_CHARACTER 128
    
    int main(void) {
        int count, lineCount, distinctCount;
        char input[MAX_LINE][MAX_CHARACTER];
    // char ch[MAX_LINE][MAX_CHARACTER];
        
        count = 0;
        lineCount = 0;
        distinctCount = 0;
    // ch[0][0] = input[0][0];
        while ((fgets(input[count], MAX_CHARACTER, stdin)) != NULL) {
            if (input[count] == input[count + 1]) {
                distinctCount = distinctCount + 1;
            }
            lineCount = lineCount + 1;
            count = count + 1;
        }
    I tried making another array ch to store the previous line, but I don't know how to implement it within the fgets while loop?

    Is there another way to approach this? I'm so lost...

  4. #4
    Registered User
    Join Date
    Jun 2015
    Posts
    1,640
    You need to check the current line against all previous lines. You'll need a loop to go through all lines up to (but not including) the current line.

    Strings are compared with strcmp, not ==.
    Code:
    if (strcmp(input[count], input[i]) == 0)
        // strings are equal
    In C we say x++ (or ++x) instead of x = x + 1.

    lineCount is exactly the same as count. I.e., it's not needed.

    distinctCount seems to be meant to count duplicates, so its name is misleading.

  5. #5
    Registered User
    Join Date
    Jun 2011
    Posts
    4,513
    It looks like you're already keeping track of the input, by storing it in the "input" array.

    I recommend you separate the code for reading input, and for checking of unique lines. In other words, first read all of the input (using your loop). When input is complete, then move on to processing the strings.

    Note you cannot compare strings with == in C, you need to use "strcmp".

  6. #6
    Registered User
    Join Date
    May 2016
    Posts
    6
    thank you so much! I was messing around with my code, should've been != and not ==. I was going to use lineCount to count the total number of lines, is there a smarter way of doing it?

  7. #7
    Registered User
    Join Date
    Jun 2011
    Posts
    4,513
    Quote Originally Posted by kuroholic View Post
    I was messing around with my code, should've been != and not ==.
    If you're referring to the string comparison, re-read algorism's and my previous responses.

    Quote Originally Posted by kuroholic View Post
    I was going to use lineCount to count the total number of lines, is there a smarter way of doing it?
    As algorism said, you already have a variable that keeps track of the line count, so the variable "lineCount" is redundant.

  8. #8
    Registered User
    Join Date
    May 2016
    Posts
    6
    ok I wrote something, but it still doesn't work

    Code:
    #include <stdio.h>
    #include <stdlib.h>
    #include <string.h>
    
    #define MAX_LINE 1024
    #define MAX_CHARACTER 128
    
    int main(void) {
        int i, j, lineCount, distinct;
        char input[MAX_LINE][MAX_CHARACTER];
        char duplicate[MAX_LINE][MAX_CHARACTER];
        
        i = 0;
        lineCount = 0;
        while ((fgets(input[i], MAX_CHARACTER, stdin)) != NULL) {
            strncpy(duplicate[i], input[i], MAX_LINE);
            lineCount = lineCount + 1;
            i = i + 1;
        }
        
        i = 1;
        distinct = 0;
        while (i < lineCount) {
            j = 0;
            while (j < i) {
                if (strcmp(input[i], duplicate[j]) != 0) {
    //             distinct = distinct + 1;
                    j = j + 1;
                } else {
                    break;
                }
            }
            i = i + 1;
        }
        
    //    printf("%d distinct lines seen after %d lines read.\n", distinct, lineCount);
        
        return 0;
    }
    so now I'm comparing it to all the previous lines, but I don't know how to skip over strings in input that already appeared before.
    Last edited by kuroholic; 05-06-2016 at 08:16 AM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Counting lines in a .txt with C
    By ofCircuits in forum C Programming
    Replies: 6
    Last Post: 02-11-2013, 09:47 PM
  2. need help using unique and counting instances of a number
    By go_loco in forum C++ Programming
    Replies: 18
    Last Post: 03-30-2010, 01:00 PM
  3. Counting Source Lines?
    By Davros in forum A Brief History of Cprogramming.com
    Replies: 6
    Last Post: 08-13-2004, 12:46 PM
  4. Counting Lines
    By darfader in forum C Programming
    Replies: 6
    Last Post: 09-12-2003, 06:19 AM
  5. Counting Lines
    By drdroid in forum C++ Programming
    Replies: 7
    Last Post: 11-18-2002, 05:09 AM

Tags for this Thread