Thread: Sorting Strings: Binary Search for string

  1. #1
    Registered User
    Join Date
    Sep 2011
    Posts
    25

    Question Sorting Strings: Binary Search for string

    I need help fixing the search method for my program. I have managed to get a variety of errors from "Killed" to "Segmentation Fault: 11" when it comes time to find the number for a word, then use bit operations to toggle that number in the "Set" (array of 10 unsigned ints). I'll post the driver and search code, let me know if you need more code to diagnose the issue.


    driver.c
    Code:
    #include "set.h"#include <stdio.h>
    #include <stdlib.h>
    
    
    // CSE 1040 Alexander Hollis Major Program Assignment 2, [email protected], CSP01
    
    
    
    
    int main() {
    
    
    printf("Initializing, scanning in your data...\n");
    
    
    Set set1, set2, rset;
    
    
    clearSet(set1);
    clearSet(set2);
    clearSet(rset);
    
    
    int i, words=0;
    
    
    for(i=0; (scanf("%s",&names[i]) ) != EOF; i++, words++);
    
    
    qSort(names, 320);
    //char *test [10][30] = {"the","hello","adam","World","car","tree","squirrel","dog","cat"};
    //for(i=0; i< 320; i++)
    //printf("%s\n",names[i]);
    
    
    printf("Adding \"seem\" to set1.\n");
    addName2Set(set1, "seem");
    
    
    printf("Adding \"next\" to set1.\n");
    addName2Set(set1, "next");
    
    
    printf("Adding \"head\" to set2.\n");
    addName2Set(set2, "head");
    printf("Adding \"both\" to set2.\n");
    addName2Set(set2 , "both");
    printf("Adding \"seem\" to set2.\n");
    addName2Set(set2, "seem");
    
    
    printf("Set1: \n");
    printSet(set1);
    
    
    printf("\n");
    
    
    printf("Set2: \n");
    printSet(set2);
    
    
    }

    search.c
    Code:
    //CSE 1040 Alexander Hollis, Programming Major Assignment 2, [email protected], CSP03
    
    #include "set.h"
    
    
    
    
    
    
    int binarySearch(char *input, char *element, int left, int right) {
    
    
    int middle = (left + (right - left) ) / 2;
    
    
    if (left > right) {
        return -1;
    }
    
    
    else {
    
    
    if( strcmp(element,(input + middle)) == 0 ) {
        return middle;
    }
       else {
        if( strcmp(element,(input + middle)) < 0 ) 
        {
         return (binarySearch(input,element, left, middle -5));
        }
        else
        {    
         return (binarySearch(input,element, middle + 5, right));
            }
    }
    
    
    }
    }
    set.c (function definitions)
    Code:
    /* ******************************************************************** *//*                                                                      */
    /* Set.c --- this is the .c file associated with the (major 2) Set      */
    /*           implementation used in Spring 2012's CSCE 1040 class.      */
    /*                                                                      */
    /*           This assignment includes 10 functions to be implemented    */
    /*           in Set.c, namely                                           */
    /*                                                                      */
    /*                 clearSet                                             */
    /*                 add2Set                                              */
    /*                 deleteFromSet                                        */
    /*                 isMember                                             */
    /*                 printSet                                             */
    /*                 setUnion                                             */
    /*                 setIntersection                                      */
    /*                                                                      */
    /*  And new for major 2                                                 */
    /*                 nameIsMember                                         */
    /*                 addName2Set                                          */
    /*                 deleteNameFromSet                                    */
    /*                                                                      */
    /*                                                                      */
    /*                                                                      */
    /* ******************************************************************** */
    
    
    //CSE 1040 Alexander Hollis, Programming Major Assignment 2, [email protected], CSP03
    
    
    #include "set.h"
    char names[320][30];   
    
    
    // include your code below.
    
    
    void qSort(char names[320][], int y);
    
    
    void swap(char *p, char *q);
    
    
    void clearSet(Set set){
    
    
    int i;
    
    
    for(i=0; i<10; i++) set[i]=0;
    
    
    }
    
    
    
    
    void add2Set(Set set, int value) {
    int loc= SetLoc(value);
    if (value >= 32) value = (value % 32) -1;
    set[loc] = set[loc] | (1 << value);
    
    
    }
    
    
    void deleteFromSet(Set set, int value) {
        int loc = SetLoc(value);
        if(value >=32) value= (value % 32) -1;
        
        set[loc] = set[loc] | (1 << value );
        set[loc] = set[loc] ^ (1 << value);
    
    
    }
        
    
    
    int isMember(Set set, int element) {
        int loc= SetLoc(element);
        
        if(element >=32) element = (element % 32) -1;
        if(set[loc] &= (1 << ((element % 32) ) ) ) return 1;
        else return 0;
    }
    
    
    void printSet(Set set) {
    
    
    int i;
    
    
    for(i=0; i< 30; i++) {
    
    
       if( isMember(set, i) ) {
    
    
           printf("%s\n",names[i]);
         }
    
    
    
    
    }
    
    
    }
    
    
    int SetLoc(int value) {
    
    
    int i, location=0;
    
    
    while(location < 10) {
    
    
    if(value > 32) {
    
    
    value = value -32;
    location++;
    
    
    }
    else break;
    
    
    }
    
    
    return location;
    
    
    }
    
    
    int nameIsMember(Set set, char *name) {
    
    
    int index = binarySearch(names, name, 0, 30);
    
    
    if( isMember(set, index) ) return 1;
    else return 0;
    
    
    }
    
    
    
    
    void addName2Set(Set set, char *name) {
    
    
    int index = binarySearch(&names, name, 0, 320);    
    
    
    add2Set(set, index);
    
    
    }
    
    
    void deleteNameFromSet(Set set, char *name) {
    
    
    int index = binarySearch(names, name, 0, 320);
    
    
    deleteFromSet(set, index);
    
    
    }
    
    
        
    void setUnion(Set set1, Set set2, Set rset) {
    int i;
    for(i=0; i< 10; i++) {
    rset[i] = set1[i] & set2[i];
    }
    
    
    
    
    }
    
    
    void setIntersection(Set set1, Set set2, Set rset) {
    int i;
    for(i=0; i< 10; i++) {
    
    
    rset[i] = set1[i] | set2[i];
    
    
    }
    
    
    
    
    }
    Thank you for your feedback, and yes I plan on tiding things up before submission, especially the indentation. Thanks!

  2. #2
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,659
    > and yes I plan on tiding things up before submission, especially the indentation. Thanks!
    So your tutors get the gourmet meal, and we get the dog-food.

    I've no inclination to read that crud.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  3. #3
    Registered User
    Join Date
    Sep 2006
    Posts
    8,868
    I don't understand the +5 and -5 logic. When your target string is near the ends of the array, aren't you going to "jump" right off the ends of it with that kind of adjustment?

    May I inquire if you pulled the +5 and -5 numbers out from a dark hole, where the sun never shines?

    Add a print statement inside your search, and a getchar(), so you can see just how the search is progressing.

    I'd refrain from trying to improve on a function like binary search, which is SO elegant as it stands. It's like you're painting a pig's snout on the Mona Lisa.
    Last edited by Adak; 03-14-2012 at 11:03 PM.

  4. #4
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    Quote Originally Posted by Salem View Post
    > and yes I plan on tiding things up before submission, especially the indentation. Thanks!
    So your tutors get the gourmet meal, and we get the dog-food.

    I've no inclination to read that crud.
    Seconded.
    Using us as a practice evaluation of your work would be far more beneficial for everyone.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

  5. #5
    Registered User claudiu's Avatar
    Join Date
    Feb 2010
    Location
    London, United Kingdom
    Posts
    2,094
    I'm sorry I read up to this line:

    Code:
    for(i=0; (scanf("%s",&names[i]) ) != EOF; i++, words++);
    and then had to get a milkshake to get my sugar level back to normal. scanf stands for scan format, as in read some formatted input. Names, and blah blah strings in general are anything but formatted. Besides, what happens if my cat jumps on the keyboard and I accidently type in 240 characters for a name?!
    1. Get rid of gets(). Never ever ever use it again. Replace it with fgets() and use that instead.
    2. Get rid of void main and replace it with int main(void) and return 0 at the end of the function.
    3. Get rid of conio.h and other antiquated DOS crap headers.
    4. Don't cast the return value of malloc, even if you always always always make sure that stdlib.h is included.

  6. #6
    Registered User
    Join Date
    Sep 2011
    Posts
    25

    Re:

    Fixed the scanf no more '&', as for the cat and keyboard reference, this program doesn't get input from the user, I give it a file called data.dat through unix, and that fills in the array.

  7. #7
    Registered User
    Join Date
    Sep 2011
    Posts
    25

    Question Updated Search and Driver

    I went and made changes to the search and driver, the search still doesn't finish, it gets killed by linux. I removed the weird "+/- 5 parts" after thinking through what it was doing.

    search.c

    Code:
    //CSE 1040 Alexander Hollis, Programming Major Assignment 2, [email protected], CSP03
    
    #include "set.h"
    
    
    
    
    
    
    int binarySearch(char *input, char *element, int left, int right) {
    
    
    int middle; 
    
    
    middle= (left + (right - left) ) / 2;
    
    
    if (left > right) {
        return -1;
    }
    
    
    else {
    
    
    if( strcmp(element,(input + middle)) == 0 ) {
        printf("I found it yay!!\n");
        return middle;
    }
       else {
        if( strcmp(element,(input + middle)) < 0 ) 
        {
             //printf("Middle too big, shrinking bounds.\n");
            right = middle;
            return binarySearch(input,element, left, right);
        }
        else
        {    
             //printf("Middle too small, middle to left searching...");
            left = middle;
            return binarySearch(input,element, left, right);
            }
    }
    
    
    }
    }
    driver.c

    Code:
    #include "set.h"#include <stdio.h>
    #include <stdlib.h>
    
    
    // CSE 1040 Alexander Hollis Major Program Assignment 2, [email protected], CSP01
    
    
    
    
    int main() {
    
    
    printf("Initializing, scanning in your data...\n");
    
    
    Set set1, set2, rset;
    
    
    clearSet(set1);
    clearSet(set2);
    clearSet(rset);
    
    
    int i, words=0;
    
    
    for(i=0; (scanf("%s",names[i]) ) != EOF; i++, words++);
    
    
    qSort(names, words);
    
    
    //char *test [10][30] = {"the","hello","adam","World","car","tree","squirrel","dog","cat"};
    //for(i=0; i< 320; i++)
    //printf("%s\n",names[i]);
    
    
    char seem[1][5] = {"seem"};
    
    
    printf("Adding \"seem\" to set1.\n");
    addName2Set(set1, seem[0]);
    
    
    printf("Adding \"next\" to set1.\n");
    addName2Set(set1, "next");
    
    
    printf("Adding \"head\" to set2.\n");
    addName2Set(set2, "head");
    printf("Adding \"both\" to set2.\n");
    addName2Set(set2 , "both");
    printf("Adding \"seem\" to set2.\n");
    addName2Set(set2, "seem");
    
    
    printf("Set1: \n");
    printSet(set1);
    
    
    printf("\n");
    
    
    printf("Set2: \n");
    printSet(set2);
    
    
    }

  8. #8
    Registered User claudiu's Avatar
    Join Date
    Feb 2010
    Location
    London, United Kingdom
    Posts
    2,094
    Well yes, but presumably you can type whatever you want in that file, as in, the grader can. And when he does type in something like "alksjdlahnqadoasdlaskjdqwoirhqoasldajsldkasjdlakj woqiujoqw" (WHICH (S)HE WILL) I guarantee you your program will crash. Why? Because scanf doesn't provide any guarantee that the string capacity is not overflown by input.
    1. Get rid of gets(). Never ever ever use it again. Replace it with fgets() and use that instead.
    2. Get rid of void main and replace it with int main(void) and return 0 at the end of the function.
    3. Get rid of conio.h and other antiquated DOS crap headers.
    4. Don't cast the return value of malloc, even if you always always always make sure that stdlib.h is included.

  9. #9
    Registered User
    Join Date
    Sep 2011
    Posts
    25

    Question

    Yes, that is correct, but as this is a intro programming course, they promised to only give 319 "names" no longer than 30 characters. But in the future, we will be utilizing "malloc" to make the array dynamically rather than a set size, but the professor didn't go over that function in enough detail to require it for this project. But I appreciate your advice, and would never trust the user to stay under 320 strings , but I guess it wouldn't hurt to throw an assert in their to prevent it from crashing?

  10. #10
    Algorithm Dissector iMalc's Avatar
    Join Date
    Dec 2005
    Location
    New Zealand
    Posts
    6,318
    These two lines both contain bugs:
    Code:
    middle= (left + (right - left) ) / 2;
    
    if (left > right) {
    Remove some brackets, and recognise that the recursive calls will never cause left to be greater than right.
    My homepage
    Advice: Take only as directed - If symptoms persist, please see your debugger

    Linus Torvalds: "But it clearly is the only right way. The fact that everybody else does it some other way only means that they are wrong"

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Unable to store string's into my Binary Search Tree?
    By twiztidsoulz in forum C Programming
    Replies: 14
    Last Post: 11-03-2009, 04:01 PM
  2. search numbers in an inputed string [array of strings]
    By zattara13 in forum C Programming
    Replies: 1
    Last Post: 03-28-2008, 06:11 AM
  3. Binary search on strings
    By ckathy in forum C++ Programming
    Replies: 1
    Last Post: 10-02-2007, 05:25 PM
  4. binary search function on string array
    By gandolf989 in forum C Programming
    Replies: 6
    Last Post: 10-02-2007, 01:47 PM
  5. Sorting strings to speed up search?
    By Mox in forum C++ Programming
    Replies: 5
    Last Post: 09-10-2001, 12:17 PM

Tags for this Thread