Deleting Duplicate Proxies

This is a discussion on Deleting Duplicate Proxies within the C++ Programming forums, part of the General Programming Boards category; I’m pulling my hair out lol I have some spare time this weekend and I want to write a program ...

  1. #1
    Pupil
    Join Date
    Oct 2005
    Location
    Toledo
    Posts
    27

    Question Deleting Duplicate Proxies

    I’m pulling my hair out lol

    I have some spare time this weekend and I want to write a program that will search for duplicate proxies (in a text file) and delete any (if found).

    My first idea was to place the first proxy in a temp variable

    Code:
    inFile >> buffer;
    proxy = buffer;
    Then, do a string compare
    Code:
    if (strcmp(proxy, buffer) = = 0 )
    {
       cout << "duplacate found!" << endl;
       …more code to delete proxy
    }
    But as the program parses through the test file, how do I move to the next proxy to compare while looping through the list one less each time?

    The program sounds so simple to code, any ideas?

    Code:
    #include <fstream.h>
    #include <stdlib.h>
    #include <iomanip.h>
    #include <string.h>
     using namespace std;
    int main()
    {
      const int MAXLENGTH = 50;	// maximum file name length
      char filename[MAXLENGTH]; // put the filename up front
      char proxy[50];
      char buffer[50];
      ifstream inFile;
    
      cout <<"enter the file name" <<endl;
      cin >> filename;
      strcat ( filename, ".txt");
    
      inFile.open(filename, ios::nocreate | ios::out | ios::in);
    
      if (inFile.fail())
      {
        cout << "ERROR opening file: " << filename << endl;
        system("pause");
        exit(1);
      }
    
    
      while(inFile)
      {
    
      //inFile.getline(buffer, MAXLENGTH);
      // ^ different approaches, tried but this excepts white spaces…no good
      inFile >> buffer;
      cout << "buffer " << buffer << endl;  // for display only while error testing
      cout <<"proxy " <<proxy <<endl; // for display only
      
    
      if (strcmp(proxy, buffer) == 0 )
         cout << "*****duplacate found******" << endl;
         //extra code will be added
         //this will only loop once. I know I will have to create a nested loop
      }
    
    
       inFile.close();
    
    
    
      system("PAUSE");
      return 0;
    }

  2. #2
    Registered User
    Join Date
    Jan 2005
    Posts
    847
    You could read in every proxy and store it in a linked list. Create two pointers one pointer to the proxy currently being checked for duplicates and the other which moves thru the list then just update the first pointer to the next proxy.

  3. #3
    ^ Read Backwards^
    Join Date
    Sep 2005
    Location
    Earth
    Posts
    282
    Instead of inFile in your while loop you could do something like:

    Code:
    while( ! inFile.eof() )
    That it, while it is not the end of the file. I am sure someone here will tell you why it is not always the best idea to use eof to check for the end of a loop; but I have never had any problems with it!

  4. #4
    Pupil
    Join Date
    Oct 2005
    Location
    Toledo
    Posts
    27
    That’s a good idea, I could use a linked list then rewrite the file once all duplicates are removed.


    I’ll change the “while(infile)” to while( ! inFile.eof() )

    Thank you both.

  5. #5
    Super Moderator Harbinger's Avatar
    Join Date
    Nov 2004
    Posts
    74
    > but I have never had any problems with it!
    Ever tried simply copying a file using that technique?

    Ever read the FAQ?

  6. #6
    Registered User
    Join Date
    Jan 2005
    Posts
    7,319
    >> Instead of inFile in your while loop you could do something like: while( ! inFile.eof() )
    >> I am sure someone here will tell you why it is not always the best idea

    In this case, making that change means that if there is an error reading it won't be picked up and the loop will never terminate. In fact, neither solution is the best solution.

    The read from the file stream should be used as the loop control, or the stream should be checked after the read and before the word is used: while (inFile >> buffer).

    There are no cases that I know of where that is a worse solution, and in this case it is better. If you use eof() to control the loop (or you use the file stream like the original code does), there is a real chance that the last word will be deleted incorrectly from your file. The loop will run one too many times and so the last word will still be in the buffer variable when the read fails due to eof. The code will think the read succeeded, and delete that word since it will match the proxy.

  7. #7
    Registered User hk_mp5kpdw's Avatar
    Join Date
    Jan 2002
    Location
    Northern Virginia/Washington DC Metropolitan Area
    Posts
    3,801
    Code:
    #include <fstream.h>
    #include <stdlib.h>
    #include <iomanip.h>
    #include <string.h>
    First off, you should be using the newer versions of those headers:
    Code:
    #include <fstream>
    #include <cstdlib>
    #include <iomanip>
    #include <cstring>


    Code:
    const int MAXLENGTH = 50;	// maximum file name length
    char filename[MAXLENGTH]; // put the filename up front
    char proxy[50];
    char buffer[50];
    C++ has a wonderful container called a string that makes managing text much easier.


    Do you care if your proxy strings are sorted or not? If not, I'd read the data from the file into a set<string> container. A set does not allow duplicate data to be stored into it. If you attempt to insert an item into a set that already exists then nothing happens, the insert call fails (a soft failure, not a program ending hard failure). The set container also automatically sorts its data.



    Code:
    ifstream inFile;
    ...
    inFile.open(filename, ios::nocreate | ios::out | ios::in);
    I'm not sure if ifstream objects have any output related functionality; no write or seekp member functions, no stream insertion operator (<<). I don't know what the point of that is. It should be an fstream object instead of an ifstream object if you intend to do both input and output using that stream.



    Code:
    system("PAUSE");
    Try to avoid that if possible, a call (or two) to cin.get() should be all you need (you also wouldn't need the <stdlib.h>/<cstdlib> header in that case).
    "Owners of dogs will have noticed that, if you provide them with food and water and shelter and affection, they will think you are god. Whereas owners of cats are compelled to realize that, if you provide them with food and water and shelter and affection, they draw the conclusion that they are gods."
    -Christopher Hitchens

  8. #8
    Pupil
    Join Date
    Oct 2005
    Location
    Toledo
    Posts
    27
    I put this little project to the side until I finished my last semester. Here is the finished duplicate proxy checker. I never heard of a “set<string> container” before could you give a little more info on this?

    I am trying to write an "http" proxy checker now. I know very little about network programming. I am interested in simply knowing if the proxy is an http proxy for web surfing. I could care less about socks 4 or 5 proxies. I want to test the proxy; then if it is good, I will push it onto a stack and at the end of testing, all proxies will be written on the file. The only compiler I use is DevC++.


    Code:
    #include <fstream>
    #include <cstdlib>
    #include <iomanip>
    #include <string>
     using namespace std;
     const int MAXLENGTH = 50;	// maximum file name length
     char filename[MAXLENGTH]; //filename
    
     struct prxy
     {
       string proxy;
       prxy* next;
     };  //end struct
    
     class ProxyList
     {
      private:
       prxy* first;
       string buffer;
       string test;
       static int count;
      public:
    
    ProxyList()
     {
       first = NULL;
       getList();
     }  //End ProxyList
    
    void getList()
     {  int check;
       ifstream inFile;
       cout <<"Enter the file name \"proxy\" <- W/ No file extension"
            << " \nOr Path to the file containing proxies \"C:\\proxy \" "
            << "<- W/ No file extension\n";
    
       cin >> filename;
       strcat ( filename, ".txt");
       inFile.open(filename, ios::nocreate | ios::in);
    
      if (inFile.fail())
       {
         cout << "ERROR opening file: " << filename << endl;
         system("pause");
         exit(1);
       }
    
      while(inFile != NULL)
       {
         inFile >> buffer;
    
        if (buffer.compare(test) != 0)  // <-- checks if two duplacate proxies
          getProxy(buffer);                 // are next to each other
        test = buffer;
       }
    
      inFile.close();
      editList();
      writeToFile();
    
      } //end getlist
    
    void getProxy(string newprxy)
      {
        prxy* newProxy = new prxy;
        newProxy->proxy = newprxy;
        newProxy->next = first;
        first = newProxy;
      } //end getProxy
    
    void display()
      {
        cout << "display" <<endl;
         prxy* current = first;
         while (current != NULL)
          {
           cout << current->proxy << endl;
           current = current->next;
          }//end while
    
       } //end display
    
    void writeToFile()
    {
    
     prxy* current = first;
     char name[] = "Edited List.txt";
      ofstream outFile(name,ios::out);
     while (current != NULL)
      {
        outFile << current->proxy << endl;
        current = current->next;
      }//end while
    
     outFile.flush();
     outFile.close();
    
    }//end wrtieToFile
    
    
    void editList()
    {
      prxy* temp;
      prxy* tail;
    bool moreToSearch = true;
    prxy* current = first; //points to first struct
    prxy* testProxy = current; //points to current
    prxy* newsearch;
    while (current != NULL)
    {
      current = current->next;
    
      while (current != NULL /*moreToSearch*/)
      {
    
        if (testProxy->proxy.compare(current->proxy) == 0)
        {
          if(current == first)
             {
               first=first->next;
               current = first;
             }
          else if(current == NULL)
             {
               temp = current;
               current=Previous(temp);
               current->next=NULL;
               delete temp;
             }
          else
             {
    
               temp=Previous(current);
               temp->next=current->next;
               current=temp;
               //display();
             }
        }//end if check
    
        current = current->next;
    
      }//end nested while
    
      current = testProxy;
      testProxy = testProxy->next;
      current = current->next;
    
    }//end main while
    
     display();
    
    }//end editList
    
    
    
    prxy* Previous(prxy* index)
    {prxy* temp=first;
     if(index==first) //special case, index IS the head
      { return first;
      }
    
     while(temp->next != index)
     { temp=temp->next;
     }
     return temp;
    }  //end Previous
    
    
    }; //end class
    
    int ProxyList::count=0;
    
    int main()
    {
    ProxyList prox;
    
    system("PAUSE");
      return 0;
    }
    Last edited by chad101; 01-15-2006 at 01:18 AM.

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Deleting / Changing a record in a text file
    By clearrtc in forum C Programming
    Replies: 9
    Last Post: 08-21-2006, 12:09 AM
  2. How to duplicate a tree?
    By franziss in forum C Programming
    Replies: 2
    Last Post: 01-15-2005, 11:23 PM
  3. duplicate detection algorithm
    By Gustaff in forum C Programming
    Replies: 4
    Last Post: 01-28-2003, 11:26 AM
  4. How to check duplicate line in text file?
    By ooosawaddee3 in forum C++ Programming
    Replies: 3
    Last Post: 10-30-2002, 05:35 PM
  5. need help deleting a deleting a struct from file
    By Unregistered in forum C Programming
    Replies: 5
    Last Post: 05-20-2002, 05:38 AM

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21