C Board  

Go Back   C Board > General Programming Boards > C++ Programming

Reply
 
LinkBack Thread Tools Display Modes
Old 05-23-2009, 02:47 AM   #1
Registered User
 
Join Date: Oct 2006
Location: LA
Posts: 44
compare structures

Hi,

I have defined a structure as

Code:
typedef struct ps_report_rows
{
    std::string cp_item_id; 
    std::string cp_object_string;
    std::string mfg_object_string;
    std::string mor_item_id; 
} report_row;
I have 2 different lists containing the report rows structure.

Code:
list<report_row> objectList1; 
list<report_row> objectList2;
Some report rows in the two lists would be common and some unique. I want to find all the unique report rows that exist in objectList1 and store them in another list.

This is the code I have in mind

Code:
list < report_row > ::iterator listIter1;
list < report_row > ::iterator listIter2;
bool isDuplicate = false;
list<report_row> finalList;

finalList.clear();

for (listIter1 = objectList1.begin(); listIter1 != objectList1.end() ; listIter1++)
{
	report_row primaryRow = *listIter1;

	for (listIter2 = objectList2.begin(); listIter2 != objectList2.end() ; listIter2++)
	{
		isDuplicate = false;
		report_row secondaryRow = *listIter2;

		check_if_duplicate_row(primaryRow, secondaryRow, isDuplicate);

		if(isDuplicate == true)
			break;
	}
	if(isDuplicate == false)
	{
			finalList.push_back(primaryRow);
	}
}

void check_if_duplicate_row(report_row primaryRow, report_row secondaryRow, bool& isDuplicate)
{
	if(primaryRow->cp_item_id!=secondaryRow->cp_item_id)
	{
		isDuplicate = false;
		break;
	}
	if(primaryRow->cp_object_string!=secondaryRow->cp_object_string)
	{
		isDuplicate = false;
		break;
	}
	if(primaryRow->mfg_object_string!=secondaryRow->mfg_object_string)
	{
		isDuplicate = false;
		break;
	}
	if(primaryRow->mor_item_id!=secondaryRow->mor_item_id)
	{
		isDuplicate = false;
		break;
	}
	isDuplicate = true;
}
Does this code look good enough? Is there a more efficient way to achieve this?
lazyme is offline   Reply With Quote
Old 05-23-2009, 03:09 AM   #2
The larch
 
Join Date: May 2006
Posts: 3,222
You can overload operator== for report_row and use std::find instead of the inner loop.

Code:
void check_if_duplicate_row(report_row primaryRow, report_row secondaryRow, bool& isDuplicate)
This is very awkward. Why not return bool instead? Also, have you heard of the && operator (and)?

If you want to make it more efficient, you might sort the list first. Then you'll only need to check pairs of neighbours to find out if they are duplicates or not.
__________________
I might be wrong.

Quote:
Thank you, anon. You sure know how to recognize different types of trees from quite a long way away.
Quoted more than 1000 times (I hope).
anon is offline   Reply With Quote
Old 05-23-2009, 10:57 PM   #3
Registered User
 
Join Date: Oct 2006
Location: LA
Posts: 44
Quote:
use std::find instead of the inner loop.
Will this work fine?

Code:
list < report_row > ::iterator listIter1;

list<report_row> finalList;

finalList.clear();

for (listIter1 = objectList1.begin(); listIter1 != objectList1.end() ; listIter1++)
{
	report_row primaryRow = *listIter1;

	if(std::find(objectList2.begin(), objectList2.end(), primaryRow) != objectList2.end())
	{
		finalList.push_back(primaryRow);
	}

}
Don't I need to compare individual elements in the structure?

Quote:
If you want to make it more efficient, you might sort the list first. Then you'll only need to check pairs of neighbours to find out if they are duplicates or not.
All the report_row elements in objectList1 are unique whereas objectList2 may contain duplicate report rows. So, don't think sorting the list would help.

Quote:
You can overload operator== for report_row
Yes, will read up on operator overloading and try this.
lazyme is offline   Reply With Quote
Old 05-23-2009, 11:16 PM   #4
C++ Witch
 
laserlight's Avatar
 
Join Date: Oct 2003
Location: Singapore
Posts: 11,318
Quote:
Originally Posted by lazyme
All the report_row elements in objectList1 are unique whereas objectList2 may contain duplicate report rows. So, don't think sorting the list would help.
Describe an algorithm that makes a single pass over the first list and at most a single pass over the second list to determine the elements that are present in the first list but not in the second list.

Sounds tough? If both the lists are sorted, then you can avoid implementing the algorithm yourself by using std::set_difference. Another algorithm to accomplish the same task, of comparable efficiency (if you were using a container with random access instead of std::list), involves sorting the second list only, then making a single pass over the first list while binary searching the second list to determine if the current element of the first list is present in the second.
__________________
C + C++ Compiler: MinGW port of GCC
Build + Version Control System: SCons + Bazaar

Look up a C/C++ Reference and learn How To Ask Questions The Smart Way
laserlight is online now   Reply With Quote
Old 05-25-2009, 02:39 AM   #5
Registered User
 
Join Date: Oct 2006
Location: LA
Posts: 44
Ok. I got your point.

Will this work fine?

Code:
list < report_row > ::iterator listIterEnd, listIter;

list<report_row> finalList;

finalList.clear();

objectList1.sort();
objectList2.sort();

listIterEnd = set_difference(objectList1.begin(), objectList1.end(), objectList2.begin(), objectList2.end(), finalList.begin());
I assume finalList will contain the unique report rows from objectList1. Another person will write a function and will pass me the 2 lists so I cant test with actual data now.
lazyme is offline   Reply With Quote
Old 05-25-2009, 03:00 AM   #6
Registered User
 
Join Date: Jan 2008
Posts: 634
"Describe an algorithm that makes a single pass over the first list and at most a single pass over the second list to determine the elements that are present in the first list but not in the second list."

Easy. Create a third and fourth list and... ^_^;

Soma
phantomotap is offline   Reply With Quote
Old 05-25-2009, 04:20 AM   #7
Registered User
 
Join Date: Oct 2006
Location: LA
Posts: 44
Ok. What about the code I posted. Should that work fine?
lazyme is offline   Reply With Quote
Old 05-26-2009, 05:09 AM   #8
Registered User
 
Join Date: Oct 2006
Location: LA
Posts: 44
objectList1.sort() gives me an error

Code:
bool std::operator <(const std::vector<_Ty,_Alloc> &,const std::vector<_Ty,_Alloc> &)' : could not deduce template argument for 'const std::vector<_Ty,_Alloc> &' from 'ps_report_rows
Will I have to write my own sorting logic?

Also, how can I use set_difference for my structure. I tried the code I posted above but that doesnt work.
lazyme is offline   Reply With Quote
Old 05-26-2009, 07:04 AM   #9
The larch
 
Join Date: May 2006
Posts: 3,222
I have no idea where the vector came from. Do you have operator< defined for ps_report_rows? What does it look like (might be tricky if you want to compare more than one field in a struct)?

Code:
bool operator< (const ps_report_rows& lhv, const ps_report_rows& rhv)
{
     //comparison logic here
}
About using set_difference, I'm quite sure that you need to use std::back_inserter. You just can't assign values to the finalList (since it is empty) but you'll need to push_back results to it (which back_inserter does).

Code:
#include <iterator>
...
list<report_row> finalList;

//finalList.clear(); //it's already empty

objectList1.sort();
objectList2.sort();

listIterEnd = set_difference(objectList1.begin(), objectList1.end(), objectList2.begin(), objectList2.end(), std::back_inserter(finalList));
__________________
I might be wrong.

Quote:
Thank you, anon. You sure know how to recognize different types of trees from quite a long way away.
Quoted more than 1000 times (I hope).
anon is offline   Reply With Quote
Old 05-26-2009, 08:03 AM   #10
Registered User
 
Join Date: Oct 2006
Location: LA
Posts: 44
No, I don't have the operator < defined.

I dont necessarily need to compare more than one field in the struct. Can I use set_difference if my lists are sorted by any field in the struct?
lazyme is offline   Reply With Quote
Old 05-26-2009, 08:56 AM   #11
The larch
 
Join Date: May 2006
Posts: 3,222
You'll either need to overload operator<, or provide a comparison function (that is the same for both sort and set_difference).

The overloaded operator could be implemented like this:

Code:
bool operator< (const ps_report_rows& lhv, const ps_report_rows& rhv)
{
     return lhv.cp_item_id < rhv.cp_item_id;
}
However, this means that only the cp_item_id will be considered when set_difference determines which items are the same or not. If this field is equal in two items then it won't matter whether the other fields are the same or not - according to this ordering these two items will be seen as equivalent. (In your original code you are checking whether all 4 fields are equal or not.)
__________________
I might be wrong.

Quote:
Thank you, anon. You sure know how to recognize different types of trees from quite a long way away.
Quoted more than 1000 times (I hope).
anon is offline   Reply With Quote
Old 05-26-2009, 10:42 AM   #12
Registered User
 
Join Date: Oct 2006
Location: LA
Posts: 44
I want set_difference to determine the items are same only if all the four fields match.

Can I achieve this with

Code:
bool operator< (const ps_report_rows& lhv, const ps_report_rows& rhv)
{
     if(lhv.cp_item_id!=rhv.cp_item_id)
         return lhv.cp_item_id < rhv.cp_item_id;

     if(lhv.cp_object_string!=rhv.cp_object_string)
         return lhv.cp_object_string< rhv.cp_object_string;

     if(lhv.mfg_object_string!=rhv.mfg_object_string)
         return lhv.mfg_object_string < rhv.mfg_object_string;
   
     return lhv.mor_item_id < rhv.mor_item_id;
}
lazyme is offline   Reply With Quote
Old 05-26-2009, 11:54 AM   #13
The larch
 
Join Date: May 2006
Posts: 3,222
It looks correct.
__________________
I might be wrong.

Quote:
Thank you, anon. You sure know how to recognize different types of trees from quite a long way away.
Quoted more than 1000 times (I hope).
anon is offline   Reply With Quote
Old 05-27-2009, 05:11 AM   #14
Registered User
 
Join Date: Oct 2006
Location: LA
Posts: 44
Thanks anon. I have written my code around this and it works fine. The structure contains 13 elements now.

I am posting my complete code below. Please let me know if you can foresee any errors(mainly in the boolean operator < function) I cant see at the moment and also any suggestions to improve this code.

Code:
std::list<pn_report_row_t> finalDeltaList;
finalDeltaList.clear();

objectList1.sort();

objectList2.sort();

set_difference(objectList1.begin(), objectList1.end(), objectList2.begin(), objectList2.end(), std::back_inserter(finalDeltaList));
Code:
std::string siemens_trim(std::string inputStr)
{
    if (!inputStr.empty())
    {
        /*get the first white space in the string*/
        std::string::size_type startPos = inputStr.find_first_not_of("\n\t ");
        if (startPos != std::string::npos)
        {
            /*get the last white space in the string*/
            std::string::size_type endPos = inputStr.find_last_not_of("\n\t ");
            /*trim the white spaces*/
            inputStr = inputStr.substr(startPos, endPos - startPos + 1);
        }
        else
        {
            inputStr.clear();
        }
		return inputStr;
    }
}
Code:
bool operator< ( pn_report_row_t& primaryRow,  pn_report_row_t& secondaryRow)
{
	string cpItemId = siemens_trim(primaryRow.cp_item_id);
	string secCpItemId = siemens_trim(secondaryRow.cp_item_id);

	string cpObjStr = siemens_trim(primaryRow.cp_object_string);
	string secCpObjStr = siemens_trim(secondaryRow.cp_object_string);

	string mfgObjStr = siemens_trim(primaryRow.mfg_object_string);
	string secMfgObjStr = siemens_trim(secondaryRow.mfg_object_string);

	string morItemId = siemens_trim(primaryRow.mor_item_id);
	string secMorItemId = siemens_trim(secondaryRow.mor_item_id);

	string morObjStr = siemens_trim(primaryRow.mor_object_string);
	string secMorObjStr = siemens_trim(secondaryRow.mor_object_string);

	string sorItemId = siemens_trim(primaryRow.sor_item_id);
	string secSorItemId = siemens_trim(secondaryRow.sor_item_id);

	string sorObjStr = siemens_trim(primaryRow.sor_object_string);
	string secSorObjStr = siemens_trim(secondaryRow.sor_object_string);

	string manufacturerType = siemens_trim(primaryRow.manufacturer_type);
	string secManufacturerType = siemens_trim(secondaryRow.manufacturer_type);

	string orderDesc = siemens_trim(primaryRow.order_desc);
	string secOrderDesc = siemens_trim(secondaryRow.order_desc);

	string addnOrderDesc = siemens_trim(primaryRow.additional_order_desc);
	string secAddnOrderDesc = siemens_trim(secondaryRow.additional_order_desc);

	string objStr = siemens_trim(primaryRow.object_string);
	string secObjStr = siemens_trim(secondaryRow.object_string);

	string lastOrdDt = siemens_trim(primaryRow.last_order_date);
	string secLastOrdDt = siemens_trim(secondaryRow.last_order_date);

	string howImpact = siemens_trim(primaryRow.how_impacted);
	string secHowImpact = siemens_trim(secondaryRow.how_impacted);

	if(cpItemId.compare(secCpItemId)!=0)
	{
		return cpItemId < secCpItemId;
	}
	
	if(cpObjStr.compare(secCpObjStr)!=0)
	{
		return cpObjStr < secCpObjStr;
	}
	
	if(mfgObjStr.compare(secMfgObjStr)!=0)
	{
		return mfgObjStr < secMfgObjStr;
	}
	
	if(morItemId.compare(secMorItemId)!=0)
	{
		return morItemId < secMorItemId;
	}
	
	if(morObjStr.compare(secMorObjStr)!=0)
	{
		return morObjStr < secMorObjStr;
	}
	
	if(sorItemId.compare(secSorItemId)!=0)
	{
		return sorItemId < secSorItemId;
	}
	
	if(morItemId.compare(secMorItemId)!=0)
	{
		return morItemId < secMorItemId;
	}
	
	if(sorObjStr.compare(secSorObjStr)!=0)
	{
		return sorObjStr < secSorObjStr;
	}
	
	if(manufacturerType.compare(secManufacturerType)!=0)
	{
		return manufacturerType < secManufacturerType;
	}
	
	if(orderDesc.compare(secOrderDesc)!=0)
	{
		return orderDesc < secOrderDesc;
	}
	
	if(objStr.compare(secObjStr)!=0)
	{
		return objStr < secObjStr;
	}
	
	if(lastOrdDt.compare(secLastOrdDt)!=0)
	{
		return lastOrdDt < secLastOrdDt;
	}
	
	return howImpact < secHowImpact;
}
lazyme is offline   Reply With Quote
Old 05-27-2009, 08:19 AM   #15
The larch
 
Join Date: May 2006
Posts: 3,222
The comparison function looks a bit heavy. If it is not important for some reasons that the stored strings keep their trailing and leading whitespace, you could make sure that the strings are trimmed when you store them in a pn_report_row_t.

Another thing to reduce code might be to store the strings in an array instead of individual variables. You can provide an enum, so each index has a descriptive name (yours aren't very descriptive).

A small fault is that comparison operators should not modify what is being compared and therefore they should take the arguments by const reference.

There's also a small problem with the trimming function. The return statement should be moved to the end of the function, otherwise it won't return anything if the string is empty. (The test whether the string is empty is not very important: find_first_not_of would simply return npos and it is OK to clear an empty string.)

Code:
std::string siemens_trim(std::string inputStr)
{
    if (!inputStr.empty())
    {
        /*get the first white space in the string*/
        std::string::size_type startPos = inputStr.find_first_not_of("\n\t ");
        if (startPos != std::string::npos)
        {
            /*get the last white space in the string*/
            std::string::size_type endPos = inputStr.find_last_not_of("\n\t ");
            /*trim the white spaces*/
            inputStr = inputStr.substr(startPos, endPos - startPos + 1);
        }
        else
        {
            inputStr.clear();
        }
    }
    return inputStr;
}

class pn_report_row_t
{
public:
    enum
    {
        cpItemId, cpObjStr, mfgObjStr, morItemId, morObjStr, sorItemId, sorObjStr,
        manufacturerType, orderDesc, addnOrderDesc, objStr, lastOrdDt, howImpact,
        maxField
    };
    const std::string& get(unsigned index) const
    {
        assert(index < maxField);
        return fields[index];
    }
    void set(const std::string& value, unsigned index)
    {
        assert(index < maxField);
        fields[index] = siemens_trim(value);
    }
private:
    std::string fields[maxField];
};

bool operator< (const pn_report_row_t& primaryRow, const pn_report_row_t& secondaryRow)
{
    for (unsigned i = 0; i != pn_report_row_t::maxField; ++i) {
        if (primaryRow.get(i) != secondaryRow.get(i)) {
            return primaryRow.get(i) < secondaryRow.get(i);
        }
    }
    return false;
}
You could also optimize the trimming function a bit by avoiding some unnecessary copying. (A single exit point in a function is a good thing, but it isn't that important in C++ where clean-up can be managed by objects.)

Code:
std::string siemens_trim(const std::string& inputStr)
{
    if (!inputStr.empty())
    {
        /*get the first white space in the string*/
        std::string::size_type startPos = inputStr.find_first_not_of("\n\t ");
        if (startPos != std::string::npos)
        {
            /*get the last white space in the string*/
            std::string::size_type endPos = inputStr.find_last_not_of("\n\t ");
            /*trim the white spaces*/
            return inputStr.substr(startPos, endPos - startPos + 1);
        }
    }
    return std::string();
}
__________________
I might be wrong.

Quote:
Thank you, anon. You sure know how to recognize different types of trees from quite a long way away.
Quoted more than 1000 times (I hope).

Last edited by anon; 05-27-2009 at 08:24 AM.
anon is offline   Reply With Quote
Reply

Thread Tools
Display Modes

Forum Jump

Similar Threads
Thread Thread Starter Forum Replies Last Post
Dynamic array of structures containing yet another dynamic array of structures innqubus C Programming 2 07-11-2008 07:39 AM
Structures within Structures Misko82 C Programming 2 08-27-2007 12:25 AM
Classes and Structures. jrahhali C++ Programming 6 03-28-2004 05:03 PM
How do I compare structures?? aspand C Programming 4 06-09-2002 10:47 AM
Methods for Sorting Structures by Element... Sebastiani C Programming 9 09-14-2001 12:59 PM


All times are GMT -6. The time now is 01:47 PM.


Powered by vBulletin® Version 3.8.1
Copyright ©2000 - 2010, Jelsoft Enterprises Ltd.
Search Engine Optimization by vBSEO 3.3.2

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22