Thread: Question about Serialized Objects and file streams

  1. #1
    Registered User
    Join Date
    May 2009
    Posts
    29

    Question about Serialized Objects and file streams

    Hello!

    I'm pretty new to C#, just learning it on my own from what I know between Java and C++/C. The term "serialization" is new to me, although I know what it means.

    I'm making a small program that simply takes some daily log information about some daily happenings, and writes them to a (binary) file, which can later be opened and viewed via a listbox on the bottom of the form.


    Here is my question:

    I assume that the "deserialize" method de-serializes the WHOLE stream. I want to break it into chunks that are the size of my class (like you would in C or C++), so I can write each individual entry down.

    However, in C++ you would use the reinterpret_cast<const char*>(&instance), sizeof(object) to tell the program the size of "chunk" you want. C# doesn't appear to have that going on, and I haven't found anything that helps yet online. (I'm searching even as I'm writing this). How would I accomplish this "chopping up" of my data stream in c#?

    thank you in advance, and have a great day!

    -argV

  2. #2
    Registered User
    Join Date
    Mar 2009
    Location
    england
    Posts
    209
    Hi argv.

    On the point of reading chunks of data from a stream this is possible to do using .net's System.IO.FileStream class. I've made a little demonstration for you which hopefully the commenting should be explanatory. It reads the file in chunks of 1024 bytes.

    Code:
    using System;
    using System.Collections.Generic;
    using System.Text;
    using System.Linq;
    using System.IO;
    
    namespace ConsoleApplication2
    {
        class Program
        {
            static void Main(string[] args)
            {
                // create a data stream using System.IO.FileStream
                FileStream fs = new FileStream("path_to_your_file", FileMode.Open, FileAccess.Read);
                
                // use this integer to monitor how much data we receive with each read
                int bytes_received = 1024;
    
                // loop while the most recent read returned 1024 bytes
                while (bytes_received == 1024)
                {
                    // byte array to store data receive from read
                    byte[] buffer = new byte[1024];
    
                    // read next chunk with max 1024 size but true value will be bytes_received
                    bytes_received = fs.Read(buffer, 0, 1024);
    
                    // perhaps this is the final chunk and is not quite 1024 bytes??
                    if (bytes_received < 1024)
                        // use the .Take function to trim off the unused bytes from buffer
                        buffer = buffer.Take(bytes_received).ToArray();
    
                    // a chunk of data is available to be parsed or processed
                    if (bytes_received > 0)
                    {
                        // now you can process the chunk of data as required
    
                    }
                }
    
                // don't forget to release your file
                fs.Flush();
                fs.Close();
            }
    
        }
    }
    Hope this helps.
    Last edited by theoobe; 09-06-2010 at 06:26 AM.

  3. #3
    Anti-Poster
    Join Date
    Feb 2002
    Posts
    1,401
    The absolute easiest way to serialize data in C# is to use the built-in object serialization:
    Code:
    //This attribute is important
    [Serializable]
    class LogEntry
    {
        string _message;
        int _code;
        /* You get the idea; pretty much anything can go here. */
    
    }
    
    class Program
    {
        static void Main(string[] args)
        {
            List<LogEntry> list = new List<LogEntry>();
    
            //add some log entries
    
            /*** Save log entries ***/
            {
                IFormatter formatter = new BinaryFormatter();
                using (FileStream file = new FileStream("log_entries.dat", FileMode.Create, FileAccess.Write, FileShare.None))
                {
                    formatter.Serialize(file, list);
                }
            }
    
            /*** Read log entries ***/
            List<LogEntry> newList = null;
            {
                IFormatter formatter = new BinaryFormatter();
                using (FileStream file = new FileStream("log_entries.dat", FileMode.Open, FileAccess.Read, FileShare.Read))
                {
                    newList = (List<LogEntry>)formatter.Deserialize(file);
                }
            }
            //check contents of newList
    
            Console.WriteLine("Finished");
            Console.ReadLine();
        }
    }
    Now, if you didn't want to read in the whole list of log entries, then don't serialize the list. Instead, save the count of how many items are in the list, then enumerate through the list, serializing each item in order. Then you should be able to serialize a single log entry at a time.
    If I did your homework for you, then you might pass your class without learning how to write a program like this. Then you might graduate and get your degree without learning how to write a program like this. You might become a professional programmer without knowing how to write a program like this. Someday you might work on a project with me without knowing how to write a program like this. Then I would have to do you serious bodily harm. - Jack Klein

  4. #4
    Registered User
    Join Date
    May 2009
    Posts
    29
    Quote Originally Posted by theoobe View Post
    Hi argv.

    On the point of reading chunks of data from a stream this is possible to do using .net's System.IO.FileStream class. I've made a little demonstration for you which hopefully the commenting should be explanatory. It reads the file in chunks of 1024 bytes.

    Code:
    using System;
    using System.Collections.Generic;
    using System.Text;
    using System.Linq;
    using System.IO;
    
    namespace ConsoleApplication2
    {
        class Program
        {
            static void Main(string[] args)
            {
                // create a data stream using System.IO.FileStream
                FileStream fs = new FileStream("path_to_your_file", FileMode.Open, FileAccess.Read);
                
                // use this integer to monitor how much data we receive with each read
                int bytes_received = 1024;
    
                // loop while the most recent read returned 1024 bytes
                while (bytes_received == 1024)
                {
                    // byte array to store data receive from read
                    byte[] buffer = new byte[1024];
    
                    // read next chunk with max 1024 size but true value will be bytes_received
                    bytes_received = fs.Read(buffer, 0, 1024);
    
                    // perhaps this is the final chunk and is not quite 1024 bytes??
                    if (bytes_received < 1024)
                        // use the .Take function to trim off the unused bytes from buffer
                        buffer = buffer.Take(bytes_received).ToArray();
    
                    // a chunk of data is available to be parsed or processed
                    if (bytes_received > 0)
                    {
                        // now you can process the chunk of data as required
    
                    }
                }
    
                // don't forget to release your file
                fs.Flush();
                fs.Close();
            }
    
        }
    }
    Hope this helps.
    I've seen this, but I don't know if I can use it, because some of the data I will be writing will be strings from the string class. I suppose if I make them all a certain length character array then this will work?
    Right now my class contains a dateTime, int, decimal, and two strings data types. Not sure if reading this all in bytes will be successful?

    I thank you for your time!!

    -argV

  5. #5
    Registered User
    Join Date
    May 2009
    Posts
    29
    Quote Originally Posted by pianorain View Post
    The absolute easiest way to serialize data in C# is to use the built-in object serialization:
    Code:
    //This attribute is important
    [Serializable]
    class LogEntry
    {
        string _message;
        int _code;
        /* You get the idea; pretty much anything can go here. */
    
    }
    
    class Program
    {
        static void Main(string[] args)
        {
            List<LogEntry> list = new List<LogEntry>();
    
            //add some log entries
    
            /*** Save log entries ***/
            {
                IFormatter formatter = new BinaryFormatter();
                using (FileStream file = new FileStream("log_entries.dat", FileMode.Create, FileAccess.Write, FileShare.None))
                {
                    formatter.Serialize(file, list);
                }
            }
    
            /*** Read log entries ***/
            List<LogEntry> newList = null;
            {
                IFormatter formatter = new BinaryFormatter();
                using (FileStream file = new FileStream("log_entries.dat", FileMode.Open, FileAccess.Read, FileShare.Read))
                {
                    newList = (List<LogEntry>)formatter.Deserialize(file);
                }
            }
            //check contents of newList
    
            Console.WriteLine("Finished");
            Console.ReadLine();
        }
    }
    Now, if you didn't want to read in the whole list of log entries, then don't serialize the list. Instead, save the count of how many items are in the list, then enumerate through the list, serializing each item in order. Then you should be able to serialize a single log entry at a time.
    Ok, I have seen this used, but didn't know if it applied. So basically I can just serialize a whole list of entries or deserialize a whole list of entries. I'm starting to think that maybe I just need a set "size" for my object/class. Because when a person hits the save button, it simply appends the information to the already existing log file, and when it's read back out, I don't know how many entries there will be, so must use a foreach, while, or for loop.. perhaps with a saved number of entries.. or read until the data is empty?

    I will work with this to see how it works. Either of these methods may work, I just need to create a set size for my class perhaps. ??

    Thank you for your time!!

    -argV

  6. #6
    Anti-Poster
    Join Date
    Feb 2002
    Posts
    1,401
    If you're just appending to the file, then I wouldn't serialize an entire list at a time. Instead, make your log entry class Serializable and serialize it out one at a time, appending to the end of the file. To read them back in, just read them in one at a time until you're at the end of the file.
    If I did your homework for you, then you might pass your class without learning how to write a program like this. Then you might graduate and get your degree without learning how to write a program like this. You might become a professional programmer without knowing how to write a program like this. Someday you might work on a project with me without knowing how to write a program like this. Then I would have to do you serious bodily harm. - Jack Klein

  7. #7
    Registered User C_ntua's Avatar
    Join Date
    Jun 2008
    Posts
    1,853
    To get back your list, just save the number of elements in the list in the file. Well, use some kind of "clever" format. Like:

    ......
    Serialized data
    ......
    #10
    .....
    Serialized data
    .....

    So you just add <--! 10 /--> after each list of data that has been serialized (with one element at a time, don't serialize the actual List).
    Now, you know that once you read a chunk of data the next line is the added line with the number of elements OR another chunk of data.

    Of course this means that you cannot use '#' in your LogEntry, but you might find another character. Every chunk of data will start/end with a special character, so you might find that special character and make sure you don't use it either. For example if you make sure that the chunk doesn't add zero as a header byte you can use (byte)0 to indicate that this is the beginning of a line that has the number of the following elements and use ReadLine().

    Eh, not the best method. A much better method is to use two files. One that has the data and the other the number of the data of each List<LogEntry> that has been written.

  8. #8
    Registered User
    Join Date
    May 2009
    Posts
    29
    Quote Originally Posted by C_ntua View Post
    To get back your list, just save the number of elements in the list in the file. Well, use some kind of "clever" format. Like:

    ......
    Serialized data
    ......
    #10
    .....
    Serialized data
    .....

    So you just add <--! 10 /--> after each list of data that has been serialized (with one element at a time, don't serialize the actual List).
    Now, you know that once you read a chunk of data the next line is the added line with the number of elements OR another chunk of data.

    Of course this means that you cannot use '#' in your LogEntry, but you might find another character. Every chunk of data will start/end with a special character, so you might find that special character and make sure you don't use it either. For example if you make sure that the chunk doesn't add zero as a header byte you can use (byte)0 to indicate that this is the beginning of a line that has the number of the following elements and use ReadLine().

    Eh, not the best method. A much better method is to use two files. One that has the data and the other the number of the data of each List<LogEntry> that has been written.
    I guess I need to write a small experiment to test how this might work. the sizeof() operator in C# does not allow user-defined data types. I *could* manually determine the size of my object at runtime, and divide the size of the stream by that amount to get how many entries are involved, but apparently C# *may* optimize the code and put values in different places to save space, so there is no guarantee that the chunk I read in is in the same order as it was written.

    As usual, nothing can just be simple. FML.

    thanks,

    -argV

  9. #9
    Anti-Poster
    Join Date
    Feb 2002
    Posts
    1,401
    Quote Originally Posted by C_ntua View Post
    So you just add <--! 10 /--> after each list of data...
    This MIGHT work if you use the SoapFormatter. It doesn't have a chance of working if you use the BinaryFormatter.
    Quote Originally Posted by argv View Post
    the sizeof() operator in C# does not allow user-defined data types.
    You don't need to use the sizeof() operator. You don't care about the size of your data class at run time. Look, here's my previous example slightly modified that does pretty much exactly what you want (assuming that "what you want" is "what you've told us"):
    Code:
    class Program
    {
        static void Main(string[] args)
        {
            List<LogEntry> list = new List<LogEntry>();
    
            //add some log entries
    
            /*** Append log entries ***/
            {
                IFormatter formatter = new BinaryFormatter();
                using (FileStream file = new FileStream("log_entries.dat", FileMode.Append, FileAccess.Write, FileShare.None))
                {
                    foreach (LogEntry entry in list)
                        formatter.Serialize(file, entry);
                }
            }
    
            /*** Read log entries ***/
            {
                LogEntry entry = null;
                IFormatter formatter = new BinaryFormatter();
                using (FileStream file = new FileStream("log_entries.dat", FileMode.Open, FileAccess.Read, FileShare.Read))
                {
                    while (file.Position != file.Length)
                    {
                        entry = (LogEntry)formatter.Deserialize(file);
                        //use contents of entry
                    }
                }
            }
        }
    }
    I really don't understand why you're trying to make it harder than it is.
    If I did your homework for you, then you might pass your class without learning how to write a program like this. Then you might graduate and get your degree without learning how to write a program like this. You might become a professional programmer without knowing how to write a program like this. Someday you might work on a project with me without knowing how to write a program like this. Then I would have to do you serious bodily harm. - Jack Klein

  10. #10
    Registered User
    Join Date
    Mar 2009
    Location
    england
    Posts
    209
    If you're simply dealing with strings and numbers, why not use XML? There's some fantastic classes in C# for converting xml documents into Generic Lists in one line of code. See System.Xml.Linq.XDocument for an example of this.
    Last edited by theoobe; 09-07-2010 at 06:42 AM.

  11. #11
    Anti-Poster
    Join Date
    Feb 2002
    Posts
    1,401
    Quote Originally Posted by theoobe View Post
    There's some fantastic classes in C# for converting xml documents into Generic Lists in one line of code.
    I'd like to see that single line of code that reads an XML file and converts it into a collection of data objects. It's not immediately obvious from the documentation link you posted.

    I repeat: I really don't understand why you're trying to make it harder than it is. If you're going to use XML, you'd probably want to create a schema so you can validate your XML file. Hard? No. More work than you really have to do just to get serialization? Yes. The object serialization logic already handles "strings and numbers" just fine.
    If I did your homework for you, then you might pass your class without learning how to write a program like this. Then you might graduate and get your degree without learning how to write a program like this. You might become a professional programmer without knowing how to write a program like this. Someday you might work on a project with me without knowing how to write a program like this. Then I would have to do you serious bodily harm. - Jack Klein

  12. #12
    Registered User
    Join Date
    Mar 2009
    Location
    england
    Posts
    209
    Quote Originally Posted by pianorain View Post
    I'd like to see that single line of code that reads an XML file and converts it into a collection of data objects. It's not immediately obvious from the documentation link you posted.
    Hi Pianorain. I've thrown together an example which fits the context of argv's criteria.

    Code:
    using System;
    using System.Collections.Generic;
    using System.Linq;
    using System.Xml.Linq;
    
    namespace ConsoleApplication3
    {
        class TheObject
        {
            // going by the criteria mentioned in one of argv's replies
            public DateTime Timestamp;
            public int Number1;
            public decimal Number2;
            public String String1;
            public String String2;
        }
    
        class Program
        {
            static void Main(string[] args)
            {
                List<TheObject> my_list = new List<TheObject>();
    
                // let's add a few entries to the list
                for (int i = 0; i < 20; i++)
                    my_list.Add(new TheObject
                    {
                        Timestamp = DateTime.Now,
                        Number1 = i,
                        Number2 = i,
                        String1 = ("string1 " + i),
                        String2 = ("string2 " + i)
                    });
    
                // save the list into a xml document
                SaveToXml(ref my_list);
    
                // ok so now we have a xml document...
                // let's clear the list and repopulate it from the xml document...
                my_list = new List<TheObject>();
                LoadFromXml(ref my_list);
            }
    
            static void SaveToXml(ref List<TheObject> list)
            {
                XDocument xml = new XDocument(new XElement("MyXMLFile"));
    
                foreach (TheObject x in list)
                    xml.Element("MyXMLFile").Add(new XElement("Item",
                        new XElement("Timestamp", x.Timestamp.Ticks),
                        new XElement("Number1", x.Number1),
                        new XElement("Number2", x.Number2),
                        new XElement("String1", x.String1),
                        new XElement("String2", x.String2)));
    
                xml.Save("my_file.xml");
            }
    
            static void LoadFromXml(ref List<TheObject> list)
            {
                XDocument xml = XDocument.Load("my_file.xml");
    
                // and here it is... convert xml document into list in one line...
                list = (from x in xml.Descendants("Item")
                        select new TheObject
                         {
                             Timestamp = new DateTime(long.Parse(x.Element("Timestamp").Value)),
                             Number1 = int.Parse(x.Element("Number1").Value),
                             Number2 = decimal.Parse(x.Element("Number2").Value),
                             String1 = x.Element("String1").Value,
                             String2 = x.Element("String2").Value
                         }).ToList<TheObject>();
    
                // pretty cool, huh :-)
            }
        }
    }
    Hope this helps.
    Last edited by theoobe; 09-07-2010 at 09:38 AM.

  13. #13
    Anti-Poster
    Join Date
    Feb 2002
    Posts
    1,401
    Yup, LINQ is pretty cool. Did you consider ease of maintenance when you suggested that route? What if I needed to add a field to TheObject? In your example, you'd have to update the class and both serialization methods. Using .Net object serialization, you just update the class. Further, if the class has private members that are not publicly accessible (both getters and setters), this method will not work. .Net object serialization does not have that restriction.
    Code:
    using System;
    using System.Collections.Generic;
    using System.Runtime.Serialization.Formatters.Binary;
    using System.Runtime.Serialization;
    using System.IO;
    
    namespace ConsoleApplication3
    {
        [Serializable] //this one line takes care of practically everything
        class TheObject
        {
            // going by the criteria mentioned in one of argv's replies
            public DateTime Timestamp;
            public int Number1;
            public decimal Number2;
            public String String1;
            public String String2;
        }
    
        class Program
        {
            static void Main(string[] args)
            {
                List<TheObject> my_list = new List<TheObject>();
    
                // let's add a few entries to the list
                for (int i = 0; i < 20; i++)
                    my_list.Add(new TheObject
                    {
                        Timestamp = DateTime.Now,
                        Number1 = i,
                        Number2 = i,
                        String1 = ("string1 " + i),
                        String2 = ("string2 " + i)
                    });
    
                // save the list into a binary file
                SaveToBinary(ref my_list);
    
                // ok so now we have a binary file
                // let's clear the list and repopulate it from the binary file
                my_list = new List<TheObject>();
                LoadFromBinary(ref my_list);
            }
    
            static void SaveToBinary(ref List<TheObject> list)
            {
                IFormatter formatter = new BinaryFormatter();
                using (FileStream file = new FileStream("my_file.bin", FileMode.Append, FileAccess.Write, FileShare.None))
                {
                    foreach (TheObject entry in list)
                        formatter.Serialize(file, list);
                }
            }
    
            static void LoadFromBinary(ref List<TheObject> list)
            {
                IFormatter formatter = new BinaryFormatter();
                using (FileStream file = new FileStream("my_file.bin", FileMode.Open, FileAccess.Read, FileShare.Read))
                {
                    while (file.Position != file.Length)
                    {
                        list.Add((TheObject)formatter.Deserialize(file));
                    }
                }
            }
    
            //This doesn't address any particular need, but it does show a way to handle
            //each entry without loading all the entries at once.
            static IEnumerable<TheObject> LoadingIterator()
            {
                IFormatter formatter = new BinaryFormatter();
                using (FileStream file = new FileStream("log_entries.dat", FileMode.Open, FileAccess.Read, FileShare.Read))
                {
                    while (file.Position != file.Length)
                    {
                        yield return (TheObject)formatter.Deserialize(file);
                    }
                }
            }
        }
    }
    If I did your homework for you, then you might pass your class without learning how to write a program like this. Then you might graduate and get your degree without learning how to write a program like this. You might become a professional programmer without knowing how to write a program like this. Someday you might work on a project with me without knowing how to write a program like this. Then I would have to do you serious bodily harm. - Jack Klein

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Memory Leak in AppWizard-Generated Code
    By jrohde in forum Windows Programming
    Replies: 4
    Last Post: 05-19-2010, 04:24 PM
  2. I'm not THAT good am I?
    By indigo0086 in forum A Brief History of Cprogramming.com
    Replies: 2
    Last Post: 10-19-2006, 10:08 AM
  3. Possible circular definition with singleton objects
    By techrolla in forum C++ Programming
    Replies: 3
    Last Post: 12-26-2004, 10:46 AM
  4. Dynamic list of Objects in External File
    By TechWins in forum C++ Programming
    Replies: 3
    Last Post: 12-18-2002, 02:05 PM
  5. classes in an exe. objects from text file.
    By davebaggott in forum Windows Programming
    Replies: 0
    Last Post: 10-08-2001, 02:55 AM