Thread: Big Integer Implementation

  1. #1
    Registered User
    Join Date
    Aug 2019
    Location
    inside a singularity
    Posts
    308

    Big Integer Implementation

    Hey! I've been working on implementing a Big Integer class for a few days now (mostly because I'm finding it useful for Project Euler problems and because I've wanted to do so for a while). My basic aim right now is to make my implementation at least as fast as Python or Java in-built Big Integer classes. I do have to let you know that I'm not very good at writing object-oriented code and I've tried keeping my code as simple as I could/need.

    Now obviously I can program a Modular Arithmetic class for problem solving instead (and in fact, I'm working on one!) but there's something different I get to achieve by implementing Big Integer.

    Till now, I've implemented only construction, addition, subtraction and multiplication. I'd like to have my code reviewed and criticised on. If someone could suggest better ways of doing things, and I can understand how the implementation would work, I'd use it to optimise the operations. I'm omitting the multiplication part as I know it's not very optimal because of a badly implemented Karatsuba algorithm. Also note that I've had no exposure to programming something for Big Integers except for simply having read/seen about it on some sites or using it in Python.

    Presently, the only two benchmarks I've performed are generating 100000(1e5) valid 100-digit Big Integers and simply finding their sum. This takes 0.0995400000 seconds. For the other benchmark, scroll to the bottom. Now, I don't know if that's good or bad but hopefully you guys will let me know. Also, the algorithm I use in addition and subtraction is the simple method taught at elementary school level. I also don't know of a way to check if my implementation is correct (the only way I think it is correct is by testing on hand some simple operations with small numbers). If there's some way to test my implementation, like a program I can run against mine, please do let me know. Also, if you have any comments on code style and documentation (yeah, I know I'm terrible, but hey, I'm trying!), do share your opinions too.

    Thanks in advance!

    Code:
    /*
        Written by: Aryan V S
        Date: Tuesday 2020-04-07
    */
    
    
    #ifndef BIG_INTEGER_H
    #define BIG_INTEGER_H
    
    
    #include <vector>
    #include <string>
    #include <initializer_list>
    #include <type_traits>
    #include <typeinfo>
    #include <cassert>
    #include <cmath>
    
    
    class BigInt {
    private:
        /**
         * std::vector that stores the digits of a Big Integer
         * Least significant digits stored first
         *
         * A digit 'Digits [i]' is not stored as it's ASCII value (as expected in
         * a vector of char) i.e. '1' is stored simply as 1 instead of it's ASCII
         * value 49. This is done to make arithmetic operations faster. When the
         * actual character value is needed (like in the to_string() method), simply
         * '0' is added in each step of conversion
         */
        std::vector <char> Digits;
        /**
         * std::bool that stores the sign of a Big Integer
         * false -> negative
         * true  -> non-negative
         */
        bool Sign;
    public:
        /**
         * Constructs a Big Integer with default values
         *   Sign: true
         * Number: 0
         */
        BigInt ();
        /**
         * Constructs a Big Integer from primitive integer types
         */
        template <typename T> BigInt (T);
        /**
         * Constructs a Big Integer from a string
         * Valid:
         *   -> "-...." (negative)
         *   -> "....." (non-negative)
         */
        BigInt (const std::string&);
        BigInt (const char*);
        /**
         * Constructs a Big Integer from std::initialzer_list
         * Valid:
         *   -> {-a,b,c,...} (negative)
         *   -> {a,b,c,....} (non-negative)
         * All integers must be in the range [0,9]
         */
        template <typename T> BigInt (const std::initializer_list <T>&);
        /**
         * Constructs a Big Integer from std::vector
         * Valid:
         *   -> {-a,b,c,...} (negative)
         *   -> {a,b,c,....} (non-negative)
         * All integers must be in the range [0,9]
         */
        template <typename T> BigInt (const std::vector <T>&);
        /**
         * Copy Constructor
         */
        BigInt (const BigInt&);
        /**
         * Copy Assignment
         */
        BigInt& operator = (const BigInt&);
        /**
         * Move Constructor
         */
        BigInt (BigInt&&);
        /**
         * Move Assignment
         */
        BigInt& operator = (BigInt&&);
        /**
         * Output stream overload
         */
        friend std::ostream& operator << (std::ostream&, const BigInt&);
        /**
         * Input stream overload
         */
        friend std::istream& operator >> (std::istream&, BigInt&);
        /* Relational Methods */
        bool operator == (const BigInt&) const;
        bool operator != (const BigInt&) const;
        bool operator <  (const BigInt&) const;
        bool operator >  (const BigInt&) const;
        bool operator <= (const BigInt&) const;
        bool operator >= (const BigInt&) const;
        /* Arithmetic Methods */
        BigInt& operator +=  (const BigInt&);
        BigInt& operator -=  (const BigInt&);
        BigInt& operator *=  (const BigInt&);
        BigInt& operator /=  (const BigInt&);
        BigInt& operator %=  (const BigInt&);
        BigInt& operator &=  (const BigInt&);
        BigInt& operator |=  (const BigInt&);
        BigInt& operator ^=  (const BigInt&);
        BigInt& operator <<= (const BigInt&);
        BigInt& operator >>= (const BigInt&);
        BigInt& operator ++  ();
        BigInt& operator --  ();
        BigInt  operator ++  (int);
        BigInt  operator --  (int);
        BigInt  operator +    () const;
        BigInt  operator -    () const;
        /* Utility Methods */
        /**
         * Accessing digits
         */
        const int& operator [] (int) const;
              int& operator [] (int);
        /**
         * to_vector: Returns a vector containing the digits. If parameter is false, default order returned.
         *            If parameter is true, reversed order returned (i.e. most significant digits start first).
         *            Default order is the order originally stored i.e. from LSD to MSD
         */
        std::vector <int> to_vector (bool) const;
        /**
         * to_string: Returns a string containing the digits. If parameter is false, default order returned.
         *            If parameter is true, reversed order returned (i.e. most significant digits start first).
         *            Default order is the order originally stored i.e. from LSD to MSD
         */
        std::string to_string (bool) const;
        /**
         * size: Returns the number of digits in the Big Integer
         */
        size_t size () const;
        /**
         * abs: Returns the absolute value of the Big Integer
         */
        BigInt abs () const;
    };
    
    
    BigInt::BigInt ()
        : Digits ({0})
        , Sign   (true)
    { }
    
    
    template <typename T>
    BigInt::BigInt (T Integer)
        : Digits ()
        , Sign   (Integer >= 0) {
        /// Only types like int, long long, etc are allowed
        static_assert(std::is_integral <T>::value, "Construction requires Integer types");
        Sign = Integer >= T(0);
        do {
            int Value = Integer % 10;
            /// Types like char, wchar_t, etc fail the assertion
            assert(Value >= 0 && Value <= 9);
            Digits.emplace_back(Value);
            Integer /= 10;
        }
        while (Integer != 0);
    }
    
    
    BigInt::BigInt (const std::string& Integer)
        : Digits ()
        , Sign   (*Integer.begin() != '-') {
        size_t N = Integer.size();
          if (Sign) Digits.resize(N);
        else        Digits.resize(N - 1);
        for (int i = N - 1; i >= !Sign; --i) {
            assert('0' <= Integer [i] && Integer [i] <= '9');
            Digits [N - i - 1] = Integer [i] - '0';
        }
    }
    
    
    BigInt::BigInt (const char* Integer)
        : BigInt ((std::string)Integer)
    { }
    
    
    template <typename T>
    BigInt::BigInt (const std::initializer_list <T>& Integer)
        : Digits ()
        , Sign   (*Integer.begin() >= 0) {
        /// Only types like int, long long, etc are allowed
        static_assert(std::is_integral <T>::value, "Construction requires Integer types");
        size_t N = Integer.size();
        Digits.resize(N);
        for (size_t i = 0; i < N; ++i) {
            int Value = std::abs(*(Integer.end() - i - 1));
            /// Types like char, wchar_t, etc fail the assertion
            assert(Value >= 0 && Value <= 9);
            Digits [i] = Value;
        }
    }
    
    
    template <typename T>
    BigInt::BigInt (const std::vector <T>& Integer)
        : Digits ()
        , Sign   (*Integer.begin() >= 0) {
        /// Only types like int, long long, etc are allowed
        static_assert(std::is_integral <T>::value, "Construction requires Integer types");
        size_t N = Integer.size();
        Digits.resize(N);
        for (size_t i = 0; i < N; ++i) {
            int Value = std::abs(*(Integer.end() - i - 1));
            /// Types like char, wchar_t, etc fail the assertion
            assert(Value >= 0 && Value <= 9);
            Digits [i] = Value;
        }
    }
    
    
    BigInt::BigInt (const BigInt& BigInteger)
        : Digits (BigInteger.Digits)
        , Sign   (BigInteger.Sign)
    { }
    
    
    BigInt& BigInt::operator = (const BigInt& BigInteger) {
        Digits.assign(BigInteger.Digits.begin(), BigInteger.Digits.end());
        Sign = BigInteger.Sign;
        return *this;
    }
    
    
    BigInt::BigInt (BigInt&& BigInteger)
        : Digits (std::move(BigInteger.Digits))
        , Sign   (std::move(BigInteger.Sign))
    { }
    
    
    BigInt& BigInt::operator = (BigInt&& BigInteger) {
        Digits.assign(BigInteger.Digits.begin(), BigInteger.Digits.end());
        Sign = BigInteger.Sign;
        return *this;
    }
    
    
    std::ostream& operator << (std::ostream& stream, const BigInt& BigInteger) {
        if (!BigInteger.Sign) stream << '-';
        for (auto Iterator = BigInteger.Digits.rbegin(); Iterator != BigInteger.Digits.rend(); ++Iterator)
            stream << int(*Iterator);
        return stream;
    }
    
    
    std::istream& operator >> (std::istream& stream, BigInt& BigInteger) {
        std::string Integer;
        stream >> Integer;
        BigInteger = BigInt(Integer);
        return stream;
    }
    
    
    inline bool BigInt::operator == (const BigInt& R) const {
        /// If numbers have different sign, they cannot be equal
        if (Sign != R.Sign)
            return false;
        /// If the number of digits in the two numbers is different, they cannot be equal
        if (Digits.size() != R.Digits.size())
            return false;
        /// The number are of same sign and have the same number of digits
        /// To test for equality, each digit must be same
        for (size_t i = 0; i < Digits.size(); ++i)
            if (Digits [i] != R.Digits [i])
                return false;
        return true;
    }
    
    
    inline bool BigInt::operator != (const BigInt& R) const {
        return !(*this == R);
    }
    
    
    inline bool BigInt::operator < (const BigInt& R) const {
        /// First is -ve, Second is non -ve
        if (!Sign && R.Sign)
            return true;
        /// First is non -ve, Second is -ve
        if (Sign && !R.Sign)
            return false;
        /// Both are non -ve
        if (Sign && R.Sign) {
            /// First is less than Second in terms of number of digits
            if (Digits.size() < R.Digits.size())
                return true;
            /// First is greater than Second in terms of number of digits
            else if (Digits.size() > R.Digits.size())
                return false;
            /// First and second have the same number of digits
            else {
                size_t N = Digits.size();
                /// Start check from MSD
                for (int i = N - 1; i >= 0; --i) {
                    if (Digits [i] > R.Digits [i])
                        return false;
                    if (Digits [i] < R.Digits [i])
                        return true;
                }
            }
        }
        /// Both are -ve
        else {
            if (Digits.size() < R.Digits.size())
                return false;
            else if (Digits.size() > R.Digits.size())
                return true;
            else {
                size_t N = Digits.size();
                /// Start check from MSD
                for (int i = N - 1; i >= 0; --i) {
                    if (Digits [i] > R.Digits [i])
                        return true;
                    if (Digits [i] < R.Digits [i])
                        return false;
                }
            }
        }
        /// All digits are same
        return false;
    }
    
    
    inline bool BigInt::operator > (const BigInt& R) const {
        return !(*this <= R);
    }
    
    
    inline bool BigInt::operator <= (const BigInt& R) const {
        return *this < R || *this == R;
    }
    
    
    inline bool BigInt::operator >= (const BigInt& R) const {
        return *this > R || *this == R;
    }
    
    
    BigInt& BigInt::operator += (const BigInt& R) {
        size_t MinSize = std::min(Digits.size(), R.Digits.size());
        size_t MaxSize = std::max(Digits.size(), R.Digits.size());
        bool isLesser = *this < R;
        /// Opposite signs
        if (Sign ^ R.Sign) {
            BigInt L;
            bool Swapped = false;
            /// Case: Left(L:(-ve)), Right(R:(+ve))
            if (isLesser) {
                /// Case: If abs(L) is lesser or equal to R, sum is non -ve
                if (this->abs() <= R) {
                    L = *this;
                    *this = R;
                    Swapped = true;
                }
                /// Case: If abs(L) is greater than R, sum is -ve
                else { /* Nothing to do here. Everything is good! */ }
            }
            /// Case: Left(+ve), Right(-ve)
            else {
                /// Case: If L is lesser than abs(R), sum is -ve
                if (*this < R.abs()) {
                    L = *this;
                    *this = R;
                    Swapped = true;
                }
                /// Case: If L is greater or equal abs(R), sum is non -ve
                else { /* Nothing to do here. Everything is good! */ }
            }
            assert(Digits.size() == MaxSize);
    
    
            std::function <void(const BigInt&)> Subtract = [&] (const BigInt& X) {
                size_t i = 0;
                bool Borrow = false;
                for (; i < MinSize; ++i) {
                    if (Digits [i] < X.Digits [i])
                        Borrow = true;
                    else
                        Borrow = false;
                    Digits [i] = (Borrow ? 10 + Digits [i] - X.Digits [i] : Digits [i] - X.Digits [i]);
                    if (Borrow) {
                        size_t j = i + 1;
                        while (j < MaxSize) {
                            if (Digits [j] != 0) {
                                --Digits [j];
                                break;
                            }
                            else Digits [j] = 9;
                            ++j;
                        }
                    }
                }
            };
    
    
            /// If a swap has occured, we need to subtract L from *this
            if (Swapped) Subtract(L);
            /// Otherwise, we need to subtract R from *this
            else Subtract(R);
    
    
            size_t CountLeadingZeroes = 0;
            for (int i = MaxSize - 1; i > 0; --i)
                if (Digits [i] == 0) ++CountLeadingZeroes;
                else break;
            Digits.resize(MaxSize - CountLeadingZeroes);
        }
        /// Same signs
        else {
            size_t CarryOver = 0, i = 0;
            for (; i < MinSize; ++i) {
                Digits [i] += R.Digits [i] + CarryOver;
                CarryOver = Digits [i] / 10;
                Digits [i] %= 10;
            }
            if (isLesser)
                for (; i < MaxSize; ++i) {
                    Digits.emplace_back(R.Digits [i] + CarryOver);
                    CarryOver = Digits [i] / 10;
                    Digits [i] %= 10;
                }
            else
                for (; i < MaxSize; ++i) {
                    Digits [i] += CarryOver;
                    CarryOver = Digits [i] / 10;
                    Digits [i] %= 10;
                }
            while (CarryOver != 0) {
                Digits.emplace_back(CarryOver % 10);
                CarryOver /= 10;
            }
        }
        return *this;
    }
    
    
    BigInt& BigInt::operator -= (const BigInt& R) {
        return *this += -R;
    }
    
    
    BigInt& BigInt::operator ++ () {
        return *this += 1;
    }
    
    
    BigInt& BigInt::operator -- () {
        return *this -= 1;
    }
    
    
    BigInt BigInt::operator ++ (int) {
        BigInt Copy = *this;
        ++*this;
        return Copy;
    }
    
    
    BigInt BigInt::operator -- (int) {
        BigInt Copy = *this;
        --*this;
        return Copy;
    }
    
    
    BigInt BigInt::operator + () const {
        return *this;
    }
    
    
    BigInt BigInt::operator - () const {
        BigInt B = *this;
          if (B.Sign && B != 0) B.Sign = false;
        else                    B.Sign = true;
        return B;
    }
    
    
    std::string BigInt::to_string (bool Original = true) const {
        size_t N = Digits.size();
        std::string S (N, '0');
        for (size_t i = 0; i < N; ++i)
            S [i] = (Original ? Digits [N - i - 1] : Digits [i]) + '0';
        return S;
    }
    
    
    std::vector <int> BigInt::to_vector (bool Original = true) const {
        size_t N = Digits.size();
        std::vector <int> V (N);
        for (size_t i = 0; i < N; ++i)
            V [i] = (Original ? Digits [N - i - 1] : Digits [i]);
        return V;
    }
    
    
    inline size_t BigInt::size () const {
        return Digits.size();
    }
    
    
    inline BigInt BigInt::abs () const {
        if (Sign)
            return *this;
        else
            return -*this;
    }
    
    
    BigInt operator + (const BigInt& L, const BigInt& R) {
        return BigInt(L) += R;
    }
    
    
    BigInt operator - (const BigInt& L, const BigInt& R) {
        return BigInt(L) -= R;
    }
    
    #endif /* BIG_INTEGER_H */
    Benchmark 2: This computes (2^1e5)*A. A is a 100-digit number with all digits 9. This code takes about 15.4884130000 seconds (which is pretty bad, I think) compared to using Python to do the same in under 5 seconds. So, there's definitely something better about the Python Big Integer implementation, which I'd like to be as fast as. Please share your opinions on how I should be implementing to optimise for space and time. I know if I use bitsets to implement BigInt, it'd become really simple for me code and implement everything, which would also be really fast and space optimised. The only problem I'm facing is how would I calculate the actual number for presenting output. I have some ideas that I'll try but any input from others will be very valuable.
    Code:
    int main() {
        ios::sync_with_stdio(false);
        cin.tie(nullptr); cout.tie(nullptr);
        cout.precision(10);
        cout << fixed;// << boolalpha;
    
        BigInt A = "9999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999";
    
        for (int i = 0; i < 1e5; ++i)
            A += A;
    
        cout << A << endl;
    
        return 0;
    }
    Last edited by Zeus_; 04-10-2020 at 07:22 AM.
    "Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the Universe trying to produce bigger and better idiots. So far, the Universe is winning." - Rick Cook, The Wizardry Compiled

  2. #2
    and the hat of int overfl Salem's Avatar
    Join Date
    Aug 2001
    Location
    The edge of the known universe
    Posts
    39,656
    Replace Digits with an integer, so you can work on 9 decimal digits at once, not one at a time.
    If you dance barefoot on the broken glass of undefined behaviour, you've got to expect the occasional cut.
    If at first you don't succeed, try writing your phone number on the exam paper.

  3. #3
    Registered User
    Join Date
    Aug 2019
    Location
    inside a singularity
    Posts
    308
    The thought has crossed my mind before. I could do it that way, but I have a question. Do you, by any chance, know/estimate by what factor would the benchmarking results improve? Like if I use Karatsuba over elementary multiplication, I have a 15-ish times better complexity for 1e5 100-digit number multiplications i.e. the algorithm finishes processing 15 times faster on average. Or, would you like me to try the implementation and figure out the hard way? I'm okay with either choice. Also, if I choose to implement using bitsets, trying to do things in binary like how it works for actual ints because this will obviously be incredibly fast, how will I build the original number for, let's say, displaying? I'm guessing it'd be something related to Number Theory to find the digit at some position and figure out for all positions, no?
    "Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the Universe trying to produce bigger and better idiots. So far, the Universe is winning." - Rick Cook, The Wizardry Compiled

  4. #4
    Registered User Sir Galahad's Avatar
    Join Date
    Nov 2016
    Location
    The Round Table
    Posts
    277
    That is a tough one to implement. I tried once and got about the same results.

    Also remember, that Python library you're comparing against is a highly-refined piece of software. Lot's of effort has gone into optimizations and the like. It's going to take a while to catch up!

    Mine used "long-word" binary maths based on simple schoolbook-style operations, plus all the most essential functions (logs, powers, roots, etc). Still wasn't even half as fast as something like GMP.

  5. #5
    Registered User
    Join Date
    Aug 2019
    Location
    inside a singularity
    Posts
    308
    I'm all about learning and testing new things. So, I don't mind the time it might end up consuming, but in the end, I'm trying to come up with a really light-weight and fast implementation of Big Integers, one that I can use as a possible template for BigInt arithmetic in problem solving. You may suggest to use Boost, or GMP let's say, but the reason I wouldn't work with those is that they're not really available as add-ons on most problem solving sites.

    Yes, it's obvious that the Python library implementation for math is very sophisticated. I'm hoping for this to be a good learning experience in library programming (because I'm terrible with object-oriented code) and math.

    PS: I hope I'm patient enough with this to reach the end I envision, and that this isn't something that I end up working on to pass time in the holidays I've got due to COVID-19

    Also, I commented a "Lmao" in some thread as a reply to your post. It's now that I realise that the piece of congested code is a part of your signature, contrary to your advice related to something about formatting code.
    "Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the Universe trying to produce bigger and better idiots. So far, the Universe is winning." - Rick Cook, The Wizardry Compiled

  6. #6
    Registered User
    Join Date
    Dec 2017
    Posts
    1,626
    bitsets have nothing to do with it.
    Instead of char (the baby bigint base) use uint32_t.
    Multiply them together into a uint64_t.
    This should be at least 10 times faster and take about a 10th of the space.
    A little inaccuracy saves tons of explanation. - H.H. Munro

  7. #7
    Registered User Sir Galahad's Avatar
    Join Date
    Nov 2016
    Location
    The Round Table
    Posts
    277
    Quote Originally Posted by Zeus_ View Post
    I'm all about learning and testing new things. So, I don't mind the time it might end up consuming, but in the end, I'm trying to come up with a really light-weight and fast implementation of Big Integers, one that I can use as a possible template for BigInt arithmetic in problem solving. You may suggest to use Boost, or GMP let's say, but the reason I wouldn't work with those is that they're not really available as add-ons on most problem solving sites.

    Yes, it's obvious that the Python library implementation for math is very sophisticated. I'm hoping for this to be a good learning experience in library programming (because I'm terrible with object-oriented code) and math.

    PS: I hope I'm patient enough with this to reach the end I envision, and that this isn't something that I end up working on to pass time in the holidays I've got due to COVID-19

    Also, I commented a "Lmao" in some thread as a reply to your post. It's now that I realise that the piece of congested code is a part of your signature, contrary to your advice related to something about formatting code.
    Funny! You know it should really display tau. But that would just make things more confusing, wouldn't it?

    Well it sounds like a worthwhile project. Drop me a PM if you ever feel the urge to wade through some old code.

    Good luck!

  8. #8
    Registered User
    Join Date
    Aug 2019
    Location
    inside a singularity
    Posts
    308
    A small doubt (probably stupid). In my implementation above, I store digits from LSD (least significant digit) to MSD (most significant digits). Considering the implementation with 9 digits at each position of Digits (i.e. Digits being a vector of uint32_t), would it be better to have storage from MSDs to LSDs, or is it better having it this way? Having MSD to LSD storage, construction becomes pretty easy and arithmetic would be easier to implement too (as far as I think).
    "Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the Universe trying to produce bigger and better idiots. So far, the Universe is winning." - Rick Cook, The Wizardry Compiled

  9. #9
    Registered User Sir Galahad's Avatar
    Join Date
    Nov 2016
    Location
    The Round Table
    Posts
    277
    Quote Originally Posted by Zeus_ View Post
    A small doubt (probably stupid). In my implementation above, I store digits from LSD (least significant digit) to MSD (most significant digits). Considering the implementation with 9 digits at each position of Digits (i.e. Digits being a vector of uint32_t), would it be better to have storage from MSDs to LSDs, or is it better having it this way? Having MSD to LSD storage, construction becomes pretty easy and arithmetic would be easier to implement too (as far as I think).
    Hard to say. You're method looks kind of similar to how this Javascript library does it. Mine's much more primitive, by comparison. I mean it doesn't even do Karatsuba!

  10. #10
    Registered User
    Join Date
    Aug 2019
    Location
    inside a singularity
    Posts
    308
    Hey! I've been working on implementing Salem and john's suggestion for storing 9 digits at every index. I actually started out by implementing exactly that. I did some benchmarking as well. Somewhere along the line of implementing for 9 digits at every index came to me an idea. I thought "why not write the code in such a way that it makes benchmarking for different number of digits at an index easier". So, that's exactly what I've ended up doing. Plus, I added multi-base support (2-10) that can let you deal with BigInt's in different bases as per your choice. All this looks cool to me (hehe!) because I'm doing this kinda stuff for the first time, but if you think I'm just wasting my time, please do comment and let me know. I've been a little lazy this week and could've completed earlier but there's nothing like rewatching Interstellar for the umpteenth time and a bunch of other Netflix shows.


    I've only done benchmarking for addition of positive numbers, and against Python 3.7 64-bit. Also, I compile my code on g++ 9.2.0. Here's the relevant "new" code:
    (Note: I haven't implemented operator += completely yet. Only finished the part that was required for benchmarking for positive number addition)


    Code:
    /*
        Written by: Aryan V S
        Date: Tuesday 2020-04-07
    */
    
    
    #ifndef BIG_INTEGER_H
    #define BIG_INTEGER_H
    
    
    #include <vector>
    #include <string>
    #include <initializer_list>
    #include <type_traits>
    #include <typeinfo>
    #include <cassert>
    #include <cmath>
    
    
    static const size_t   DIGITS_PER_SLOT = 18;
    static const uint64_t MAX_VALUE       = 1e18;
    
    
    /// Default BigInt is base 10
    template <size_t BASE = 10>
    class BigInt {
        /* [IMPORTANT]:
           -> Construction requires the constructing value to be represented in
              base 10, if you want expected behaviour.
              Example: BigInt <2> B = 256;
              (Even though the base of BigInt is 2, you are expected to pass the constructing
               value in base 10 i.e. if you want 256 to be represented as a Big Integer in
               base 2, you must construct with 256 itself, and not its base 2 equivalent 100000000)
    
    
            -> Support only for bases [2,10] (till now!)
        */
        static_assert (2 <= BASE && BASE <= 10, "BASE support: [2,10]");
    public:
        /**
         * std::vector that stores the digits of a Big Integer
         * Storage from Least Sigificant Digits to Most Significant
         * Digits. At each position, (DIGITS_PER_SLOT) digits of the number are
         * stored from MSD to LSD.
         *
         * Example 1: BigInt B = 12345678901234567890
         *            Digits: {345678901234567890, 12}
         *
         * Example 2: BigInt B = 1000000000000000000094
         *            Digits: {94, 1000}
         *
         */
        std::vector <uint64_t> Digits;
        /**
         * bool that stores the sign of a Big Integer
         * false -> negative
         * true  -> non-negative
         */
        bool Sign;
    public:
        /**
         * Constructs a Big Integer with default values
         *   Sign: true
         * Number: 0
         */
        BigInt ();
        /**
         * Constructs a Big Integer from primitive integer types (int, long long, etc)
         */
        template <typename T> BigInt (T);
        /**
         * Constructs a Big Integer from a string
         * Valid:
         *   -> "-...." (negative)
         *   -> "....." (non-negative)
         * All characters {.....} expected to be in the range ['0', '9']
         * Construction assumes that a valid number has been provided
         * (example: one without leading zeroes or "-0", etc.)
         */
        BigInt (const std::string&);
        BigInt (const char*);
        /* Copy Constructor */
        BigInt (const BigInt <BASE>&);
        /* Copy Assignment */
        BigInt& operator = (const BigInt <BASE>&);
        /* Move Constructor */
        BigInt (BigInt <BASE>&&);
        /* Move Assignment */
        BigInt& operator = (BigInt <BASE>&&);
        /* Output stream overload */
        template <size_t B>
        friend std::ostream& operator << (std::ostream&, const BigInt <B>&);
        /* Relational Methods */
        bool operator == (const BigInt <BASE>&) const;
        bool operator != (const BigInt <BASE>&) const;
        bool operator <  (const BigInt <BASE>&) const;
        bool operator >  (const BigInt <BASE>&) const;
        bool operator <= (const BigInt <BASE>&) const;
        bool operator >= (const BigInt <BASE>&) const;
        /* Arithmetic Methods */
        BigInt <BASE>& operator +=  (const BigInt <BASE>&);
        BigInt <BASE>  operator +    () const;
        BigInt <BASE>  operator -    () const;
        /**
         * size: Returns the number of digits in the Big Integer
         */
        size_t size () const;
        /**
         * abs: Returns the absolute value of the Big Integer
         */
        BigInt <BASE> abs () const;
    };
    
    
    template <size_t BASE>
    BigInt <BASE>::BigInt ()
        : Digits ({0})
        , Sign   (true)
    { }
    
    
    template <size_t BASE>
    template <typename T>
    BigInt <BASE>::BigInt (T Integer)
        : Digits ()
        , Sign   (Integer >= 0) {
        /// Only types like int, long long, etc are allowed
        static_assert(std::is_integral <T>::value, "Construction requires Integer types");
        Sign = Integer >= T(0);
        if (!Sign)
            Integer = -Integer;
        int Index = -1;
        uint64_t Multiplier = 1, Value;
        do {
            if (Multiplier == MAX_VALUE)
                Multiplier = 1;
            if (Multiplier == 1) {
                Digits.emplace_back(0);
                ++Index;
            }
            Value = Integer % BASE;
            Digits [Index] += Value * Multiplier;
            Multiplier *= 10;
            Integer /= BASE;
        }
        while (Integer != 0);
    }
    
    
    template <size_t BASE>
    BigInt <BASE>::BigInt (const std::string& Integer)
        : Digits ()
        , Sign   (*Integer.begin() != '-') {
        int N = Integer.size(), Index = -1;
        /// StrInt: The string from integer is to be obtained
        ///  Begin: First index of integer
        ///    End: (Last + 1) index of integer
        std::function <uint64_t(const std::string&, int, int)> to_int =
        [&] (const std::string& StrInt, int Begin, int End) -> uint64_t {
            uint64_t Int = 0, multiplier = 1;
            while (--End >= Begin) {
                /// Check if invalid character
                assert('0' <= StrInt [End] && StrInt [End] <= '9');
                Int += (StrInt [End] - '0') * multiplier;
                multiplier *= 10;
            }
            return Int;
        };
        uint64_t Multiplier = 1;
        for (int begin = !Sign; begin < N; begin += DIGITS_PER_SLOT) {
            uint64_t Int = to_int(Integer, begin, begin + std::min((int)DIGITS_PER_SLOT, N - begin)), Value;
            do {
                if (Multiplier == MAX_VALUE)
                    Multiplier = 1;
                if (Multiplier == 1) {
                    Digits.emplace_back(0);
                    ++Index;
                }
                Value = Int % BASE;
                Digits [Index] += Value * Multiplier;
                Multiplier *= 10;
                Int /= BASE;
            }
            while (Int != 0);
        }
    }
    
    
    template <size_t BASE>
    BigInt <BASE>::BigInt (const char* Integer)
        : BigInt ((std::string)Integer)
    { }
    
    
    template <size_t BASE>
    BigInt <BASE>::BigInt (const BigInt <BASE>& BigInteger)
        : Digits (BigInteger.Digits)
        , Sign   (BigInteger.Sign)
    { }
    
    
    template <size_t BASE>
    BigInt <BASE>& BigInt <BASE>::operator = (const BigInt <BASE>& BigInteger) {
        Digits.assign(BigInteger.Digits.begin(), BigInteger.Digits.end());
        Sign = BigInteger.Sign;
        return *this;
    }
    
    
    template <size_t BASE>
    BigInt <BASE>::BigInt (BigInt <BASE>&& BigInteger)
        : Digits (std::move(BigInteger.Digits))
        , Sign   (std::move(BigInteger.Sign))
    { }
    
    
    template <size_t BASE>
    BigInt <BASE>& BigInt <BASE>::operator = (BigInt <BASE>&& BigInteger) {
        Digits.assign(BigInteger.Digits.begin(), BigInteger.Digits.end());
        Sign = BigInteger.Sign;
        return *this;
    }
    
    
    template <size_t B>
    std::ostream& operator << (std::ostream& stream, const BigInt <B>& BigInteger) {
        if (!BigInteger.Sign) stream << '-';
        bool first = false;
        for (auto Iterator = BigInteger.Digits.rbegin(); Iterator != BigInteger.Digits.rend(); ++Iterator) {
            if (first) {
                int N = DIGITS_PER_SLOT - BigInt <B>(*Iterator).size();
                for (int i = 0; i < N; ++i)
                    stream << 0;
            }
            stream << *Iterator;
            first = true;
        }
        return stream;
    }
    
    
    template <size_t BASE>
    inline bool BigInt <BASE>::operator == (const BigInt <BASE>& R) const {
        /// If numbers have different sign, they cannot be equal
        if (Sign != R.Sign)
            return false;
        /// If the number of digits in the two numbers is different, they cannot be equal
        if (Digits.size() != R.Digits.size())
            return false;
        /// The number are of same sign and have the same number of digits
        /// To test for equality, each digit must be same
        for (size_t i = 0; i < Digits.size(); ++i)
            if (Digits [i] != R.Digits [i])
                return false;
        return true;
    }
    
    
    template <size_t BASE>
    inline bool BigInt <BASE>::operator != (const BigInt <BASE>& R) const {
        return !(*this == R);
    }
    
    
    template <size_t BASE>
    inline bool BigInt <BASE>::operator < (const BigInt <BASE>& R) const {
        /// Left is -ve, Right is non -ve
        if (!Sign && R.Sign)
            return true;
        /// Left is non -ve, Right is -ve
        if (Sign && !R.Sign)
            return false;
        /// Both are non -ve
        if (Sign && R.Sign) {
            /// Left is less than Right in terms of number of digits
            if (Digits.size() < R.Digits.size())
                return true;
            /// Left is greater than Right in terms of number of digits
            else if (Digits.size() > R.Digits.size())
                return false;
            /// Left and Right have the same number of digits
            else {
                size_t N = Digits.size();
                /// Start check from MSD
                for (int i = N - 1; i >= 0; --i) {
                    if (Digits [i] > R.Digits [i])
                        return false;
                    if (Digits [i] < R.Digits [i])
                        return true;
                }
            }
        }
        /// Both are -ve
        else {
            if (Digits.size() < R.Digits.size())
                return false;
            else if (Digits.size() > R.Digits.size())
                return true;
            else {
                size_t N = Digits.size();
                /// Start check from MSD
                for (int i = N - 1; i >= 0; --i) {
                    if (Digits [i] > R.Digits [i])
                        return true;
                    if (Digits [i] < R.Digits [i])
                        return false;
                }
            }
        }
        /// All digits are same
        return false;
    }
    
    
    template <size_t BASE>
    inline bool BigInt <BASE>::operator > (const BigInt <BASE>& R) const {
        return !(*this <= R);
    }
    
    
    template <size_t BASE>
    inline bool BigInt <BASE>::operator <= (const BigInt <BASE>& R) const {
        return *this < R || *this == R;
    }
    
    
    template <size_t BASE>
    inline bool BigInt <BASE>::operator >= (const BigInt <BASE>& R) const {
        return *this > R || *this == R;
    }
    
    
    template <size_t BASE>
    BigInt <BASE>& BigInt <BASE>::operator += (const BigInt <BASE>& R) {
        size_t MinSize = std::min(Digits.size(), R.Digits.size());
        size_t MaxSize = std::max(Digits.size(), R.Digits.size());
        bool isLesser = *this < R;
        /// Opposite signs
        if (Sign ^ R.Sign) {
            BigInt L;
            bool Swapped = false;
            /// Case: Left(L:(-ve)), Right(R:(+ve))
            if (isLesser) {
                /// Case: If abs(L) is lesser or equal to R, sum is non -ve
                if (this->abs() <= R) {
                    L = *this;
                    *this = R;
                    Swapped = true;
                }
                /// Case: If abs(L) is greater than R, sum is -ve
                else { /* Nothing to do here. Everything is good! */ }
            }
            /// Case: Left(+ve), Right(-ve)
            else {
                /// Case: If L is lesser than abs(R), sum is -ve
                if (*this < R.abs()) {
                    L = *this;
                    *this = R;
                    Swapped = true;
                }
                /// Case: If L is greater or equal abs(R), sum is non -ve
                else { /* Nothing to do here. Everything is good! */ }
            }
            assert(Digits.size() == MaxSize);
    
    
            std::function <void(const BigInt&)> Subtract = [&] (const BigInt& X) {
                size_t i = 0;
                bool Borrow = false;
                for (; i < MinSize; ++i) {
                    if (Digits [i] < X.Digits [i])
                        Borrow = true;
                    else
                        Borrow = false;
                    Digits [i] = (Borrow ? 10 + Digits [i] - X.Digits [i] : Digits [i] - X.Digits [i]);
                    if (Borrow) {
                        size_t j = i + 1;
                        while (j < MaxSize) {
                            if (Digits [j] != 0) {
                                --Digits [j];
                                break;
                            }
                            else Digits [j] = 9;
                            ++j;
                        }
                    }
                }
            };
    
    
            /// If a swap has occured, we need to subtract L from *this
            if (Swapped) Subtract(L);
            /// Otherwise, we need to subtract R from *this
            else Subtract(R);
    
    
            size_t CountLeadingZeroes = 0;
            for (int i = MaxSize - 1; i > 0; --i)
                if (Digits [i] == 0) ++CountLeadingZeroes;
                else break;
            Digits.resize(MaxSize - CountLeadingZeroes);
        }
        /// Same signs
        else {
            uint64_t CarryOver = 0;
            size_t i = 0;
            for (; i < MinSize; ++i) {
                Digits [i] += R.Digits [i] + CarryOver;
                CarryOver = Digits [i] / MAX_VALUE;
                Digits [i] %= MAX_VALUE;
            }
            if (isLesser)
                for (; i < MaxSize; ++i) {
                    Digits.emplace_back(R.Digits [i] + CarryOver);
                    CarryOver = Digits [i] / MAX_VALUE;
                    Digits [i] %= MAX_VALUE;
                }
            else
                for (; i < MaxSize; ++i) {
                    Digits [i] += CarryOver;
                    CarryOver = Digits [i] / MAX_VALUE;
                    Digits [i] %= MAX_VALUE;
                }
            if (CarryOver != 0)
                Digits.emplace_back(CarryOver);
        }
        return *this;
    }
    
    
    template <size_t BASE>
    BigInt <BASE> BigInt <BASE>::operator + () const {
        return *this;
    }
    
    
    template <size_t BASE>
    BigInt <BASE> BigInt <BASE>::operator - () const {
        BigInt B = *this;
          if (B.Sign && B != 0) B.Sign = false;
        else                    B.Sign = true;
        return B;
    }
    
    
    template <size_t BASE>
    inline size_t BigInt <BASE>::size () const {
        size_t N = Digits.size() - 1;
        if (N == 0 && this->Digits [0] == 0)
            return 1;
        int Count = 0;
        uint64_t Int = Digits [N];
        while (Int != 0) {
            Int /= 10;
            ++Count;
        }
        return N * DIGITS_PER_SLOT + Count;
    }
    
    
    template <size_t BASE>
    inline BigInt <BASE> BigInt <BASE>::abs () const {
        if (Sign)
            return *this;
        else
            return -*this;
    }
    
    
    #endif /* BIG_INTEGER_H */

    Firstly, if you suggest any optimisations about doing things in a different way so as to reduce the total number of operations, or in particular a change in the way I've implemented certain things, I'm all ears to your opinions.


    Now, onto benchmarking results. You can be sure that the addition works perfectly (as far as I've checked in different bases).


    Code:
    // Let's use this bit of code:
    
    
    BigInt A = "9999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999999";
    BigInt B = A;
    for (int i = 0; i < N; ++i)
        A += B;
    Debug(A, A.Digits, A.size())
    
    
    // The number A that we initially start with is a total of 100-digits of '9'. I didn't complete my
    // pseudo-random generator implementation so we're going to use just this for now.
    
    
    --------------------- Benchmark 1 ---------------------
    
    
    -> N = 1e7
    -> DIGITS_PER_SLOT = 1
    -> MAX_VALUE = 10
    My implementation: 6.5098740000s
    Python: 5.4782185554504395s
    
    
    --------------------- Benchmark 2 ---------------------
    
    
    -> N = 1e7
    -> DIGITS_PER_SLOT = 9
    -> MAX_VALUE = 1e9
    My implementation: 1.1058650000s
    Python: 5.4782185554504395s
    
    
    --------------------- Benchmark 3 ---------------------
    
    
    -> N = 1e7
    -> DIGITS_PER_SLOT = 18
    -> MAX_VALUE = 1e18
    My implementation: 0.5776130000s
    Python: 5.4782185554504395s
    
    
    --------------------- Benchmark 4 ---------------------
    
    // The killer benchmark. 10 times more additions in about the same time.
    
    -> N = 1e7 (Python)
    -> N = 1e8 (My implementation)
    -> DIGITS_PER_SLOT = 18
    -> MAX_VALUE = 1e18
    My implementation: 5.7817140000s
    Python: 5.4782185554504395s
    
    
    --------------------- Benchmark 5 ---------------------
    
    
    -> N = 1e8
    -> DIGITS_PER_SLOT = 18
    -> MAX_VALUE = 1e18
    My implementation: 5.7817140000s
    Python: 55.4829481004289103

    Similar results against Java 8.0 too. Java is slower on all benchmarks (except b-1). What I realised and learnt from this is that the more digits I "club" together, the faster the arithmetic becomes. This was bound to happen as the number of operations reduce greatly. However, I really did not expect such out-performance (b-5). C/C++ performance is underrated. Now, I'm trying to optimise further to try and reduce operations wherever possible. It'd be great to have help in this area. Thanks for the suggestion Salem and john, you guys have taught me a lot. I know I can't club more than 18 digits together because when I implement multiplication, I'd face trouble as 128-bit int supported by G++ would overflow. For now, I'll continue to implement all other operations, and in the meantime, everyone can critique on the code and benchmarks. Goodnight, it's almost 06:00 here and I'm going to sleep.

    EDIT: No idea why this message goes as a reply to Galahad instead of being indented as a new message.
    "Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the Universe trying to produce bigger and better idiots. So far, the Universe is winning." - Rick Cook, The Wizardry Compiled

  11. #11
    Registered User Sir Galahad's Avatar
    Join Date
    Nov 2016
    Location
    The Round Table
    Posts
    277
    Quote Originally Posted by Zeus_ View Post
    Hey! I've been working on implementing Salem and john's suggestion for storing 9 digits at every index. I actually started out by implementing exactly that. I did some benchmarking as well. Somewhere along the line of implementing for 9 digits at every index came to me an idea. I thought "why not write the code in such a way that it makes benchmarking for different number of digits at an index easier". So, that's exactly what I've ended up doing. Plus, I added multi-base support (2-10) that can let you deal with BigInt's in different bases as per your choice. All this looks cool to me (hehe!) because I'm doing this kinda stuff for the first time, but if you think I'm just wasting my time, please do comment and let me know. I've been a little lazy this week and could've completed earlier but there's nothing like rewatching Interstellar for the umpteenth time and a bunch of other Netflix shows.


    I've only done benchmarking for addition of positive numbers, and against Python 3.7 64-bit. Also, I compile my code on g++ 9.2.0. Here's the relevant "new" code:
    (Note: I haven't implemented operator += completely yet. Only finished the part that was required for benchmarking for positive number addition)

    Looks good. A little cleaner way to implement those operators would be to first code a "compare" function that returns -1, 0, or 1 (ie: less, equal, greater) and then simply define all of the other comparison operators in terms of that:


    Code:
    template <size_t BASE>
    inline bool BigInt <BASE>::operator > (const BigInt <BASE>& R) const {
        return compare(R) > 0;
    }
    
    
    template <size_t BASE>
    inline bool BigInt <BASE>::operator <= (const BigInt <BASE>& R) const {
        return compare(R) <= 0;
    }
    
    
    template <size_t BASE>
    inline bool BigInt <BASE>::operator >= (const BigInt <BASE>& R) const {
        return compare(R) >= 0;
    }

  12. #12
    Registered User
    Join Date
    Aug 2019
    Location
    inside a singularity
    Posts
    308
    Yeah, that looks a lot cleaner. I'll implement exactly that tonight. Also, just realised that the above code in the benchmark comment is an older version of what I was doing (which has a LOT of implementation errors). My file names were pretty messy and I seemed to have copy-pasted the wrong one (whenever I'm finished coding for a particular day, I store the progress until then separately. This way I record my progress on any project I'm working on). If you hadn't commented, I would've never looked through and realised that. So, thanks! In the current version that I'm writing, I've removed all template BASE related code and instead just implemented everything in base 10 (because it is way simpler to implement, and works way faster than any other base, plus reduces time and space consumed on construction). To compensate for that, I'm adding a to_base utility function that returns a BigInt converted to some base [2,16 (or maybe 36, as letters go up to 'z')] (obviously, with DIGITS_PER_SLOT digits at each index of its respective vector Digits).
    "Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the Universe trying to produce bigger and better idiots. So far, the Universe is winning." - Rick Cook, The Wizardry Compiled

  13. #13
    Registered User
    Join Date
    Aug 2019
    Location
    inside a singularity
    Posts
    308
    Is my reply (to #11) still stuck in the moderation queue? It's been 12+ (ig) hours since I replied and I don't see it yet...
    "Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the Universe trying to produce bigger and better idiots. So far, the Universe is winning." - Rick Cook, The Wizardry Compiled

  14. #14
    C++ Witch laserlight's Avatar
    Join Date
    Oct 2003
    Location
    Singapore
    Posts
    28,412
    I tried approving it in case I'm just not identifying in-topic posts that need moderation, but it says the action is not valid, i.e., it is not in the moderation queue. It's the post starting with:
    Yeah, that looks a lot cleaner. I'll implement exactly that tonight. Also, just realised that the above code in the benchmark comment is an older version of what I was doing (which has a LOT of implementation errors).
    Right?
    Quote Originally Posted by Bjarne Stroustrup (2000-10-14)
    I get maybe two dozen requests for help with some sort of programming or design problem every day. Most have more sense than to send me hundreds of lines of code. If they do, I ask them to find the smallest example that exhibits the problem and send me that. Mostly, they then find the error themselves. "Finding the smallest program that demonstrates the error" is a powerful debugging tool.
    Look up a C++ Reference and learn How To Ask Questions The Smart Way

  15. #15
    Registered User
    Join Date
    Aug 2019
    Location
    inside a singularity
    Posts
    308
    Yeah, that's the one.
    "Programming today is a race between software engineers striving to build bigger and better idiot-proof programs, and the Universe trying to produce bigger and better idiots. So far, the Universe is winning." - Rick Cook, The Wizardry Compiled

Popular pages Recent additions subscribe to a feed

Similar Threads

  1. Replies: 7
    Last Post: 10-01-2008, 07:45 PM
  2. Replies: 5
    Last Post: 06-12-2007, 02:18 PM
  3. determining if an integer is a legal integer
    By LiquidBlaze33 in forum C++ Programming
    Replies: 2
    Last Post: 10-09-2005, 07:06 PM
  4. finding an integer in an integer
    By Unregistered in forum C++ Programming
    Replies: 5
    Last Post: 05-09-2002, 07:25 PM
  5. determining integer/non-integer
    By gulakojhyau in forum C++ Programming
    Replies: 9
    Last Post: 03-08-2002, 01:25 PM

Tags for this Thread