The Flower Classification Problem is like the classic "Hello World" introduction to the world of Machine Learning and Artificial Intelligence or so I believe.
Basically, the problem statement says that for a given of 5 different Flower types of different colours in addition to 4 other different properties of each flower type, make a prediction model that can predict the colour of a mystery flower with 4/5 given properties (5th one being the colour which is not given and the ML model is supposed to predict).
I found this problem quite difficult at first as I'm just starting out and didn't understand how I should be using a neural network to solve it. So, I simplified the problem for myself to just consist of two flower types each type having three properties (Petal Length, Petal Width, Colour) and attempted at creating a prediction model. It's working pretty well at predicting the colour of the mystery flower but there's something disturbing me about the Cost function. I'll provide the code and explain my question.
Here's my ML model:
Code:
/*
Perceptron : Flower Classification Problem
Neural Network Structure:
---------------------------------------------------
O Output Layer : Color (Blue or Red)
/ \
O O Input Layer : (PetalLength , PetalWidth)
---------------------------------------------------
*/
#include <iostream>
#include <cmath>
#include <random>
enum Color
{
Undetermined = -1 ,
Blue = 0 ,
Red = 1
};
struct Flower
{
public:
float PetalLength;
float PetalWidth;
Color FlowerColor;
public:
Flower () = delete;
Flower (float PL , float PW , Color C)
: PetalLength (PL) ,
PetalWidth (PW) ,
FlowerColor (C)
{ }
friend std::ostream& operator << (std::ostream& stream , const Flower& flower);
};
std::ostream& operator << (std::ostream& stream , const Flower& flower)
{
stream << "[Petal Length]: " << flower.PetalLength << "\n [Petal Width]: " << flower.PetalWidth << "\n [Color]: ";
switch (flower.FlowerColor)
{
case -1: stream << "Undetermined"; break;
case 0 : stream << "Blue"; break;
case 1 : stream << "Red"; break;
}
return stream;
}
/* Function Prototypes */
double Sigmoid (double X);
double dSigmoidX_dx (double X);
double CostFunction (double Prediction , int Expected);
double random_double (void);
int random_int (void);
/* Main Thread */
int main (void)
{
/* Training Set */
Flower FlowersDataSet [] =
{
// Blue: Characteristic feature - Smaller in size
{ 2.0f , 1.0f , Color::Blue } ,
{ 3.0f , 1.0f , Color::Blue } ,
{ 2.0f , 0.5f , Color::Blue } ,
{ 1.0f , 1.0f , Color::Blue } ,
// Red: Characteristic feature - Larger in size
{ 3.0f , 1.5f , Color::Red } ,
{ 4.0f , 1.5f , Color::Red } ,
{ 3.5f , 0.5f , Color::Red } ,
{ 5.5f , 1.0f , Color::Red }
};
/* Initial Weights and Bias is random */
double Weight1 = random_double();
double Weight2 = random_double();
double Bias = random_double();
/* ----------------------------------------- */
/* Other required variables */
int RandomIndex;
double Activation ;
double Prediction ;
double Cost ;
double dCost_dPred;
double dPred_dActi;
double dActi_dW1 ;
double dActi_dW2 ;
double dActi_dBias;
double dCost_dActi;
double dCost_dW1 ;
double dCost_dW2 ;
double dCost_dBias;
double LearningRate = 0.1;
/* ------------------------ */
/* Start Training */
std::cout << "Training in Progress..." << std::endl;
/* Training Loop */
for (int i = 0; i < 10000; i++)
{
RandomIndex = random_int();
Activation = (FlowersDataSet[RandomIndex].PetalLength * Weight1) + (FlowersDataSet[RandomIndex].PetalWidth * Weight2) + Bias;
Prediction = Sigmoid(Activation);
Cost = CostFunction(Prediction , FlowersDataSet[RandomIndex].FlowerColor);
std::cout << Cost << std::endl;
dCost_dPred = 2 * (Prediction - FlowersDataSet[RandomIndex].FlowerColor);
dPred_dActi = dSigmoidX_dx(Activation);
dActi_dW1 = FlowersDataSet[RandomIndex].PetalLength;
dActi_dW2 = FlowersDataSet[RandomIndex].PetalWidth;
dActi_dBias = 1;
dCost_dActi = dCost_dPred * dPred_dActi;
dCost_dW1 = dCost_dActi * dActi_dW1;
dCost_dW2 = dCost_dActi * dActi_dW2;
dCost_dBias = dCost_dActi * dActi_dBias;
Weight1 -= (LearningRate * dCost_dW1 );
Weight2 -= (LearningRate * dCost_dW2 );
Bias -= (LearningRate * dCost_dBias);
}
std::cout << "Training Completed!" << std::endl << std::endl << Weight1 << " " << Weight2 << " " << Bias;
Flower MysteryFlower(1 , 1.5 , Color::Undetermined);
Activation = MysteryFlower.PetalLength * Weight1 + MysteryFlower.PetalWidth * Weight2 + Bias;
Prediction = Sigmoid(Activation);
if (Prediction > 0.5)
MysteryFlower.FlowerColor = Color::Red;
else
MysteryFlower.FlowerColor = Color::Blue;
std::cout << std::endl << std::endl << MysteryFlower;
return 0;
}
double Sigmoid (double X)
{
return (1 / (1 + exp(-X)));
}
double dSigmoidX_dx (double X)
{
return (Sigmoid(X) * (1 - Sigmoid(X)));
}
double CostFunction (double Prediction , int Expected)
{
return (pow(Prediction - (double)Expected , 2));
}
double random_double (void)
{
static std::default_random_engine e;
static std::uniform_real_distribution<> dis(0, 1);
return dis(e);
}
int random_int (void)
{
static std::default_random_engine e;
static std::uniform_real_distribution<> dis(0, 8);
return dis(e);
}
First, credit to c++ - Random float number generation - Stack Overflow for the random number generator.
Second, my question. Have a look at the Cost values evaluated from the cost function:
Code:
0.00702527
0.028994
0.00011494
3.07483e-007
3.07481e-007
0.0183549
0.0177597
0.00700561
0.00691958
0.000107458
0.00683462
0.000106515
0.00039988
3.9654e-007
0.458472
0.362696
0.00298477
0.00208283
0.0877828
6.75981e-005
0.00141368
2.02433e-006
0.0036412
0.00141755
0.0687313
7.49147e-005
0.0042517
0.00421865
0.0100692
0.0584421
0.00084096
0.079695
0.000595151
0.413079
0.00884952
0.0682537
0.350009
0.00229093
0.00225457
0.00681175
0.264896
9.67824e-006
0.00197086
0.198073
0.0027157
5.44921e-005
5.44886e-005
0.14506
0.00147193
0.00854737
0.00347595
2.27151e-006
2.27145e-006
0.00152141
0.307664
0.114488
0.27559
0.188302
0.0932209
0.112232
0.0552676
9.00705e-007
0.0753365
0.037417
0.000457546
0.442054
0.00105253
0.0904635
0.0007183
0.072252
0.0143435
0.420105
0.00126335
6.9766e-005
0.00125219
6.99779e-005
0.00378638
0.065188
0.000935867
0.0106762
1.23841e-006
0.00429752
1.26348e-006
1.26346e-006
1.26344e-006
0.000981218
0.354948
0.00285393
0.00283864
0.00221268
0.0021787
0.00689987
0.26681
0.193945
0.0065364
0.00645031
5.42753e-005
0.00263111
5.40839e-005
4.51614e-006
0.00633342
4.65456e-006
0.102146
0.0785372
0.0987857
0.0486906
0.0682993
4.87485e-007
0.0567359
0.0168735
0.050095
0.018354
3.06251e-007
0.000330341
0.477846
0.000760553
8.18737e-005
0.00479408
0.0474058
0.0053247
0.00527377
8.63794e-005
0.000636328
0.0414009
0.00051712
9.24693e-005
0.42673
0.00383604
0.00120267
1.61248e-006
0.00382518
0.330095
0.150222
6.33479e-005
0.00154276
0.0739423
0.0595928
7.73619e-005
0.371281
0.083115
6.52392e-005
0.0658824
0.00423946
0.0890867
8.19332e-005
0.0006847
7.53096e-007
7.53088e-007
0.0427782
0.00055308
0.0368406
0.0148277
4.72599e-007
0.0582564
0.00679147
9.96971e-005
0.000380562
9.97812e-005
0.00671765
0.0289541
0.475973
0.0462178
0.00530084
0.0686167
0.0327887
9.73331e-005
0.0515055
0.00031151
2.66341e-007
2.6634e-007
0.0441569
0.00814776
0.000259398
0.0195541
0.000110522
2.30931e-007
2.30931e-007
2.3093e-007
0.018883
0.000107695
2.51411e-007
2.5141e-007
0.0429014
0.0201657
0.0194589
0.496195
8.21744e-005
0.0409966
5.22899e-007
0.00570051
0.0136075
8.53853e-005
0.0372898
0.000467256
0.0145405
0.0337777
0.451255
7.00099e-005
0.052781
0.000746095
7.98998e-007
7.60235e-005
0.0762551
0.00564718
0.00559017
0.00553426
0.414946
0.321216
0.103905
0.00184598
0.0018221
0.0784298
0.00367542
0.00365047
0.0637587
1.11336e-006
0.00420191
0.000995856
0.090963
0.000678575
0.0411686
0.0357011
4.1329e-007
8.70425e-005
8.70336e-005
0.438844
1.23631e-006
0.00106607
0.00404097
1.24722e-006
0.342892
0.146433
0.00818034
0.00804849
0.307051
0.00229871
0.110172
2.92765e-006
0.00288222
0.13686
6.14113e-005
0.00368812
1.56969e-006
0.00366303
0.00363824
0.00845458
0.0647591
6.58112e-005
6.58061e-005
0.350301
0.264483
0.133239
0.0985158
0.128614
1.27566e-006
0.0983336
0.0111236
0.0798308
0.00550363
0.0657857
0.448084
0.0097384
1.15332e-006
0.0923929
0.391377
2.09064e-006
0.00157073
0.00765803
2.13606e-006
0.00754176
0.00317685
0.121487
6.52685e-005
0.00942837
0.340445
0.0061162
0.00258561
0.00257299
0.149491
1.93314e-006
0.112158
0.0100143
0.0542413
0.0771631
5.33383e-007
0.00545403
8.25551e-005
0.0130682
0.0383342
0.000477423
0.00580472
This is some part of the 10000 times that the Cost is printed. What's disturbing me is that I want my cost value to decrease gradually but quite a few times the cost rises up to quite a high value in comparison to the average cost. Are these random rises in cost a general trend seen in training models? Or is it just that my training model is flawed? If so, could any seasoned ML programmer point out what's wrong? Also, I had to bring my initial weights and bias into a range of 0-1 to get a good working model, but before I changed that the random_double values were set to give a result between 0-100. No matter how much I tried (by increasing number of iterations or playing with the learning rate) the cost would never be precise. It was just displaying a 0 or 1. Why is that? The Sigmoid function pretty much gives a value very close to 1 but never exactly 1 and yet my cost displayed 1 at quite a few places. How do I increase the precision of cost without any rounding off occuring?
Thanks for your time and help!
Here's what an example output looks like for the mystery flower:
Code:
[Petal Length]: 1
[Petal Width]: 1.5
[Color]: Blue