Using OpenCV ANN MLP to Train a Model on Iris Flower Dataset

Even though OpenCV is mainly a Computer Vision Library, it still contains a large set of very powerful mathematical functions, optimization algorithms and even GUI utilities that can be useful in other applications as well. Besides the fact that it’s open source and has a very permissive license, the emphasis on speed and performance which has always been the main goal of OpenCV, makes it even more appealing for commercial grade applications. That was my main motivation behind writing this post, and I want to walk you through it with a classical machine learning example, that is training a multilayer perceptron to classify Iris Flower Dataset entries.



First things first, let’s get the dataset from the following link:

https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data

We’re going to have to perform a one-hot encoding of the class labels to make it compatible with MLP classification first. Let’s say we use the following mappings:

  • Iris-setosa = 1.0,0.0,0.0
  • Iris-versicolor = 0.0,1.0,0.0
  • Iris-virginica = 0.0,0.0,1.0

Just open the downloaded iris.data file and replace labels with their one-hot encoding equivalents.

Now let’s go through the actual code. We’re going to start by reading in the data:

auto trainData = cv::ml::TrainData::loadFromCSV("iris_one_hot_encoded.data", 0, 4, 7);

This loads the training data from CSV into memory. The number of header lines in the CSV file are zero, the class labels start at index 4 and end at 7.

We can split the data into training and test by using the following code:

trainData->setTrainTestSplitRatio(0.75, true);

75% of the data will be dedicated to training the model, and the remaining 25% we’ll keep for testing it afterwards.

We can set up the network as easily as it’s seen in the following code:

int nFeatures = trainData->getNVars();
int nClasses = trainData->getResponses().cols;

Mat_<int> layers(4,1);
layers(0) = nFeatures;     // input
layers(1) = nClasses * 32;  // hidden
layers(2) = nClasses * 16;  // hidden
layers(3) = nClasses;      // output, 1 pin per class.

auto ann = ml::ANN_MLP::create();
ann->setLayerSizes(layers);
ann->setActivationFunction(ml::ANN_MLP::SIGMOID_SYM, 0, 0);
ann->setTermCriteria(TermCriteria(TermCriteria::MAX_ITER + TermCriteria::EPS, 500, 0.0001));
ann->setTrainMethod(ml::ANN_MLP::BACKPROP, 0.0001);

Let’s quickly recap what we just did. We started by getting the number of features and classes from the data. Obviously, we already know the answer is 4 and 3, but we used the proper method of getting it from trainData. Then we did set up the input and output layers accordingly, plus two hidden layers. For activation function, we used the sigmoid function and set the number of training epochs to 500. The training methid is going to be the Backpropagation algorithm. Note that these parameters are just some random examples and you can most certainly play around to get better results.



Now we can start the training by using the train method as seen here:

ann->train(trainData);

Now let’s test the classifier to measure the accuracy:

auto testSamples = trainData->getTestSamples();
auto testResponses = trainData->getTestResponses();    

Mat result;
ann->predict(testSamples, result);

float error = 0.0;
for(int i=0; i<result.rows; ++i)
{
    double minVal, maxVal;
    int minIdx, maxIdx;
    cv::minMaxIdx(result.row(i).t(), &minVal, &maxVal, &minIdx, &maxIdx);
    int prediction = maxIdx;
    cv::minMaxIdx(testResponses.row(i).t(), &minVal, &maxVal, &minIdx, &maxIdx);
    int testResponse = maxIdx;

    if(prediction != testResponse)
        error += 1.0;
}

float errorRate = error / testSamples.rows;
float accuracy = 1.0 - errorRate;

std::cout << "Error = " << errorRate << std::endl;
std::cout << "Accuracy = " << accuracy << std::endl;

A quick recap again. We fetched the test samples and their corresponding responses first. Using predict, we performed predictions on all test samples and then compared them against the actual classes from the dataset.

That covers the tutorial. Using the link below you can find the complete source code that you can paste into a C++ source file and run:

https://bitbucket.org/snippets/amahta/dLpRbo/opencv-ann-mlp-example



6 Replies to “Using OpenCV ANN MLP to Train a Model on Iris Flower Dataset”

  1. Hi Amin! So I am trying to run your code that I downloaded from the link that you provided. I have also replaced all the labels with their one hot encoding. However, when I run the code with openCV 3.4 and Microsoft Visual Studio 2017, I get the following error:

    Unhandled exception at 0x00007FFA36E73B29 in opencvtry.exe: Microsoft C++ exception: cv::Exception at memory location 0x00000055529CDC40.

    I think that this might be a problem in reading the data in correctly, but I might be wrong. I would be extremely grateful if you could kindly let me know how to fix this error.

    I also wanted to ask, that would the same code work on any dataset, such as Fashion-MNIST or CIFAR-10, as long as I changed their labels to one hot encoding, and changed the parameters of auto trainData =cv::ml::TrainData::loadFromCSV (“iris_one_hot_encoded.data”, 0, 4,7);
    appropriately, to go with my data? If not, can you please let me know what else I would need to do?

    Thank you

    1. Can you tell me which line is producing that error?
      Add some logs and debug messages and try to locate the exact point where crash happens.

      Regarding your second question, you can use this same code with any data that has the same structure.
      It’s quite straightforward.

  2. Hello, my name is Fairuzy.
    I’m interest in your code, I using opencv 3.2 visual studio 2015 but it cannot run (because some errors in library, and I cannot fix it). What is version of application do you use to run the program?

      1. Hey! Thanks for your reply. I was able to resolve that issue. Now I want to know how the forward and backward propagation functions work in opencv. Can you please tell me which functions perform forward and backward propagation? And I want to see how those functions are implemented. Is it just traindata() that does everything?

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.