Types of Machine Learning

How do Classifiers and Regressors differ? What are Supervised and Unsupervised ML algorithms? Let's find out more here.
Cover Image

Welcome Back!

Excited to know more about the algorithms which helped our computer predict the type of fruit on the conveyor belt? Are you excited to learn how a self-driving car, well, does so? Or how Netflix always knows the stuff you want to watch? Ready? Let’s get back on our journey then - the journey to unravel the secrets of these mysterious invisible things running the show!

Back to School

Let’s say you are new in town and you are looking for a good high school for your son. You want to be sure that your kid stays disciplined and away from the ill habits associated with teens in high school. Your close friend suggests you one - not so popular but he sure is confident that you are going to love it. The following day you pay a visit and are surprised when the principal mentions that the school invested in state-of-the-art technology to keep students disciplined. Something uncommon for a high school, right?

This piques your interest and you enquire about the details. The principal begins by listing down the common problems faced by the management:

  • Maintaining students attention in a classroom
  • Cheating during assignments & exams

You find the first one cliched and want to know what the school is trying to do differently.  You enquire and get an answer you never expected - “Simple. We use OCR” 

Extremely fast and performs better than the rest

OCR? Optical Character Recognition. Though it seems complex, the idea behind the school’s solution is very simple. 

While experimenting with different pedagogical methods, the administration noticed the following:

  • Usage of pre-made presentations in a classroom reduces the burden of taking notes but the students seemed lost interest in the lecture pretty soon. It didn’t deliver the same impact as blackboard teaching. 
  • Teaching using a blackboard actively engaged the students. But it had its fair share of problems. The students feel taking down notes cumbersome and always felt uncertain about one thing - did they get everything taught in the class? And what about the students who missed the lecture? 
  • The teaching staff are not in favor of using a combination of both i.e. teaching using the blackboard and circulating notes post the class. 

It was through this experience that the school came up with the solution of using an OCR based technique. The lecturer continued using the blackboard. But before erasing the content off the board, he presses a button by the side. An overhead camera, positioned at the perfect angle, captures the image of the blackboard. It converts these images into PDF documents containing the printed text from the board and emailed the same to the students at the end of the lecture. Efforts reduced plus attention retained! 

And yes, you read it right. The documents contain actual text and not simply the images of the blackboard. 

But wait. How did the camera convert images into printed text? Optical Character Recognition. Now does this name make sense? Characters - recognized! But how does the camera know how to figure out whether the character on the board is either an “A” or a “B” etc. Well, you teach it and it recognizes them later. Brings back memories? Apples, oranges recognition?

Yes, similar to how we trained our ML algorithm for determining the type of fruit on the conveyor belt, we teach this algorithm how each character in the English Alphabet looks like. Apart from the alphabet, you also provide data about numbers and other special characters i.e. !,*,/,{}, () etc. To help you appreciate the difference in complexity between the above example, consider the following:

  • The number of unique labels (also referred to as “classes”) here are larger. It’s not just differentiating between an apple and an orange. 
  • Handwriting of people varies. Also, there is cursive and block writing. 

But the algorithm can still do a good job. How? You train it using data collected from different people. For example, you collect all the occurrences of the letter “D” from the notes of a person and tag it with the label “D”. Well, you don’t actually need to do all this unless your teacher’s handwriting is really weird - such labelled data is already present on the web. You can find a few examples here. Thanks to all the efforts put in by the research community! A snapshot of such training data is presented below:

Image Source


Each column is tagged with the respective label and fed into the ML algorithm. It, in turn learns the unique structure of each character and learns to distinguish between them - for example, sometimes a ‘B’ might look like an ‘8’. 

Such a trained algorithm, on the chip of the camera, identifies individual characters from the blackboard images - just like our computer identified oranges and apples. So this is the actual process behind the conversion of an image into text, which is then put into a document and mailed to the students. 

These types of ML algorithms which predict the output for new data using the generated internal mappings from the training stage are called “Supervised Learning based Algorithms”. Let us break it down a bit further. By providing the machine the labeled data initially, you are playing the role of a supervisor, a trainer. For example, in the case of our orchard, you are indirectly telling the machine that if “the camera data looks like this, then the fruit must be an orange”. And in our present case, if the camera data looks like “this”, it must be a “B”.

While we are at it

Already feeling knowledgeable? Let us categorize Supervised ML Algorithms further. This will let you brag more to your peers. While you were consulting your better half regarding the admission into this school, the headmaster mentions something even more interesting. “We can predict your son’s performance across different subjects in the 12th final board exams. Right here, right now. And let’s say if we predict that he is going to be below average in Math and English, special personal attention would be paid by the teachers in these subjects.”

You loved the whole idea but were curious about one thing. How can such a prediction be possible? Two years before the finals? Observing your puzzled expressions, the principal asks your son his scores in different subjects from his secondary school days - 9th and 10th marks. He also inquires about the following:

  • Education and financial status of you and your spouse
  • Access to a computer and internet connectivity at home

He enters the above in his laptop and turns it towards you.

“Your son is a bit weak in Science. While the average board score is around 78, our algorithms predict that he can score around 62. Mr Saroj, our very talented Science lecturer, can take care of this.”

How did the algorithm come up with the score? We had a word with the principal and found out the secret. Well, the algorithm looks at a student's past scores in the subject during his lower grade classes and compares it with historical data containing the scores obtained by different students (across many batches) in the same subjects. It then combines this information on information about your family.

The underlying algorithm here is also based on Supervised Learning. It was trained on the historical student performance and family economic data and predicts the output on a new student. Taking a look at the training data, employed for the score predictor, might make it even more obvious.

The ML algorithm finds out the relation between the input and output in the training stage and uses the internal rules so developed to predict the output on a new data point. The new data point over here being your household details and 9th and 10th class scores of your son in Science. For example, a few relations the algorithm might have found from the data might look like this:

  • Students who scored 90+ in lower grades, score 90+ marks in the subject in the 12th boards
  • Students who have well educated parents score higher than those having parents with basic education
  • Students who have access to an internet enabled computer at home score higher than those who don’t. 

So what's different here from the example of the fruit on the conveyor belt? Here’s a hint: look at the nature of the output labels.

We guess you must have found out by now. The output classes in this case are on a continuous scale i.e. the marks are between 1-100 - can be 92 or 71.5 or 62. In the other cases, the classes are discrete i.e. either “apples” or “oranges”; or a character on the board being “A”, “B, “C”, etc., - the number of classes were limited. Therefore, the score predictor outputs a numerical quantity while the algorithms in the previous cases produced a discrete label. We have names for them too. We call those Supervised Learning based algorithms which predict a discrete class label as Classification Algorithms or simply Classifiers. Those whose task is to predict a continuous quantity are called Regression Algorithms or simply Regressors.We will revisit them later. 

In the meanwhile, impressed by everything, you want your son in the school. But your spouse is not so sure. Being your better half, she is already thinking of problems with the above regression. How it might discriminate against children of poorer parents. How children’s performance might revert to becoming worse because of “regression to the mean” (look it up!). So you decide to have a discussion and then inform the school.

“Change some stuff here and there! Interchange the lines. The teacher can’t find out”

Aditya, your son, recalls his good friend Manoj saying the above a few days back.

After a great amount of debate with your better half, you decided to put Aditya into the school. The very next week, the thing you expected the least has happened. Someone from the school administration calls you for a meeting with one of Aditya’s teachers as he was found cheating in the English assignment. He turned it in online, minutes before the deadline, on the website of the school (remember the fondness shown by the school for technology).

There are three other students along with Aditya. Believing in your son, you claim that the school is pushing false accusations. You ask for the copies to be shown, to which the teacher obliges. You go through all four documents, belonging to the students, one of which belonged to your son. You fail to find any similarity between them and argue for Aditya’s full vindication. The teacher brings out another set of copies but with a few parts highlighted in all the four documents. 

One such highlighted lines from Aditya's submission read "Because he studied advanced subjects on his own, he often cut clases; this earned him the animosity of some professors"

While from the other student's assignment, it was "He often studied advanced subjects by himself at home and skipped classes, which irritated some lecturers."

This was just one of many examples. Upon close examination, you find out the highlighted content to be extremely similar in all four assignments. 

You look at the teacher sheepishly, fully believing that Aditya is in the wrong, and ask “How? But how could you find out?”

The very fact that the teacher called out plagiarism, out of 200 odd copies and with the similar content separated by at least two paragraphs, surprised you. The teacher pointed her finger. You turn around expecting to find a much experienced staff member. But alas, it was a computer. It confused you even more. The teacher explains.

“Our algorithms find similar patterns of words and phrases across the content submitted by different students and inform us if such matches are high. Pattern Detection, Sir! The program found out the assignments submitted by these four students are very similar. We looked at it and confirmed what the machine found.”

Well, this brings us to the next category of Machine Learning algorithms – “Unsupervised Learning based Algorithms”. Unlike supervised algorithms, in this case, “labeled” data is not required to train the algorithm i.e. a supervisor is not needed. For example, the text from the assignments was the only data given to the plagiarism detector. No labels. The unsupervised algorithm in play over here employs a clustering technique. Yes, yes. It does exactly what you are thinking - groups together objects which are similar to each other (and less similar to other data). So, the algorithm used by the school automatically puts all the four assignments under one cluster - denoting a very high likelihood of plagiarism.

Aditya gives in and confesses. You chide him and drive him back. The only thing which made a little less upset is the fact that you learnt about a new category of ML algorithms.

The same algorithm can be applied at our orchard. Subject only the camera data to clustering without the labels. Guess, how many clusters would the algorithm detect? Two, right? Yes, in the ideal scenario. More clusters might be possible. For example, a cluster of red apples, oranges and greenish-yellow apples. Try to apply the above description of clustering here and you will figure out why.

End Note

Summarizing, we broadly categorized the ML algorithms based on the usage of labeled data and also the type of labeled data (if employed). A clear understanding of these key concepts is important before delving deeper into this field. We hope, through this article, the following are clarified:

  • Difference between Classifiers and  Regressors
  • Difference between Supervised and Unsupervised algorithms

Now that you know a bit more of the underlying mechanics, we will discuss the real-life examples of ML in the next article.

Just like last time, we have compiled a few additional resources for people who want a slightly more technical way of handling of these concepts:

 

Until next time!

Share this post

-