Let’s face it – EVERYONE wants to know about Machine Learning. It is a highly demanded field right now, considering the immense job-creating life-revolutionising potential that it has. There are so many articles and videos everywhere, it’s almost suffocating! The question is, where to begin?
As a beginner, I faced this exact problem. Thus, I decided to consolidate all the information I could gather on Machine Learning basics into one easy to understand article, for other beginners who may be struggling like I did. I’m sure after reading this, you will feel at least a little more confident in your knowledge of Machine Learning.
Let us start with the “What”, “Why”, “Where”, “When”, and “How” of Machine Learning.
Machine Learning is nothing but the process by which machines learn how to perform a specific task, without being explicitly programmed.
To make certain tasks easier, quicker, and much more efficient.
WHERE AND WHEN?
I thought I’d put these two together because they imply pretty much the same thing: Applications of Machine Learning. So here are a few important ones that are good to know –
- Spam Filter: Spam emails can be automatically detected and stored in your Spam folder. That way, it doesn’t interfere with your more important emails.
- Recommendation Systems: Most online stores use ML to recommend items based on the user’s recent activity and requirements.
- Virtual Assistants: They assist you in your daily requirements like setting alarms, making lists, and so on. They then store data from your previous tasks, and tailor their performance based on your preferences.
- Search Engines: Search Engines use Machine Learning Algorithms to find and display the results that are most accurate to your search, and even filter them based on your past activity.
- GPS: It is easy to travel now, thanks to GPS apps. These systems show people their current location, the distance between two places, the estimated time it would take to reach another location, and the amount of traffic indicating the increase or decrease in their estimated time of arrival.
This is where we will go into a little more detail, as to how ML works. Don’t worry, it’s not that difficult.
The first thing to know is that Machine Learning is mainly of two types:
- Supervised Learning (Predictive Models): It involves the use of labelled data (where the number of classes are known). The data here is used for prediction.
- Unsupervised Learning (Descriptive Models): It involves the use of unlabelled data (where the number of classes are unknown). The data here is descriptive.
Now, there is a 3rd type, known as Reinforcement Learning, which is a little different from its other two siblings, as it doesn’t really depend on the presence or absence of labels. This type of Machine Learning is similar to training a Puppy to do tricks – When the Puppy does the trick right, we give it a treat. In the same way, if the Machine gives a correct output, we give it a positive signal (reinforcing), while if it gives an incorrect output, we give it a negative signal. Thus, the machine eventually learns to perform its task correctly, with fewer wrong outcomes.
Going back to the first two types, they are further divided into their own methods. Supervised Learning involves Regression and Classification, while Unsupervised Learning involves Clustering and Association.
Methods of Machine Learning –
Regression: It is used to find values for Continuous variables. For example – Predicting the price of a stock, a Consumer’s spending behaviour, and so on.
Classification: It is used to find values for Discrete variables. It can be used for Binary Classification (Example – Determining if a picture is that of a Cat or Dog) or Multi-Class Classification (Example – Determining if an Email is priority, regular, or spam).
Clustering: It is used to classify data, but without any prior knowledge of the number of classes or the labels of the class. It finds similarities between certain data all on its own, and groups them together as a possible class. Most real world data is unlabelled and would thus require Clustering. For example – Finding out how many different types of animals are in a set of images.
Association: It is used to find associations or relationships among items within a data set. For example – If a Customer buys a Phone and Back Cover, they will most likely also want to buy a screen guard.
Let us now have a look at the different Algorithms in Machine Learning, which are important when it comes to the practical implementations. I will go into the details of it in another Article. For now, we will just skim through them:
Types of Supervised Machine Learning Algorithms –
Linear Regression, Support Vector Machines (SVM), Neural Networks, Decision Trees, Naive Bayes, Nearest Neighbor
Types of Unsupervised Machine Learning Algorithms –
k-means clustering, Association rule
Types of Reinforcement Learning –
Q-Learning, Deep Adversarial Networks
How to solve a Machine Learning Problem – Steps Involved (in brief) –
When a Machine Learning problem is placed in front of you, it might look quite intimidating, and you might feel a little lost. So, very briefly, we will go through how you need to tackle the problem.
- Study the Data.
By studying the Data, you will be able to decide what kind of data it is (Supervised or Unsupervised), after which you can decide what method to follow and what algorithm to use.
- Clean the Data.
Make sure the Data does not have any unnecessary rows or columns, based on your question (for example, to predict the sales of a product by studying its demand for the past 6 months, the name of the buyers will not have any affect on the demand for the product, so you can remove it from your data). Also, you will need to remove any NaN (Not a Number) values.
- Define the Training and Test Data.
You will need to separate out some of your data to use later when testing your result. The rest of the data is used for Training the Model.
- Create the Model.
You then create the Model that will perform the required computation on your data. You input the necessary functions (based on which Python library you decide to use) and parameters here.
- Run the Model on the Training Data.
You now provide the number of Epochs (the number of times the data must pass through the model) and allow the machine to carry out its computation.
- Run the Model on the Test Data.
This is done to check the accuracy of your model. The higher the accuracy, the better the model. (NOTE: Higher accuracy is better, but a 100% accuracy is not the best result. It could mean that the Machine has just memorised the correct values. This is caused due to overfitting.)
- Use the Most Accurate Model on the New Data.
Once you have finished training the Machine, you can use the Model to gather information from new data (For example – Once the Machine knows the difference between the images of a Mouse, a Squirrel, and a Mongoose, it can be used to identify these animals from video footage).
This just about wraps up all you would need to know about the basics of Machine Learning. Once you are confident of your basics, you can move on to full-fledged Programming, first for practise problems, and later, for real World problems. You will soon find yourself becoming an Expert in Machine Learning!