Machine Learning

Avatto > > DATA SCIENTIST > > SHORT QUESTIONS > > Machine Learning

The three main stages of building a Hypothesis model in Machine Learning are as follows:

a. Model Building: We use different algorithms to build a model. In this stage, we use training data to build a model.

b. Model testing: In this stage, the model is tested by providing sample test data. We can determine the accuracy of the model in this stage. If the model meets our accuracy criteria on test data, we can use it for actual production purposes.

c. Applying the model: Once the model starts working correctly, we apply the model to real data and start using it. We keep on calculating the accuracy of real data to keep the model updated. In case of any drop in accuracy we go back to the model building stage and tune the model.
Some of the basic learning techniques in Machine Learning are as follows:

a. Supervised Learning: In this learning technique we infer a function from a labeled training data. There are a set of examples with input and output. Based on that we try to come up with a model/function to predict future data. An example of supervised learning is the Decision tree.

b. Unsupervised Learning: In Unsupervised learning, we infer a function from unlabeled data. Two common approaches of unsupervised learning are Self-Organizing Map (SOM) and Adaptive Resonance Theory (ART).

c. Semi-supervised Learning: It is a supervised learning type in which we also make use of unsupervised learning approach. We can use labeled as well as unlabeled data in this technique. It is a general opinion that unlabeled data in conjunction with labeled data can improve the accuracy of model.

d. Reinforcement Learning: In Reinforcement learning the basis is behavioral psychology. Software agents are programmed to take actions to maximize the cumulative reward. This concept is also used in Game Theory, Operations Research, and Control Theory.

e. Transduction Learning: It is a Supervised learning approach in which reasoning is from general cases to specific cases. It is preferred over induction. In transduction, we try to get the answers that we need rather than creating general rules.

f. Learning to Learn: It is also known as Meta Learning. In this approach, the learning technique is flexible enough to keep making modifications in the algorithm to learn new scenarios.
The most common approach to Supervised learning is Inductive learning. We start with a large dataset and split it into a training dataset and test dataset.

We start with data x and output of function f(x). The proposed model is run on the training dataset first. Once the model is mature, we use the test dataset for testing it. When we are at a desired level of accuracy, we can use the model on real data.

Some of the popular algorithms of supervised learning are:

Support Vector Machines
Linear Regression
Logistic Regression
Naive Bayes
Decision Tree
K Nearest Neighbor
We use training and test datasets in the Supervised machine learning approach.

The purpose of the training dataset is to discover the predictive relationship by using a model. With the training dataset, the model can learn the behavior of data and tweak itself. The model is built based on the data it discovers in the training dataset.

Once the model is mature, we use the test dataset to get the accuracy of the hypothesis. Since the model has not seen the test dataset during the training phase, it gives us impartial results. Therefore, it is very important to keep the test dataset separate from the training data set.

Sometimes the model gets the problem of overfitting the data. It can occur when training and test datasets get mixed up.
The problems in the real world are of a large variety and type. Each problem requires its own analysis. It can be solved in multiple ways. Some of the approaches to solve a problem by implementing Machine Learning are as follows:

  • a) Inductive learning

  • b) Unsupervised learning

  • c) Recommendation system

  • d) Predictive modeling

  • e) Artificial Neural Networks

  • f) Decision tree learning

  • g) Deep learning

  • h) Support vector machine

  • i) Clustering

  • j) Bayesian networks