Machine Learning

Avatto > > DATA SCIENTIST > > SHORT QUESTIONS > > Machine Learning

In Inductive machine learning, we start with an input sample (x) and an output sample f(x). Our aim is to estimate the function f(x).

In simple words, we try to learn by example. We generalize the samples and estimate the output.

E.g. Let say we want to classify two types of fruits: apple and watermelon. We can measure the height and weight of these fruits in our test data set and label these as apple or watermelon.

Then we give it to our model. Our model can start approximating that things of higher weight and more height are watermelons and lesser weight things are apples. This is an example of inductive learning.

In deductive learning, we learn by actual experiment. We look at these fruits and conclude that watermelons are heavier than apples. Now we apply this rule to our data to deduce that this is a lighter object, therefore it is an apple.

Generally for Disease diagnosis, we use Inductive machine learning. Based on certain symptoms, a model can predict if the disease is present or not.
Some of the popular uses of Inductive machine learning are as follows:

Disease Diagnosis: Let say x is the symptoms of a patient.
Then f(x) is the disease patient is suffering from. We can derive the function f() by Inductive machine learning.

Credit risk analysis: Let say x is the important financial indicators. Eg. Social Security, Credit score, etc. of a person.
The function f(x) can give the result whether a credit is approved for this person. In this case, f is based on the machine learning model.

Self Driving Car: In this case images from different cameras of a car are X.
The f(x) is the angle at which the steering wheel can be turned to follow the path.

Face recognition: Images of different people are in a dataset X.
We can create f(x) model to get the name of a person from dataset X.
Some of the popular algorithms of Machine Learning are as follows:

  • a) Linear Regression

  • b) Logistic Regression

  • c) K Nearest Neighbor (KNN)

  • d) Decision Tree Learning

  • e) Artificial Neural Network

  • f) Gradient Boosting Algorithm

  • g) Naive Bayes Algorithm

  • h) Support Vector Machine (SVM)
We use Linear Regression to determine the price of cars, houses, profit etc. This is a technique to estimate the real value based on the variables of continuous range.

E.g. the price of a house can range from $5000 to $1,500,000 based on the square ft. area, year of making, size of the lot, etc. It can be any number between this range.

One of the simple ways of performing Linear Regression is by drawing a straight line. Once we have the best fit line based on our training data, we can use it for predicting the value of actual data.

In a straight line we have to estimate the slope a and b intercept. The equation is:

y = aX + b

Let say we have a graph with X = sq. ft. of house and y = price of the house. Our model can estimate and give us a = 100 and b = 190 for the best fit line.

Now we can get the value of any house by using following formula:

y = 100X + 190

We just need to replace X with the sq. ft. of the house to get the price of the house.

In Linear regression X is the independent variable and y is the dependent variable. The value of y is dependent on the value of X.
Logistic Regression is a classification algorithm. Although from the name it appears to be regression technique, it is used for classification problems.

We can use it to determine the class of data. It can be a binary class like 0,1 or yes, no, etc.

The Logistic regression is based on probability. In this case, the probability lies between 0 and 1. We use the log of the value to create a step function. The step function can distribute the values among the two classes.

E.g. Let say we want to predict whether it will be a rainy day or not based on the temperature on a day. We have to give one of the two answers.

True for a rainy day and false for a sunny day. We can use the temperature in our model to predict the probability of rain.

Then we take the log of this probability to divide into two classes. It can be 0 or 1. We can use 0 for a sunny day and 1 for a rainy day.