Machine Learning

Avatto > > DATA SCIENTIST > > SHORT QUESTIONS > > Machine Learning

Two main paradigms of ensemble learning are as follows:

Sequential ensemble: In this paradigm, we use multiple models in a sequence. The prediction output of one model is used as an input for the next model.

Parallel ensemble: In this paradigm, multiple models are used in parallel. The prediction results from these models are combined to give the final output of the ensemble model.
In ensemble learning, we combine two or more models to build a more accurate model. Boosting and bagging are two approaches used in ensemble learning.

Bagging: Bagging is also known as Bootstrap aggregating. In bagging, we reduce the variance of the model by generating additional test data. Once the size of the data set is increased, we can tune the model to be more immune to variance.

Boosting: Boosting is a two-step algorithm for ensemble learning. In boosting, we use subsets of a dataset to create an average performance model. Then we tune the model on larger data set to boost the performance of the model.

Data partition: In bagging data partition is random. In boosting, miss-classified data is given higher importance.

Goal: In bagging the goal is to reduce the variance in the model. Boosting aims for increasing the prediction accuracy of the model.

Method: We use a random subspace in bagging. Boosting uses a gradient descent method.

Function: Bagging uses the weighted or average function. Boosting uses a weighted majority vote function.
PCA stands for Principal Components Analysis.

KPCA stands for Kernel-based Principal Component Analysis.

ICA stands for Independent Component Analysis.

These are feature extraction techniques used in machine learning.

The primary purpose of PCA, KPCA, and ICA is dimensionality reduction.
We use an Incremental learning algorithm to improve the performance of a model in machine learning.

In this algorithm, input data is continuously used to further train the model. It is a dynamic technique of machine learning.

Some of the examples of incremental learning algorithms are decision trees, artificial neural networks, and incremental SVM.

The goal of Incremental learning is to adapt the model to new data while maintaining the knowledge learned earlier.

Incremental learning is very popular in data streams of Big data.
we use dimension reduction technique in machine learning to perform feature selection or feature extraction.

with dimension reduction we aim to find the important features that can be used for prediction.

dimension reduction process reduces the number of random variables or features under consideration in a machine learning algorithm.

some of the advantages of dimension reduction (DR) are as follows:


a) DR reduces the need for storage and time to perform a machine learning algorithm. it makes the algorithm more efficient.


b) DR removes the multi-collinearity between features. it can improve the performance of the model.


c)DR dr makes it easy to visualize the model in 2d or 3d views.