Top 50 data Science Interview Questions

Avatto > > DATA SCIENTIST > > SHORT QUESTIONS > > Top 50 data Science Interview Questions

Apply step function, which calculates the AIC for different permutation and combination of features and provides the best features for the dataset.
Feature engineering is the process of using domain knowledge of the data to create features for the machine learning algorithm to work
-Adding more columns (or) removing columns from the existing column
-Outlier Detection
-Normalization etc
In Logistic Regression, we can use step() which gives AIC score of a set of features
In Decision Tree, We can use information gain(which internally uses entropy)
In Random Forest, We can use varImpPlot
It exists when 2 or more predictors are highly correlated with each other.
Example: In the Data Set if you have grades of 2nd PUC and marks of 2nd PUC, Then both give the same trend to capture, which might internally hamper the speed and time.so we need to check if the multicollinearity exists by using VIF(variance Inflation Factor).
Note: if the Variance Inflation Factor is more than 4, then multicollinearity problem exists.
Measure how much the variance of the estimated regression coefficients are inflated as compared to when the predictor variables are not linearly related.