Top 50 data Science Interview Questions

Avatto > > DATA SCIENTIST > > SHORT QUESTIONS > > Top 50 data Science Interview Questions

Missing Values
Noise in the Data Set
Outliers
Mixture of Different Languages (like English and Chinese)
Range Constraints
Fraud detection
Disease screening
Imbalanced Data Set means that the population of one class is extremely large than the other (Eg: Fraud – 99% and Non-Fraud – 1%) Imbalanced dataset can be handled by either oversampling, undersampling and penalized Machine Learning Algorithm.
Machine learning algorithm suits well for small data and it might take huge amount of time to train for large data. Whereas Deep learning algorithm takes less amount of data to train due to the help of GPU(Parallel Processing).
Linear Regression and Logistic Regression
Decision Trees and Random Forest
SVM
Naïve Bayes
XGBoost