narashil.github.io

View My GitHub Profile

This repository consists of four main projects listed below with direct links. Rest of the files in the repository are supporting libraries and images.

August 2020: Rescuing survivors under tarps - Haiti Earthquake (Scikit-learn ML models)
https://narashil.github.io/Data_Mining_Project_Narayan.html
Developed and implemented five ML models (Logistic, KNN, Random Forest, Linear SVC, SVM) using scikit-learn to classify Blue Tarps and rescue earthquake survivors. Automated the implementation to find optimal F1-score threshold based on precision-recall curve and determine cost tradeoff between false positives and false negatives. Conducted hyperparameter tuning and cross validation using pipelines and grid search to find the optimal parameters. Finally compared various classification metrics such as accuracy, log-loss, precision, recall and run-time to choose the optimal algorithm.

July 2021: Predicting education level using US Census data (using PySpark)
https://narashil.github.io/Predicting%20Educ%20Level%20Using%20Census%20Data_Code.html
Implemented a Random Forest ML model in PySpark to predict education level for a resident based on various household characteristics. Used Principal Component Analysis (PCA) to reduce the dimensions and identify the most contributing predictors. Optimized the runtime by using graph based cross validation methods in PySpark.

Feb 2021: Diabetes Mellitus prediction (XGBoost Classification in Python)
https://narashil.github.io/wids_2021_datathon_narayan.html
Developed a XGBoost Classification model to predict patients with Diabetes Mellitus condition based on various health features. Imputed the missing data using various methods to improve the training data.

May 2020: COVID-19 Real-time Data Analysis
https://narashil.github.io/03_CS5010_Project_Code_Andris_Hombal_Narayan_Sharma_Shah.html
Read the USA Facts data and anlayzed in various ways to show the impact of COVID-19 across USA. Used different librarries such as pandas, matplotlib, choropleth and plotly to visualize the impact in different ways. Demonstrated the implementation of the project using agile practices and functional programming.