Hello
Here, you’ll find a collection of my previous projects, each showcasing my expertise in transforming complex challenges into actionable outcomes. I invite you to explore these projects and how I can apply the same expertise to your ideas. Take a look around, and let's explore how we can create something remarkable together!
PCOD Risk Prediction Model
Objective: To develop a comprehensive machine learning-based tool that predicts the likelihood of Polycystic Ovarian Disease (PCOD) based on patient-provided demographic, lifestyle, symptom, and medical data, while also offering personalized guidance for the next steps.
Key Features and Functionalities:
-
PCOD Risk Prediction:
-
Built a robust machine learning model using Gradient Boosting Machines (GBM) to predict the likelihood of PCOD based on users' demographic details, daily symptoms, and routine health check-up data.
-
The model provides an accurate assessment of PCOD risk for an individual.
-
-
Personalized Dietary and Lifestyle Recommendations:
-
For individuals identified at an early or manageable stage, the tool automatically generates a customized dietary and exercise plan, helping them manage and potentially reverse the condition through lifestyle modifications.
-
-
Doctor and Hospital Recommendations:
-
If the assessed risk indicates the need for medical intervention, the tool recommends the best hospitals and specialists nearby based on the user's geographical location, helping streamline access to appropriate healthcare
-
Data and Methodology:
-
Utilized a large-scale dataset with 700,000+ records, covering women of diverse backgrounds, ages, regions, and professions.
-
Incorporated various health parameters, including lifestyle habits, physiological symptoms, and periodic medical check-up data.
-
Applied advanced machine learning techniques with rigorous validation to ensure accuracy and reliability.
Skin Disease Prediction System
Objective:
To develop an image-based diagnostic tool that predicts various skin diseases from user-uploaded images using deep learning techniques.
Key Features and Functionalities:
-
Image-Based Disease Prediction:
-
Designed and implemented a Convolutional Neural Network (CNN) model capable of accurately identifying different skin diseases from images.
-
Enabled real-time diagnostic assistance through user-friendly image uploads.
-
-
Enhanced Model Performance:
-
Applied data augmentation techniques to expand the diversity of training data, significantly improving the model's robustness and prediction accuracy.
-
-
Scalable and Accessible Tool:
-
Developed to assist both individuals and healthcare professionals in obtaining quick preliminary assessments, reducing time to diagnosis.
-
Course Recommendation System
Objective: To build a personalized course recommendation system for an online learning platform using machine learning and natural language processing.
Key Features and Functionalities:
-
AI-Driven Course Recommendations:
-
Developed a recommendation engine powered by the BERT (Bidirectional Encoder Representations from Transformers) model to understand user preferences and suggest relevant courses.
-
-
Priority-Based Ranking:
-
The system ranks recommendations by 1st, 2nd, and 3rd priority based on factors such as:
-
User’s previous searches.
-
Registered courses.
-
Progress and completion status of prior courses.
-
-
Personalized Learning Path:
-
Tailored recommendations to align with each user’s learning journey, interests, and career goals, thereby enhancing engagement and motivation.
-
Outcome and Impact:
-
Enhanced the learning experience by delivering highly personalized course recommendations, resulting in improved discoverability and increased course completion rates.
-
Enhanced user satisfaction by providing relevant, goal-aligned course suggestions, contributing to better learning outcomes.
Clinical Notes Summarizer using Transformers
Objective: To automate the summarization of unstructured clinical notes into concise, structured summaries using transformer-based NLP models, thereby reducing physician workload and improving clinical decision support.
Key Features and Functionalities:
-
Automated Summarization:
Leveraged a fine-tuned BERT-based model to generate short, context-aware summaries of clinical text from electronic health records (EHR). -
Medical Entity Extraction:
Integrated named entity recognition (NER) to highlight key components like symptoms, medications, diagnoses, and procedures. -
Real-Time API Deployment:
Deployed the model via FastAPI to enable real-time summarization within existing hospital systems.
Data and Methodology:
-
Used anonymized clinical text data from MIMIC-III for fine-tuning.
-
Applied transfer learning on BERT and BioBERT models.
-
Evaluated summaries using ROUGE scores and clinician feedback.
Credit Risk Scoring System with Explainable AI
Objective: To build a transparent and accurate machine learning system to assess loan default risks and assist financial institutions in informed lending decisions.
Key Features and Functionalities:
-
Risk Prediction:
Developed a credit scoring model using XGBoost and logistic regression based on customer demographic, financial, and credit history data. -
Explainable Predictions:
Incorporated SHAP values for interpretability, allowing underwriters to understand key drivers of each prediction. -
Interactive Visualization:
Built a Tableau dashboard to visualize credit risk scores, feature importance, and user-specific explanations.
Data and Methodology:
-
Used a structured financial dataset with thousands of loan records and repayment history.
-
Handled class imbalance with SMOTE and validated with stratified k-fold cross-validation.
-
Deployed the model and dashboard via AWS EC2 and RDS.
Real-Time Student Engagement Detector
Objective: To create an intelligent computer vision system that monitors student attention in virtual classrooms and provides real-time feedback to instructors.
Key Features and Functionalities:
-
Engagement Detection:
Used CNN and OpenCV to analyze facial expressions, blink rate, and head pose from webcam footage. -
Real-Time Monitoring:
Streamed engagement scores via a dashboard to alert instructors when student attention dropped. -
Privacy-Compliant Deployment:
Deployed locally on edge devices to ensure data privacy and low-latency performance.
Data and Methodology:
-
Collected labeled video datasets representing engaged vs. distracted behaviors.
-
Trained CNN on preprocessed frames; applied Haar cascades and Dlib landmarks.
-
Deployed using Flask + Streamlit.
Multimodal Disease Diagnosis using Tabular + Image Data
Objective: To enhance disease diagnosis accuracy by integrating visual and clinical data in a unified machine learning framework.
Key Features and Functionalities:
-
Multimodal Fusion Model:
Combined CNN-based image analysis with GBM on structured clinical data using late-fusion architecture. -
Condition-Specific Diagnosis:
Focused on complex conditions like diabetic retinopathy and cardiovascular disease where image + tabular data are complementary. -
User Interface:
Built an interface to upload data, visualize predictions, and generate clinical summaries.
Data and Methodology:
-
Used 100K+ patient records including medical scans and clinical reports.
-
Trained CNN (ResNet) for image data and GBM for tabular features; merged logits before classification.
-
Conducted cross-modal evaluation and ablation studies.
Large-Scale LLM-Powered Document Retrieval System
Objective: To create a fast and accurate system that retrieves relevant content from millions of enterprise documents using LLMs and semantic search.
Key Features and Functionalities:
-
RAG Architecture:
Implemented a retrieval-augmented generation pipeline using FAISS and MiniLM for dense vector similarity search. -
Efficient Indexing:
Chunked and embedded 1M+ documents with caching and FAISS indexing for low-latency retrieval. -
Frontend Integration:
Deployed backend via FastAPI and connected it to a React-based UI for user interaction.
Data and Methodology:
-
Preprocessed diverse enterprise text formats (PDFs, Word, HTML).
-
Used SentenceTransformers (MiniLM) for embedding generation.
-
Benchmarked against keyword-based retrieval using precision and recall.
Smart Defect Detection in Manufacturing using Vision AI
Objective: To automate quality control in manufacturing by detecting surface-level defects on parts using real-time computer vision.
Key Features and Functionalities:
-
Defect Classification:
Trained a CNN to detect defects such as cracks, dents, and discoloration from camera feeds. -
Real-Time Inference:
Deployed using FastAPI and Docker on edge devices for integration with IoT-based production lines. -
Actionable Output:
Provided visual alerts and defect localization overlays to aid human operators.
Data and Methodology:
-
Used 50K+ labeled images from industrial cameras.
-
Performed data augmentation and handled class imbalance with weighted loss.
-
Achieved 94% accuracy.