Projects
1. Loan Default Prediction – Data Analysis & Modeling
Conducted a thorough analysis to predict loan default risk using historical borrower data. Performed exploratory data analysis to discover key patterns and potential risk drivers, as well as customer segmentation using k-Means clustering. Developed and evaluated predictive models using decision trees and neural networks then selected the best performing model.
2. Demo AI Image Authenticity Detector
Developed a demo interactive AI powered web application that allows users to upload images and evaluate the likelihood that the content was AI generated or digitally manipulated. Designed a user-friendly interface with image processing and analysis output. Focused on usability and practical application of AI concepts to address real world concerns around deepfakes and falsified media.
3. Business Performance – Data Analysis & Modeling
Analyzed multidimensional business performance data to evaluate sales, profitability, market share, and employee productivity across products and regions. Performed revenue and margin analysis, target vs actual comparisons, and regional performance benchmarking. Built dynamic Excel models to support pricing optimization, profitability analysis, and scenario based decision making, turning quantitative results into business insights and even recommendations.
4. Database Design and SQL Implementation
Designed and implemented a normalized relational database for a fictional makeup store business to support customer, product, and order management. Developed a complete database schema in third normal form (3NF) and created an Entity-Relationship Diagram (ERD) to clearly define entities, attributes, keys, and relationships. Implemented and tested complex SQL queries to perform full CRUD operations and advanced analytical queries, including customer behavior analysis, sales performance reporting, ranking, and cumulative metrics to help make better decisions.
5. Optimization of Retail Performance – Business Analytics & Visualization
Analyzed a large retail transaction dataset to uncover sales trends, customer behavior, and profitability drivers across products, regions, and time periods. Built interactive Tableau dashboards to visualize revenue performance, geographic distribution, customer segmentation, and product hierarchies. Conducted diagnostic and predictive analytics, including sales forecasting and what if scenario analysis to evaluate the potential impact of improved customer ratings. Transformed analytical findings into actionable business insights to support inventory planning, marketing strategy, and revenue growth decisions.
6. Vehicle Pricing Analysis Across Regions & Engine Types (R)
Performed a statistical analysis on used vehicle pricing data to evaluate how prices vary across geographic regions and engine types. Conducted rigorous data preprocessing, random sampling, and factor conversion before applying inferential statistical techniques. Used variance testing, one and two way ANOVA, and post hoc Tukey comparisons to identify statistically significant price differences while controlling for multiple categorical factors. Interpreted results in a market and business context.
7. Distributed Data Pipeline & Scalable Data Architecture (ETL, Sharding & Cassandra)
Designed and implemented a scalable data pipeline integrating multiple operational databases into a centralized data warehouse. Built and deployed an automated ETL process using Docker to extract, transform, and load data from distributed PostgreSQL systems, including sharded sales databases. Enhanced the pipeline to merge data across shards and ensure consistency across sources. Extended the architecture with Cassandra to support high-performance, query-specific data access patterns using denormalized models. Evaluated distributed SQL solutions such as Citus to improve scalability, real-time analytics, and overall system performance.