Hello, my name is
I build at the intersection of AI and software engineering, creating solutions that scale and think. I'm a Master's student in Applied Data Science at San Jose State University, and I bring 2 years of experience as a Software Engineer working on ETL pipelines, distributed systems, and cloud-native applications. My projects focus on Generative AI, Agentic AI, LLMs, deep learning, scalable ML systems, and Big Data workflows, where I've combined AI and data engineering to solve real-world challenges.
I'm passionate about building solutions that bring together AI and software systems, and I'm actively seeking full-time opportunities in Software Engineering (AI/ML), AI Engineering, and Data Engineering.
When I'm not building AI agents, I'm acting like one by automating job applications at scale!
Featured Project · In Progress
An AI-powered mock interview platform for Data Science that generates dynamic, domain-specific questions, evaluates responses, and delivers instant feedback across SQL, Python, Statistics, Machine Learning and more, enhancing interview preparation. Built end-to-end with cloud-native data pipelines, NLP, and Generative AI, the platform showcases expertise in scalable system design, applied machine learning, and AI engineering to help learners prepare for real-world technical interviews.
A GenAI pipeline that translates natural language into SQL queries using fine-tuned LLaMA-3 and Mistral on the Spider dataset. Integrated LangChain agents (generation, clarification with Gemini, optimization with OpenAI GPT-3.5), and deployed with a Flask backend and HTML/CSS/JS frontend. Model hosted on Hugging Face with Docker & CI/CD support.
Developed an end-to-end AI system combining multi-modal RAG, generative model fine-tuning, and an agentic AI travel assistant. The RAG system retrieved text & images from PDFs and used Gemini for grounded Q&A, Stable Diffusion was fine-tuned with LoRA for custom image generation, and a multi-agent assistant (Flight, Weather, Hotel, Itinerary) orchestrated APIs to build dynamic travel plans.
Designed an end-to-end Machine learning pipeline for analyzing and predicting substance use patterns. Conducted extensive data preprocessing and exploratory data analysis (EDA), applied PCA for dimensionality reduction, KMeans clustering for unsupervised insights, and multiple classifiers (Random Forest, Logistic Regression, Decision Trees, KNN) for supervised prediction. Addressed class imbalance with ADASYN to improve minority class recall.
Analyzed 48GB of Binance cryptocurrency trading data (2020–2024) using AWS EC2, Databricks, Apache Spark, Pandas, and PyArrow. Built scalable pipelines for data cleaning, preprocessing, and transformation, followed by volatility analysis, trading patterns, and liquidity insights. Developed interactive Tableau dashboards for visualization, showcasing distributed systems, cloud deployment, and big data engineering for financial analytics.
Built a full-stack distributed healthcare application for symptom-based disease prediction. Integrated a React frontend with a Flask backend, multiple ML models, and secure cloud deployment on GCP. The system supports authentication, interactive consultations, and automated health report generation.
Implemented a hybrid Vision Transformer (ViT) + U-Net model for automated skin cancer analysis. The system performs both segmentation (pixel-level lesion masks) and classification (benign vs. malignant), trained on ISIC dermoscopic datasets. Combined BCE + Dice loss for segmentation and weighted cross-entropy for classification to handle class imbalance, achieving strong performance on ISIC 2016 & 2017 benchmarks.
If you would like to work together or discuss an opportunity for work, feel free to reach out at yashaswini.madineni@sjsu.edu or connect with me on LinkedIn.