University calendar

RAG-Powered Customer Insight Generation for E-Commerce Using LLMs, Vector Search, and an End-to-End MLOps Pipeline

Monday, May 18, 2026 at 2:30pm to 3:30pm

Faculty Supervisor: Dr. Amir Akhavan Masoumi, Computer & Information Science/Data Science

Committee Members:

Dr. AshokKumar Patel, Computer & Information Science/Data Science

Dr. Debarun Das, Computer & Information Science/Data Science

Location/Link: Online via Zoom
https://us04web.zoom.us/j/71177187003?pwd=bfh7typ8TW4oqb7tPqGZ7GMqY6Zpa7.1

Meeting ID: 71177187003
Passcode: tt8zda

Abstract:
The rapid growth of online retail has created enormous volumes of unstructured product data that most businesses struggle to turn into actionable intelligence. This study presents an intelligent analytics platform that combines Retrieval-Augmented Generation (RAG) with Claude Opus 4.6 to generate structured business insights from a corpus of 200,000 Amazon Electronics product records. A multi-layered pipeline transforms raw product metadata into semantically rich text chunks, encodes them using BGE-M3 sentence embeddings, and stores the resulting 200,000 vectors in a ChromaDB persistent vector store. At query time, the platform retrieves the most contextually relevant product records, reranks them by semantic similarity, and feeds them to Claude Opus 4.6, which synthesizes the retrieved evidence into coherent, data-grounded analytical narratives complete with business recommendations. The platform is built with production deployment in mind, with MLflow tracking every experiment for full reproducibility, Docker containerizing the entire application stack, and GitHub Actions automating the continuous integration and delivery pipeline. An interactive Streamlit dashboard brings all capabilities together in a user-friendly interface requiring no technical expertise. Evaluation across eight quantitative metrics confirms the quality of the system's outputs, achieving a ROUGE-1 score of 0.4121, a ROUGE-L score of 0.4121, and a BERTScore F1 of 0.9131, indicating strong lexical precision and exceptional semantic alignment with human-authored reference insights. A faithfulness score of 0.5567 demonstrates that generated content is reliably grounded in retrieved evidence. All sixteen automated unit tests pass, confirming the robustness of every system component.

For further information, please contact Dr. Amir Akhavan Masoumi at aakhavanmasoumi@umassd.edu.

Online - Zoom
Dr Amir Akhavan Masoumi
aakhavanmasoumi@umassd.edu
https://us04web.zoom.us/j/71177187003?pwd=bfh7typ8TW4oqb7tPqGZ7GMqY6Zpa7.1

Add to my calendar

May 2026

Questions about the calendar?

Prev	May 2026					Next
Mo	Tu	We	Th	Fr	Sa	Su
27	28	29	30	01	02	03
04	05	06	07	08	09	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30	31