Data Science MS Thesis Defense by Adithi Madduluri
Advisor: Dr. Donghui Yan
Committee Members:
Dr. Yuchou Chang, Computer and Information Science Department, University of Massachusetts Dartmouth
Dr. Long Jiao, Computer and Information Science Department, University of Massachusetts Dartmouth
Abstract:
Cryptocurrency markets are non-stationary, making price forecasting inherently unreliable over time. This study examines whether the choice of target variable has more impact on forecast stability than the choice of model architecture. Five models are evaluated across two target formulations: raw Bitcoin price and 1-hour percentage change. The models tested are Naive Forecast, ARIMA, LSTM, Bidirectional LSTM, and GRU, each trained and assessed over a 27-day test window using live data collected at 5-minute intervals across a rolling six-week period. Drift detection using the Wasserstein distance confirmed that raw price exhibits significantly greater distributional shift than percentage change over the same timeframe. Models trained on raw price produced directional accuracy below 50% across all learned architectures, with visible degradation over the test window. The Naive Forecast outperformed all learned models on both RMSE ($110.34) and MAE ($66.08). Models trained on percentage change maintained substantially higher accuracy: LSTM achieved 74.7% directional accuracy, while BiLSTM and GRU both reached 72.8%, with no comparable decay observed. The results indicate that model decay in Bitcoin forecasting is driven primarily by data drift in the target variable rather than by limitations in the predicting architecture. When the target is stationary, all models tested retain their accuracy across the full evaluation window.
Virtual