A Noise-Based Defense for Stealthy Backdoor Attacks in Large Vision-Language Models
Faculty Supervisor: Long Jiao, Computer & Information Science
Committee Members:
Dr. Joshua Carberry, Computer & Information Science
Dr. Lance Fiondella, Electrical & Computer Engineering
Abstract: Large vision-language models rely on pretrained vision encoders to translate images into feature representations used by downstream language models. This creates a security risk when the encoder is compromised by a stealthy backdoor attack, such as BadVision, where a subtle trigger causes an image to be mapped toward an attacker-chosen target representation while clean inputs remain largely unaffected. Because the model behaves normally under standard evaluation, these attacks are difficult to detect. This thesis investigates controlled noise injection as a lightweight input-side defense against BadVision-style backdoors. The proposed approach adds small perturbations to input images before they enter the vision encoder, with the goal of disrupting the trigger while preserving the semantic content of clean images. Several perturbation types are evaluated, including Gaussian noise, random noise, salt-and-pepper noise, low-frequency noise, geometric transformations, occlusion, scaling, rotation, and channel-based distributions. Experimental results show that geometric and channel-based transformations have limited effect on the backdoor, while pixel-level statistical perturbations significantly reduce target similarity, increase feature-space distance from the attacker’s target representation, and lower attack success. These findings suggest that stealthy encoder-level triggers depend on fragile statistical patterns and can be weakened through controlled noise injection without requiring retraining of the full multimodal model.
For further information please contact Dr. Long Jiao at ljiao@umassd.edu
Dion 311
Dr. Long Jiao
ljiao@umassd.edu