arXiv:2412.16257v1 Announce Type: new
Abstract: Current text-to-image (T2I) diffusion models can produce high-quality images, and malicious users who are authorized to use the model only for benign purposes might modify their models to generate images that result in harmful social impacts. Therefore, it is essential to verify the integrity of T2I diffusion models, especially when they are deployed as black-box services. To this end, considering the randomness within the outputs of generative models and the high costs in interacting with them, we capture modifications to the model through the differences in the distributions of the features of generated images. We propose a novel prompt selection algorithm based on learning automaton for efficient and accurate integrity verification of T2I diffusion models. Extensive experiments demonstrate the effectiveness, stability, accuracy and generalization of our algorithm against existing integrity violations compared with baselines. To the best of our knowledge, this paper is the first work addressing the integrity verification of T2I diffusion models, which paves the way to copyright discussions and protections for artificial intelligence applications in practice.
Source link
lol