View a PDF of the paper titled Publicly-Detectable Watermarking for Language Models, by Jaiden Fairoze and 4 other authors
Abstract:We present a publicly-detectable watermarking scheme for LMs: the detection algorithm contains no secret information, and it is executable by anyone. We embed a publicly-verifiable cryptographic signature into LM output using rejection sampling and prove that this produces unforgeable and distortion-free (i.e., undetectable without access to the public key) text output. We make use of error-correction to overcome periods of low entropy, a barrier for all prior watermarking schemes. We implement our scheme and find that our formal claims are met in practice.
Submission history
From: Jaiden Fairoze [view email]
[v1]
Fri, 27 Oct 2023 21:08:51 UTC (1,038 KB)
[v2]
Mon, 27 May 2024 09:24:16 UTC (420 KB)
[v3]
Tue, 28 May 2024 06:10:45 UTC (420 KB)
[v4]
Sat, 4 Jan 2025 13:52:49 UTC (354 KB)
Source link
lol