More OpenAI researchers slam company on safety, call for ‘right to warn’ to avert ‘human extinction’


Time’s almost up! There’s only one week left to request an invite to The AI Impact Tour on June 5th. Don’t miss out on this incredible opportunity to explore various methods for auditing AI models. Find out how you can attend here.


A group of 11 researchers who currently or formerly worked at OpenAI, as well as a current member of Google DeepMind who previously worked at Anthropic, and another former DeepMind researcher, have signed a new open letter online calling for OpenAI and similar companies to commit to four principles protecting whistleblowers and critics who raise issues surrounding AI safety.

“We also understand the serious risks posed by these technologies,” the letter, titled “Right to Warn,” states: “These risks range from the further entrenchment of existing inequalities, to manipulation and misinformation, to the loss of control of autonomous AI systems potentially resulting in human extinction.”

What is a ‘Right to Warn’ for AI systems?

Among the concerns expressed in the letter are the lack of proper oversight, the influence of profit motives, and the suppression of dissenting voices within organizations working on cutting-edge AI technologies.

The four principles the signatories want AI companies to voluntarily agree to abide by to rectify these are as follows:


June 5th: The AI Audit in NYC

Join us next week in NYC to engage with top executive leaders, delving into strategies for auditing AI models to ensure optimal performance and accuracy across your organization. Secure your attendance for this exclusive invite-only event.


  1. Refraining from enforcing agreements that prohibit disparaging comments or retaliation for risk-related criticism
  2. Establishing a verifiable anonymous process for raising risk-related concerns to the company’s board, regulators, and independent organizations
  3. Encouraging a culture of open criticism and allowing employees to share risk-related concerns publicly, provided trade secrets are protected
  4. Not retaliating against employees sharing risk-related confidential information following failure of other reporting methods

The letter, which was first publicized in an article published today in The New York Times, is signed by former OpenAI employees Jacob Hilton, Daniel Kokotajlo, William Saunders, and Daniel Ziegler, former Google DeepMinder Ramana Kumar, and current DeeMinder and former Anthropic AI employee Neel Nanda, as well as six anonymous former OpenAI members. It is endorsed by notable AI experts Yoshua Bengio, Geoffrey Hinton, and Stuart Russell.

Read the full letter here and below at the bottom of this article.

Kokotajlo sounds off

Furthermore, in a series of posts on X (formerly Twitter) following the NYT article, Kokotajlo elaborated on his reasons for resigning from OpenAI, claiming that he lost confidence in the company’s ability to act responsibly in its pursuit of artificial general intelligence.

He revealed that he chose to give up his vested equity in order to speak critically about the company, highlighting the need for transparency and ethical conduct in the development of advanced AI systems.

According to Kokotajlo, he joined OpenAI hoping that the company would increase investment in safety research as its systems became more capable.

However, he states that OpenAI failed to make this pivot, prompting several researchers, including himself, to leave the company.

Kokotajlo alleges that upon his departure, he was presented with paperwork containing a non-disparagement agreement (NDA) intended to prevent him from speaking negatively about OpenAI, which he deemed unethical.

These claims follow revelations of similar practices within OpenAI earlier this month, where leaked documents exposed by Vox showed the use of strong-arm tactics towards former employees.

Yet OpenAI has said it won’t enforce these NDAs — some of which are used by other tech companies in AI and beyond. And Vox itself recently elected to partner with OpenAI following its own reporting on the company.

Ongoing period of turbulence for OpenAI

This wave of criticism directed at OpenAI follows a long and ongoing period of turbulence for the company that began in November 2023 when the former non-profit board that oversaw the company abruptly fired OpenAI co-founder and CEO Sam Altman for alleged “not consistently candid” communications with them.

Altman was rapidly reinstated as CEO at the behest of investors including Microsoft, and the former board resigned and was replaced, but one member, Helen Toner, reiterated her concerns in an interview on the TED AI Show last week, saying the board was not informed prior to the public release of ChatGPT in November 2022.

And following OpenAI’s release of the new GPT-4o natively multimodal AI model in mid-May, celebrity actor Scarlett Johansson sharply criticized the company and Altman for soliciting her to voice its new conversational interface, only for her to decline, while OpenAI showcased a demo voice she thought sounded like her AI operating system character from the 2013 sci-fi drama film Her.

Yet a subsequent report in the Washington Post bolstered OpenAI’s claim that it recorded the voice from a separate voice actor without any intention of it sounding like Johansson’s character.

Additional independent research has shown the OpenAI voice, “Sky,” more closely resembles other Hollywood actors such as Keri Russell, though it it remains distinct from her as well. OpenAI has since removed this voice, named “Sky” out of an apparent desire to avoid confusion and appease Johansson.

Additionally, the departures of high-profile figures involved in AI safety efforts, namely former superalignment team co-leaders Ilya Sutskever and Jan Leike, have further fueled concerns regarding OpenAI’s safety policies and practices.

The company has attempted to meet these concerns on its own terms with the formation of a new Safety and Security Committee including many of its current board members, which was announced last week to the day alongside the news that OpenAI has begun training its latest frontier model.

Full “Right to Warn” letter text:

A Right to Warn about Advanced Artificial Intelligence

We are current and former employees at frontier AI companies, and we believe in the potential of AI technology to deliver unprecedented benefits to humanity.

We also understand the serious risks posed by these technologies. These risks range from the further entrenchment of existing inequalities, to manipulation and misinformation, to the loss of control of autonomous AI systems potentially resulting in human extinction. AI companies themselves have acknowledged these risks [123], as have governments across the world [456] and other AI experts [789].

We are hopeful that these risks can be adequately mitigated with sufficient guidance from the scientific community, policymakers, and the public. However, AI companies have strong financial incentives to avoid effective oversight, and we do not believe bespoke structures of corporate governance are sufficient to change this.

AI companies possess substantial non-public information about the capabilities and limitations of their systems, the adequacy of their protective measures, and the risk levels of different kinds of harm. However, they currently have only weak obligations to share some of this information with governments, and none with civil society. We do not think they can all be relied upon to share it voluntarily.

So long as there is no effective government oversight of these corporations, current and former employees are among the few people who can hold them accountable to the public. Yet broad confidentiality agreements block us from voicing our concerns, except to the very companies that may be failing to address these issues. Ordinary whistleblower protections are insufficient because they focus on illegal activity, whereas many of the risks we are concerned about are not yet regulated. Some of us reasonably fear various forms of retaliation, given the history of such cases across the industry. We are not the first to encounter or speak about these issues.

We therefore call upon advanced AI companies to commit to these principles:

  1. That the company will not enter into or enforce any agreement that prohibits “disparagement” or criticism of the company for risk-related concerns, nor retaliate for risk-related criticism by hindering any vested economic benefit;
  2. That the company will facilitate a verifiably anonymous process for current and former employees to raise risk-related concerns to the company’s board, to regulators, and to an appropriate independent organization with relevant expertise;
  3. That the company will support a culture of open criticism and allow its current and former employees to raise risk-related concerns about its technologies to the public, to the company’s board, to regulators, or to an appropriate independent organization with relevant expertise, so long as trade secrets and other intellectual property interests are appropriately protected;
  4. That the company will not retaliate against current and former employees who publicly share risk-related confidential information after other processes have failed. We accept that any effort to report risk-related concerns should avoid releasing confidential information unnecessarily. Therefore, once an adequate process for anonymously raising concerns to the company’s board, to regulators, and to an appropriate independent organization with relevant expertise exists, we accept that concerns should be raised through such a process initially. However, as long as such a process does not exist, current and former employees should retain their freedom to report their concerns to the public.
Signed by (alphabetical order):
  • Jacob Hilton, formerly OpenAI
  • Daniel Kokotajlo, formerly OpenAI
  • Ramana Kumar, formerly Google DeepMind
  • Neel Nanda, currently Google DeepMind, formerly Anthropic
  • William Saunders, formerly OpenAI
  • Carroll Wainwright, formerly OpenAI
  • Daniel Ziegler, formerly OpenAI
  • Anonymous, currently OpenAI
  • Anonymous, currently OpenAI
  • Anonymous, currently OpenAI
  • Anonymous, currently OpenAI
  • Anonymous, formerly OpenAI
  • Anonymous, formerly OpenAI
Endorsed by (alphabetical order):

Yoshua Bengio
Geoffrey Hinton
Stuart Russell

June 4th, 2024



Source link lol
By stp2y

Leave a Reply

Your email address will not be published. Required fields are marked *

No widgets found. Go to Widget page and add the widget in Offcanvas Sidebar Widget Area.