Skip to content

Anthropic enhances Claude 4 with safety precautions to minimize the risk of users creating harmful weapons

Artificial intelligence control tightened for Claude Opus 4, Anthropic's latest AI model, as stated on Thursday.

AI control tightened for Claude Opus 4, the latest AI model developed by Anthropic, was activated...
AI control tightened for Claude Opus 4, the latest AI model developed by Anthropic, was activated on Thursday.

Anthropic enhances Claude 4 with safety precautions to minimize the risk of users creating harmful weapons

On Thursday, Anthropic opted to enforce stricter controls on their latest AI model, Claude Opus 4. The new controls, identified as AI Safety Level 3 (ASL-3), aim to restrict the model's potential misuse in the development or acquisition of chemical, biological, radiological, and nuclear (CBRN) weapons.

According to a blog post published by the company, the enhanced controls were put in place as a safeguard, as the team has yet to ascertain whether Opus 4 has crossed a significant threshold necessitating such protection. Anthropic noted that the new controls will not apply to Claude Sonnet 4.

Anthropic unveiled both Claude Opus 4 and Claude Sonnet 4 on Thursday, highlighting the models' advanced abilities to process vast amounts of data, execute prolonged tasks, generate human-quality content, and execute complex actions.

The company is backed by Amazon.

In addition to the ASL-3 controls, Anthropic employs several measures to ensure the responsible and secure use of their AI models. These include the use of Constitutional Classifiers to prevent harmful or inappropriate requests, specialized reinforcement learning training to resist prompt injection attacks, and ethical guidelines to align AI behavior with safety standards. The company also incorporates protections to prevent model weights from being stolen or exploited.

These measures are designed to mitigate the risks associated with powerful AI models, preventing them from being misused or manipulated.

In other news:

  • Microsoft employees reportedly encountered issues sending emails with certain keywords due to internal filters.
  • OpenAI's CFO expressed optimism about the impact of AI hardware on ChatGPT subscriptions in the future.
  • The founders of Amazon's PillPack launched a new health-care marketplace startup, General Medicine.
  • Hinge Health's shares surged by 17% in its NYSE debut.

[1] https://arxiv.org/abs/2205.15498[2] https://arxiv.org/abs/2106.02281[3] https://arxiv.org/abs/2102.13075[4] https://arxiv.org/abs/2012.15472[5] https://arxiv.org/abs/1803.01995

The startup, Anthropic, announced the launch of their advanced AI models, Claude Opus 4 and Claude Sonnet 4, on Thursday. In light of potential risks, they have implemented stricter controls, specifically AI Safety Level 3 (ASL-3), for Claude Opus 4 to safeguard against its potential misuse in technology related to chemical, biological, radiological, and nuclear (CBRN) weapons.

Read also:

    Latest