AI safety and research company Anthropic just released its next-generation AI assistant called Claude(Opens in a new window).
Claude is a competitor to OpenAI's ChatGPT, which has already generated some negative headlines. However, Claude is designed to be different, with a focus on being "helpful, honest, and harmless" while still managing to carry out all the tasks you'd expect of an AI assistant (for example, summarization, search, creative and collaborative writing, Q&A, and coding).
As Reuters reports(Opens in a new window), Anthropic was co-founded by former OpenAI executives Dario and Daniela Amodei, and have backing from Alphabet. The reason Claude is different and can be relied upon to be helpful and trustworthy is because the AI has been provided with a list of rules/principles and trained through self-improvement. It's a method known as "Constitutional AI(Opens in a new window)."
As Anthropic explains:
"The process involves both a supervised learning and a reinforcement learning phase. In the supervised phase we sample from an initial model, then generate self-critiques and revisions, and then finetune the original model on revised responses. In the RL phase, we sample from the finetuned model, use a model to evaluate which of the two samples is better, and then train a preference model from this dataset of AI preferences. We then train with RL using the preference model as the reward signal, i.e. we use 'RL from AI Feedback' (RLAIF)."
The end result allows Claude to engage with harmful, unpleasant, and even malicious queries made by users by "explaining its objections to them." So Claude can provide honest answers without producing any harmful output. That's a very attractive feature to offer companies who
Read more on pcmag.com