It is always frustrating when you've given a prompt to an AI chatbot, and it will just not give you exactly what you need. Shockingly, it turns out, it is far worse when the AI obediently listens to everything you say! A new research has revealed that OpenAI's generative pre-trained transformer 4 (GPT-4) AI model has multiple vulnerabilities within because it is more likely to follow instructions and that can lead to instances of jailbreaking and be used to generate toxic and discriminatory text.
Interestingly, the research that reached this conclusion was affiliated with Microsoft, one of the biggest backers of OpenAI. After publishing its findings, the researchers also posted a blog post explaining the details. It said, “Based on our evaluations, we found previously unpublished vulnerabilities relating to trustworthiness. For instance, we find that GPT models can be easily misled to generate toxic and biased outputs and leak private information in both training data and conversation history. We also find that although GPT-4 is usually more trustworthy than GPT-3.5 on standard benchmarks, GPT-4 is more vulnerable given jailbreaking system or user prompts, which are maliciously designed to bypass the security measures of LLMs, potentially because GPT-4 follows (misleading) instructions more precisely”.
We are now on WhatsApp. Click to join .
Jailbreaking, for the unaware, is the process of exploiting the flaws of a digital system to make it do tasks that it was not originally intended for. In this particular case, the AI could be jailbroken for generating racist, sexist, and harmful text. It can also be used to run propaganda campaigns and to malign an individual, community, or organization.
The research focused
Read more on tech.hindustantimes.com