From ChatGPT to Gemini: how AI is rewriting the internet
See all Stories
E
Microsoft says its automated AI red teaming tool finds malicious content “in a matter of hours.”
PyRIT, or Python Risk Identification Toolkit, can point human evaluators to “hot spot” categories in AI that might generate harmful prompt results.
Microsoft used PyRIT while redteaming (the process of intentionally trying to get AI systems to go against safety protocols) its Copilot services to write thousands of malicious prompts and score the response based on potential harm in categories that security teams can now focus on.
Follow topics and authors from this story to see more like this in your personalized homepage feed and to receive email updates.
Loading comments
Getting the conversation ready...
Most Popular
Most Popular
- Midjourney goes from generating cat images to full-body ultrasound scans
- Apple’s weird anti-nausea dots cured my car sickness
- Amazon employees say they’re facing termination for backing data center limits
- This Ghost in the Shell keyboard makes me want to activate the hundred spidery robot fingers inside my regular fingers
- This robotic self-driving toilet comes to you











