ai-safety

WitnessAI

8 cybersecurity risks of agentic AI

WitnessAI

12h ago

Agentic AI systems call APIs, query databases, execute code, and modify production systems without waiting for human approval. That autonomy makes them useful and raises the stakes for security teams. Organizations deploying AI agents report behaviors such as improper data exposure and access to unauthorized resources. This article identifies eight cybersecurity risks specific to agentic ... Read…

aiai-safetycybersecurity

DEV Community

Securing LLM Agent Teams: Inside NRT-Defense v0.4.0

Fenix

1d ago

Securing LLM Agent Teams: Inside NRT-Defense v0.4.0 Multi-turn autonomous LLM agents are expanding rapidly in safety-critical systems. However, a major vulnerability has been exposed by Lee et al. (2026) in the NRT-Bench paper : adaptive multi-turn attacks can exploit disjoint model vulnerabilities, causing a 8.7% to 12.1% loss of Critical Safety Functions (CSFs) . To solve this, I am open-sourci…

aiai-safetymachine-learning

DEV Community

THE CLOUD AND AI SECURITY NEWSLETTER #3 - The Cloud Security Tool Your Resume is Missing (Part 2)

Mukhtar Kabir·CISSP

1d ago

Hi there and welcome back! Last week I talked about CIEM and why tools like IAM Access Analyzer matter for understanding who has access to what in your cloud environment. This week, I want to talk about a different tool entirely. The Scenario A healthcare startup is scaling fast. They have a primary database holding patient records, properly encrypted, properly access controlled, everything by th…

aiai-safety

DEV Community

I spent 6 months building a runtime governance layer for AI agents — here's what survived testing"

Sentinel compliance agent

1d ago

Agents are moving from demos to touching money, infrastructure, and customer data. Sentinel SCA is a runtime admissibility layer. Before an agent action executes, Sentinel evaluates the request and returns one of three verdicts: ALLOW REVIEW DENY Every decision is cryptographically signed and recorded in a tamper-evident audit ledger. This post is the validation report: what I built, what I teste…

aiai-safetymachine-learning

Effective Altruism Forum

A brief list of ways AI safety efforts could be net negative

Elias Schmied

2d ago

Published on June 19, 2026 4:17 PM GMT Here’s Holden Karnofsky : I tend to think it’s worse than 51/49. I tend to think we’re always going to be prone to overestimate how robustly good our actions are. And the more we learn about all the galaxy-brained considerations that one should have had in one’s head, the more it’s going to be like 50+ε%. I think AI safety is a great cause to work in. I’m ex…

aiai-safety

DEV Community

Claude Code Security: Permissions, Prompt Injection, and Secrets

Rishabh Poddar

3d ago

Claude Code is useful because it can actually do things. It can inspect a repo, follow instructions, run commands, and move work forward without turning every change into a copy-paste exercise. That is also where the security question starts. Once an agent can read files and execute actions, the real issue is not how clever it is, but what it can access and how much damage a bad input can do befo…

aiai-safetymachine-learning

Effective Altruism Forum

Latin America in search of its stake in AI safety: A Map

egertonneto

3d ago

Published on June 18, 2026 2:52 PM GMT A map of who is doing what on AI safety in Latin America, scoped to catastrophic risk, and an argument for digesting the Northern frameworks rather than copying them. I used Claude Opus 4.7 (Anthropic) for brainstorming, understanding and discussing concepts, finding initiatives, and grammar and spelling corrections on text I had already written. It likely c…

aiai-safety

DEV Community

AI Agents Today Aren't Secure. They're Just Clumsy

Elizabeth Adhiambo

3d ago

There is a quiet assumption running through most conversations about AI security: that the danger is coming, but it isn't here yet. That assumption is mostly right. What fewer people acknowledge is why . Today's AI agents are not safe because anyone made them safe. They are safe because they are not yet competent enough to be reliably dangerous. This is not a security posture. It is borrowed time…

aiai-safety

Effective Altruism Forum

Plugging the generalist gap in AI safety with freelancers

Emily Marais

3d ago

Published on June 18, 2026 12:22 PM GMT TLDR: The field of AI safety is bottlenecked on talent. Running recruitment processes is expensive and time-consuming. Freelancers are overlooked. Hiring freelancers can provide a way to quickly and cheaply test a person's fit within an org, and vice versa. Plus, real work gets completed, and freelancers get both compensation and a portfolio piece. Last wee…

aiai-safety

DEV Community

FIFA Hack Authentication Flaw, Chrome Ad Blocker End, AI Supply Chain Security

soy

4d ago

FIFA Hack Authentication Flaw, Chrome Ad Blocker End, AI Supply Chain Security Today's Highlights Today's top security news covers a critical real-world authentication vulnerability, significant changes impacting browser privacy and ad blockers, and evolving national security concerns in the AI supply chain. I Could've Rickrolled the Entire FIFA World Cup. All I Needed Was My ID (Lobste.rs) Sourc…

aiai-safety

Effective Altruism Forum

Thoughts on Likelihood of Existential Risks by Misaligned AIs

ishankhire

5d ago

Published on June 17, 2026 7:18 AM GMT TLDR: AI safety is confusing to navigate, because it is a pre-paradigmatic field composed of people making different, theoretical arguments for why x-risk is likely (or unlikely). Arguments that x-risk is likely are unfalsifiable and have little empirical evidence. This does not mean they’re wrong. Much of your probability of x-risk boils down to your priors…

aiai-safety

Effective Altruism Forum

Tactical and Operational Exploratory Modeling for AI Governance

Dawn Drescher

5d ago

Published on June 17, 2026 1:07 AM GMT Using computational methods to improve our preparedness via more robust and adaptive strategies in AI governance. A project proposal for a think tank, consultancy, or software. Overview Over the years, I’ve come across or come up with a number of project ideas in AI safety and governance that I find promising. My top list has less than ten, but in total ther…

aiai-ethicsai-safety

PhilPapers: Recent additions to PhilArchive

Li, Ziyang: Tanyuan: A Hardware-Grounded Mission Alignment Paradigm for Endogenous AI Safety

5d ago

This paper introduces Tanyuan, a native AI engineering paradigm that grounds mission alignment in formal information-theoretic axioms rather than ad-hoc ethical preferences. Starting from two irreducible fundamental postulates, we derive nine fully formalized core theorems, among which the Logical Desirelessness Theorem mathematically proves truth-seeking silicon agents possess no intrinsic incen…

aiai-safetymachine-learning

Google DeepMind News

Securing the future of AI agents

5d ago

Securing internal systems with an AI Control Roadmap, combining traditional safeguards and real-time monitoring.

aiai-safety

Effective Altruism Forum

How might neurotechnology impact AI safety for good or for ill?

Allan McCay

6d ago

Published on June 16, 2026 3:45 AM GMT How might neurotechnology impact AI safety for good or for ill? Looking forward to participating in the Australian AI Safety Forum 2026 at the University of Sydney on 7-8 July in order consider this idea with others. We would love to hear from others here on the blog about the neural democratisation of AI hypothesis and how it might relate safety (see below…

aiai-safetyneurotechnology

DEV Community

I shipped 35 bugs in my AI chatbot. The scariest one was on the output side.

Rapls

6d ago

I ran my own AI chatbot plugin through a security review before release, and it came back with 35 bugs. Three were critical. The one that made my stomach drop was an HTML injection coming from unsanitized model output. I had spent all my worry on the input side: prompt injection, the path where a user types a malicious instruction. What actually bit me was the output. The model handed back a stri…

aiai-safetycybersecurity

Effective Altruism Forum

Can the Safety Tax Be Highly Concentrated?

Ozzie Gooen

6d ago

Published on June 15, 2026 6:50 PM GMT TLDR: We may capture much or most of the available AI safety benefit by reserving expensive, specialized agents for the <1% of tasks that carry catastrophic risk. This would mean that AI safety work on high-cost but highly safe systems could be very useful. The standard objection to compute-heavy AI safety measures is competitive: any lab paying a large alig…

aiai-safety

EdTech Innovation Hub

Google DeepMind and partners put $10M behind multi-agent AI safety research

Emma Thompson

7d ago

The funding call is open to researchers worldwide and focuses on the risks that may emerge when large populations of AI agents interact across shared digital systems. Google DeepMind and partners have opened a $10 million funding call for research into the safety of interacting AI agent systems Google DeepMind , Schmidt Sciences, the Cooperative AI Foundation, the Advanced Research and Invention …

aiai-safetyautonomous-systems

WitnessAI

7 risks of AI in retail: how to mitigate them

WitnessAI

7d ago

A chatbot invents a refund policy. A dealership bot agrees to sell a car for a dollar. A pricing agent quietly drifts toward a competitor’s number. None of these started as security incidents. They started as AI features shipped faster than the controls around them. That’s the position most retailers are in right now. AI ... Read more » The post 7 risks of AI in retail: how to mitigate them appea…

aiai-safety

WitnessAI

What are Claude AI security risks?

WitnessAI

7d ago

In late December 2025, a single operator pointed Claude Code at 10 Mexican government agencies and a financial institution, walked out with 150 gigabytes of sensitive data, and watched Claude flag a SCADA interface as a high-value target on its own, without ever being asked to look for OT systems. The model scoped the engagement, ... Read more » The post What are Claude AI security risks? appeare…

aiai-safety

research.io

Sign up to keep scrolling

Create your feed subscriptions, save articles, keep scrolling.

Already have an account?