
ChatGPT ads and ambient devices & Google Gemini 3.1 Pro leap - AI News (Feb 21, 2026)
February 21, 202612m 23s
Audio is streamed directly from the publisher (mcdn.podbean.com) as published in their RSS feed. Play Podcasts does not host this file. Rights-holders can request removal through the copyright & takedown page.
Show Notes
Please support this podcast by checking out our sponsors:
- Consensus: AI for Research. Get a free month - https://get.consensus.app/automated_daily
- KrispCall: Agentic Cloud Telephony - https://try.krispcall.com/tad
- Prezi: Create AI presentations fast - https://try.prezi.com/automated_daily
Support The Automated Daily directly:
Buy me a coffee: https://buymeacoffee.com/theautomateddaily
Today's topics: ChatGPT ads and ambient devices - OpenAI’s ChatGPT ads went live, colliding with rumors of a pocket-sized, always-on assistant device—raising incentives, privacy, and data-control questions. Google Gemini 3.1 Pro leap - Google rolls out Gemini 3.1 Pro with a verified 77.1% on ARC-AGI-2, positioning it for complex reasoning and agentic workflows via API, Vertex AI, and NotebookLM. NotebookLM meets Opal workflows - An internal build hints NotebookLM notebooks could become native Opal tiles, turning curated notes into a reusable knowledge source for no-code automation blocks. ARC-AGI harness shows gaps - A custom ARC-AGI-3-style harness suggests Gemini 3.1 Pro improves task identification but struggles with execution and memory, while Claude Opus performs stronger under constraints. Cooperation emerges from extortion - A new arXiv paper shows in-context co-player inference can yield cooperation in multi-agent RL—because agents adapt quickly, they become extortable, creating pressure to cooperate. Cord’s agent trees with context - Cord proposes agent coordination as dependency trees with explicit spawn vs fork context flow, using MCP tools and a shared SQLite store to enforce authority and results injection. GEPA optimizes any text artifact - GEPA’s optimize_anything generalizes evolutionary optimization to any text artifact—prompts, code, configs, SVG—using evaluator feedback as Actionable Side Information and Pareto search. Crusoe Managed Inference KV cache - Crusoe launches Managed Inference with a cluster-wide KV cache (MemoryAlloy), claiming up to 9.9x faster time-to-first-token and 5x throughput vs vLLM benchmarks. SANS AI Cybersecurity Summit 2026 - SANS announces the AI Cybersecurity Summit 2026 plus optional GIAC-track courses, emphasizing technical workshops on prompt injection, agent failures, and AI-powered attacks. Agent safety: sandboxes and bans - Cursor’s agent sandboxing reduces approval fatigue by containing autonomous terminal commands, while Meta’s AI-driven account security reportedly creates onboarding false positives at scale. Microsoft Gaming leadership reshuffle - Phil Spencer retires from Xbox leadership as Asha Sharma becomes CEO of Microsoft Gaming, promising human-made art, cross-platform expansion, and no ‘soulless AI slop’. Production lessons: prompts to observability - Operator experience reports highlight what works for agents: prototype with frontier models, fine-tune for stable tasks, use typed languages, run multi-model critique loops, and invest in tracing.
-https://www.sans.org/cyber-security-training-events/ai-summit-2026
-https://arxiv.org/abs/2602.16301
-https://juno-labs.com/blogs/every-company-building-your-ai-assistant-is-an-ad-company
-https://www.neowin.net/news/phil-spencer-is-exiting-microsoft-as-ai-executive-takes-over-xbox/
-https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-1-pro/
-https://www.june.kim/cord
-https://www.testingcatalog.com/google-test-notebooklm-integration-for-opal-workflows/
-https://x.com/scaling01/status/2024640940657246235
-https://tomtunguz.com/9-observations-using-ai-agents/
-https://daoudclarke.net/2026/02/19/repeating-prompt
-https://www.crusoe.ai/cloud/managed-inference
-https://www.sans.org/mlp/ai-security-blueprint
-https://cursor.com/blog/agent-sandboxing
-https://9to5mac.com/2026/02/19/duckduckgo-rolls-out-ai-powered-image-editing-on-duck-ai/
-https://mojodojo.io/blog/meta-is-systematically-killing-our-agency/
-https://gepa-ai.github.io/gepa/blog/2026/02/18/introducing-optimize-anything/
-https://fortune.com/2026/02/19/openai-anthropic-sam-altman-dario-amodei-refused-to-hold-hands-ai-super-bowl-ad-war-ceos-big-tech-conflict/
-https://thezvi.wordpress.com/2026/02/19/ai-156-part-1-they-do-mean-the-effect-on-jobs/
-https://www.ben-evans.com/benedictevans/2026/2/19/how-will-openai-compete-nkg2x
Episode Transcript
ChatGPT ads and ambient devices
Let’s start with the business model tension that keeps showing up in AI.
OpenAI quietly rolled out advertisements inside ChatGPT—announced mid-January, and reportedly live by early February. On its own, ads are not shocking. What is more unsettling is the direction the broader market is heading: assistants that don’t wait for you to type, but instead stay “ambient”—always around, always sensing. The same commentary points to OpenAI’s acquisition of Jony Ive’s hardware startup io and the idea of a pocket-sized device with a microphone and camera, designed to be contextually aware—maybe even a phone replacement.
The crux of the argument is simple: privacy policies are promises, but architecture is enforcement. If a system is ad-funded, it’s structurally incentivized to learn more about you. And ambient audio and video inside a home is qualitatively different from scanning email—it captures arguments, health conversations, finances, and intimate moments. The proposed counterweight is edge inference: run the full pipeline locally so the assistant can “know everything” while sending nothing. Whether that becomes mainstream is unclear, but the incentive conflict is now out in the open.
That story also intersects with a very public rivalry: Sam Altman and Dario Amodei had a noticeably awkward onstage moment at India’s AI Impact Summit this week, after Anthropic’s Super Bowl campaign leaned hard into a message of “no ads in Claude.” The optics don’t matter as much as the positioning: one camp arguing for subsidized access at massive scale, the other selling the idea that attention-based monetization is a fundamental betrayal of the assistant concept.
Google Gemini 3.1 Pro leap
Now, to the model race—Google is pushing hard on reasoning.
Google announced Gemini 3.1 Pro, rolling out starting February 19 across consumer products like the Gemini app and NotebookLM, and developer channels like the Gemini API, Vertex AI, and Android Studio. Google frames it as the model you use when “a simple answer isn’t enough,” and says it’s the core intelligence behind recent “Deep Think” advances.
The headline number is a verified 77.1% on ARC-AGI-2, a benchmark designed to test whether a system can solve genuinely new logic patterns. Google claims that’s more than double Gemini 3 Pro’s reasoning performance on that test. The demos lean into synthesis and building: animated SVGs from prompts, a live dashboard that visualizes the International Space Station’s orbit from public telemetry, and interactive 3D experiences with hand-tracking and generative audio.
But the reality check comes from independent testing culture. One ARC-AGI-3-style harness report says Gemini 3.1 Pro is better at identifying what a puzzle wants, yet still fumbles execution—misreading visual cues, missing a 90-degree rotation, and running out of moves. The same tester says Claude 4.6 Opus (Thinking) looks stronger in planning and in how it uses memory, even if it still fails under tight action budgets. The interesting takeaway isn’t “who won”—it’s that memory structure and tool discipline are becoming first-class capabilities, not nice-to-haves.
NotebookLM meets Opal workflows
Staying with Google for a moment: there’s a quiet workflow story brewing.
An internal build suggests Google Labs is testing an integration where NotebookLM notebooks appear as native assets inside Opal, its no-code workflow builder. If that ships, NotebookLM stops being a passive research vault and becomes a persistent knowledge tile you can wire into automated flows—especially into Opal’s “Generate” block, where a prompt could directly reference your curated notebook.
That sounds small, but it’s a key pattern: durable, user-owned context feeding repeatable automations. Today, most “memory” in workflow tools is either temporary—or it’s spread across docs and tabs that humans have to shuttle manually. A NotebookLM tile could become a practical middle layer: not a full database, but a living, curated source of truth for analysts and researchers.
ARC-AGI harness shows gaps
Let’s shift into agents: how they cooperate, how we coordinate them, and how we keep them from causing damage.
On the research side, a new arXiv paper—“Multi-agent cooperation through in-context co-player inference”—explores a tricky question: how do self-interested reinforcement-learning agents end up cooperating without hardcoded assumptions about each other? The authors’ key move is to use sequence models trained against a diverse set of co-players. That diversity seems to teach agents a fast, within-episode adaptation ability—basically in-context learning for game-theoretic behavior.
And here’s the twist: that in-context adaptability makes agents vulnerable to extortion. If you can be exploited, you now have an incentive to shape how the other party adapts to you. The paper argues that this “mutual shaping” pressure can settle into cooperation—an emergent outcome, not a rule.
In the builder world, June Kim introduced Cord, an open-source concept for coordinating not a single chain of agents, but a tree of agents with dependencies and parallel branches—closer to how real work actually looks. Cord’s distinguishing feature is explicit control over context flow: “spawn” gives a child a clean slate plus only what it needs, while “fork” inherits the accumulated context for synthesis. It’s implemented with MCP tools and a shared SQLite store, and it even makes the human an explicit node via an “ask” primitive that blocks downstream steps until you answer.
Then there’s the meta-tooling wave: GEPA introduced optimize_anything, a declarative API that tries to optimize any text-representable artifact—prompts, code, agent designs, SVGs—by searching candidates, scoring them with your evaluator, and feeding diagnostic feedback back into the proposer model as “Actionable Side Information.” It’s the same idea behind gradient descent, but for messy, black-box objectives. The demos range from improving an SVG illustration to evolving multi-stage agent architectures for ARC-style tasks.
Cooperation emerges from extortion
If you’re actually running models in production, speed and cost still set the rules.
Crusoe Cloud launched “Managed Inference,” pitching low-latency, high-throughput inference with managed scaling. The claimed performance gains are big—up to 9.9x faster time-to-first-token and up to 5x throughput versus vLLM on a Llama-3.3-70B benchmark—attributed largely to MemoryAlloy, a cluster-wide KV cache that avoids duplicate prefills and supports persistent sessions with smarter routing.
In other words: the caching layer is becoming the product. If you can reuse computation across requests and keep sessions warm, you can make agents feel snappy without brute-forcing everything with more GPUs. Crusoe also bundled the launch into an “Intelligence Foundry” hub—model catalog, keys, monitoring, provisioned throughput—the usual platform shape, but tuned for inference economics.
And on the “how to build it” front, Tom Tunguz shared practical observations from a year of agent systems: prototype unpredictable inputs with frontier models first, fine-tune when the distribution is stable, and consider typed languages like Rust to reduce the ‘it compiled in the model’s imagination’ problem. He also advocates a multi-model braintrust—one drafts, others critique, then iterate—and notes a very real operational truth: in AI apps, traces are the documentation. His example of nightly mining the last hundred conversations for failure patterns is becoming a standard playbook.
One more lightweight but provocative research note: a Google paper argues that simply repeating a prompt can boost performance for non-reasoning models. The commentary around it is basically: it’s wild that such a blunt trick still works—and maybe it points to training changes, like segment-based attention masking, that could bake those gains in without wasting tokens.
Cord’s agent trees with context
Now, security—because agentic software changes the threat model, and today’s news reflects that.
SANS announced its AI Cybersecurity Summit 2026, happening April 20th and 21st in Arlington, Virginia, with a live online option—and then optional training courses from April 22nd through 27th. The summit portion offers 12 CPE credits and is chaired by Rob T. Lee. The agenda is clearly leaning technical: highly detailed talks, hands-on workshops, a solutions expo, and evening networking under the “Summit@Night” banner.
What stands out is the workshop content. One lab frames a “smart pizza place” as the target, using the OWASP AI Exchange to walk through prompt injection, data leakage, poisoning, supply-chain risks, vector database issues, and agents that do too much. Another workshop is an “OWASP FinBot Lab” CTF focused on agentic workflow failures—goal hijacking, tool misuse, and the kind of unexpected remote code execution that happens when an agent has more permissions than its designer anticipated.
SANS is also pushing a broader “Secure AI Blueprint” built around three tracks: Protect AI, Utilize AI, and Govern AI—arguing that adoption is outrunning readiness, and that governance has to be as concrete as controls and detection.
That maps neatly to what Cursor just shipped: agent sandboxing. Cursor’s point is that auto-approving terminal commands is productive—until it isn’t. But if humans approve everything, they get tired and start rubber-stamping. Cursor’s approach is to let agents run freely inside a locked-down sandbox and only ask for approval when they need to escape—most often for internet access. They report sandboxed agents stop for approvals 40% less often, which is a productivity metric, but it’s also a security story: fewer opportunities for ‘approval fatigue’ to turn into a catastrophe.
And finally, a cautionary tale from the ad ecosystem: an agency called Mojo Dojo says Meta’s AI-driven account security is repeatedly banning new employee accounts—after ID uploads and face scans—before those accounts even touch ad systems. The most worrying detail is the support dead-end: appeals require logging in, but banned users can’t log in. Whether this is policy, automation error, or both, it’s a real example of what happens when identity and enforcement become “almost entirely AI” without a reliable human override.
GEPA optimizes any text artifact
Two quick hits before we close.
DuckDuckGo is rolling out AI image editing inside Duck.ai, free and without an account. It uses an OpenAI model under the hood, but DuckDuckGo says it strips metadata, removes IP addresses before sending prompts, and stores uploaded images locally on-device. Edited images are labeled with C2PA metadata. This is part of a broader positioning: AI features are optional, and there’s even an AI-free search interface at noai.duckduckgo.com.
And in gaming, Microsoft is reshaping Xbox leadership. Phil Spencer is retiring after decades at Microsoft and more than a decade leading Xbox. Sarah Bond is also leaving. Asha Sharma—previously President of Microsoft’s CoreAI product—becomes CEO of Microsoft Gaming. In her internal message, she emphasized investing in franchises while expanding Xbox across platforms, exploring new business models as monetization and AI evolve, and explicitly said Microsoft won’t flood games with “soulless AI slop.” That’s a notable line in an era where every entertainment executive is being pitched generative content at scale.
Subscribe to edition specific feeds:
- Space news
* Apple Podcast English
* Spotify English
* RSS English Spanish French
- Top news
* Apple Podcast English Spanish French
* Spotify English Spanish French
* RSS English Spanish French
- Tech news
* Apple Podcast English Spanish French
* Spotify English Spanish Spanish
* RSS English Spanish French
- Hacker news
* Apple Podcast English Spanish French
* Spotify English Spanish French
* RSS English Spanish French
- AI news
* Apple Podcast English Spanish French
* Spotify English Spanish French
* RSS English Spanish French
Visit our website at https://theautomateddaily.com/
Send feedback to [email protected]
Youtube
LinkedIn
X (Twitter)
- Consensus: AI for Research. Get a free month - https://get.consensus.app/automated_daily
- KrispCall: Agentic Cloud Telephony - https://try.krispcall.com/tad
- Prezi: Create AI presentations fast - https://try.prezi.com/automated_daily
Support The Automated Daily directly:
Buy me a coffee: https://buymeacoffee.com/theautomateddaily
Today's topics: ChatGPT ads and ambient devices - OpenAI’s ChatGPT ads went live, colliding with rumors of a pocket-sized, always-on assistant device—raising incentives, privacy, and data-control questions. Google Gemini 3.1 Pro leap - Google rolls out Gemini 3.1 Pro with a verified 77.1% on ARC-AGI-2, positioning it for complex reasoning and agentic workflows via API, Vertex AI, and NotebookLM. NotebookLM meets Opal workflows - An internal build hints NotebookLM notebooks could become native Opal tiles, turning curated notes into a reusable knowledge source for no-code automation blocks. ARC-AGI harness shows gaps - A custom ARC-AGI-3-style harness suggests Gemini 3.1 Pro improves task identification but struggles with execution and memory, while Claude Opus performs stronger under constraints. Cooperation emerges from extortion - A new arXiv paper shows in-context co-player inference can yield cooperation in multi-agent RL—because agents adapt quickly, they become extortable, creating pressure to cooperate. Cord’s agent trees with context - Cord proposes agent coordination as dependency trees with explicit spawn vs fork context flow, using MCP tools and a shared SQLite store to enforce authority and results injection. GEPA optimizes any text artifact - GEPA’s optimize_anything generalizes evolutionary optimization to any text artifact—prompts, code, configs, SVG—using evaluator feedback as Actionable Side Information and Pareto search. Crusoe Managed Inference KV cache - Crusoe launches Managed Inference with a cluster-wide KV cache (MemoryAlloy), claiming up to 9.9x faster time-to-first-token and 5x throughput vs vLLM benchmarks. SANS AI Cybersecurity Summit 2026 - SANS announces the AI Cybersecurity Summit 2026 plus optional GIAC-track courses, emphasizing technical workshops on prompt injection, agent failures, and AI-powered attacks. Agent safety: sandboxes and bans - Cursor’s agent sandboxing reduces approval fatigue by containing autonomous terminal commands, while Meta’s AI-driven account security reportedly creates onboarding false positives at scale. Microsoft Gaming leadership reshuffle - Phil Spencer retires from Xbox leadership as Asha Sharma becomes CEO of Microsoft Gaming, promising human-made art, cross-platform expansion, and no ‘soulless AI slop’. Production lessons: prompts to observability - Operator experience reports highlight what works for agents: prototype with frontier models, fine-tune for stable tasks, use typed languages, run multi-model critique loops, and invest in tracing.
-https://www.sans.org/cyber-security-training-events/ai-summit-2026
-https://arxiv.org/abs/2602.16301
-https://juno-labs.com/blogs/every-company-building-your-ai-assistant-is-an-ad-company
-https://www.neowin.net/news/phil-spencer-is-exiting-microsoft-as-ai-executive-takes-over-xbox/
-https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-1-pro/
-https://www.june.kim/cord
-https://www.testingcatalog.com/google-test-notebooklm-integration-for-opal-workflows/
-https://x.com/scaling01/status/2024640940657246235
-https://tomtunguz.com/9-observations-using-ai-agents/
-https://daoudclarke.net/2026/02/19/repeating-prompt
-https://www.crusoe.ai/cloud/managed-inference
-https://www.sans.org/mlp/ai-security-blueprint
-https://cursor.com/blog/agent-sandboxing
-https://9to5mac.com/2026/02/19/duckduckgo-rolls-out-ai-powered-image-editing-on-duck-ai/
-https://mojodojo.io/blog/meta-is-systematically-killing-our-agency/
-https://gepa-ai.github.io/gepa/blog/2026/02/18/introducing-optimize-anything/
-https://fortune.com/2026/02/19/openai-anthropic-sam-altman-dario-amodei-refused-to-hold-hands-ai-super-bowl-ad-war-ceos-big-tech-conflict/
-https://thezvi.wordpress.com/2026/02/19/ai-156-part-1-they-do-mean-the-effect-on-jobs/
-https://www.ben-evans.com/benedictevans/2026/2/19/how-will-openai-compete-nkg2x
Episode Transcript
ChatGPT ads and ambient devices
Let’s start with the business model tension that keeps showing up in AI.
OpenAI quietly rolled out advertisements inside ChatGPT—announced mid-January, and reportedly live by early February. On its own, ads are not shocking. What is more unsettling is the direction the broader market is heading: assistants that don’t wait for you to type, but instead stay “ambient”—always around, always sensing. The same commentary points to OpenAI’s acquisition of Jony Ive’s hardware startup io and the idea of a pocket-sized device with a microphone and camera, designed to be contextually aware—maybe even a phone replacement.
The crux of the argument is simple: privacy policies are promises, but architecture is enforcement. If a system is ad-funded, it’s structurally incentivized to learn more about you. And ambient audio and video inside a home is qualitatively different from scanning email—it captures arguments, health conversations, finances, and intimate moments. The proposed counterweight is edge inference: run the full pipeline locally so the assistant can “know everything” while sending nothing. Whether that becomes mainstream is unclear, but the incentive conflict is now out in the open.
That story also intersects with a very public rivalry: Sam Altman and Dario Amodei had a noticeably awkward onstage moment at India’s AI Impact Summit this week, after Anthropic’s Super Bowl campaign leaned hard into a message of “no ads in Claude.” The optics don’t matter as much as the positioning: one camp arguing for subsidized access at massive scale, the other selling the idea that attention-based monetization is a fundamental betrayal of the assistant concept.
Google Gemini 3.1 Pro leap
Now, to the model race—Google is pushing hard on reasoning.
Google announced Gemini 3.1 Pro, rolling out starting February 19 across consumer products like the Gemini app and NotebookLM, and developer channels like the Gemini API, Vertex AI, and Android Studio. Google frames it as the model you use when “a simple answer isn’t enough,” and says it’s the core intelligence behind recent “Deep Think” advances.
The headline number is a verified 77.1% on ARC-AGI-2, a benchmark designed to test whether a system can solve genuinely new logic patterns. Google claims that’s more than double Gemini 3 Pro’s reasoning performance on that test. The demos lean into synthesis and building: animated SVGs from prompts, a live dashboard that visualizes the International Space Station’s orbit from public telemetry, and interactive 3D experiences with hand-tracking and generative audio.
But the reality check comes from independent testing culture. One ARC-AGI-3-style harness report says Gemini 3.1 Pro is better at identifying what a puzzle wants, yet still fumbles execution—misreading visual cues, missing a 90-degree rotation, and running out of moves. The same tester says Claude 4.6 Opus (Thinking) looks stronger in planning and in how it uses memory, even if it still fails under tight action budgets. The interesting takeaway isn’t “who won”—it’s that memory structure and tool discipline are becoming first-class capabilities, not nice-to-haves.
NotebookLM meets Opal workflows
Staying with Google for a moment: there’s a quiet workflow story brewing.
An internal build suggests Google Labs is testing an integration where NotebookLM notebooks appear as native assets inside Opal, its no-code workflow builder. If that ships, NotebookLM stops being a passive research vault and becomes a persistent knowledge tile you can wire into automated flows—especially into Opal’s “Generate” block, where a prompt could directly reference your curated notebook.
That sounds small, but it’s a key pattern: durable, user-owned context feeding repeatable automations. Today, most “memory” in workflow tools is either temporary—or it’s spread across docs and tabs that humans have to shuttle manually. A NotebookLM tile could become a practical middle layer: not a full database, but a living, curated source of truth for analysts and researchers.
ARC-AGI harness shows gaps
Let’s shift into agents: how they cooperate, how we coordinate them, and how we keep them from causing damage.
On the research side, a new arXiv paper—“Multi-agent cooperation through in-context co-player inference”—explores a tricky question: how do self-interested reinforcement-learning agents end up cooperating without hardcoded assumptions about each other? The authors’ key move is to use sequence models trained against a diverse set of co-players. That diversity seems to teach agents a fast, within-episode adaptation ability—basically in-context learning for game-theoretic behavior.
And here’s the twist: that in-context adaptability makes agents vulnerable to extortion. If you can be exploited, you now have an incentive to shape how the other party adapts to you. The paper argues that this “mutual shaping” pressure can settle into cooperation—an emergent outcome, not a rule.
In the builder world, June Kim introduced Cord, an open-source concept for coordinating not a single chain of agents, but a tree of agents with dependencies and parallel branches—closer to how real work actually looks. Cord’s distinguishing feature is explicit control over context flow: “spawn” gives a child a clean slate plus only what it needs, while “fork” inherits the accumulated context for synthesis. It’s implemented with MCP tools and a shared SQLite store, and it even makes the human an explicit node via an “ask” primitive that blocks downstream steps until you answer.
Then there’s the meta-tooling wave: GEPA introduced optimize_anything, a declarative API that tries to optimize any text-representable artifact—prompts, code, agent designs, SVGs—by searching candidates, scoring them with your evaluator, and feeding diagnostic feedback back into the proposer model as “Actionable Side Information.” It’s the same idea behind gradient descent, but for messy, black-box objectives. The demos range from improving an SVG illustration to evolving multi-stage agent architectures for ARC-style tasks.
Cooperation emerges from extortion
If you’re actually running models in production, speed and cost still set the rules.
Crusoe Cloud launched “Managed Inference,” pitching low-latency, high-throughput inference with managed scaling. The claimed performance gains are big—up to 9.9x faster time-to-first-token and up to 5x throughput versus vLLM on a Llama-3.3-70B benchmark—attributed largely to MemoryAlloy, a cluster-wide KV cache that avoids duplicate prefills and supports persistent sessions with smarter routing.
In other words: the caching layer is becoming the product. If you can reuse computation across requests and keep sessions warm, you can make agents feel snappy without brute-forcing everything with more GPUs. Crusoe also bundled the launch into an “Intelligence Foundry” hub—model catalog, keys, monitoring, provisioned throughput—the usual platform shape, but tuned for inference economics.
And on the “how to build it” front, Tom Tunguz shared practical observations from a year of agent systems: prototype unpredictable inputs with frontier models first, fine-tune when the distribution is stable, and consider typed languages like Rust to reduce the ‘it compiled in the model’s imagination’ problem. He also advocates a multi-model braintrust—one drafts, others critique, then iterate—and notes a very real operational truth: in AI apps, traces are the documentation. His example of nightly mining the last hundred conversations for failure patterns is becoming a standard playbook.
One more lightweight but provocative research note: a Google paper argues that simply repeating a prompt can boost performance for non-reasoning models. The commentary around it is basically: it’s wild that such a blunt trick still works—and maybe it points to training changes, like segment-based attention masking, that could bake those gains in without wasting tokens.
Cord’s agent trees with context
Now, security—because agentic software changes the threat model, and today’s news reflects that.
SANS announced its AI Cybersecurity Summit 2026, happening April 20th and 21st in Arlington, Virginia, with a live online option—and then optional training courses from April 22nd through 27th. The summit portion offers 12 CPE credits and is chaired by Rob T. Lee. The agenda is clearly leaning technical: highly detailed talks, hands-on workshops, a solutions expo, and evening networking under the “Summit@Night” banner.
What stands out is the workshop content. One lab frames a “smart pizza place” as the target, using the OWASP AI Exchange to walk through prompt injection, data leakage, poisoning, supply-chain risks, vector database issues, and agents that do too much. Another workshop is an “OWASP FinBot Lab” CTF focused on agentic workflow failures—goal hijacking, tool misuse, and the kind of unexpected remote code execution that happens when an agent has more permissions than its designer anticipated.
SANS is also pushing a broader “Secure AI Blueprint” built around three tracks: Protect AI, Utilize AI, and Govern AI—arguing that adoption is outrunning readiness, and that governance has to be as concrete as controls and detection.
That maps neatly to what Cursor just shipped: agent sandboxing. Cursor’s point is that auto-approving terminal commands is productive—until it isn’t. But if humans approve everything, they get tired and start rubber-stamping. Cursor’s approach is to let agents run freely inside a locked-down sandbox and only ask for approval when they need to escape—most often for internet access. They report sandboxed agents stop for approvals 40% less often, which is a productivity metric, but it’s also a security story: fewer opportunities for ‘approval fatigue’ to turn into a catastrophe.
And finally, a cautionary tale from the ad ecosystem: an agency called Mojo Dojo says Meta’s AI-driven account security is repeatedly banning new employee accounts—after ID uploads and face scans—before those accounts even touch ad systems. The most worrying detail is the support dead-end: appeals require logging in, but banned users can’t log in. Whether this is policy, automation error, or both, it’s a real example of what happens when identity and enforcement become “almost entirely AI” without a reliable human override.
GEPA optimizes any text artifact
Two quick hits before we close.
DuckDuckGo is rolling out AI image editing inside Duck.ai, free and without an account. It uses an OpenAI model under the hood, but DuckDuckGo says it strips metadata, removes IP addresses before sending prompts, and stores uploaded images locally on-device. Edited images are labeled with C2PA metadata. This is part of a broader positioning: AI features are optional, and there’s even an AI-free search interface at noai.duckduckgo.com.
And in gaming, Microsoft is reshaping Xbox leadership. Phil Spencer is retiring after decades at Microsoft and more than a decade leading Xbox. Sarah Bond is also leaving. Asha Sharma—previously President of Microsoft’s CoreAI product—becomes CEO of Microsoft Gaming. In her internal message, she emphasized investing in franchises while expanding Xbox across platforms, exploring new business models as monetization and AI evolve, and explicitly said Microsoft won’t flood games with “soulless AI slop.” That’s a notable line in an era where every entertainment executive is being pitched generative content at scale.
Subscribe to edition specific feeds:
- Space news
* Apple Podcast English
* Spotify English
* RSS English Spanish French
- Top news
* Apple Podcast English Spanish French
* Spotify English Spanish French
* RSS English Spanish French
- Tech news
* Apple Podcast English Spanish French
* Spotify English Spanish Spanish
* RSS English Spanish French
- Hacker news
* Apple Podcast English Spanish French
* Spotify English Spanish French
* RSS English Spanish French
- AI news
* Apple Podcast English Spanish French
* Spotify English Spanish French
* RSS English Spanish French
Visit our website at https://theautomateddaily.com/
Send feedback to [email protected]
Youtube
X (Twitter)