Rethinking imitation learning with Predictive Inverse Dynamics Models

This research looks at why Predictive Inverse Dynamics Models often outperform standard Behavior Cloning in imitation learning. By using simple predictions of what happens next, PIDMs reduce ambiguity and learn from far fewer demonstrations. The post Rethinking imitation learning with Predictive Inverse Dynamics Models appeared first on Microsoft Research.

2026/2/6
articleCard.readMore

Paza: Introducing automatic speech recognition benchmarks and models for low resource languages

Microsoft Research unveils Paza, a human-centered speech pipeline, and PazaBench, the first leaderboard for low-resource languages. It covers 39 African languages and 52 models and is tested with communities in real settings. The post Paza: Introducing automatic speech recognition benchmarks and models for low resource languages appeared first on Microsoft Research.

2026/2/5
articleCard.readMore

UniRG: Scaling medical imaging report generation with multimodal reinforcement learning

AI can help generate medical image reports, but today’s models struggle with varying reporting schemes. Learn how UniRG uses reinforcement learning to boost performance of medical vision-language models. The post UniRG: Scaling medical imaging report generation with multimodal reinforcement learning appeared first on Microsoft Research.

2026/1/28
articleCard.readMore

Multimodal reinforcement learning with agentic verifier for AI agents

Argos improves multimodal RL by evaluating whether an agent’s reasoning aligns with what it observes over time. The approach reduces visual hallucinations and produces more reliable, data-efficient agents for real-world applications. The post Multimodal reinforcement learning with agentic verifier for AI agents appeared first on Microsoft Research.

2026/1/21
articleCard.readMore

OptiMind: A small language model with optimization expertise

OptiMind is a small language model that converts business operation challenges, described naturally, into mathematical formulations that optimization software can solve. It reduces formulation time & errors & enables fast, privacy-preserving local use. The post OptiMind: A small language model with optimization expertise appeared first on Microsoft Research.

2026/1/15
articleCard.readMore

Agent Lightning: Adding reinforcement learning to AI agents without code rewrites

By decoupling how agents work from how they’re trained, Agent Lightning turns each step an agent takes into data for reinforcement learning. This makes it easy for developers to improve agent performance with almost zero code changes. The post Agent Lightning: Adding reinforcement learning to AI agents without code rewrites appeared first on Microsoft Research.

2025/12/12
articleCard.readMore

Promptions helps make AI prompting more precise with dynamic UI controls

Promptions helps developers add dynamic, context-aware controls to chat interfaces so users can guide generative AI responses. It lets users shape outputs quickly without writing long instructions. The post Promptions helps make AI prompting more precise with dynamic UI controls appeared first on Microsoft Research.

2025/12/11
articleCard.readMore

GigaTIME: Scaling tumor microenvironment modeling using virtual population generated by multimodal AI

Using AI-generated virtual populations, Microsoft researchers uncovered hidden cellular patterns that could reshape how we understand and treat cancer. The post GigaTIME: Scaling tumor microenvironment modeling using virtual population generated by multimodal AI appeared first on Microsoft Research.

2025/12/10
articleCard.readMore

Ideas: Community building, machine learning, and the future of AI

As the Women in Machine Learning Workshop (WiML) marks its 20th annual gathering, cofounders, friends, and collaborators Jenn Wortman Vaughan and Hanna Wallach reflect on WiML’s evolution, navigating the field of ML, and their work in responsible AI. The post Ideas: Community building, machine learning, and the future of AI appeared first on Microsoft Research.

2025/12/2
articleCard.readMore

Reducing Privacy leaks in AI: Two approaches to contextual integrity 

New research explores two ways to give AI agents stronger privacy safeguards grounded in contextual integrity. One adds lightweight, inference-time checks; the other builds contextual awareness directly into models through reasoning and RL. The post Reducing Privacy leaks in AI: Two approaches to contextual integrity  appeared first on Microsoft Research.

2025/11/26
articleCard.readMore