Topic

Research

Academic papers, technical breakthroughs, and lab research findings

Featured

Funding & Startups

Alibaba's Qwen Lead Researcher Launches AI Lab, Targets $2B Valuation

4 days ago· The Information

All Stories

Generative AI

AI IQ Launches Model Scorecard, Sparks Precision vs. Simplicity Debate

A new site called AI IQ has launched a framework for scoring frontier language models on a single intelligence…

3 days ago· VentureBeat AI

AI Risk & Security

Anthropic's Mythos AI Shows Sharper Hacking Skills, U.K. Researchers Find

Researchers at the U.K.'s AI Security Institute reported Wednesday that Anthropic's latest version of Mythos AI…

3 days ago· The Information

AI Agents

Frontier LLMs Silently Corrupt 25% of Documents in Iterative Workflows

Microsoft researchers developed a benchmark showing that frontier LLMs silently corrupt an average of 25% of document…

3 days ago· VentureBeat AI

AI Agents

Sakana trains 7B model to orchestrate GPT, Claude, Gemini

Sakana AI has developed RL Conductor, a 7-billion-parameter language model trained via reinforcement learning to…

9 days ago· VentureBeat AI

Research

Mamba Proves Viable for Time Series Classification

Researchers propose MambaSL, a minimally modified single-layer Mamba architecture designed specifically for time series…

11 days ago· ArXiv (cs.AI)

AI Hardware

Magnitude Beats Phase in Hybrid Quantum ML for SAR

Researchers tested five different encoding strategies for using Synthetic Aperture Radar data in quantum machine…

11 days ago· ArXiv (cs.AI)

Data & Training

Faithful Reasoning Emerges from Multi-Move Training, Not Direct Prediction

Researchers studied how reasoning develops in language models across supervised fine-tuning and reinforcement learning…

11 days ago· ArXiv (cs.AI)

AI Agents

SUDP: A Protocol to Keep Agent Secrets Secret

Researchers propose SUDP, a three-role protocol that lets AI agents perform secret-backed operations (API calls, cloud…

12 days ago· ArXiv (cs.AI)

AI Safety & Alignment

Safety Routing Circuits Found Across Models, Vulnerable to Encoding Attacks

Researchers have localized the policy routing mechanism in alignment-trained language models, identifying specific…

12 days ago· ArXiv (cs.AI)

AI for Business

GAN Synthesizes Missing Brain MRI Scans While Preserving Tumors

Researchers propose 3D-MC-SAGAN, a generative model that synthesizes missing MRI brain scan modalities from a single…

12 days ago· ArXiv (cs.AI)

AI for Business

Harvard Study: AI Outperforms Doctors on ER Diagnoses

A Harvard study evaluating large language models across medical contexts found that at least one AI model delivered…

13 days ago· TechCrunch AI

AI Risk & Security

GPT-5.5 matches Mythos Preview on cybersecurity tests

OpenAI's newly released GPT-5.5 performs at parity with Anthropic's restricted Mythos Preview model on cybersecurity…

13 days ago· Ars Technica AI

AI Safety & Alignment

Warmer AI Models Trade Accuracy for Empathy

Researchers at Oxford University's Internet Institute found that large language models fine-tuned to appear warmer and…

13 days ago· Ars Technica AI

AI for Business

MIT's Physics-Based Violin Simulator Offers Luthiers a New Design Tool

MIT engineers have developed a physics-based virtual violin simulation tool that models the fundamental acoustics of…

13 days ago· Ars Technica AI

GeminiTrending

Google Brings Gemini to Connected Vehicles via Software Update

Google is rolling out its Gemini AI assistant to vehicles equipped with Google built-in, replacing the current Google…

14 days ago· The Verge AI

Research

Data Sovereignty Becomes AI Strategy for Enterprises and Governments

A panel discussion from MIT Technology Review's EmTech AI conference explored how enterprises and governments are…

14 days ago· MIT Technology Review

AI Agents

Alibaba cuts AI agent tool calls 49x with decoupled optimization

Alibaba researchers introduced Hierarchical Decoupled Policy Optimization (HDPO), a reinforcement learning framework…

16 days ago· VentureBeat AI

AI Safety & Alignment

Goodfire's Silico Brings Mechanistic Interpretability to Model Development

Goodfire, a San Francisco startup, released Silico, a tool that lets developers inspect and adjust AI model parameters…

16 days ago· MIT Technology Review

AI Agents

Aggregating Zero-Shot LLMs Beats Single Models for Financial Disclosure Analysis

A new paper demonstrates that a lightweight supervised aggregator can effectively combine outputs from multiple…

16 days ago· ArXiv (cs.AI)

Research

NanoKnow: Mapping How LLMs Encode Knowledge

Researchers have released NanoKnow, a benchmark dataset that maps questions from Natural Questions and SQuAD to whether…

16 days ago· ArXiv (cs.AI)

AI Risk & Security

Personalized Calibration Makes Conformal Prediction Work in Clinical Settings

Researchers at the University of Illinois and collaborators demonstrate that personalized calibration strategies can…

16 days ago· ArXiv (cs.AI)

AI for BusinessTrending

DeepMind Pursues AI Co-Clinician Model for Healthcare

Google DeepMind is researching an AI co-clinician model designed to augment healthcare delivery by working alongside…

17 days ago· Google Deepmind

AI Hardware

GPU Rental Performance Varies Wildly Within Same Model

Research from the College of William & Mary, Jefferson Lab, and Silicon Data reveals significant performance…

17 days ago· IEEE Spectrum AI

AI Agents

Evaluation costs now rival training costs for AI models

AI evaluation costs have become a major bottleneck as benchmarking has shifted from static LLM tests to agent-based…

17 days ago· Hugging Face Blog

AI Hardware

Multi-Task EEG Model Cuts Costs with Low-Rank Adaptation

Researchers propose MTEEG, a multi-task learning framework that adapts pre-trained EEG models to multiple downstream…

17 days ago· ArXiv (cs.AI)

AI Agents

Frontier Agents Now Autonomously Implement ML Pipelines, With Claude Outpacing Rivals

Researchers benchmarked frontier coding agents on their ability to autonomously implement an AlphaZero-style machine…

17 days ago· ArXiv (cs.AI)

Generative AI

Poly-DPO and ViPO: Scaling Visual Preference Optimization

Researchers introduced Poly-DPO, an algorithmic extension to preference optimization that adds a polynomial term to…

17 days ago· ArXiv (cs.AI)

Data & Training

Scaling Multi-Anchor Embeddings to LLMs with 40x Compression

Researchers introduce Adaptive Dictionary Embeddings (ADE), a framework that scales multi-anchor word representations…

17 days ago· ArXiv (cs.AI)

AI Agents

New Training Method Cuts Reasoning Model Costs for Enterprises

Researchers at JD.com and academic partners introduced RLSD (Reinforcement Learning with Verifiable Rewards with…

18 days ago· VentureBeat AI