{"ok":true,"capturedAt":"2026-06-01T23:39:34.996Z","venues":["ICLR 2025","NeurIPS 2025","ICML 2025","NeurIPS 2024"],"paper_count":120,"papers":[{"title":"ModHiFi: Identifying High Fidelity predictive components for Model Modification","authors":["Dhruva Kashyap","Chaitanya Murti","Pranav K Nayak","Tanay Narshana","Chiranjib Bhattacharyya"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"deep_learning","keywords":["Pruning","Machine Unlearning"],"abstract_snippet":"Open weight models, which are ubiquitous, rarely provide access to their training data or loss function. This makes modifying such models for tasks such as pruning or unlearning, which are constrained by this unavailability, an active ar...","forum_url":"https://openreview.net/forum?id=lClK4uBxSG","pdf_url":"https://openreview.net/pdf/540a7960e800d4bbd7de9a93e80daae00417007d.pdf","accepted_at":"2025-05-12T11:54:07.752Z"},{"title":"The Structure of Relation Decoding Linear Operators in Large Language Models","authors":["Miranda Anna Christ","Adrián Csiszárik","Gergely Becsó","Dániel Varga"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"deep_learning","keywords":["large language models","relations","tensor networks","interpretability"],"abstract_snippet":"This paper investigates the structure of linear operators introduced in Hernandez et al. [2023] that decode specific relational facts in transformer language models. We extend their single-relation findings to a collection of relations a...","forum_url":"https://openreview.net/forum?id=XsBzmJzJ2l","pdf_url":"https://openreview.net/pdf/ed3dd009ac0424ecb2d27ab9be7041714b6d8359.pdf","accepted_at":"2025-05-12T11:51:37.869Z"},{"title":"KLASS: KL-Guided Fast Inference in Masked Diffusion Models","authors":["Seo Hyun Kim","Sunwoo Hong","Hojung Jung","Youngrok Park","Se-Young Yun"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"deep_learning","keywords":["Generative Models","Efficient Inference Methods"],"abstract_snippet":"Masked diffusion models have demonstrated competitive results on various tasks including language generation. However, due to its iterative refinement process, the inference is often bottlenecked by slow and static sampling speed. To ove...","forum_url":"https://openreview.net/forum?id=gOG9Zoyn4R","pdf_url":"https://openreview.net/pdf/e5d6b78c9b1c9df5c9f7d513c8e643f790928834.pdf","accepted_at":"2025-05-12T11:38:35.083Z"},{"title":"HM3: Hierarchical Multi-Objective Model Merging for Pretrained Models","authors":["Yu Zhou","Xingyu Wu","Jibin Wu","Liang Feng","KC Tan"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"deep_learning","keywords":["Large language model","model merging","multi-objective optimization","architecture-level merging"],"abstract_snippet":"Model merging is a technique that combines multiple large pretrained models into a single model, enhancing performance and broadening task adaptability without original data or additional training. However, most existing model merging me...","forum_url":"https://openreview.net/forum?id=JeP0lpusYw","pdf_url":"https://openreview.net/pdf/4458c41d6973c01afd92fa290015347f767d7d67.pdf","accepted_at":"2025-05-12T11:35:10.537Z"},{"title":"Structured Sparse Transition Matrices to Enable State Tracking in State-Space Models","authors":["Aleksandar Terzic","Nicolas Menet","Michael Hersche","Thomas Hofmann","Abbas Rahimi"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"deep_learning","keywords":["State-Space Models","Expressiveness","Efficiency","Matrix Parametrisation","State-Tracking","Finite-State Automata"],"abstract_snippet":"Modern state-space models (SSMs) often utilize structured transition matrices\nwhich enable efficient computation but pose restrictions on the model’s expressivity,\nas measured in terms of the ability to emulate finite-state automata (FSA...","forum_url":"https://openreview.net/forum?id=RDbuSCWhad","pdf_url":"https://openreview.net/pdf/0e83e1ab7e0aace90a87b9ec410932a6355e3af2.pdf","accepted_at":"2025-05-12T11:18:56.212Z"},{"title":"Generalized Linear Mode Connectivity for Transformers","authors":["Alexander Theus","Alessandro Cabodi","Sotiris Anagnostidis","Antonio Orvieto","Sidak Pal Singh"],"venue_group":"NeurIPS 2025","tier":"Oral","primary_area":"deep_learning","keywords":["Neural Network Merging","Linear Mode Connectivity","Model Re-basin","Parameter Space Geometry","Transformer","Permutation Invariance"],"abstract_snippet":"Understanding the geometry of neural network loss landscapes is a central question in deep learning, with implications for generalization and optimization. A striking phenomenon is $\\textit{linear mode connectivity}$ (LMC), where indepen...","forum_url":"https://openreview.net/forum?id=KurYdcCbjv","pdf_url":"https://openreview.net/pdf/74f91986c42f68d256ffcef1d4a4ee46f7530d3d.pdf","accepted_at":"2025-05-12T11:11:40.657Z"},{"title":"Twilight: Adaptive Attention Sparsity with Hierarchical Top-$p$  Pruning","authors":["Chaofan Lin","Jiaming Tang","Shuo Yang","Hanshuo Wang","Tian Tang"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"deep_learning","keywords":["Large Language Model","Sparse Attention","Decode","KV Cache"],"abstract_snippet":"Leveraging attention sparsity to accelerate long-context large language models (LLMs) has been of great importance recently. However, most existing sparse attention algorithms use a fixed budget of how many tokens to use in their computa...","forum_url":"https://openreview.net/forum?id=Ve693NkzcU","pdf_url":"https://openreview.net/pdf/d63b27c7918827915ee716011ba51753540879dd.pdf","accepted_at":"2025-05-12T11:09:50.464Z"},{"title":"An Analysis of Causal Effect Estimation using Outcome Invariant Data Augmentation","authors":["UZAIR AKBAR","Niki Kilbertus","Hao Shen","Krikamol Muandet","Bo Dai"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"probabilistic_methods","keywords":["Causal Inference","Data Augmentation","Instrumental Variables","Out-of-distribution Generalization","Causal Regularization"],"abstract_snippet":"The technique of data augmentation (DA) is often used in machine learning for regularization purposes to better generalize under i.i.d. settings. In this work, we present a unifying framework with topics in causal inference to make a cas...","forum_url":"https://openreview.net/forum?id=C1LVIInfZO","pdf_url":"https://openreview.net/pdf/8114367043ade3658c5e579778d50f4103ffcb12.pdf","accepted_at":"2025-05-12T10:40:34.667Z"},{"title":"Deciphering the Extremes: A Novel Approach for Pathological Long-tailed Recognition in Scientific Discovery","authors":["Zhe Zhao","HaiBin Wen","Xianfu Liu","Rui Mao","Pengkun Wang"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"general_machine_learning","keywords":["Long-tailed learning","Imbalanced datasets"],"abstract_snippet":"Scientific discovery across diverse fields increasingly grapples with datasets exhibiting pathological long-tailed distributions: a few common phenomena overshadow a multitude of rare yet scientifically critical instances. Unlike standar...","forum_url":"https://openreview.net/forum?id=E16vULI6AF","pdf_url":"https://openreview.net/pdf/30f7d83b3452994b583222d8187fc0b243850349.pdf","accepted_at":"2025-05-12T10:23:15.813Z"},{"title":"OpenCUA: Open Foundations for Computer-Use Agents","authors":["Xinyuan Wang","Bowen Wang","Dunjie Lu","Junlin Yang","Tianbao Xie"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"applications","keywords":["Computer-Use Agent","Visual Language Model","Planning","Reasoning","Scaling","Dataset"],"abstract_snippet":"Vision-language models have demonstrated impressive capabilities as computer-use agents (CUAs) capable of automating diverse computer tasks. As their commercial potential grows, critical details of the most capable CUA systems remain clo...","forum_url":"https://openreview.net/forum?id=6iRZvJiC9Q","pdf_url":"https://openreview.net/pdf/eb1bd0238abbc386303352dba1049a4d5d1fec83.pdf","accepted_at":"2025-05-12T10:08:33.986Z"},{"title":"Near-Optimal Experiment Design in Linear non-Gaussian Cyclic Models","authors":["Ehsan Sharifian","Saber Salehkaleybar","Negar Kiyavash"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"probabilistic_methods","keywords":["Causal Discovery","Adaptive Experiment Design","Linear Non-Gaussian SCMs","Cyclic Causal Models","Adaptive Submodularity","Greedy Optimization"],"abstract_snippet":"We study the problem of causal structure learning from a combination of observational and interventional data generated by a linear non-Gaussian structural equation model that might contain cycles. Recent results show that using mere obs...","forum_url":"https://openreview.net/forum?id=opAU0pYlcP","pdf_url":"https://openreview.net/pdf/d2236355d992c37c0a651c49ec71e676bf864a18.pdf","accepted_at":"2025-05-12T10:05:13.871Z"},{"title":"Escaping saddle points without Lipschitz smoothness: the power of nonlinear preconditioning","authors":["Alexander Bodard","Panagiotis Patrinos"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"optimization","keywords":["Nonconvex optimization","generalized smoothness","saddle point avoidance"],"abstract_snippet":"We study generalized smoothness in nonconvex optimization, focusing on $(L_0, L_1)$-smoothness and anisotropic smoothness. The former was empirically derived from practical neural network training examples, while the latter arises natura...","forum_url":"https://openreview.net/forum?id=7qrhHzZpTA","pdf_url":"https://openreview.net/pdf/d73a4f2b49a9cca19b750c8c17c0411132d8beca.pdf","accepted_at":"2025-05-12T09:46:41.051Z"},{"title":"Activation Control for Efficiently Eliciting Long Chain-of-thought Ability of Language Models","authors":["Zekai Zhao","Qi Liu","Kun Zhou","Zihan Liu","Yifei Shao"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"deep_learning","keywords":["Large Language Models","Long Chain of Thoughts"],"abstract_snippet":"Despite the remarkable reasoning performance, eliciting the long chain-of-thought(CoT) ability in large language models(LLMs) typically requires costly reinforcement learning or supervised fine-tuning on high-quality distilled data. We i...","forum_url":"https://openreview.net/forum?id=XNo4yS9n1k","pdf_url":"https://openreview.net/pdf/2b62626f959a636707dfb61efe094e4dcd7bd32f.pdf","accepted_at":"2025-05-12T09:29:10.134Z"},{"title":"Direct Fisher Score Estimation for Likelihood Maximization","authors":["Sherman Khoo","Yakun Wang","Song Liu","Mark Beaumont"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"probabilistic_methods","keywords":["Simulation-based inference","Likelihood-free inference","Score Matching","Gradient-Based Optimization","Likelihood Maximization"],"abstract_snippet":"We study the problem of likelihood maximization when the likelihood function is intractable but model simulations are readily available. We propose a sequential, gradient-based optimization method that directly models the Fisher score ba...","forum_url":"https://openreview.net/forum?id=2h8bFmEQwh","pdf_url":"https://openreview.net/pdf/932db12010d7d344e1b4ec2a731dc50fc4d0c713.pdf","accepted_at":"2025-05-12T09:23:15.224Z"},{"title":"Path-Enhanced Contrastive Learning for Recommendation","authors":["Haoran Sun","Fei Xiong","Yuanzhe Hu","Liang Wang"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"applications","keywords":["graph mining","recommender systems","self-supervised learning","Contrastive Learning","Data Augmentation"],"abstract_snippet":"Collaborative filtering (CF) methods are now facing the challenge of data sparsity in recommender systems. In order to reduce the effect of data sparsity, researchers proposed contrastive learning methods to extract self-supervised signa...","forum_url":"https://openreview.net/forum?id=xKmlBQhgI4","pdf_url":"https://openreview.net/pdf/f68206ec049476ef519d4d121f30a3a4b1176561.pdf","accepted_at":"2025-05-12T09:22:43.816Z"},{"title":"T-REGS: Minimum Spanning Tree Regularization for Self-Supervised Learning","authors":["Julie Mordacq","David Loiseaux","Vicky Kalogeiton","Steve Oudot"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"general_machine_learning","keywords":["self-supervised learning","minimum spanning tree","dimensional collapse","dimension estimation","topological data analysis"],"abstract_snippet":"Self-supervised learning (SSL) has emerged as a powerful paradigm for learning representations without labeled data, often by enforcing invariance to input transformations such as rotations or blurring.\nRecent studies have highlighted tw...","forum_url":"https://openreview.net/forum?id=jvObbvshjE","pdf_url":"https://openreview.net/pdf/76246d79c19aebe4013dc98f01ce1e1541d9c8ba.pdf","accepted_at":"2025-05-12T09:21:43.608Z"},{"title":"Generating Informative Samples for Risk-Averse Fine-Tuning of Downstream Tasks","authors":["Heasung Kim","Taekyun Lee","Hyeji Kim","Gustavo De Veciana"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"general_machine_learning","keywords":["Generative models","data augmentation","score-based generative models","risk","importance sampling","wireless communications"],"abstract_snippet":"Risk-averse modeling is critical in safety-sensitive and high-stakes applications. Conditional Value-at-Risk (CVaR) quantifies such risk by measuring the expected loss in the tail of the loss distribution, and minimizing it provides a pr...","forum_url":"https://openreview.net/forum?id=kfB5Ciz2XZ","pdf_url":"https://openreview.net/pdf/b25b05c74fe4ba2989d38be8b7a439cfe68c26e8.pdf","accepted_at":"2025-05-12T09:20:31.822Z"},{"title":"AceSearcher: Bootstrapping Reasoning and Search for LLMs via Reinforced Self-Play","authors":["Ran Xu","Yuchen Zhuang","Zihan Dong","Ruiyu Wang","Yue Yu"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"applications","keywords":["large language model","retrieval augmented generation","self-play"],"abstract_snippet":"Search-augmented LLMs often struggle with complex reasoning tasks due to ineffective multi-hop retrieval and limited reasoning ability. We propose AceSearcher, a cooperative self-play framework that trains a single large language model (...","forum_url":"https://openreview.net/forum?id=jSgCM0uZn3","pdf_url":"https://openreview.net/pdf/d73f4b0dc08ba840cfc15d8e7e9be6ad4a9b52ce.pdf","accepted_at":"2025-05-12T09:20:21.325Z"},{"title":"DeltaFlow: An Efficient Multi-frame Scene Flow Estimation Method","authors":["Qingwen Zhang","Xiaomeng Zhu","Yushan Zhang","Yixi Cai","Olov Andersson"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"applications","keywords":["Scene flow estimation","Point clouds","Efficient and scalable vision"],"abstract_snippet":"Previous dominant methods for scene flow estimation focus mainly on input from two consecutive frames, neglecting valuable information in the temporal domain. While recent trends shift towards multi-frame reasoning, they suffer from rapi...","forum_url":"https://openreview.net/forum?id=T9qNDtvAJX","pdf_url":"https://openreview.net/pdf/058d68c6055660f7a33942f5d8e662e65a5d68ac.pdf","accepted_at":"2025-05-12T09:17:17.027Z"},{"title":"Web-Shepherd: Advancing PRMs for Reinforcing Web Agents","authors":["Hyungjoo Chae","Sunghwan Kim","Junhee Cho","Seungone Kim","Seungjun Moon"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"applications","keywords":["Web Agent","Reward Model","LLM"],"abstract_snippet":"Web navigation is a unique domain that can automate many repetitive real-life tasks and is challenging as it requires long-horizon sequential decision making beyond typical multimodal large language model (MLLM) tasks.\nYet, specialized r...","forum_url":"https://openreview.net/forum?id=G2kMroO9UV","pdf_url":"https://openreview.net/pdf/db46564195dad40b9b174514d7d03b0336d2a8eb.pdf","accepted_at":"2025-05-12T08:57:45.450Z"},{"title":"How do Transformers Learn Implicit Reasoning?","authors":["Jiaran Ye","Zijun Yao","Zhidian Huang","Liangming Pan","Jinxin Liu"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"deep_learning","keywords":["Implicit Reasoning","Multi-hop Inference","Transformer Models","Interpretability"],"abstract_snippet":"Recent work suggests that large language models (LLMs) can perform multi-hop reasoning implicitly---producing correct answers without explicitly verbalizing intermediate steps---but the underlying mechanisms remain poorly understood.\nIn...","forum_url":"https://openreview.net/forum?id=19ygs48nOa","pdf_url":"https://openreview.net/pdf/ad38d1ecc76e644507fb0bca76d2143a06260e52.pdf","accepted_at":"2025-05-12T08:52:00.402Z"},{"title":"Deep Compositional Phase Diffusion for Long Motion Sequence Generation","authors":["Ho Yin Au","Jie Chen","Junkun Jiang","Jingyu Xiang"],"venue_group":"NeurIPS 2025","tier":"Oral","primary_area":"deep_learning","keywords":["Motion Generation","Phase Autoencoder","Long Term Motion Sequence Generation","Motion Inbetweening"],"abstract_snippet":"Recent research on motion generation has shown significant progress in generating semantically aligned motion with singular semantics. However, when employing these models to create composite sequences containing multiple semantically ge...","forum_url":"https://openreview.net/forum?id=jzPQRbGkAq","pdf_url":"https://openreview.net/pdf/1130d293ca50514da0c4fc6832964cee8756084b.pdf","accepted_at":"2025-05-12T08:51:13.627Z"},{"title":"On the sample complexity of semi-supervised multi-objective learning","authors":["Tobias Wegel","Geelon So","Junhyung Park","Fanny Yang"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"theory","keywords":["multi-objective learning","statistical learning","optimization"],"abstract_snippet":"In multi-objective learning (MOL), several possibly competing prediction tasks must be solved jointly by a single model. Achieving good trade-offs may require a model class $\\mathcal{G}$ with larger capacity than what is necessary for so...","forum_url":"https://openreview.net/forum?id=IrgQe6YjKm","pdf_url":"https://openreview.net/pdf/586c6b806ea9e377e7bbf7731cd69b6c16ceaa5c.pdf","accepted_at":"2025-05-12T08:35:19.835Z"},{"title":"Diversity-Aware Policy Optimization for Large Language Model Reasoning","authors":["Jian Yao","Ran Cheng","Xingyu Wu","Jibin Wu","KC Tan"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"deep_learning","keywords":["LLM","Policy Optimization","Diversity"],"abstract_snippet":"The reasoning capabilities of large language models (LLMs) have advanced rapidly, particularly following the release of DeepSeek-R1, which has inspired a surge of research into data quality and reinforcement learning (RL) algorithms. Des...","forum_url":"https://openreview.net/forum?id=5eZ0iykpDU","pdf_url":"https://openreview.net/pdf/c884fab9a8271939813690f5af1e6cebe20f7a2a.pdf","accepted_at":"2025-05-12T08:34:40.672Z"},{"title":"Fixed-Point RNNs: Interpolating from Diagonal to Dense","authors":["Sajad Movahedi","Felix Sarnthein","Nicola Muca Cirone","Antonio Orvieto"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"deep_learning","keywords":["Deep Learning","Sequence Architecture","Recurrent Neural Network","State Space Model","Linear RNN","SSM"],"abstract_snippet":"Linear recurrent neural networks (RNNs) and state-space models (SSMs) such as Mamba have become promising alternatives to softmax-attention as sequence mixing layers in Transformer architectures. Current models, however, do not exhibit t...","forum_url":"https://openreview.net/forum?id=KT8y9pFgJE","pdf_url":"https://openreview.net/pdf/e76140ca8a27d51bcb73d98746f2e5ec88965a0d.pdf","accepted_at":"2025-05-12T08:28:17.211Z"},{"title":"Bridging Theory and Practice in Link Representation with Graph Neural Networks","authors":["Veronica Lachi","Francesco Ferrini","Antonio Longa","Bruno Lepri","Andrea Passerini"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"deep_learning","keywords":["Graph Neural Networks","Link Representation","Expressiveness"],"abstract_snippet":"Graph Neural Networks (GNNs) are widely used to compute representations of node pairs for downstream tasks such as link prediction. Yet, theoretical understanding of their expressive power has focused almost entirely on graph-level repre...","forum_url":"https://openreview.net/forum?id=WYnvP3DePZ","pdf_url":"https://openreview.net/pdf/124d69e8b54debfcfeb0e34773f9fb43350ff919.pdf","accepted_at":"2025-05-12T07:58:51.939Z"},{"title":"ErrorTrace: A Black-Box Traceability Mechanism Based on Model Family Error Space","authors":["Chuanchao Zang","Xiangtao Meng","Wenyu Chen","Tianshuo Cong","Zha Yaxing"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"social_and_economic_aspects_of_machine_learning","keywords":["LLM Intellectual Property Protection","LLM","LLM Safety","Error Space"],"abstract_snippet":"The open-source release of large language models (LLMs) enables malicious users to create unauthorized derivative models at low cost, posing significant threats to intellectual property (IP) and market stability. Existing IP protection m...","forum_url":"https://openreview.net/forum?id=3P3PL7aCXM","pdf_url":"https://openreview.net/pdf/d7ee6b9c50f47b267806e2463cc1c6fc31a5ac46.pdf","accepted_at":"2025-05-12T07:39:13.016Z"},{"title":"Protein Design with Dynamic Protein Vocabulary","authors":["Nuowei Liu","Jiahao Kuang","Yanting Liu","Tao Ji","Changzhi Sun"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"machine_learning_for_sciences","keywords":["function-based protein design"],"abstract_snippet":"Protein design is a fundamental challenge in biotechnology, aiming to design novel sequences with specific functions within the vast space of possible proteins. Recent advances in deep generative models have enabled function-based protei...","forum_url":"https://openreview.net/forum?id=MpJkAzwUtl","pdf_url":"https://openreview.net/pdf/80c2b5c41eda8d76419bc7f20cc82d3a85ec7a92.pdf","accepted_at":"2025-05-12T07:25:30.848Z"},{"title":"Towards Reliable Code-as-Policies: A Neuro-Symbolic Framework for Embodied Task Planning","authors":["Sanghyun Ahn","Wonje Choi","Junyong Lee","Jinwoo Park","Honguk Woo"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"reinforcement_learning","keywords":["neuro-symbolic","embodied task planning","large language model"],"abstract_snippet":"Recent advances in large language models (LLMs) have enabled the automatic generation of executable code for task planning and control in embodied agents such as robots, demonstrating the potential of LLM-based embodied intelligence. How...","forum_url":"https://openreview.net/forum?id=VaC4sa96EI","pdf_url":"https://openreview.net/pdf/5f58ec4c84b0853291bf90d7ce98babefbbbc313.pdf","accepted_at":"2025-05-12T07:15:36.154Z"},{"title":"AI-Researcher: Autonomous Scientific Innovation","authors":["Jiabin Tang","Lianghao Xia","Zhonghang Li","Chao Huang"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"applications","keywords":["LLM agents","AI for Science"],"abstract_snippet":"The powerful reasoning capabilities of Large Language Models (LLMs) in mathematics and coding, combined with their ability to automate complex tasks through agentic frameworks, present unprecedented opportunities for accelerating scienti...","forum_url":"https://openreview.net/forum?id=kQWyOYUAC4","pdf_url":"https://openreview.net/pdf/a1c63cdd0495de94664b1513f7d95a3aedcb483a.pdf","accepted_at":"2025-05-12T06:54:55.201Z"},{"title":"GnnXemplar: Exemplars to Explanations - Natural Language Rules for Global GNN Interpretability","authors":["Burouj Armgaan","Eshan Jain","Harsh Pandey","Mahesh Chandran","Sayan Ranu"],"venue_group":"NeurIPS 2025","tier":"Oral","primary_area":"social_and_economic_aspects_of_machine_learning","keywords":["graph neural network","graph machine learning","explainability","xai","global explanation","text-based explanation"],"abstract_snippet":"Graph Neural Networks (GNNs) are widely used for node classification, yet their opaque decision-making limits trust and adoption. While local explanations offer insights into individual predictions, global explanation methods—those that...","forum_url":"https://openreview.net/forum?id=eafIjoZAHm","pdf_url":"https://openreview.net/pdf/4c15b12cff38ede047ec44b0df28590572e447cc.pdf","accepted_at":"2025-05-12T06:54:01.960Z"},{"title":"Abstain Mask Retain Core: Time Series Prediction by Adaptive Masking Loss with Representation Consistency","authors":["Renzhao Liang","Sizhe Xu","Chenggang Xie","Jingru Chen","Feiyang Ren"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"deep_learning","keywords":["Time Series Forecasting","Redundant Feature Suppression","Adaptive Masking Loss"],"abstract_snippet":"Time series forecasting plays a pivotal role in critical domains such as energy management and financial markets. Although deep learning-based approaches (e.g., MLP, RNN, Transformer) have achieved remarkable progress, the prevailing \"lo...","forum_url":"https://openreview.net/forum?id=KrglRiOKYT","pdf_url":"https://openreview.net/pdf/69c0bd90c0a346f571aa8e64b71fc0f7c0db24d7.pdf","accepted_at":"2025-05-12T06:51:00.044Z"},{"title":"DexFlyWheel: A Scalable and Self-improving Data Generation Framework for Dexterous Manipulation","authors":["Kefei Zhu","Fengshuo Bai","YuanHao Xiang","Yishuai Cai","Xinglin Chen"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"reinforcement_learning","keywords":["robotic learning","dexterous manipulation","data flywheel"],"abstract_snippet":"Dexterous manipulation is critical for advancing robot capabilities in real-world applications, yet diverse and high-quality datasets remain scarce. Existing data collection methods either rely on human teleoperation or require significa...","forum_url":"https://openreview.net/forum?id=a49F7EAm6l","pdf_url":"https://openreview.net/pdf/afa7f5544d62b8d426e0a7ab40c16a45776f6efc.pdf","accepted_at":"2025-05-12T06:39:45.086Z"},{"title":"Learning Robust Vision-Language Models from Natural Latent Spaces","authors":["Zhangyun Wang","Ni Ding","Aniket Mahanti"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"deep_learning","keywords":["Adversarial Robustness","Prompt Learning","Vision-Language Models"],"abstract_snippet":"Pre-trained vision-language models (VLMs) exhibit significant vulnerability to imperceptible adversarial perturbations. Current advanced defense strategies typically employ adversarial prompt tuning to improve the adversarial robustness...","forum_url":"https://openreview.net/forum?id=7G9YKty2UZ","pdf_url":"https://openreview.net/pdf/e8595a31cf77a217d9396e59495cf5cba4c01d8e.pdf","accepted_at":"2025-05-12T06:36:04.266Z"},{"title":"Accelerating Diffusion LLMs via Adaptive Parallel Decoding","authors":["Daniel Mingyi Israel","Guy Van den Broeck","Aditya Grover"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"deep_learning","keywords":["LLM","discrete diffusion","autoregression","sequential","fast","throughput"],"abstract_snippet":"The generation speed of LLMs are bottlenecked by autoregressive decoding, where tokens are predicted sequentially one by one. Alternatively, diffusion large language models (dLLMs) theoretically allow for parallel token generation, but i...","forum_url":"https://openreview.net/forum?id=xwqTt26NJf","pdf_url":"https://openreview.net/pdf/598ea86df6f94c4e13ef03da44b681605dacb6ea.pdf","accepted_at":"2025-05-12T06:36:03.777Z"},{"title":"Inner Speech as Behavior Guides: Steerable Imitation of Diverse Behaviors for Human-AI coordination","authors":["Rakshit Trivedi","Kartik Sharma","David C. Parkes"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"social_and_economic_aspects_of_machine_learning","keywords":["Imitation Learning","Inner Speech","Cognitive Processes","Diverse Behaviors","Steerable Imitation","Human Demonstrations"],"abstract_snippet":"Effective human-AI coordination requires artificial agents capable of exhibiting and responding to human-like behaviors while adapting to changing contexts. Imitation learning has emerged as one of the prominent approaches to build such...","forum_url":"https://openreview.net/forum?id=AwLRF1lZvI","pdf_url":"https://openreview.net/pdf/c9a503dade610541901e33f0b5f64674371b53be.pdf","accepted_at":"2025-05-12T06:24:33.863Z"},{"title":"Graph-Based Attention for Differentiable MaxSAT Solving","authors":["Sota Moriyama","Katsumi Inoue"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"optimization","keywords":["Maximum Satisfiability","Weighted Maximum Satisfiability","Graph Neural Network","Graph Attention","T-Norm","Neurosymbolic"],"abstract_snippet":"The use of deep learning to solve fundamental AI problems such as Boolean Satisfiability (SAT) has been explored recently to develop robust and scalable reasoning systems. This work advances such neural-based reasoning approaches by deve...","forum_url":"https://openreview.net/forum?id=g9XLUU3TaG","pdf_url":"https://openreview.net/pdf/5ccdb0f59f31fa1bf01fc44b0a5efc46e96d53cd.pdf","accepted_at":"2025-05-12T06:17:24.317Z"},{"title":"Pass@K Policy Optimization: Solving Harder Reinforcement Learning Problems","authors":["Christian Walder","Deep Tejas Karkhanis"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"reinforcement_learning","keywords":["reinforcement learning","llm","pass at k","inference time compute","monte carlo","gradient estimation"],"abstract_snippet":"Reinforcement Learning algorithms commonly sample multiple ($n>1$) solution attempts for each problem and reward them independently. This optimizes for pass@1 performance and prioritizes individual sample performance over the diversity a...","forum_url":"https://openreview.net/forum?id=W6WC6047X2","pdf_url":"https://openreview.net/pdf/ad9487f82f7b5b7ee075d300beb3ae17fa78d1d9.pdf","accepted_at":"2025-05-12T06:12:56.918Z"},{"title":"LogicTree: Improving Complex Reasoning of LLMs via Instantiated Multi-step Synthetic Logical Data","authors":["Zehao Wang","Lin Yang","Jie Wang","Kehan Wang","Hanzhu Chen"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"deep_learning","keywords":["large language model","logical reasoning","data synthesis"],"abstract_snippet":"Despite their remarkable performance on various tasks, Large Language Models (LLMs) still struggle with logical reasoning, particularly in complex and multi-step reasoning processes. \nAmong various efforts to enhance LLMs' reasoning capa...","forum_url":"https://openreview.net/forum?id=z4AMrCOetn","pdf_url":"https://openreview.net/pdf/30059572044fffae201f1e0daf92fc78a5aef218.pdf","accepted_at":"2025-05-12T06:10:22.261Z"},{"title":"Multitask Learning  with Stochastic Interpolants","authors":["Hugo Negrel","Florentin Coeurdoux","Michael Samuel Albergo","Eric Vanden-Eijnden"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"deep_learning","keywords":["generative models","flows","diffusions","conditional generation","pinpointing","fine-tuning"],"abstract_snippet":"We propose a framework for learning maps between probability distributions that broadly generalizes the time dynamics of flow and diffusion models. To enable this, we generalize stochastic interpolants by replacing the scalar time variab...","forum_url":"https://openreview.net/forum?id=9k9ZsDs9Vc","pdf_url":"https://openreview.net/pdf/4317d99e7d5807e6559461fc01627d2cd92de467.pdf","accepted_at":"2025-05-12T06:06:13.063Z"},{"title":"FUDOKI: Discrete Flow-based Unified Understanding and Generation via Kinetic-Optimal Velocities","authors":["Jin Wang","Yao Lai","Aoxue Li","Shifeng Zhang","Jiacheng Sun"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"deep_learning","keywords":["Multimodal Large Language Models (MLLMs); Discrete Flow Matching"],"abstract_snippet":"The rapid progress of large language models (LLMs) has catalyzed the emergence of multimodal large language models (MLLMs) that unify visual understanding and image generation within a single framework. However, most existing MLLMs rely...","forum_url":"https://openreview.net/forum?id=RSVdHXZN6D","pdf_url":"https://openreview.net/pdf/4e0fd77a94401dc935b3e0e27331f1297941e3d1.pdf","accepted_at":"2025-05-12T06:01:15.446Z"},{"title":"Fast MRI for All: Bridging Access Gaps by Training without Raw Data","authors":["Yasar Utku Alcalar","Merve Gulle","Mehmet Akcakaya"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"machine_learning_for_sciences","keywords":["Computational Imaging","Fast MRI","Unsupervised Learning","Compressed Sensing","Deep Learning","Access"],"abstract_snippet":"Physics-driven deep learning (PD-DL) approaches have become popular for improved reconstruction of fast magnetic resonance imaging (MRI) scans. Though PD-DL offers higher acceleration rates than existing clinical fast MRI techniques, the...","forum_url":"https://openreview.net/forum?id=ugBmWX3H1R","pdf_url":"https://openreview.net/pdf/da505502afc71f8c0e5e0d3134118993b6ea153d.pdf","accepted_at":"2025-05-12T05:59:25.782Z"},{"title":"Knowledge Insulating Vision-Language-Action Models: Train Fast, Run Fast, Generalize Better","authors":["Danny Driess","Jost Tobias Springenberg","brian ichter","LILI YU","Adrian Li-Bell"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"deep_learning","keywords":["vision-language-action models","robotic manipulation","robot learning"],"abstract_snippet":"Vision-language-action (VLA) models provide a powerful approach to training control policies for physical systems, such as robots, by combining end-to-end learning with transfer of semantic knowledge from web-scale vision-language model...","forum_url":"https://openreview.net/forum?id=cb0xbZ3APM","pdf_url":"https://openreview.net/pdf/a125f5bc144a834ceef1946ec665a202b39c5b8c.pdf","accepted_at":"2025-05-12T05:53:13.255Z"},{"title":"Complete Structure Guided Point Cloud Completion via Cluster- and Instance-Level Contrastive Learning","authors":["Yang Chen","Yirun Zhou","WEIZHONG ZHANG","Cheng Jin"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"applications","keywords":["3D computer Vision","Point cloud","self-supervised point cloud completion","contrastive Learning"],"abstract_snippet":"Point cloud completion, aiming to reconstruct missing part from incomplete point clouds, is a pivotal task in 3D computer vision. Traditional supervised approaches often necessitate complete point clouds for training supervision, which a...","forum_url":"https://openreview.net/forum?id=4f6mEr1DQs","pdf_url":"https://openreview.net/pdf/898e52d701ce00afeb8ad0fde70d19f25cc91569.pdf","accepted_at":"2025-05-12T05:19:00.446Z"},{"title":"ARECHO: Autoregressive Evaluation via Chain-Based Hypothesis Optimization for Speech Multi-Metric Estimation","authors":["Jiatong Shi","Yifan Cheng","Bo-Hao Su","Hye-jin Shim","Jinchuan Tian"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"applications","keywords":["Speech Evaluation","Speech Assessment","Speech Profiling","Dynamic Classifier Chain"],"abstract_snippet":"Speech signal analysis poses significant challenges, particularly in tasks such as speech quality evaluation and profiling, where the goal is to predict multiple perceptual and objective metrics. For instance, metrics like PESQ (Perceptu...","forum_url":"https://openreview.net/forum?id=P2yIMJP5b1","pdf_url":"https://openreview.net/pdf/dd43f5f5899825b5b2a0670fde272dbc9a0c1879.pdf","accepted_at":"2025-05-12T04:52:48.004Z"},{"title":"Projective Equivariant Networks via Second-order Fundamental Differential Invariants","authors":["Yikang Li","Yeqing Qiu","Yuxuan Chen","Lingshen He","Lexiang Hu"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"deep_learning","keywords":["equivariant networks","differential invariants","projective group"],"abstract_snippet":"Equivariant networks enhance model efficiency and generalization by embedding symmetry priors into their architectures. However, most existing methods, primarily based on group convolutions and steerable convolutions, face significant li...","forum_url":"https://openreview.net/forum?id=crczm2smVo","pdf_url":"https://openreview.net/pdf/624d31bafbb51c489f233e04c37422b38b7e4d44.pdf","accepted_at":"2025-05-12T04:46:02.113Z"},{"title":"ZeroS: Zero‑Sum Linear Attention for Efficient Transformers","authors":["Jiecheng Lu","Xu Han","Yan Sun","Viresh Pati","Yubin Kim"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"deep_learning","keywords":["Attention Mechanisms","Transformer","Sequence Modeling"],"abstract_snippet":"Linear attention methods offer Transformers $O(N)$ complexity but typically underperform standard softmax attention. We identify two fundamental limitations affecting these approaches: the restriction to convex combinations that only per...","forum_url":"https://openreview.net/forum?id=Ms6IXbfzzX","pdf_url":"https://openreview.net/pdf/53b99866f7a487c410012b2077f3c4dc78c72742.pdf","accepted_at":"2025-05-12T04:41:38.273Z"},{"title":"Blameless Users in a Clean Room: Defining Copyright Protection for Generative Models","authors":["Aloni Cohen"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"theory","keywords":["copyright","generative models","law","definitions","AI and law","differential privacy"],"abstract_snippet":"Are there any conditions under which a generative model’s outputs are guaranteed not to infringe the copyrights of its training data? This is the question of \"provable copyright protection\" first posed by Vyas, Kakade, and Barak [ICML 20...","forum_url":"https://openreview.net/forum?id=V8SndhCN0z","pdf_url":"https://openreview.net/pdf/c42408e8338af818f6a989bc659012264cb38df9.pdf","accepted_at":"2025-05-12T04:40:05.881Z"},{"title":"Adaptive 3D Reconstruction via Diffusion Priors and Forward Curvature-Matching Likelihood Updates","authors":["Seunghyeok Shin","Dabin Kim","Hongki Lim"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"applications","keywords":["3D reconstruction","point cloud generation","diffusion model"],"abstract_snippet":"Reconstructing high-quality point clouds from images remains challenging in computer vision. Existing generative models, particularly diffusion models, based approaches that directly learn the posterior may suffer from inflexibility—they...","forum_url":"https://openreview.net/forum?id=IJLqUjtrls","pdf_url":"https://openreview.net/pdf/4c00ad056aea5a8ef45ec41abc040560d1de88b6.pdf","accepted_at":"2025-05-12T04:38:15.964Z"},{"title":"From Shortcut to Induction Head: How Data Diversity Shapes Algorithm Selection in Transformers","authors":["Ryotaro Kawata","Yujin Song","Alberto Bietti","Naoki Nishikawa","Taiji Suzuki"],"venue_group":"NeurIPS 2025","tier":"Spotlight","primary_area":"theory","keywords":["Transformers","induction head","training dynamics"],"abstract_snippet":"Transformers can implement both generalizable algorithms (e.g., induction heads) and simple positional shortcuts (e.g., memorizing fixed output positions). In this work, we study how the choice of pretraining data distribution steers a s...","forum_url":"https://openreview.net/forum?id=n0QvMU2kON","pdf_url":"https://openreview.net/pdf/9c8c670660a860bd105aa1ba9eccc494261f7690.pdf","accepted_at":"2025-05-12T04:36:25.064Z"}],"notes":["Notable-tier (Oral and Spotlight) accepted papers from current top ML venues on OpenReview. The decision tier is the acceptance signal. Abstracts are clipped; follow forum_url for the full paper. The venue list is curated and refreshed each conference season."],"source":{"name":"OpenReview","url":"https://openreview.net","license":"OpenReview public submission metadata (v2 API). TensorFeed links and summarizes with a clipped abstract; full text and PDFs are not republished."}}