Overview
Personalization is key in understanding user behavior and has been a main focus in the fields of knowledge discovery and information retrieval. Building personalized recommender systems is especially important now due to the vast amount of user-generated textual content, which offers deep insights into user preferences. The recent advancements in Large Language Models (LLMs) have significantly impacted research areas, mainly in Natural Language Processing and Knowledge Discovery, giving these models the ability to handle complex tasks and learn context.
However, the use of generative models and user-generated text for personalized systems and recommendation is relatively new and has shown some promising results. This workshop is designed to bridge the research gap in these fields and explore personalized applications and recommender systems. We aim to fully leverage generative models to develop AI systems that are not only accurate but also focused on meeting individual user needs. Building upon the momentum of previous successful forums, this workshop seeks to engage a diverse audience from academia and industry, fostering a dialogue that incorporates fresh insights from key stakeholders in the field.
Call for papers
We will welcome papers that leverage generative models with a goal of recommendation and personalization on several topics including but not limited to those mentioned in CFP. Papers can be submitted via OpenReview. Submissions may be 2-8 pages (excluding references and supplementary materials).
Information for the day of the workshop
Workshop at WSDM 2026
- Submission deadline:
November 21, 2025November 28, 2025 - Author notifications: December 18, 2025
- Meeting: February 26, 2026
Schedule
| Time (MST) | Agenda |
|---|---|
| 8:55 - 9:00am | Opening remarks |
| 9:00 - 9:45am | Keynote by Dr. Meng Jiang (45 min) |
| 9:45 - 10:30am | Keynote by Dr. Yinglong Xia (45 min) |
| 10:30 - 11:00am | Coffee Break (30 min) |
| 11:00 - 11:45am | Oral Session 1 (45 min) - Paper Authors - AGP: Auto-Guided Prompt Refinement for Personalized Reranking in Recommender Systems - InsertRank: LLMs can Reason over BM25 Scores to Improve Listwise Reranking - Selective LLM-Guided Regularization for Enhancing Recommendation Models |
| 11:45am - 12:30pm | Keynote by Dr. Chirag Shah (45 min) |
| 12:30 - 1:45pm | Lunch (75 min) |
| 1:45 - 2:30pm | Keynote by Dr. Nathan Kallus (45 min) |
| 2:30 - 3:30pm | Oral Session 2 (60 min) - Paper Authors - Joint Evaluation: A Human+LLM+Multi-Agents Collaborative Framework for Comprehensive AI Safety - Multi-Agent Video Recommenders: Evolution, Patterns and Open Challenges - Large-Scale Retrieval for the LinkedIn Feed using Causal Language Models - Agentic Orchestration for Adaptive Educational Recommendations: A Multi-Agent LLM Framework for Personalized Learning Pathways |
| 3:30 - 4:00pm | Coffee Break (30 min) |
| 4:00 - 4:20pm | [Invited Talk by Liam Collins] (20 min) Sequential Data Augmentation for Generative Recommendation |
| 4:20 - 4:40pm | [Invited Talk by Chengyi Liu] (20 min) Continuous Time Discrete-space Diffusion Model for Recommendation |
| 4:40 - 4:50pm | Closing remarks |
Keynote Speakers
Chirag Shah
University of Washington
Beyond the Personalization-Privacy Pareto: Can AI Agents Break the Tradeoff?
Yinglong Xia
Meta
The Graph Trilogy: Unlocking Personalization at Scale
Meng Jiang
University of Notre Dame
What Does It Mean to Personalize a Language Model? Insights from RecSys to LLMs
Nathan Kallus
Cornell Tech, Netflix
LLM Post-Training and Reasoning via Efficient Value-Based RL
Accepted Papers
- InsertRank LLMs can Reason over BM25 Scores to Improve Listwise Reranking
Rahul SeetharamanAbstractAbstract: Large Language Models LLMs have demonstrated significant strides across various information retrieval tasks particularly as rerankers owing to their strong generalization and knowledge transfer capabilities acquired from extensive pretraining. In parallel the rise of LLM based chat interfaces has raised user expectations encouraging users to pose more complex queries that necessitate retrieval by reasoning over documents rather than through simple keyword matching or semantic similarity. While some recent efforts have exploited reasoning abilities of LLMs for reranking such queries considerable potential for improvement remains. In that regards we introduce InsertRank an LLM based reranker that leverages lexical signals like BM25 scores during reranking to further improve retrieval performance. InsertRank demonstrates improved retrieval effectiveness on BRIGHT a reasoning benchmark spanning 12 diverse domains and R2MED a specialized medical reasoning retrieval benchmark spanning 8 different tasks. We conduct an exhaustive evaluation and several ablation studies and demonstrate that InsertRank consistently improves retrieval effectiveness across multiple families of LLMs including GPT Gemini and Deepseek models.PDF Code - Selective LLM Guided Regularization for Enhancing Recommendation Models
Zhan Shi, Shanglin YangAbstractAbstract: Large language models LLMs provide rich semantic priors and strong reasoning capabilities making them promising auxiliary signals for recommendation. However prevailing approaches either deploy LLMs as standalone recommenders or apply global knowledge distillation both of which suffer from inherent drawbacks. Standalone LLM recommenders are costly biased and unreliable across large regions of the user item space while global distillation forces the downstream model to imitate LLM predictions even when such guidance is inaccurate. Meanwhile recent studies show that LLMs excel particularly in re ranking and challenging scenarios rather than uniformly across all contexts. We introduce Selective LLM Guided Regularization S LLMR a model agnostic and computation efficient framework that activates LLM based pairwise ranking supervision only when a trainable gating mechanism informed by user history length item popularity and model uncertainty predicts the LLM to be reliable. All LLM scoring is done offline transferring knowledge without increasing inference cost. Experiments across multiple datasets show that this selective strategy consistently improves overall accuracy and yields substantial gains in cold start and long tail regimes outperforming global distillation baselines.PDF Code - AGP Auto Guided Prompt Refinement for Personalized Reranking in Recommender Systems
Chen Wang, Mingdai Yang, Zhiwei Liu, Pan Li, Linsey Pang, Qingsong Wen, Philip S YuAbstractAbstract: Reranking plays a critical role in recommendation systems by refining initial predictions to better reflect user preferences. While large language models LLMs have shown promise in enhancing reranking through contextual reasoning they still rely heavily on manually crafted prompts an approach that is both labor intensive and difficult to scale. Although prompt optimization has been studied in domains like question answering and news recommendation its adaptation to general item recommendation remains limited due to the unstructured and inconsistent nature of item metadata. To address these challenges we propose Auto Guided Prompt Refinement AGP a novel framework that automatically refines user profile generation prompts instead of reranking prompts directly. AGP leverages position based feedback which encodes item level ranking misalignments and introduces batched training with aggregated feedback to ensure robust and generalizable prompt updates. Experimental results on Amazon Movies and TV Yelp and Goodreads demonstrate AGPs effectiveness. With only 100 training users AGP improves NDCG at 10 by 5.61 percent 2.46 percent and 6.18 percent when reranking SASRec and by 9.36 percent 7.98 percent and 20.68 percent when reranking LightGCN.PDF Code - Joint Evaluation Framework for Comprehensive AI Safety Assessment
Himanshu Joshi, Shivani Shukla, Priyanka KumarAbstractAbstract: Evaluating the safety and alignment of AI systems remains a critical challenge as foundation models grow increasingly sophisticated. Traditional evaluation methods rely heavily on human expert review creating bottlenecks that cannot scale with rapid AI development. We introduce Jo E Joint Evaluation a multi agent collaborative framework that systematically coordinates large language model evaluators specialized adversarial agents and strategic human expert involvement for comprehensive safety assessments. Our framework employs a five phase evaluation pipeline with explicit mechanisms for conflict resolution severity scoring and adaptive escalation. Through extensive experiments on GPT 4o Claude 3.5 Sonnet Llama 3.1 70B and Phi 3 medium we demonstrate that Jo E achieves 94.2 percent detection accuracy compared to 78.3 percent for single LLM as Judge approaches and 86.1 percent for Agent as Judge baselines while reducing human expert time by 54 percent compared to pure human evaluation.PDF Code - Multi Agent Video Recommenders Evolution Patterns and Open Challenges
Srivaths Ranganathan, Abhishek Dharmaratnakar, Anushree Sinha, Debanshu DasAbstractAbstract: Video recommender systems are among the most popular and impactful applications of AI shaping content consumption and influencing culture for billions of users. Traditional single model recommenders which optimize static engagement metrics are increasingly limited in addressing the dynamic requirements of modern platforms. In response multi agent architectures are redefining how video recommender systems serve learn and adapt to both users and datasets. These agent based systems coordinate specialized agents responsible for video understanding reasoning memory and feedback to provide precise explainable recommendations. In this survey we trace the evolution of multi agent video recommendation systems MAVRS. We combine ideas from multi agent recommender systems foundation models and conversational AI culminating in the emerging field of large language model LLM powered MAVRS. We present a taxonomy of collaborative patterns and analyze coordination mechanisms across diverse video domains ranging from short form clips to educational platforms.PDF Code - Large-Scale Retrieval for the LinkedIn Feed using Causal Language Models
Sudarshan Srinivasa Ramanujam, Antonio Alonso, Saurabh Kataria, Siddharth Dangi, Akhilesh Gupta, Birjodh Singh Tiwana, Manas Somaiya, Luke Simon, David Byrne, Sojeong Ha, Sen Zhou, Andrei Akterskii, Zhanglong Liu, Samira Sriram, Zihan Xiong, Zhoutao Pei, Angela Shao, Alexander Li, Annie Xiao, Caitlin Kolb, Thomas Kistler, Zach Moore, Hamed FiroozAbstractAbstract:PDF Code - Agentic Orchestration for Adaptive Educational Recommendations A Multi-Agent LLM Framework for Personalized Learning Pathways
Naina Chaturvedi, Ananda GunawardenaAbstractAbstract: Educational personalization represents a unique challenge for recommender systems: learners require not just content recommendations, but dynamic curriculum adaptation, real-time feedback, and proactive intervention strategies that evolve over extended timescales. We present a novel multi-agent architecture that treats educational personalization as an emergent property of specialized agent collaboration rather than a monolithic recommendation model. Our framework deploys 18+ coordinated agents organized in a four-tier hierarchy spanning perception, domain expertise, coordination, and strategic planning. Through deployment on a learning platform serving 6,000+ active users, we demonstrate that hierarchical agent orchestration enables recommendation capabilities unachievable by single-model approaches: parallel domain-specific analysis, temporal stratification from millisecond feedback to multi-month roadmap generation, and graceful degradation under partial failures. We present the architectural principles, coordination protocols, and preliminary evidence that agentic systems offer a promising paradigm for next-generation personalized learning systems. Our work contributes both a concrete implementation blueprint and theoretical foundations for applying multi-agent LLM orchestration to complex recommendation domains beyond education.PDF Code
Organizers
Narges Tabari
AWS AI Labs
Aniket Deshmukh
Databricks
Wang-Cheng Kang
Google DeepMind
Neil Shah
Snap Research
Julian McAuley
University of California, San Diego
James Caverlee
Texas A&M University
George Karypis
University of Minnesota
Program Committee
- TBD Member 1 (TBD Organization)
- TBD Member 2 (TBD Organization)
- TBD Member 3 (TBD Organization)