Overview
Personalization is key in understanding user behavior and has been a main focus in the fields of knowledge discovery and information retrieval. Building personalized recommender systems is especially important now due to the vast amount of user-generated textual content, which offers deep insights into user preferences. The recent advancements in Large Language Models (LLMs) have significantly impacted research areas, mainly in Natural Language Processing and Knowledge Discovery, giving these models the ability to handle complex tasks and learn context.
However, the use of generative models and user-generated text for personalized systems and recommendation is relatively new and has shown some promising results. This workshop is designed to bridge the research gap in these fields and explore personalized applications and recommender systems. We aim to fully leverage generative models to develop AI systems that are not only accurate but also focused on meeting individual user needs. Building upon the momentum of previous successful forums, this workshop seeks to engage a diverse audience from academia and industry, fostering a dialogue that incorporates fresh insights from key stakeholders in the field.
Call for papers
We will welcome papers that leverage generative models with a goal of recommendation and personalization on several topics including but not limited to those mentioned in CFP. Papers can be submitted at OpenReview.
Information for the day of the workshop
Workshop at KDD2025
- Submission deadline:
May 8th, 2025May 18th, 2025 - Author notifications: June 8th, 2025
- Meeting: August 4th, 2025
Schedule
Time (EDT) | Agenda |
---|---|
1:00-1:10pm | Opening remarks |
1:10-1:50pm | Keynote by Dr. Ed Chi (40 min) |
1:50-2:30pm | Keynote by Dr. Luna Dong (40 min) |
2:30-3:00pm | [Poster Session] (30 min) |
3:00-3:30pm | [Coffee Break + Poster Spillover] (30 min) |
3:30-4:10pm | Keynote by Dr. Dong Wang (40 min) |
4:10-5:00pm | Panel Discussion (60 min) Panelists: Ed Chi, Dong Wang, Jundong Li, Neil Shah |
Keynote Speakers
Ed Chi
Google DeepMind
Title TBD
Luna Dong
Meta
From Sight to Insight: Visual Memory for Smarter Assistants
Dong Wang
University of Illinois Urbana-Champaign
Harnessing Generative AI for Efficient Multimodal Recommender Systems and Privacy-preserving Personalized Image Generation
Panelists
Ed Chi
Google DeepMind
Dong Wang
University of Illinois Urbana-Champaign
Jundong Li
University of Virginia
Neil Shah
Snap Research
Accepted Papers
- Best Paper Award LLM-based Conversational Recommendation Agents with Collaborative Verbalized Experience
Yaochen Zhu, Harald Steck, Dawen Liang, Yinhan He, Nathan Kallus, Jundong LiAbstractAbstract: Large language models (LLM) have demonstrated impressive zero- shot capabilities in conversational recommender systems (CRS). However, effectively utilizing historical conversations remains a significant challenge. Current approaches either retrieve few-shot examples or extract global rules to augment the prompt for LLM- based CRSs, which fail to capture the implicit and preference- oriented knowledge. To address the above challenge, we propose LLM-based Conversational Recommendation Agents with Collab- orative Verbalized Experience (CRAVE). CRAVE starts by sampling trajectories of LLM-based CRS agents on historical queries and establishing verbalized experience banks by reflecting the agents’ actions on user feedback. Additionally, we introduce a collaborative retriever network finetuned with content-parameterized multino- mial likelihood on query-items pairs to retrieve preference-oriented verbal experiences for new queries. Furthermore, we developed a debater-critic agent (DCA) system where each agent maintains an independent collaborative experience bank and works together to enhance the CRS recommendations. We demonstrate that the open- ended debate and critique nature of DCA benefits significantly from the collaborative experience augmentation with CRAVE.PDF Code - C-TLSAN Content-Enhanced Time-Aware Long- and Short-Term Attention Network for Personalized Recommendation
Siqi Liang, Yudi Zhang, Yubo WangAbstractAbstract: Sequential recommender systems aim to model users’ evolving preferences by capturing patterns in their historical interactions. Recent advances in this area have leveraged deep neural networks and attention mechanisms to effectively represent sequential behav- iors and time-sensitive interests. In this work, we propose C-TLSAN (Content-Enhanced Time-Aware Long- and Short-Term Attention Network), an extension of the TLSAN architecture that jointly models long- and short-term user preferences while incorporating semantic content associated with items—such as product descrip- tions. C-TLSAN enriches the recommendation pipeline by embedding textual content linked to users’ historical interactions directly into both long-term and short-term attention layers. This allows the model to learn from both behavioral patterns and rich item content, enhancing user and item representations across temporal dimen- sions. By fusing sequential signals with textual semantics, our ap- proach improves the expressiveness and personalization capacity of recommendation systems. We conduct extensive experiments on large-scale Amazon datasets, benchmarking C-TLSAN against state-of-the-art baselines, includ- ing recent sequential recommenders based on Large Language Mod- els (LLMs), which represent interaction history and predictions in text form. Empirical results demonstrate that C-TLSAN consistently outperforms strong baselines in next-item prediction tasks. Notably, it improves AUC by 1.66%, Recall@10 by 93.99%, and Precision@10 by 94.80% on average over the best-performing baseline (TLSAN) across 10 Amazon product categories. These results highlight the value of integrating content-aware enhancements into temporal modeling frameworks for sequential recommendation.PDF Code - Not Just What, But When Integrating Irregular Intervals to LLM for Sequential Recommendation
Wei-Wei Du, Takuma Udagawa, Kei TatenoAbstractAbstract: Time intervals between purchasing items are a crucial factor in se- quential recommendation tasks, whereas existing approaches focus on item sequences and often overlook by assuming the intervals between items are static. However, dynamic intervals serve as a dimension that describes user profiling on not only the history within a user but also different users with the same item history. In this work, we propose IntervalLLM, a novel framework that inte- grates interval information into LLM and incorporates the novel interval-infused attention to jointly consider information of items and intervals. Furthermore, unlike prior studies that address the cold-start scenario only from the perspectives of users and items, we introduce a new viewpoint the interval perspective to serve as an additional metric for evaluating recommendation methods on the warm and cold scenarios. Extensive experiments on 3 benchmarks with both traditional- and LLM-based baselines demonstrate that our IntervalLLM achieves not only 4.4% improvements in average but also the best-performing warm and cold scenarios across all users, items, and the proposed interval perspectives. In addition, we observe that the cold scenario from the interval perspective experi- ences the most significant performance drop among all recommen- dation methods. This finding underscores the necessity of further research on interval-based cold challenges and our integration of in- terval information in the realm of sequential recommendation tasks.PDF Code - LLM-Enhanced Reranking for Complementary Product Recommendation
Zekun Xu, Yudi ZhangAbstractAbstract: Complementary product recommendation, which aims to suggest items that are used together to enhance customer value, is a crucial yet challenging task in e-commerce. While existing graph neural network (GNN) approaches have made significant progress in cap- turing complex product relationships, they often struggle with the accuracy-diversity tradeoff, particularly for long-tail items. This paper introduces a model-agnostic approach that leverages Large Language Models (LLMs) to enhance the reranking of complemen- tary product recommendations. Unlike previous works that use LLMs primarily for data preprocessing and graph augmentation, our method applies LLM-based prompting strategies directly to rerank candidate items retrieved from existing recommendation models, eliminating the need for model retraining. Through ex- tensive experiments on public datasets, we demonstrate that our approach effectively balances accuracy and diversity in complemen- tary product recommendations, with at least 50% lift in accuracy metrics and 2% lift in diversity metrics on average for the top rec- ommended items across datasetsPDF Code - Dynamic Context-Aware Prompt Recommendation for Domain-Specific AI Applications
Xinye Tang, Haijun Zhai, Chaitanya Belwal, Vineeth Thayanithi, Philip Baumann, Yogesh K RoyAbstractAbstract: LLM-powered applications are highly susceptible to the quality of user prompts, and crafting high-quality prompts can often be challenging especially for domain-specific applications. This pa- per presents a novel dynamic context-aware prompt recommen- dation system for domain-specific AI applications. Our solution combines contextual query analysis, retrieval-augmented knowl- edge grounding, hierarchical skill organization, and adaptive skill ranking to generate relevant and actionable prompt suggestions. The system leverages behavioral telemetry and a two-stage hierar- chical reasoning process to dynamically select and rank relevant skills, and synthesizes prompts using both predefined and adap- tive templates enhanced with few-shot learning. Experiments on real-world datasets demonstrate that our approach achieves high usefulness and relevance, as validated by both automated and expert evaluations.PDF Code - Robustness of LLM-Initialized Bandits for Recommendation Under Noisy Priors
Adam Bayley, Kevin H. Wilson, Yanshuai Cao, Raquel Aoki, Xiaodan ZhuAbstractAbstract: Contextual bandits have proven effective for building personalized recommender systems, yet they suffer from the cold-start prob- lem when little user interaction data is available. Recent work has shown that Large Language Models (LLMs) can help address this by simulating user preferences to warm-start bandits—a method known as Contextual Bandits with LLM Initialization (CBLI). While CBLI reduces early regret, it is unclear how robust the approach is to inaccuracies in LLM-generated preferences. In this paper, we ex- tend the CBLI framework to systematically evaluate its sensitivity to noisy LLM priors. We inject both random and label-flipping noise into the synthetic training data and measure how these affect cu- mulative regret across three tasks generated from conjoint-survey datasets. Our results show that CBLI is robust to random corruption but exhibits clear breakdown thresholds under preference-flipping warm-starting remains effective up to 30% corruption, loses its ad- vantage around 40%, and degrades performance beyond 50%. We further observe diminishing returns with larger synthetic datasets beyond a point, more data can reinforce bias rather than improve performance under noisy conditions. These findings offer practical insights for deploying LLM-assisted decision systems in real-world recommendation scenariosPDF Code - Towards Large-scale Generative Ranking
Yanhua Huang, Yuqi Chen, Xiong Cao, Rui Yang, Mingliang Qi, Yinghao Zhu, Qingchang Han, Yaowei Liu, Zhaoyu Liu, Xuefeng Yao, Yuting Jia, Leilei Ma, Yinqi Zhang,Taoyu Zhu, Liujie Zhang, Lei Chen, Weihang Chen, Min Zhu, Ruiwen Xu, Lei ZhangAbstractAbstract: Generative recommendation has recently emerged as a promising paradigm in information retrieval. However, generative ranking systems are still understudied, particularly with respect to their effectiveness and feasibility in large-scale industrial settings. This paper investigates this topic at the ranking stage of Xiaohongshu’s Explore Feed, a recommender system that serves hundreds of mil- lions of users. Specifically, we first examine how generative ranking outperforms current industrial recommenders. Through theoretical and empirical analyses, we find that the primary improvement in ef- fectiveness stems from the generative architecture, rather than the training paradigm. To facilitate efficient deployment of generative ranking, we introduce GenRank, a novel generative architecture for ranking. We validate the effectiveness and efficiency of our solution through online A/B experiments. The results show that GenRank achieves significant improvements in user satisfaction with nearly equivalent computational resources compared to the existing production systemPDF Code - Enhancing Text Classification with a Novel Multi-Agent Collaboration Framework Leveraging BERT
Hediyeh Baban, Sai Abhishek Pidaparthi, Sichen Lu, Aashutosh Nema, Samaksh GulatiAbstractAbstract: We present a multi-agent collaboration framework that enhances text classification by dynamically routing low-confidence BERT pre- dictions to specialized agents—Lexical, Contextual, Logic, Consen- sus, and Explainability. This escalation mechanism enables deeper analysis and consensus-driven decisions. Across four benchmark datasets, our system improves classification accuracy by up to 5.5% over standard BERT, offering a scalable and interpretable solution for robust NLP.PDF Code - Optimizing Retrieval-Augmented Generation with Multi-Agent Hybrid Retrieval
Hediyeh Baban, Sai Abhishek Pidaparthi, Samaksh Gulati, Aashutosh NemaAbstractAbstract: With the rapid growth of digital content and scientific literature, efficient information retrieval is increasingly vital for research au- tomation, document management, and question answering. Tradi- tional retrieval methods like BM25 and embedding-based search, though effective individually, often fall short on complex queries. We propose an Agentic AI workflow for Retrieval-Augmented Generation (RAG), integrating hybrid retrieval with multi-agent collaboration. Our system combines BM25 and semantic search, ensembles results via weighted cosine similarity, and applies con- textual reordering using large language models. The workflow is powered by LangGraph, a multi-agent frame- work enabling dynamic agent coordination for document ranking and filtering. Experiments show a 4×reduction in retrieval latency (43s to 11s) and a 7% improvement in relevance accuracy. We also analyze weight sensitivity and discuss scalPDF Code - End-to-End Personalization Unifying Recommender Systems with Large Language Models
Danial Ebrat, Tina Aminian, Sepideh Ahmadian, Luis RuedaAbstractAbstract: Recommender systems are essential for guiding users through the vast and diverse landscape of digital content by delivering person- alized and relevant suggestions. However, improving both person- alization and interpretability remains a challenge, particularly in scenarios involving limited user feedback or heterogeneous item at- tributes. In this article, we propose a novel hybrid recommendation framework that combines Graph Attention Networks (GATs) with Large Language Models (LLMs) to address these limitations. LLMs are first used to enrich user and item representations by generating semantically meaningful profiles based on metadata such as titles, genres, and overviews. These enriched embeddings serve as initial node features in a user–movie bipartite graph, which is processed using a GAT-based collaborative filtering model. To enhance rank- ing accuracy, we introduce a hybrid loss function that combines Bayesian Personalized Ranking (BPR), cosine similarity, and robust negative sampling. Post-processing involves reranking the GAT- generated recommendations using the LLM, which also generates natural-language justifications to improve transparency. We evalu- ate our model on benchmark datasets, including MovieLens 100k and 1M, where it consistently outperforms strong baselines. Abla- tion studies confirm that LLM-based embeddings and the cosine similarity term significantly contribute to performance gains. This work demonstrates the potential of integrating LLMs to improve both the accuracy and interpretability of recommender systems.PDF Code - PromptShield: A Hybrid Framework for Copyright-Safe Text-to-Image Generation
Shreya GargAbstractAbstract: Text-to-image diffusion models are increasingly used in commercial creative workflows, including automated design generation for gift cards. However, these models—trained on large- scale web data—are prone to unintentionally generating content that infringes on copyrighted or trademarked material, particularly in the form of stylistic mimicry or semantic similarity to known intellectual property (IP). We propose PromptShield, a hybrid, dataset-free framework for proactively mitigating copyright risk in generative pipelines. PromptShield integrates three lightweight components: (1) zero-shot sentence transformer-based prompt filtering to flag high-risk queries, (2) prompt rewriting using large language models (LLMs) to preserve creative intent while removing IP cues, and (3) style regularization at image generation time using negative prompting and classifier-free guidance. Applied to the domain of Amazon Gift Card design, PromptShield achieves a 92% reduction in IP-risky generations without degrading image quality or prompt-image alignment. Our method enables scalable, safe design generationPDF Code
Organizers
Narges Tabari
AWS AI Labs
Aniket Deshmukh
AWS AI Labs
Wang-Cheng Kang
Google DeepMind
Neil Shah
Snap Research
Julian McAuley
University of California, San Diego
James Caverlee
Texas A&M University
George Karypis
University of Minnesota