Reliable and Responsible Foundation Models: A Comprehensive Survey
Xinyu Yang, Junlin Han, Rishi Bommasani, Jinqi Luo, Wenjie Qu, Wangchunshu Zhou, Adel Bibi, Xiyao Wang, Jaehong Yoon, Elias Stengel-Eskin, Shengbang Tong, Lingfeng Shen, Rafael Rafailov, Runjia Li, Zhaoyang Wang, Yiyang Zhou, Chenhang Cui, Yu Wang, Wenhao Zheng, Huichi Zhou, Jindong Gu, Zhaorun Chen, Peng Xia, Tony Lee, Thomas Zollo, Vikash Sehwag, Jixuan Leng, Jiuhai Chen, Yuxin Wen, Huan Zhang, Zhun Deng, Linjun Zhang, Pavel Izmailov, Pang Wei Koh, Yulia Tsvetkov, Andrew Wilson, Jiaheng Zhang, James Zou, Cihang Xie, Hao Wang, Philip Torr, Julian McAuley, David Alvarez-Melis, Florian Tramèr, Kaidi Xu, Suman Jana, Chris Callison-Burch, Rene Vidal, Filippos Kokkinos, Mohit Bansal, Beidi Chen, Huaxiu Yao
TL;DR
This comprehensive survey analyzes reliable and responsible foundation models across nine dimensions (bias and fairness, alignment, security, privacy, hallucination, uncertainty, distribution shift, explainability, and AIGC detection) for four core families (LLMs, MLLMs, image and video generative models). It clarifies definitions, surveys state-of-the-art methods, and outlines concrete future directions, emphasizing interactions and trade-offs among dimensions and modalities. The work highlights data- and model-centric approaches (SFT, RLHF, instruction tuning, visual instruction tuning), multi-modal alignment challenges, and cross-cutting concerns such as risk, ethics, and governance. By connecting mechanisms like retrieval-augmented generation, diffusion-based image synthesis, and watermark-based AIGC detection, the survey provides a holistic blueprint for building reliable, robust, and socially responsible foundation models with practical implications for industry, academia, and policy. The findings underscore the importance of cross-disciplinary research, standardized evaluation, and continuous auditing to ensure foundation models remain powerful yet trustworthy as they scale and permeate real-world domains.
Abstract
Foundation models, including Large Language Models (LLMs), Multimodal Large Language Models (MLLMs), Image Generative Models (i.e, Text-to-Image Models and Image-Editing Models), and Video Generative Models, have become essential tools with broad applications across various domains such as law, medicine, education, finance, science, and beyond. As these models see increasing real-world deployment, ensuring their reliability and responsibility has become critical for academia, industry, and government. This survey addresses the reliable and responsible development of foundation models. We explore critical issues, including bias and fairness, security and privacy, uncertainty, explainability, and distribution shift. Our research also covers model limitations, such as hallucinations, as well as methods like alignment and Artificial Intelligence-Generated Content (AIGC) detection. For each area, we review the current state of the field and outline concrete future research directions. Additionally, we discuss the intersections between these areas, highlighting their connections and shared challenges. We hope our survey fosters the development of foundation models that are not only powerful but also ethical, trustworthy, reliable, and socially responsible.
