本文已被:浏览 131次 下载 42次
修订日期:2025-04-09
修订日期:2025-04-09
中文摘要: 训练数据的数量和质量对人工智能模型的性能至关重要。然而,目前我国训练数据的生产存在数量不足、质量较低、分布零散等问题,受限于商业生态、监管政策和公共数据开发利用的多重制约。为了解决这些问题,文章提出了一系列政策建议,包括:鼓励科研机构生产开源数据集、打造人工智能应用场景、采取“宽进严出”的监管理念、设立知识产权豁免条款、完善个人信息保护实施细则、加快建设全国统一的公共数据平台等。
Abstract:The quantity and quality of training data are critical to the performance of artificial intelligence (AI) models. However, in China, the production of training data is hindered by issues such as insufficient quantity, low quality, and fragmented distribution, compounded by limitations stemming from commercial ecosystems, regulatory frameworks, and restricted development and utilization of public data. To address these challenges, this study proposes several policy recommendations, including incentivizing research institutions to generate open-source datasets, fostering AI application scenarios, adopting a “loose-in, focus-out” regulatory approach, introducing intellectual property exemption provisions, refining personal information protection guidelines, and expediting the establishment of a unified national public data platform
keywords: artificial intelligence training data data element circulation intellectual property personal information protection public data
文章编号: 中图分类号: 文献标志码:
基金项目:
引用文本:
林韬.我国AI训练数据生产流通的制约因素与应对策略研究[J].中国科学院院刊,2025,40(4):672-680.
LIN Tao.Study on constraints and policy responses for production and circulation of AI training data in China[J].Bulletin of Chinese Academy of Sciences,2025,40(4):672-680.
林韬.我国AI训练数据生产流通的制约因素与应对策略研究[J].中国科学院院刊,2025,40(4):672-680.
LIN Tao.Study on constraints and policy responses for production and circulation of AI training data in China[J].Bulletin of Chinese Academy of Sciences,2025,40(4):672-680.