|本期目录/Table of Contents|

[1]高峰,董东,刘俊成.国家学生体质健康标准测试数据审核:异常检测算法优选与实证探索[J].体育科学研究,2025,(2):59-70.
 GAO Feng,DONG Dong,LIU Juncheng.Data Audit for National Student Physical Health Standard Testing:Optimal Selection of Anomaly Detection Algorithms and Empirical Exploration[J].sports science research,2025,(2):59-70.
点击复制

国家学生体质健康标准测试数据审核:异常检测算法优选与实证探索(PDF)
分享到:

《体育科学研究》[ISSN:1006-6977/CN:61-1281/TN]

卷:
期数:
2025年2期
页码:
59-70
栏目:
体育生物科学
出版日期:
2025-04-15

文章信息/Info

Title:
Data Audit for National Student Physical Health Standard Testing:Optimal Selection of Anomaly Detection Algorithms and Empirical Exploration
作者:
高峰134董东2刘俊成2
1.河北师范大学体育学院,河北 石家庄 050024;2.河北师范大学计算机与网络空间安全学院,河北 石家庄 050024;3.河北省国家学生体质健康标准数据管理与服务中心,河北 石家庄 050024;4.河北省人体运动生物信息测评重点实验室,河北 石家庄 050024
Author(s):
GAO Feng134DONG Dong2LIU Juncheng2
1.College of Physical Education,Hebei Normal University,Shijiazhuang 050024,China;2.College of Computer and Cyber Security,Hebei Normal University,Shijiazhuang 050024,China;3.Hebei NSPFS Data Management and Service Center,Shijiazhuang 050024,China;4.Province Key Lab of Human Movement Bioinformation Measurement and Evaluation,Shijiazhuang 050024,China
关键词:
国家学生体质健康标准测试数据审核异常检测隔离森林算法优选可视化
Keywords:
national students physical fitness standards testingdata auditinganomaly detectionisolation forestoptimal algorithm selectionvisualization
分类号:
-
DOI:
-
文献标志码:
A
摘要:
异常检测技术可辅助审核员快速筛查数据集中的潜在异常,但不同算法存在适用场景的局限性。以大学生国家学生体质健康标准测试(NSPFS)数据集为实证研究对象,从算法优选与可视化的角度探究单变量和多变量异常检测方法的适用性。基于6个基准数据集,采用元学习推荐工具和接受者操作特性曲线下面积完成算法优选。结果显示:1.5IQR单变量检测可辅助审核员发现单项异常审核线索;IForest算法适用于大规模NSPFS数据中异常点识别;学校异常评分排序和分组散点矩阵图可为无监督IForest筛查结果提供可视化解释。可见,IForest与1.5IQR相结合的异常检测可辅助审核员快速筛查NSPFS数据中的异常个体和高风险学校,为基于计算机辅助技术的交互式NSPFS数据审核框架的构建提供实证支撑。
Abstract:
Anomaly detection technology assists auditors to quickly screen potential or high-risk anomalies in datasets.However,each anomaly detection algorithm is confined to a specific scenario.From the perspectives of optimal algorithm selection and visualization,we have explored the applicability and feasibility of univariate and multivariate anomaly detection approaches in enabling auditors to quickly identify suspicious points in NSPFS datasets.The optimal selection of anomaly detection algorithms has been carried out on 6 benchmark datasets by using the meta-learning recommendation tool and the area integral under the receiver operating characteristic curve;the empirical exploration was conducted on the NSPFS dataset.Results indicated that 1.5IQR univariate detection is capable of providing auditors with audit clues for high-risk items.The isolated forest (IForest) algorithm can be utilized to screen suspicious points in large-scale NSPFS data.Meanwhile,the ranking of school anomaly scores and the grouped scatterplot matrix can provide visual interpretations for the screening results of the unsupervised IForest algorithm.Therefore,the anomaly detection combining IForest and 1.5IQR helps auditors quickly screen and identify abnormal individuals and high-risk schools in NSPFS data,providing empirical support for constructing an interactive NSPFS data auditing framework based on computer-aided technologies.

参考文献/References:

相似文献/References:

备注/Memo

备注/Memo:
更新日期/Last Update: 2025-05-01