基于自然语言处理的空管系统危险源文本分类方法研究A text classification method of hazards in ATM system based on NLP
郭九霞;
摘要(Abstract):
为了预防空管系统不安全事件的发生,利用人为因素分析与分类系统(HFACS)模型和自然语言处理技术,研究空管系统危险源文本分类方法。根据改进的HFACS模型建立空管系统危险源分类指标体系,选取民航空管安全管理系统的危险源数据库作为生语料库,将其划分为5级,并对其进行编码。为解决空管系统危险源数据库小样本、多标签、样本不均衡的问题,分别采用基于TFIDF-TextRank关键词提取的文本分类方法和基于CNN和BERT模型的文本分类方法进行试验。结果表明,基于TFIDF-TextRank关键词提取的文本分类方法的精确率和召回率明显优于基于CNN和BERT模型的分类方法,关键词提取方法可以有效处理小语料库文本分类问题,并有助于进一步研究空管系统不安全事件的形成机理。
关键词(KeyWords): 安全社会工程;空管系统;危险源;HFACS模型;TFIDF-TextRank方法;文本分类
基金项目(Foundation): 民航局安全能力建设项目(民航局合同(2021)87号);; 中国高校产学研创新基金项目(2021ALA02025)
作者(Authors): 郭九霞;
DOI: 10.13637/j.issn.1009-6094.2021.1687
参考文献(References):
- [1] International Civil Aviation Organization (ICAO).Safety management manual[M].4th ed.Montreal:International Civil Aviation Organization,2018.
- [2] O'CONNOR P.HFACS with an additional layer of granularity:validity and utility in accident analysis[J].Aviation Space and Environmental Medicine,2008,79(6):599-606.
- [3] RASMUSSEN J.Risk management in a dynamic society:a modelling problem[J].Safety Science,1997,27(2/3):183-213.
- [4] SHAPPELL S A,WIEGMANN D A.The human factors analysis and classification system-HFACS[R].Oklahoma City:Federal Aviation Administration,2000.
- [5] LEVESON N.A new accident model for engineering safer systems[J].Safety Science,2004,42(4):237-270.
- [6] SALMON P M,CORNELISSEN M,TROTTER M J.Systems-based accident analysis methods:a comparison of Accimap,HFACS,and STAMP[J].Safety Science,2012,50(4):1158-1170.
- [7] SHORROCK S T,KIRWAN B.Development and application of a human error identification tool for air traffic control[J].Applied Ergonomics,2002,33(4):319-336.
- [8] WIEGMANN D A,SHAPPELL S A.A human error approach to aviation accident analysis:the human factors analysis and classification system[M].Burlington:Ashgate Publishing Company,2003.
- [9] WALLACE B,ROSS A.Beyond human error:taxonomines and safety science[M].Boca Raton:CRC Press,2006.
- [10] OLSEN N S.Coding ATC incident data using HFACS:inter-coder consensus[J].Safety Science,2011,49(10):1365-1370.
- [11] 黄亚春.基于自然语言处理的建筑工程安全事故报告风险研究[D].武汉:华中科技大学,2019.HUANG Y C.Research based on natural language processing for risk of construction accident reports[D].Wuhan:Huazhong Univeristy of Science & Technology,2019.
- [12] 张国琛.基于词扩展LDA的铁路事故致因分析方法研究[D].北京:北京交通大学,2019.ZHANG G C.Research on EW-LDA based railway accident contributor analysis method[D].Beijing:Beijing Jiaotong University,2019.
- [13] 王洁宁,张聪俊,张钰涵.民航不安全事件报告危险源识别模型[J].安全与环境学报,2020,20(1):186-192.WANG J N,ZHANG C J,ZHANG Y H.Causative safety model identification reports on the civil aviatio incidents and accidents[J].Journal of Safety and Environment,2020,20(1):186-192.
- [14] 高自亮,张建平,田小强.基于 HFACS-ISM 的空管不安全事件人因分析[J].交通运输工程与信息学报,2020,18(3):57-63.GAO Z L,ZHANG J P,TIAN X Q.Analysis of air traffic control accidents using the human factors analysis and classification system-interpretative structural modeling (HFACS-ISM)[J].Journal of Transportation Engineering and Information,2020,18(3):57-63.
- [15] 赵京胜,朱巧明,周国栋,等.自动关键词抽取研究综述[J].软件学报,2017,28(9):2431-2449.ZHAO J S,ZHU Q M,ZHOU G D,et al.Review of research on automatic keyword extraction[J].Journal of Software,2017,28(9):2431-2449.
- [16] 常耀成,张宇翔,王红,等.特征驱动的关键词提取算法综述[J].软件学报,2018,29(7):2046-2070.CHANG Y C,ZHANG Y X,WANG H,et,al.Features oriented survey of state-of- the art keyphrase extraction algorithms[J].Journal of Software,2018,29(7):2046-2070.
- [17] SALTON G.Team-weighting approaches in automatic text retrieval[J].Information Processing & Management,1988,24(5):513-523.
- [18] MIHALCEA R,TARAU P.TextRank:bringing order into texts[C]//WU D.Proceeding of the 2004 Conference on Empirical Methods in Natural Language Processing (EMNLP).Stroudsburg:Association for Computational Linguistics,2004:404-411.
- [19] KIM Y.Convolutional neural networks for sentence classification[C]//DAELEMANS W.Proceeding of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP).Stroudsburg:Association for Computational Linguistics,2014:1746-1751.
- [20] DEVLIN J,CHANG M W,LEE K,et al.BERT:pre-training of deep bidirectional transformers for language understanding[C]//SOLORIO T.Human language technologies:Proceeding of the Conference of the North American Chapter of the Association for Computational Linguistics (NAACL).Stroudsburg:Association for Computational Linguistics,2019:4171-4186.