An empirical study on dependence clusters for effort-aware fault-proneness prediction

Book chapter


Yang, Yibiao, Harman, Mark, Krinke, Jens, Islam, S., Binkley, David, Zhou, Yuming and Xu, Baowen 2016. An empirical study on dependence clusters for effort-aware fault-proneness prediction. in: Lo, David, Apel, Sven and Khurshid, Sarfraz (ed.) ASE’16 Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering IEEE/ACM. pp. 296-307
AuthorsYang, Yibiao, Harman, Mark, Krinke, Jens, Islam, S., Binkley, David, Zhou, Yuming and Xu, Baowen
EditorsLo, David, Apel, Sven and Khurshid, Sarfraz
Abstract

A dependence cluster is a set of mutually inter-dependent program elements. Prior studies have found that large dependence clusters are prevalent in software systems. It has been suggested that dependence clusters have potentially harmful effects on software quality. However, little empirical evidence has been provided to support this claim. The study presented in this paper investigates the relationship between dependence clusters and software quality at the function-level with a focus on effort-aware fault-proneness prediction. The investigation first analyzes whether or not larger dependence clusters tend to be more fault-prone. Second, it investigates whether the proportion of faulty functions inside dependence clusters is significantly different from the proportion of faulty functions outside dependence clusters. Third, it examines whether or not functions inside dependence clusters playing a more important role than others are more fault-prone. Finally, based on two groups of functions (i.e., functions inside and outside dependence clusters), the investigation considers a segmented fault-proneness prediction model. Our experimental results, based on five well-known open-source systems, show that (1) larger dependence clusters tend to be more fault-prone; (2) the proportion of faulty functions inside dependence clusters is significantly larger than the proportion of faulty functions outside dependence clusters; (3) functions inside dependence clusters that play more important roles are more fault-prone; (4) our segmented prediction model can significantly improve the effectiveness of effort-aware fault-proneness prediction in both ranking and classification scenarios. These findings help us better understand how dependence clusters influence software quality.

KeywordsDependence clusters; fault-proneness; fault prediction; network analysis
Book titleASE’16 Proceedings of the 31st IEEE/ACM International Conference on Automated Software Engineering
Page range296-307
Year2016
PublisherIEEE/ACM
Publication dates
Print06 Oct 2016
Publication process dates
Deposited02 Mar 2017
Event31st IEEE/ACM International Conference on Automated Software Engineering (ASE)
ISBN978-1-4503-3845-5
978-1-5090-5571-5
FunderNational Key Basic Research and Development Program of China
National Natural Science Foundation of China
Natural Science Foundation of Jiangsu Province
Engineering and Physical Sciences Research Council (EPSRC)
National Science Foundation (NSF)
Fulbright award
National Key Basic Research and Development Program of China
National Natural Science Foundation of China
National Natural Science Foundation of China
National Natural Science Foundation of China
National Natural Science Foundation of China
National Natural Science Foundation of China
National Natural Science Foundation of China
National Natural Science Foundation of China
National Natural Science Foundation of Jiangsu Province
Engineering and Physical Sciences Research Council
National Science Foundation
Fulbright award
Web address (URL)http://ieeexplore.ieee.org/abstract/document/7582767/
Additional information

© 2016 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Y. Yang et al., "An empirical study on dependence clusters for effort-aware fault-proneness prediction," 2016 31st IEEE/ACM International Conference on Automated Software Engineering (ASE), Singapore, 2016, pp. 296-307.

Accepted author manuscript
Permalink -

https://repository.uel.ac.uk/item/84z2x

Download files

  • 148
    total views
  • 266
    total downloads
  • 1
    views this month
  • 13
    downloads this month

Export as

Related outputs

Semantic-Based Process Mining Technique for Annotation and Modelling of Domain Processes
Okoye, K., Islam, S., Naeem, U. and Sharif, S. 2020. Semantic-Based Process Mining Technique for Annotation and Modelling of Domain Processes. International Journal of Innovative Computing, Information and Control. 16 (3), pp. 899-921. https://doi.org/10.24507/ijicic.16.03.899
Improving Student Engagement and Performance in Computing Final Year Projects
Naeem, U., Islam, S. and Siddiqui, A. 2019. Improving Student Engagement and Performance in Computing Final Year Projects. IEEE TALE 2019. Yogyakarta - Indonesia 10 - 13 Oct 2019 IEEE. https://doi.org/10.1109/TALE48000.2019.9225860
Functional Connectivity Evaluation for Infant EEG Signals based on Artificial Neural Network
Sharif, M., Naeem, U., Islam, S. and Karami, A. 2018. Functional Connectivity Evaluation for Infant EEG Signals based on Artificial Neural Network. Arai, Kohei, Kapoor, Supriya and Bhatia, Rahul (ed.) Intelligent Systems Conference (IntelliSys) 2018. London, UK 06 - 07 Sep 2018 Springer, Cham. https://doi.org/10.1007/978-3-030-01057-7_34
The Application of a Semantic-Based Process Mining Framework on a Learning Process Domain
Okoye, Kingsley, Islam, S., Naeem, U., Sharif, M., Azam, Muhammad Awais and Karami, A. 2018. The Application of a Semantic-Based Process Mining Framework on a Learning Process Domain. Arai, Kohei, Kapoor, Supriya and Bhatia, Rahul (ed.) Intelligent Systems Conference (IntelliSys) 2018. London, UK 06 - 07 Sep 2018 Springer, Cham. https://doi.org/10.1007/978-3-030-01054-6_96
Authentication of Smartphone Users Based on Activity Recognition and Mobile Sensing
Ehatisham-ul-Haq, Muhammad, Azam, Muhammad Awais, Loo, Jonathan, Shuang, Kai, Islam, S., Naeem, U. and Amin, Yasar 2017. Authentication of Smartphone Users Based on Activity Recognition and Mobile Sensing. Sensors. 17 (9), p. 2043. https://doi.org/10.3390/s17092043
Taskification – Gamification of Tasks
Naeem, U., Islam, S., Sharif, M., Sudakov, Sergey and Azam, Awais 2017. Taskification – Gamification of Tasks. in: Proceedings of the 2017 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2017 ACM International Symposium on Wearable Computers Association for Computing Machinery (ACM). pp. 631-634
SignalSense - Towards Quality Service
Islam, S., Sharif, M., Naeem, U. and Geehan, James 2017. SignalSense - Towards Quality Service. in: Proceedings of the 2017 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2017 ACM International Symposium on Wearable Computers Association for Computing Machinery (ACM). pp. 627-630
CrimeSafe - Helping you stay safe
Islam, S., Naeem, U., Sharif, M. and Dovnarovic, Arnold 2017. CrimeSafe - Helping you stay safe. in: Proceedings of the 2017 ACM International Joint Conference on Pervasive and Ubiquitous Computing and Proceedings of the 2017 ACM International Symposium on Wearable Computers Association for Computing Machinery (ACM). pp. 642-645
Using semantic-based approach to manage perspectives of process mining: Application on improving learning process domain data
Kingsley, Okoye, Tawil, Abdel-Rahman H., Naeem, U., Islam, S. and Lamine, Elyes 2017. Using semantic-based approach to manage perspectives of process mining: Application on improving learning process domain data. in: 2016 IEEE International Conference on Big Data (Big Data) IEEE. pp. 3529-3538
Dependence Cluster Visualization
Islam, S., Krinke, Jens and Binkley, David 2010. Dependence Cluster Visualization. in: Proceedings of the 5th international symposium on Software visualization New York, NY, USA Association for Computing Machinery (ACM). pp. 93-102
Less is more: Temporal fault predictive performance over multiple Hadoop releases
Harman, Mark, Islam, S., Jia, Yue, Minku, Leandro L., Sarro, Federica and Srivisut, Komsan 2014. Less is more: Temporal fault predictive performance over multiple Hadoop releases. in: Goues, Claire Le and Yoo, Shin (ed.) Search-Based Software Engineering Springer International Publishing.
ORBS: Language-Independent Program Slicing
Binkley, David, Gold, Nicolas, Harman, Mark, Islam, S., Krinke, Jens and Yoo, Shin 2014. ORBS: Language-Independent Program Slicing. in: Proceedings of the 22nd ACM SIGSOFT International Symposium on Foundations of Software Engineering New York, NY, USA Association for Computing Machinery (ACM). pp. 109-120
Jolinar: Analysing the Energy Footprint of Software Applications (demo)
Noureddine, A., Islam, S. and Bashroush, R. 2016. Jolinar: Analysing the Energy Footprint of Software Applications (demo). in: Proceedings of the 25th International Symposium on Software Testing and Analysis New York, NY, USA Association for Computing Machinery (ACM). pp. 445-448
Semantic-Based Model Analysis Towards Enhancing Information Values of Process Mining: Case Study of Learning Process Domain
Okoye, Kingsley, Tawila, Abdel-Rahman H., Naeem, U., Islam, S. and Lamine, Elyes 2017. Semantic-Based Model Analysis Towards Enhancing Information Values of Process Mining: Case Study of Learning Process Domain. in: Abraham, Ajith, Cherukuri, Aswani Kumar, Madureira, Ana Maria and Muda, Azah Kamilah (ed.) Proceedings of the Eighth International Conference on Soft Computing and Pattern Recognition (SoCPaR 2016) Springer, Cham.
Assessing the impact of global variables on program dependence and dependence clusters
Binkley, David, Harman, Mark, Hassoun, Youssef, Islam, S. and Li, Zheng 2009. Assessing the impact of global variables on program dependence and dependence clusters. Journal of Systems and Software. 83 (1), pp. 96-107. https://doi.org/10.1016/j.jss.2009.03.038
Requirements for the formal representation of pathophysiology mechanisms by clinicians
de Bono, B., Helvensteijn, M., Kokash, N., Martorelli, I., Sarwar, D., Islam, S., Grenon, P. and Hunter, P. 2016. Requirements for the formal representation of pathophysiology mechanisms by clinicians. Interface Focus. 6 (2), p. 20150075. https://doi.org/10.1098/rsfs.2015.0099
Towards Cloud Security Monitoring: A Case Study
Ismail, Umar Mukhtar, Islam, S. and Islam, S. 2016. Towards Cloud Security Monitoring: A Case Study. in: 2016 Cybersecurity and Cyberforensics Conference (CCC) IEEE.
gUML: Reasoning about Energy at Design Time by Extending UML Deployment Diagrams with Data Centre Contextual Information
Jebraeil, Nigar, Noureddine, A., Doyle, J., Islam, S. and Bashroush, R. 2017. gUML: Reasoning about Energy at Design Time by Extending UML Deployment Diagrams with Data Centre Contextual Information. in: 2017 IEEE World Congress on Services (SERVICES) IEEE. pp. In Press
Cloud Strife: Expanding the Horizons of Cloud Gaming Services
Doyle, J., Islam, S., Bashroush, R. and O'Mahony, Donal 2017. Cloud Strife: Expanding the Horizons of Cloud Gaming Services. in: 2017 IEEE World Congress on Services (SERVICES) IEEE.
Measuring energy footprint of software features
Islam, S., Noureddine, A. and Bashroush, Rabih 2016. Measuring energy footprint of software features. in: 2016 IEEE 24th International Conference on Program Comprehension (ICPC) IEEE.
PORBS: A parallel observation-based slicer
Islam, S. and Binkley, David 2016. PORBS: A parallel observation-based slicer. in: 2016 IEEE 24th International Conference on Program Comprehension (ICPC) IEEE.
Efficient Identification of Linchpin Vertices in Dependence Clusters
Binkley, David, Gold, Nicolas, Harman, Mark, Islam, S., Krinke, Jens and Li, Zheng 2013. Efficient Identification of Linchpin Vertices in Dependence Clusters. ACM Transactions on Programming Languages and Systems. 35 (2), pp. 1-35.
ORBS and the limits of static slicing
Binkley, David, Gold, Nicolas, Harman, Mark, Islam, S., Krinke, Jens and Yoo, Shin 2015. ORBS and the limits of static slicing. in: 2015 IEEE 15th International Working Conference on Source Code Analysis and Manipulation (SCAM) IEEE. pp. 1-10
Coherent clusters in source code
Islam, S., Krinke, Jens, Binkley, David and Harman, Mark 2013. Coherent clusters in source code. Journal of Systems and Software. 88, pp. 1-24.
Uncovering Dependence Clusters and Linchpin Functions
Binkley, David, Beszédes, Árpád, Islam, S., Jász, Judit and Vancsics, Béla 2015. Uncovering Dependence Clusters and Linchpin Functions. in: 2015 IEEE 31st International Conference on Software Maintenance and Evolution (ICSME) IEEE. pp. 141-150