Open Access Journal

ISSN : 2456-1290 (Online)

International Journal of Engineering Research in Computer Science and Engineering (IJERCSE)

Monthly Journal for Computer Science and Engineering

Open Access Journal

International Journal of Engineering Research in Mechanical and Civil Engineering (IJERMCE)

Monthly Journal for Mechanical and Civil Engineering

ISSN : 2456-1290 (Online)

Comparison of Boxplots for Outlier Detection in Performance Modelling

Author : Joice Mary Philip 1 Abraham George 2

Date of Publication :9th October 2021

Abstract: Distorted values creeping into a data due to sampling, experimental, instrumental, manual , data handling or data processing errors can mislead the prediction of performance. Misfits in an observational data has to be diagnosed which need to be treated before modelling. Quality of data on the material characteristics, determines the accuracy in the performance prediction of a product. In this paper, the reported incompetence of models in a research data and the reason for model inaccuracy is considered . Examination of the data under study using Tukey’s traditional boxplot, and two other medcouple based adjusted boxplots indicated presence of outliers in the data on characteristics of different types of fly ash. Skewness in the data on fly ash characteristics revealed through histogram and density plots were dealt by transformations done to the data. Impact of data transformation in outlier detection is studied for the 3 boxplots. Suitability of each method for the detection of outliers is assessed using sensitivity and specificity calculations. Sensitivity or True Positive Rate is found to be maximum in modified adjusted boxplots while specificity or True Negative Rate is found to be maximum in traditional boxplots. Adjusted boxplots showed least variation in the results with transformed and nontransformed data which suggests it to be suitable for a nontransformed data. Performance models could predict well for the winsorised data based on adjusted box plots

Reference :

    1. Tanikella, P., & Olek, J. (2017). “Updating physical and chemical characteristics of fly ash for use in concrete” (Joint Transportation Research Program Publication No. FHWA/IN/JTRP-2017/11). West Lafayette, IN: Purdue University. https://doi.org/10.5703/1288284315213)
    2.  Hawkins D.M., 1980, “Identification of Outliers”, Chapman & Hall, ISBN 978-94-015-3996-8, ISBN 978-94-015-3994- 4(ebook), DOI 10.1007/978-94-015-3994-4
    3. Insia Hussain, “Outlier Detection using Graphical and Nongraphical Functional Methods in Hydrology”, (IJACSA) International Journal of Advanced Computer Science and Applications, Vol. 10, No. 12, 2019
    4. Mark Kasunic James, McCurley, Dennis Goldenson, David Zubrow December 2011, “An Investigation of Techniques for Detecting Data Anomalies in Earned Value Management Data”, Technical Report Cmu/Sei-2011-Tr-027 Esc-Tr-2011- 027
    5.  Yinaze Herve Dovoedo(2011), “Contributions To Outlier Detection Methods: Some Theory And Applications”, Tuscaloosa, Alabama
    6.  Brys G., Hubert M., Struyf A. (2003) “A Comparison of Some New Measures of Skewness”. In: Dutter R., Filzmoser P., Gather U., Rousseeuw P.J. (eds) Developments in Robust Statistics. Physica, Heidelberg. https://doi.org/10.1007/978-3- 642-57338-5_8
    7. M. Hubert, E. Vandervieren, An adjusted boxplot for skewed distributions, Computational Statistics & Data Analysis, Volume 52, Issue 12, 2008,Pages 5186-5201,ISSN 0167-9473, https://doi.org/10.1016/j.csda.2007.11.008. http://www.sciencedirect.com/science/article/pii/S0167947307 004434)
    8.  Singh A, Masuku M. “Understanding and applications of test characteristics and basics inferential statistics in hypothesis testing.”, European Journal of Applied Sciences (ISSN 2079- 2077) 4 (2): 90-97, 2012, DOI: 10.5829/idosi.ejas.2012.4.2.65132

Recent Article