PDFOS: PDF estimation based over-sampling for
imbalanced two-class problems

Gao, Ming; Hong, Xia; Chen, Sheng; Harris, Chris J; Khalaf, Emad

Download

Full text not archived in this repository.

Advice

Please see our End User Agreement.

It is advisable to refer to the publisher's version if you intend to cite from this work. See Guidance on citing.

Tools

Lists

Gao, M., Hong, X. ORCID: https://orcid.org/0000-0002-6832-2298, Chen, S., Harris, C. J. and Khalaf, E. (2014) PDFOS: PDF estimation based over-sampling for imbalanced two-class problems. Neurocomputing, 138. pp. 248-259. ISSN 0925-2312 doi: 10.1016/j.neucom.2014.02.006

Abstract/Summary

This contribution proposes a novel probability density function (PDF) estimation based over-sampling (PDFOS) approach for two-class imbalanced classification problems. The classical Parzen-window kernel function is adopted to estimate the PDF of the positive class. Then according to the estimated PDF, synthetic instances are generated as the additional training data. The essential concept is to re-balance the class distribution of the original imbalanced data set under the principle that synthetic data sample follows the same statistical properties. Based on the over-sampled training data, the radial basis function (RBF) classifier is constructed by applying the orthogonal forward selection procedure, in which the classifier’s structure and the parameters of RBF kernels are determined using a particle swarm optimisation algorithm based on the criterion of minimising the leave-one-out misclassification rate. The effectiveness of the proposed PDFOS approach is demonstrated by the empirical study on several imbalanced data sets.

Altmetric Badge

Dimensions Badge

Item Type	Article
URI	https://centaur.reading.ac.uk/id/eprint/36567
Identification Number/DOI	10.1016/j.neucom.2014.02.006
Refereed	Yes
Divisions	Science > School of Mathematical, Physical and Computational Sciences > Department of Computer Science
Uncontrolled Keywords	Imbalanced classification, probability density function based over-sampling, radial basis function classifier, orthogonal forward selection, particle swarm optimisation
Publisher	Elsevier
Download/View statistics	View download statistics for this item

Deposit Details

References

[1] N. Petrick, H. P. Chan, B. Sahiner, and D. Wei, “An adaptive densityweighted contrast enhancement filter for mammographic breast mass detection,” IEEE Transactions on Medical Imaging, vol. 15, no. 1, pp. 59–67, 1996. [2] T. Fawcett and F. Provost, “Adaptive fraud detection,” Data Mining and Knowledge Discovery, vol. 1, no. 3, pp. 291–316, 1997. [3] M. Kubat, R. C. Holte, and S. Matwin, “Machine learning for the detection of oil spills in satellite radar images,” Machine Learning, vol. 30, no. 2-3, pp. 195–215, 1998. [4] D. D. Lewis and J. Catlett, “Heterogeneous uncertainty sampling for supervised learning,” in Proceedings of the 11th International Conference on Machine Learning (New Brunswick, NJ, USA), July 10-13, 1994, pp. 148–156. [5] C. X. Ling and C. Li, “Data mining for direct marketing: Problems and solutions,” in Proceedings of the 4th International Conference on Knowledge Discovery and Data Mining (New York, USA), August 27- 31, 1998, pp. 73–79. [6] E. P. D. Pednault, B. K. Rosen, and C. Apte, “Handling imbalanced data sets in insurance risk modeling,” IBM Research Report RC-21731, 2000. [7] G. M. Weiss and F. Provost, “The effect of class distribution on classifier learning: An empirical study,” Technical Report ML-TR-44, Department of Computer Science, Rutgers University, 2001. [8] A. Estabrooks, T. Jo, and N. Japkowicz, “A multiple resampling method for learning from imbalanced data sets,” Journal of Chemical Information and Modeling, vol. 20, no. 1, pp. 18–36, 2004. [9] N. Japkowicz and S. Stephen, “The class imbalance problem: A systematic study,” Intelligence Data Analysis, vol. 6, no. 5, pp. 429–449, 2002. [10] R. Akbani, S. Kwek, and N. Japkowicz, “Applying support vector machines to imbalanced datasets,” in Proceedings of the 15th European Conference on Machine Learning (Pisa, Italy), Sept. 20-24, 2004, pp. 39–50. [11] G. Wu and E. Y. Chang, “KBA: kernel boundary alignment considering imbalanced data distribution,” IEEE Transactions on Knowledge and Data Engineering, vol. 17, no. 6, pp. 786–795, 2005. [12] H. He and E. A. Garcia, “Learning from imbalanced data,” IEEE Transactions on Knowledge and Data Engineering, vol. 21, no. 9, pp. 1263–1284, 2009. [13] X. Hong, S. Chen, and C. J. Harris, “A kernel-based two-class classifier for imbalanced data sets,” IEEE Transactions on Neural Networks, vol. 18, no. 1, pp. 28–41, 2007. [14] J. Moody and C. J. Darken, “Fast learning in networks of locally-tuned processing units,” Neural Computation, vol. 1, No. 2, pp. 281–294, 1989. [15] S. Haykin, Neural Networks: A Comprehensive Foundation, 2nd Edition. Upper Saddle River, NJ: Prentice Hall, 1998. [16] Y. Sun, M. S. Kamel, A. K. C. Wong, and Y. Wang, “Cost-sensitive boosting for classification of imbalanced data,” Pattern Recognition, vol. 40, no. 12, pp. 3358–3378, 2007. [17] W. Fan, S. J. Stolfo, J. Zhang, and P. K. Chan, “AdaCost: Misclassification cost-sensitive boosting,” in Proceedings of the 16th International Conference on Machine Learning (Bled, Slovenia), June 27-30, 1999, pp. 97–105. [18] J. Kennedy and R. C. Eberhart, Swarm Intelligence. Morgan Kaufmann, 2001. [19] S. Chen, X. Hong, and C. J. Harris, “Radial basis function classifier construction using particle swarm optimisation aided orthogonal forward regression,” in Proceedings of the 2010 International Joint Conference on Neural Networks (Barcelona, Spain), July 18-23, 2010, pp. 3418– 3423. [20] S. Chen, X. Hong, and C. J. Harris, “Particle swarm optimization aided orthogonal forward regression for unified data modelling,” IEEE Transactions on Evolutionary Computation, vol. 14, no. 4, pp. 477–499, 2010. [21] A. Ratnaweera, S. K. Halgamuge, and H. C. Watson, “Self-organizing hierarchical particle swarm optimizer with time-varying acceleration coefficients,” IEEE Transactions on Evolutionary Computation, vol. 8, no. 3, pp. 240–255, 2004. [22] W.-F. Leong and G. G. Yen, “PSO-based multiobjective optimization with dynamic population size and adaptive local archives,” IEEE Transactions on Systems, Man, and Cybernetics, Part B, vol. 38, no. 5, pp. 1270 –1293, 2008. [23] S. Chen, X. Hong, B. L. Luk, and C. J. Harris, “Non-linear system identification using particle swarm optimisation tuned radial basis function models,” International Journal of Bio-Inspired Computation, vol. 1, no. 4, pp. 246–258, 2009. [24] M. Ramezani, M.-R. Haghifam, C. Singh, H. Seifi, and M. P. Moghaddam, “Determination of capacity benefit margin in multiarea power systems using particle swarm optimization,” IEEE Transactions on Power Systems, vol. 24, no. 2, pp. 631 –641, 2009. [25] H.-L. Wei, S. A. Billings, Y. Zhao, and L. Guo, “Lattice dynamical wavelet neural networks implemented using particle swarm optimization for spatio-temporal system identification,” IEEE Transactions on Neural Networks, vol. 20, no. 1, pp. 181 –185, 2009. [26] S. Chen, W. Yao, H. R. Palally, and L. Hanzo, “Particle swarm optimisation aided MIMO transceiver designs,” Chapter 19 in: Y. Tenne and C.-K. Goh, Eds., Computational Intelligence in Expensive Optimization Problems, Berlin: Springer-Verlag, 2010, pp. 487–511. [27] P. Puranik, P. Bajaj, A. Abraham, P. Palsodkar, and A. Deshmukh, “Human perception-based color image segmentation using comprehensive learning particle swarm optimization,” Journal of Information Hiding and Multimedia Signal Processing, vol. 2, no. 3, pp. 227–235, 2011. [28] F.-C. Chang and H.-C. Huang, “A refactoring method for cache-efficient swarm intelligence algorithms,” Information Sciences, vol. 192, no. 1, pp. 39–49, 2012. [29] G. E. A. P. A. Batista, R. C. Prati, and M. C. Monard, “A study of the behavior of several methods for balancing machine learning training data,” ACM SIGKDD Exploration Newsletter, vol. 6, no. 1, pp. 20–29, 2004. [30] C. Drummond and R. C. Holte, “C4.5, class imbalance, and cost sensitivity: Why under-sampling beats over-sampling,” in Proceedings of the 12th International Conference on Machine Learning – Workshop on Learning from Imbalanced Datasets II (Washington DC, USA), Aug. 21, 2003, pp. 1–8. [31] D. W. Aha, D. Kibler, and M. K. Albert, “Instance-based learning algorithms,” Machine Learning, vol. 6, no. 1, pp. 37–66, 1991. [32] J. Zhang, “Selecting typical instances in instance-based learning,” in Proceedings of the 9th International Workshop on Machine learning (Aberdeen, Scotland), July 1-3, 1992, pp. 470–479. [33] D. B. Skalak, “Prototype and feature selection by sampling and random mutation hill climbing algorithms,” in Proceedings of the 11th International Conference on Machine Learning (New Brunswick, USA), July 10-13, 1994, pp. 293–301. [34] S. Floyd and M. Warmuth, “Sample compression, learnability, and the vapnik-chervonenkis dimension.” Machine Learning, vol. 21, no. 3, pp. 269–304, 1995. [35] M. Kubat and S. Matwin, “Addressing the curse of imbalanced training sets: One-sided selection,” in Proceedings of the 14th International Conference on Machine Learning (Nashville, USA), July 8-12, 1997, pp. 179–186. [36] J. Zhang and I. Mani, “KNN approach to unbalance data distributions: A case study involving information extraction,” in Proceedings of the 12th International Conference on Machine Learning – Workshop on Learning from Imbalanced Datasets II (Washington DC, USA), Aug. 21, 2003, pp. 42–48. [37] X. Y. Liu, J. Wu, and Z. H. Zhou, “Exploratory undersampling for class-imbalance learning,” IEEE Transactions on Systems, Man, and Cybernetics, Part B, vol. 39, no. 2, pp. 539–550, 2009. [38] R. Barandela, E. Rangel, J. S. S´anchez, and F. J. Ferri, “Restricted decontamination for the imbalanced training sample problem,” in: A. Sanfeliu and J. Ruiz-Shulcloper, Eds., Progress in Pattern Recognition, Speech and Image Analysis, LNCS vol. 2905, Berlin: Springer-Verlag, 2003, pp. 424–431. [39] S. Garc´ıa, J. Cano, A. Fern´adez, and F. Herrera, “A proposal of evolutionary prototype selection for class imbalance problems,” in: E. Corchado, H. Yin, V. Botti, and C. Fyfe, Eds., Intelligent Data Engineering and Automated Learning, LNCS vol. 4224, Berlin: Springer- Verlag, 2006, pp. 1415–1423. [40] R. Barandela, J. K. Hern´andez, J. S. S´anchez, and F. J. Ferri, “Imbalanced training set reduction and feature selection through genetic optimization,” in Proceeding of the 2005 Conference on Artificial Intelligence Research and Development, vol. 131, 2005, pp. 215–222. [41] I. Tomek, “Two modifications of CNN,” IEEE Transactions on Systems, Man and Cybernetics, vol. 6, no. 11, pp. 769–772, 1976. [42] D. L. Wilson, “Asymptotic properties of nearest neighbor rules using edited data,” IEEE Transactions on Systems, Man and Cybernetic, vol. 2, no. 3, pp. 408–421, 1972. [43] R. Barandela, J. S. S´anchez, V. Garc´ıa, and E. Rangel, “Strategies for learning in class imbalance problems,” Pattern Recognition, vol. 36, no. 3, pp. 849–851, 2003. [44] J. Laurikkala, “Improving identification of difficult small classes by balancing class distribution,” in Proceedings of the 8th Conference on AI in Medicine in Europe: Artificial Intelligence Medicine (Cascais, Portugal), July 1-4, 2001, pp. 63–66. [45] P. Hart, “The condensed nearest neighbor rule (Corresp.),” IEEE Transactions on Information Theory, vol. 14, no. 3, pp. 515–516, 1968. [46] R. Barandela, R. M. Valdovinos, J. S. S´anchez, and F. J. Ferri, “The imbalanced training sample problem: Under or over sampling?” in: A. Fred, T. Caelli, R. P. W. Duin, A. Campilho, and D. d. Ridder, Eds., Structural, Syntactic, and Statistical Pattern Recognition, LNCS vol.3138, Berlin: Springer-Verlag, 2004, pp. 806–814. [47] N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: Synthetic minority over-sampling technique,” Journal of Artificial Intelligence Research, vol. 16, pp. 321–357, 2002. [48] B. X. Wang and N. Japkowicz, “Imbalanced data set learning with synthetic samples,” in Proceedings of IRIS Machine Learning Workshop (Ottawa, Canada), June 9, 2004. [49] N. V. Chawla, A. Lazarevic, L. O. Hall, and K. W. Bowyer, “SMOTEBoost: Improving prediction of the minority class in boosting,” in Proceedings of the 7th European Conference on Principles and Practice of Knowledge Discovery in Databases (Cavtat-Dubrovnik, Croatia), Sept. 22-26, 2003, pp. 107–119. [50] H. Han, W. Y. Wang, and B. H. Mao, “Borderline-SMOTE: A new oversampling method in imbalanced data sets learning,” in: D.-S. Huang, X.- P. Zhang, and G.-B. Huang, Eds., Advances in Intelligent Computing, LNCS vol. 3644, Berlin: Springer-Verlag, 2005, pp. 878–887. [51] H. He, Y. Bai, E. A. Garcia, and S. Li, “ADASYN: Adaptive synthetic sampling approach for imbalanced learning,” in Proceedings of the 2008 International Joint Conference on Neural Networks (Hong Kong, China), June 1-8, 2008, pp. 1322–1328. [52] B. W. Silverman, Density Estimation for Statistics and Data Analysis. London: Chapman and Hall, 1986. [53] R. O. Duda and P. E. Hart, Pattern Classification and Scene Analysis. New York: John Wiley & Sons Inc., 1973. [54] C. M. Bishop, Neural Networks for Pattern Recognition. New York: Oxford University Press, 1995. [55] E. Parzen, “On estimation of a probability density function and mode,” The Annals of Mathematical Statistics, vol. 33, no. 3, pp. 1065–1076, 1962. [56] M. Gao, X. Hong, S. Chen, and C. J. Harris, “On combination of SMOTE and particle swarm optimization based radial basis function classifier for imbalanced problems,” in Proceedings of the 2011 International Joint Conference on Neural Networks (San Jose, USA), July 30 - Aug. 5, 2011, pp. 1146–1153. [57] X. Hong, S. Chen, and C. J. Harris, “A forward-constrained regression algorithm for sparse kernel density estimation,” IEEE Transactions on Neural Networks, vol. 19, no. 1, pp. 193–198, 2008 [58] S. Chen, X. Hong, and C. J. Harris, “Sparse kernel density construction using orthogonal forward regression with leave-one-out test score and local regularization,” IEEE Transactions on Systems, Man, and Cybernetics, Part B, vol. 34, no. 4, pp. 1708–1717, 2004. [59] X. Hong, S. Chen, and C. J. Harris, “An orthogonal forward regression technique for sparse kernel density estimation,” Neurocomputing, vol. 71, no. 4-6, pp. 931–943, 2008. [60] K. Fukunaga, Introduction to Statistical Pattern Recognition, 2nd Edition. Academic Press, 1990. [61] J. W. Tukey and P. A. Tukey, “Graphical display of data sets in 3 or more dimensions,” in: V. Barnett, ed., Interpreting Multivariate Data. Chichester, UK: Wiley and Sons, 1981, pp. 189–257. [62] X. Hong, S. Chen, and C. J. Harris, “A fast linear-in-the-parameters classifier construction algorithm using orthogonal forward selection to minimize leave-one-out misclassification rate,” International Journal of Systems Science, vol. 39, no. 2, pp. 119–25, 2008. [63] R. H. Myers, Classical and Modern Regression with Applications, 2ne Edition. Boston: PWS-KENT, 1990. [64] K. K. Lee, C. J. Harris, S. R. Gunn, and P. A. S. Reed, “Classification of imbalanced data with transparent kernel,” in Proceedings of the 2001 International Joint Conference on Neural Networks (Washington DC, USA), July 15-19, 2001, pp. 2410–2415. [65] C. L. Blake and C. J. Merz, “UCI repository of machine learning databases,” Department of Computer Science, University of California, Department of Computer Science, Irvine, CA, 1998. http://archive.ics.uci.edu/ml/datasets.html [66] A. P. Bradley, “The use of the area under the ROC curve in the evaluation of machine learning algorithms,” Pattern Recognition, vol. 30, pp. 1145– 1159, 1997. [67] C. van Rijsbergen, Information Retrieval. London: Butterworths, 1979.

CORE (COnnecting REpositories)

University Staff: Request a correction | Centaur Editors: Update this record

Date Deposited:	06 May 2014 13:44	Date item deposited into CentAUR
Last Modified:	01 Mar 2026 05:52	Date item last modified