Burnham, K. P., & Anderson, D. R. (2002). Model selection
and multimodel inference: A practical information-theoretic
approach. New York: Springer Science.
Chen, S., Hong, X., & Harris, C. J. (2009). Construction of
tunable radial basis function networks using orthogonal
forward selection. IEEE Transaction on Systems, Man and
Cybernetics, Part B: Cybernetics, 39(2), 457–466.
Chen, S., Hong, X., Harris, C. J., & Sharkey, P. M. (2004). Sparse
modelling using orthogonal forward regression with PRESS
statistic and regularization. IEEE Transaction on Systems,
Man and Cybernetics, Part B: Cybernetics, 34(2), 898–911.
Frank, A., & Asuncion, A. (2010). UCI machine learning repository.
Retrieved from http://archive.ics.uci.edu/ml.
Girolami, M., & He, C. (2003). Probability density estimation
from optimally condensed data samples. IEEE Transaction
on Pattern Analysis and Machine Intelligence, 25(10),
1253–1264.
Gun, S. R. (1998). Support vector machines for classification and
regression. Southampton: ISIS Research Group, Department
Electronics Computer Science, University of Southampton.
Hong, X., Chen, S., Qatawneh, A., Daqrouq, K., Sheikh, M.,
& Morfeq, A. (2013). Sparse probability density function
estimation using the minimum integrated square error. Neurocomputing,
114, 122–129.
Kennedy, J., & Eberhart, R. C. (2001). Swarm intelligence. San
Francisco, CA: Morgan Kaufmann.
Jiang, X., Gao, J., Wang, T., & Zheng, L. (2012). Supervised
latent linear Gaussian process latent variable model for
dimensionality reduction. IEEE Transactions on Systems,
Man and Cybernetics, Part B: Cybernetics, 42(6), 1620–
1632.
Kullback, S., & Leibler, R. A. (1951). On information and
sufficiency. The Annals of Mathematical Statistics, 22(1),
79–86.
Lawrence, N. (2005). Probabilistic non-linear principal component
analysis with Gaussian process latent variable models.
Journal of Machine Learning Research, 6, 1783–
1816.
Rasmussen, C. E. (2004). Gaussian processes in machine learning.
In Advanced lectures on machine learning. Lecture
notes in computer science (Vol. 3176, pp. 63–71). New
York: Springer.
Rasmussen, C. C. E., & Williams, C. K. I. (2006). Gaussian
processes for machine learning. Cambridge, MA: The MIT
Press.
Schölkopf, B., & Smola, A. (2002). Learning with Kernels.
Cambridge, MA: The MIT Press.
Scott, S. W. (2001). Parametric statistical modeling by minimum
integrated square error. Technometrics, 43(3), 274–285.
Silverman, B. W. (1986). Density estimation for statistics and
data analysis. London: Chapman and Hall.
Turner, R. D., Huber, M. F., Hanebeck, U. D., & Rasmussen,
C. E. (2012). Robust filtering and smoothing with gaussian
processes. IEEE Transactions of Automatic Control, 57(7),
1865–1871.
van der Vaart, A. W. (2000). Asymptotic statistics. Cambridge:
Cambridge University Press.
Venkataraman, P. (2002). Applied optimization with MATLAB
programming. New York, NY: Wiley-Interscience.