Package 'GMDHreg'

Title: Regression using GMDH Algorithms
Description: Regression using GMDH algorithms from Prof. Alexey G. Ivakhnenko. Group Method of Data Handling (GMDH), or polynomial neural networks, is a family of inductive algorithms that performs gradually complicated polynomial models and selecting the best solution by an external criterion. In other words, inductive GMDH algorithms give possibility finding automatically interrelations in data, and selecting an optimal structure of model or network. The package includes GMDH Combinatorial, GMDH MIA (Multilayered Iterative Algorithm), GMDH GIA (Generalized Iterative Algorithm) and GMDH Combinatorial with Active Neurons.
Authors: Manuel Villacorta Tilve
Maintainer: Manuel Villacorta Tilve <[email protected]>
License: GPL-3
Version: 0.2.3
Built: 2024-10-31 20:28:04 UTC
Source: https://github.com/cran/GMDHreg

Help Index


GMDH Combinatorial

Description

Build a regression model performing GMDH Combinatorial.
This is the basic GMDH algorithm. For more information, please read the package's vignette.

Usage

gmdh.combi(
  X,
  y,
  G = 2,
  criteria = c("PRESS", "test", "ICOMP"),
  x.test = NULL,
  y.test = NULL
)

Arguments

X

matrix with N>1 columns and M rows, containing independent variables in the model.
Be careful, N>4 and G=2, could be computationally very expensive and time consuming.
The data must not contain NAs

y

vector or matrix containing dependent variable in the model.
The data must not contain NAs

G

polynomial degree.
0: linear regression without quadratic and interactrion terms.
1: linear regression with interaction terms.
2: original Ivakhnenko quadratic polynomial.

criteria

GMDH external criteria. Values:

  • PRESS: Predicted Residual Error Sum of Squares. It take into account all information in data sample and it is computed without recalculating of system for each test point.

  • test: use x.test and y.test to estimate RMSE (Root Mean Squeare Errors).

  • ICOMP: Index of Informational Complexity. Like PRESS, it is computed without recalculating of system.

x.test

matrix with a sample randomly drawn from the initial data. This sample should not be included in X.
It is used when criteria = test.

y.test

vector or matrix with y values correspond with x.test values.

Value

An object of class 'combi'. This is a list with two elements: results and G.
Results is a list with two elements:

  • coef: coeficients of final selected GMDH Combinatorial model.

  • CV: external criteria value for selected model.

G the grade of polynomial used in GMDH Combinatorial model.

References

Bozdogan, H. and Haughton, D.M.A. (1998): "Information complexity criteria for regression models", Computational Statistics & Data Analysis, 28, pp. 51-76 <doi: 10.1016/S0167-9473(98)00025-5>

Hild, Ch. R. and Bozdogan, H. (1995): "The use of information-based model selection criteria in the GMDH algorithm", Systems Analysis Modelling Simulation, 20(1-2), pp. 29-50

Ivakhnenko, A.G. (1968): "The Group Method of Data Handling - A Rival of the Method of Stochastic Approximation", Soviet Automatic Control, 13(3), pp. 43-55

Müller, J.-A., Ivachnenko, A.G. and Lemke, F. (1998): "GMDH Algorithms for Complex Systems Modelling", Mathematical and Computer Modelling of Dynamical Systems, 4(4), pp. 275-316 <doi: 10.1080/13873959808837083>

Examples

set.seed(123)
x <- matrix(data = c(rnorm(1050)), ncol = 3, nrow = 350)
colnames(x) <- c("a", "b", "c")
y <- matrix(data = c(10 + x[, "a"] + x[, "b"]^2 + x[, "c"]^3), ncol = 1)
colnames(y) <- "y"
x.test <- x[1:10, ]
y.test <- y[1:10]
x <- x[-c(1:10), ]
y <- y[-c(1:10)]

mod <- gmdh.combi(X = x, y = y, criteria = "PRESS")
pred <- predict(mod, x.test)
summary(sqrt((pred - y.test)^2))

GMDH Twice-Multilayered Combinatorial

Description

Build a regression model performing GMDH Twice-Multilayered Combinatorial (TMC).
For more information, please read the package's vignette.

Usage

gmdh.combi.twice(
  X,
  y,
  criteria = c("PRESS", "test", "ICOMP"),
  G = 2,
  x.test = NULL,
  y.test = NULL
)

Arguments

X

matrix with N>1 columns and M rows, containing independent variables in the model.
Be careful, N>4 and G=2, could be computationally very expensive and time consuming.
The data must not contain NAs

y

vector or matrix containing dependent variable in the model.
The data must not contain NAs

criteria

GMDH external criteria. Values:

  • PRESS: Predicted Residual Error Sum of Squares. It take into account all information in data sample and it is computed without recalculating of system for each test point.

  • test: use x.test and y.test to estimate RMSE (Root Mean Squeare Errors).

  • ICOMP: Index of Informational Complexity. Like PRESS, it is computed without recalculating of system.

G

polynomial degree.
0: linear regression without quadratic and interactrion terms.
1: linear regression with interaction terms.
2: original Ivakhnenko quadratic polynomial.

x.test

matrix with a sample randomly drawn from the initial data. This sample should not be included in X.
It is used when criteria = test.

y.test

vector or matrix with y values correspond with x.test values.

Value

An object of class 'combitwice'. This is a list with two elements: results and G
Results is a list with two elements:

  • coef: coeficients of final selected GMDH Combinatorial model.

  • CV: external criteria value for selected model.

G the grade of polynomial used in GMDH Combinatorial model.

References

Bozdogan, H. and Haughton, D.M.A. (1998): "Information complexity criteria for regression models", Computational Statistics & Data Analysis, 28, pp. 51-76 <doi: 10.1016/S0167-9473(98)00025-5>

Hild, Ch. R. and Bozdogan, H. (1995): "The use of information-based model selection criteria in the GMDH algorithm", Systems Analysis Modelling Simulation, 20(1-2), pp. 29-50

Ivakhnenko, A.G., Ivakhnenko, G.A and Müller, J.-A. (1994): "Self-organization of Neural Networks with Active Neurons", Pattern Recognition and Image Analysis, 4(2), pp. 185-196

Ivakhnenko, A.G. (1968): "The Group Method of Data Handling - A Rival of the Method of Stochastic Approximation", Soviet Automatic Control, 13(3), pp. 43-55

Müller, J.-A., Ivachnenko, A.G. and Lemke, F. (1998): "GMDH Algorithms for Complex Systems Modelling", Mathematical and Computer Modelling of Dynamical Systems, 4(4), pp. 275-316 <doi: 10.1080/13873959808837083>

Examples

set.seed(123)
x <- matrix(data = c(rnorm(1050)), ncol = 3, nrow = 350)
colnames(x) <- c("a", "b", "c")
y <- matrix(data = c(10 + x[, "a"] + x[, "b"]^2 + x[, "c"]^3), ncol = 1)
colnames(y) <- "y"
x.test <- x[1:10, ]
y.test <- y[1:10]
x <- x[-c(1:10), ]
y <- y[-c(1:10)]

mod <- gmdh.combi.twice(X = x, y = y, criteria = "PRESS")
pred <- predict(mod, x.test)
summary(sqrt((pred - y.test)^2))

GMDH GIA

Description

Build a regression model performing GMDH GIA (Generalized Iterative Algorithm) with Active Neurons (Combinatorial algorithm).
For more information, please read the package's vignette.

Usage

gmdh.gia(
  X,
  y,
  prune = ncol(X),
  criteria = c("PRESS", "test", "ICOMP"),
  x.test = NULL,
  y.test = NULL
)

Arguments

X

matrix with N>3 columns and M rows, containing independent variables in the model.
The data must not contain NAs

y

vector or matrix containing dependent variable in the model.
The data must not contain NAs

prune

an integer whose recommended minimum value is the number of initial regressors.
The maximum value will depend on the available RAM.
Prune is the selected number of neurons from layer i to layer i+1. The resulting layer i+1 has prune(prune-1)/2 neurons; for example with prune=150, the resulting nerurons will be 11.175

criteria

GMDH external criteria. Values:

  • PRESS: predicted residual error sum of squares.

  • test: use x.test and y.test to estimate RMSE (root mean squeare errors).

  • ICOMP: Index of Informational Complexity. Like PRESS, it is computed without recalculating of system.

x.test

matrix with a sample randomly drawn from the initial data.
It is used when criteria = test.
This sample should not be included in X.

y.test

vector or matrix with y values correspond with x.test values.

Value

An object of class gia.

References

Bozdogan, H. and Haughton, D.M.A. (1998): "Information complexity criteria for regression models", Computational Statistics & Data Analysis, 28, pp. 51-76 <doi: 10.1016/S0167-9473(98)00025-5>

Farlow, S.J. (1981): "The GMDH algorithm of Ivakhnenko", The American Statistician, 35(4), pp. 210-215. <doi: 10.2307/2683292>

Hild, Ch. R. and Bozdogan, H. (1995): "The use of information-based model selection criteria in the GMDH algorithm", Systems Analysis Modelling Simulation, 20(1-2), pp. 29-50

Ivakhnenko, A.G. (1968): "The Group Method of Data Handling - A Rival of the Method of Stochastic Approximation", Soviet Automatic Control, 13(3), pp. 43-55

Müller, J.-A., Ivachnenko, A.G. and Lemke, F. (1998): "GMDH Algorithms for Complex Systems Modelling", Mathematical and Computer Modelling of Dynamical Systems, 4(4), pp. 275-316 <doi: 10.1080/13873959808837083>

Stepashko, V. Bulgakova, O. and Zosimov V. (2018): "Construction and Research of the Generalized Iterative GMDH Algorithm with Active Neurons", Advances in Intelligent Systems and Computing II, pp. 492-510 <doi:10.1007/978-3-319-70581-1_35>

Examples

set.seed(123)
x <- matrix(data = c(rnorm(500)), ncol = 4, nrow = 125)
colnames(x) <- c("a", "b", "c", "d")
y <- matrix(data = c(10 + x[, "a"] + x[, "d"]^2), ncol = 1)
colnames(y) <- "y"
x.test <- x[1:5, ]
y.test <- y[1:5]
x <- x[-c(1:5), ]
y <- y[-c(1:5)]

mod <- gmdh.gia(X = x, y = y, criteria = "PRESS")
pred <- predict(mod, x.test)
summary(sqrt((pred - y.test)^2))

GMDH MIA

Description

Build a regression model performing GMDH MIA (Multilayered Iterative Algorithm).
For more information, please read the package's vignette.

Usage

gmdh.mia(
  X,
  y,
  prune = ncol(X),
  criteria = c("PRESS", "test", "ICOMP"),
  x.test = NULL,
  y.test = NULL
)

Arguments

X

matrix with N>3 columns and M rows, containing independent variables in the model.
The data must not contain NAs

y

vector or matrix containing dependent variable in the model.
The data must not contain NAs

prune

an integer whose recommended minimum value is the number of initial regressors.
The maximum value will depend on the available RAM.
Prune is the selected number of neurons from layer i to layer i+1. The resulting layer i+1 has prune(prune-1)/2 neurons; for example with prune=150, the resulting nerurons will be 11.175

criteria

GMDH external criteria. Values:

  • PRESS: Predicted Residual Error Sum of Squares. It take into account all information in data sample and it is computed without recalculating of system for each test point.

  • test: use x.test and y.test to estimate RMSE (Root Mean Squeare Errors).

  • ICOMP: Index of Informational Complexity. Like PRESS, it is computed without recalculating of system.

x.test

matrix with a sample randomly drawn from the initial data.
It is used when criteria = test.
This sample should not be included in X.

y.test

vector or matrix with y values correspond with x.test values.

Value

An object of class mia.

References

Bozdogan, H. and Haughton, D.M.A. (1998): "Information complexity criteria for regression models", Computational Statistics & Data Analysis, 28, pp. 51-76 <doi: 10.1016/S0167-9473(98)00025-5>

Farlow, S.J. (1981): "The GMDH algorithm of Ivakhnenko", The American Statistician, 35(4), pp. 210-215. <doi: 10.2307/2683292>

Hild, Ch. R. and Bozdogan, H. (1995): "The use of information-based model selection criteria in the GMDH algorithm", Systems Analysis Modelling Simulation, 20(1-2), pp. 29-50

Ivakhnenko, A.G. (1968): "The Group Method of Data Handling - A Rival of the Method of Stochastic Approximation", Soviet Automatic Control, 13(3), pp. 43-55

Müller, J.-A., Ivachnenko, A.G. and Lemke, F. (1998): "GMDH Algorithms for Complex Systems Modelling", Mathematical and Computer Modelling of Dynamical Systems, 4(4), pp. 275-316 <doi: 10.1080/13873959808837083>

Examples

set.seed(123)
x <- matrix(data = c(rnorm(1000)), ncol = 5, nrow = 200)
colnames(x) <- c("a", "b", "c", "d", "e")
y <- matrix(data = c(10 + x[, "a"] * x[, "e"]^3), ncol = 1)
colnames(y) <- "y"
x.test <- x[1:10, ]
y.test <- y[1:10]
x <- x[-c(1:10), ]
y <- y[-c(1:10)]

mod <- gmdh.mia(X = x, y = y, criteria = "PRESS")
pred <- predict(mod, x.test)
summary(sqrt((pred - y.test)^2))

Predict GMDH Combinatorial

Description

Calculates GMDH Combinatorial model predictions for new data.

Usage

## S3 method for class 'combi'
predict(object, newdata, ...)

Arguments

object

an object of class 'combi'

newdata

matrix containing dependent variables in the model, wich the predictions are calculated.

...

other undocumented arguments

Value

A matrix with predictions.

Examples

set.seed(123)
x <- matrix(data = c(rnorm(1050)), ncol = 3, nrow = 350)
colnames(x) <- c("a", "b", "c")
y <- matrix(data = c(10 + x[, "a"] + x[, "b"]^2 + x[, "c"]^3), ncol = 1)
colnames(y) <- "y"
x.test <- x[1:10, ]
y.test <- y[1:10]
x <- x[-c(1:10), ]
y <- y[-c(1:10)]

mod <- gmdh.combi(X = x, y = y, criteria = "PRESS")
pred <- predict(mod, x.test)
summary(sqrt((pred - y.test)^2))

Predict GMDH Twice-Multilayered Combinatorial

Description

Calculates GMDH Twice-Multilayered Combinatorial model predictions for new data.

Usage

## S3 method for class 'combitwice'
predict(object, newdata, ...)

Arguments

object

an object of class 'combitwice'

newdata

matrix containing dependent variables in the model, wich the predictions are calculated.

...

other undocumented arguments

Value

A matrix with predictions.

Examples

set.seed(123)
x <- matrix(data = c(rnorm(1050)), ncol = 3, nrow = 350)
colnames(x) <- c("a", "b", "c")
y <- matrix(data = c(10 + x[, "a"] + x[, "b"]^2 + x[, "c"]^3), ncol = 1)
colnames(y) <- "y"
x.test <- x[1:10, ]
y.test <- y[1:10]
x <- x[-c(1:10), ]
y <- y[-c(1:10)]

mod <- gmdh.combi.twice(X = x, y = y, criteria = "PRESS")
pred <- predict(mod, x.test)
summary(sqrt((pred - y.test)^2))

Predict GMDH GIA object

Description

Calculates GMDH GIA Twice model predictions for new data.

Usage

## S3 method for class 'gia'
predict(object, newdata, ...)

Arguments

object

an object of class 'giatwice'

newdata

matrix containing dependent variables in the model, wich the predictions are calculated.

...

other undocumented arguments

Value

A matrix with predictions.

Examples

set.seed(123)
x <- matrix(data = c(rnorm(500)), ncol = 4, nrow = 125)
colnames(x) <- c("a", "b", "c", "d")
y <- matrix(data = c(10 + x[, "a"] + x[, "d"]^2), ncol = 1)
colnames(y) <- "y"
x.test <- x[1:5, ]
y.test <- y[1:5]
x <- x[-c(1:5), ]
y <- y[-c(1:5)]

mod <- gmdh.gia(X = x, y = y, criteria = "PRESS")
pred <- predict(mod, x.test)
summary(sqrt((pred - y.test)^2))

Predict GMDH MIA object

Description

Calculates GMDH MIA model predictions for new data.

Usage

## S3 method for class 'mia'
predict(object, newdata, ...)

Arguments

object

an object of class 'mia'

newdata

matrix containing dependent variables in the model, wich the predictions are calculated.

...

other undocumented arguments

Value

A matrix with predictions.

Examples

set.seed(123)
x <- matrix(data = c(rnorm(1000)), ncol = 5, nrow = 200)
colnames(x) <- c("a", "b", "c", "d", "e")
y <- matrix(data = c(10 + x[, "a"] * x[, "e"]^3), ncol = 1)
colnames(y) <- "y"
x.test <- x[1:10, ]
y.test <- y[1:10]
x <- x[-c(1:10), ]
y <- y[-c(1:10)]

mod <- gmdh.mia(X = x, y = y, prune = 5, criteria = "PRESS")
pred <- predict(mod, x.test)
summary(sqrt((pred - y.test)^2))