| Type: | Package |
| Title: | Stochastic Gradient Descent for Scalable Estimation |
| Version: | 1.1.3 |
| Maintainer: | Junhyung Lyle Kim <jlylekim@gmail.com> |
| Description: | A fast and flexible set of tools for large scale estimation. It features many stochastic gradient methods, built-in models, visualization tools, automated hyperparameter tuning, model checking, interval estimation, and convergence diagnostics. |
| URL: | https://github.com/airoldilab/sgd |
| BugReports: | https://github.com/airoldilab/sgd/issues |
| License: | GPL-2 |
| Imports: | ggplot2, MASS, methods, Rcpp (≥ 0.11.3), stats |
| Suggests: | bigmemory, glmnet, gridExtra, R.rsp, testthat, microbenchmark |
| LinkingTo: | BH, bigmemory, Rcpp, RcppArmadillo |
| LazyData: | yes |
| VignetteBuilder: | R.rsp |
| Encoding: | UTF-8 |
| RoxygenNote: | 7.3.3 |
| NeedsCompilation: | yes |
| Packaged: | 2025-10-21 04:16:50 UTC; jlylekim |
| Author: | Junhyung Lyle Kim [cre, aut], Dustin Tran [aut], Panos Toulis [aut], Tian Lian [ctb], Ye Kuang [ctb], Edoardo Airoldi [ctb] |
| Repository: | CRAN |
| Date/Publication: | 2025-10-21 11:00:02 UTC |
Extract Model Coefficients
Description
Extract model coefficients fromsgd objects.coefficientsis analias for it.
Usage
## S3 method for class 'sgd'coef(object, ...)Arguments
object | object of class |
... | some methods for this generic require additionalarguments. None are used in this method. |
Value
Coefficients extracted from the model objectobject.
Extract Model Fitted Values
Description
Extract fitted values from fromsgd objects.fitted.values is analias for it.
Usage
## S3 method for class 'sgd'fitted(object, ...)Arguments
object | object of class |
... | some methods for this generic require additionalarguments. None are used in this method. |
Value
Fitted values extracted from the objectobject.
Plot objects of classsgd.
Description
Plot objects of classsgd.
Usage
## S3 method for class 'sgd'plot(x, ..., type = "mse", xaxis = "iteration")## S3 method for class 'list'plot(x, ..., type = "mse", xaxis = "iteration")Arguments
x | object of class |
... | additional arguments used for each type of plot. See‘Details’. |
type | character specifying the type of plot: |
xaxis | character specifying the x-axis of plot: |
Details
Types of plots available:
mseMean squared error in predictions, which takes thefollowing arguments:
x_testtest set
y_testtest responses to compare predictions to
clfClassification error in predictions, which takes thefollowing arguments:
x_testtest set
y_testtest responses to compare predictions to
mse-paramMean squared error in parameters, which takes thefollowing arguments:
true_paramtrue vector of parameters to compare to
Model Predictions
Description
Form predictions using the estimated model parameters from stochasticgradient descent.
Usage
## S3 method for class 'sgd'predict(object, newdata, type = "link", ...)predict_all(object, newdata, ...)Arguments
object | object of class |
newdata | design matrix to form predictions on |
type | the type of prediction required. The default "link" ison the scale of the linear predictors; the alternative '"response"'is on the scale of the response variable. Thus for a defaultbinomial model the default predictions are of log-odds(probabilities on logit scale) and 'type = "response"' gives thepredicted probabilities. The '"terms"' option returns a matrixgiving the fitted values of each term in the model formula on thelinear predictor scale. |
... | further arguments passed to or from other methods. |
Details
A column of 1's must be included tonewdata if theparameters include a bias (intercept) term.
Print objects of classsgd.
Description
Print objects of classsgd.
Usage
## S3 method for class 'sgd'print(x, ...)Arguments
x | object of class |
... | further arguments passed to or from other methods. |
Extract Model Residuals
Description
Extract model residuals fromsgd objects.resid is analias for it.
Usage
## S3 method for class 'sgd'residuals(object, ...)Arguments
object | object of class |
... | some methods for this generic require additionalarguments. None are used in this method. |
Value
Residuals extracted from the objectobject.
Stochastic gradient descent
Description
Run stochastic gradient descent in order to optimize the induced lossfunction given a model and data.
Usage
sgd(x, ...)## S3 method for class 'formula'sgd(formula, data, model, model.control = list(), sgd.control = list(...), ...)## S3 method for class 'matrix'sgd(x, y, model, model.control = list(), sgd.control = list(...), ...)## S3 method for class 'big.matrix'sgd(x, y, model, model.control = list(), sgd.control = list(...), ...)Arguments
x,y | a design matrix and the respective vector of outcomes. |
... | arguments to be used to form the default |
formula | an object of class |
data | an optional data frame, list or environment (or object coercibleby |
model | character specifying the model to be used: |
model.control | a list of parameters for controlling the model.
|
sgd.control | an optional list of parameters for controlling the estimation.
|
Details
Models:The Cox model assumes that the survival data is ordered when passedin, i.e., such that the risk set of an observation i is all data points afterit.
Methods:
sgdstochastic gradient descent (Robbins and Monro, 1951)
implicitimplicit stochastic gradient descent (Toulis et al.,2014)
asgdstochastic gradient with averaging (Polyak and Juditsky,1992)
ai-sgdimplicit stochastic gradient with averaging (Toulis etal., 2015)
momentum"classical" momentum (Polyak, 1964)
nesterovNesterov's accelerated gradient (Nesterov, 1983)
Learning rates and hyperparameters:
one-dimscalar value prescribed in Xu (2011) as
a_n = scale * gamma/(1 + alpha*gamma*n)^(-c)where the defaults are
lr.control = (scale=1, gamma=1, alpha=1, c)wherecis1if implemented without averaging,2/3if with averagingone-dim-eigendiagonal matrix
lr.control = NULLd-dimdiagonal matrix
lr.control = (epsilon=1e-6)adagraddiagonal matrix prescribed in Duchi et al. (2011) as
lr.control = (eta=1, epsilon=1e-6)rmspropdiagonal matrix prescribed in Tieleman and Hinton(2012) as
lr.control = (eta=1, gamma=0.9, epsilon=1e-6)
Value
An object of class"sgd", which is a list containing the followingcomponents:
model | name of the model |
coefficients | a named vector of coefficients |
converged | logical. Was the algorithm judged to have converged? |
estimates | estimates from algorithm stored at each iterationspecified in |
fitted.values | the fitted mean values |
pos | vector of indices specifying the iteration number each estimatewas stored for |
residuals | the residuals, that is response minus fitted values |
times | vector of times in seconds it took to complete the number ofiterations specified in |
model.out | a list of model-specific output attributes |
Author(s)
Dustin Tran, Tian Lan, Panos Toulis, Ye Kuang, Edoardo Airoldi
References
John Duchi, Elad Hazan, and Yoram Singer. Adaptive subgradient methods foronline learning and stochastic optimization.Journal of MachineLearning Research, 12:2121-2159, 2011.
Yurii Nesterov. A method for solving a convex programming problem withconvergence rateO(1/k^2).Soviet Mathematics Doklady,27(2):372-376, 1983.
Boris T. Polyak. Some methods of speeding up the convergence of iterationmethods.USSR Computational Mathematics and Mathematical Physics,4(5):1-17, 1964.
Boris T. Polyak and Anatoli B. Juditsky. Acceleration of stochasticapproximation by averaging.SIAM Journal on Control and Optimization,30(4):838-855, 1992.
Herbert Robbins and Sutton Monro. A stochastic approximation method.The Annals of Mathematical Statistics, pp. 400-407, 1951.
Panos Toulis, Jason Rennie, and Edoardo M. Airoldi, "Statistical analysis ofstochastic gradient methods for generalized linear models", InProceedings of the 31st International Conference on Machine Learning,2014.
Panos Toulis, Dustin Tran, and Edoardo M. Airoldi, "Stability and optimalityin stochastic gradient descent", arXiv preprint arXiv:1505.02417, 2015.
Wei Xu. Towards optimal one pass large scale learning with averagedstochastic gradient descent. arXiv preprint arXiv:1107.2490, 2011.
# Dimensions
Examples
## Linear regressionset.seed(42)N <- 1e4d <- 5X <- matrix(rnorm(N*d), ncol=d)theta <- rep(5, d+1)eps <- rnorm(N)y <- cbind(1, X) %*% theta + epsdat <- data.frame(y=y, x=X)sgd.theta <- sgd(y ~ ., data=dat, model="lm")sprintf("Mean squared error: %0.3f", mean((theta - as.numeric(sgd.theta$coefficients))^2))Wine quality data of white wine samples from Portugal
Description
This dataset is a collection of white "Vinho Verde" winesamples from the north of Portugal. Due to privacy and logisticissues, only physicochemical (inputs) and sensory (the output)variables are available (e.g. there is no data about grape types,wine brand, wine selling price, etc.).
Usage
winequalityFormat
A data frame with 4898 rows and 12 variables
fixed acidity.
volatile acidity.
citric acid.
residual sugar.
chlorides.
free sulfur dioxide.
total sulfur dioxide.
density.
pH.
sulphates.
alcohol.
quality (score between 0 and 10).