Movatterモバイル変換


[0]ホーム

URL:


Type:Package
Title:Stochastic Gradient Descent for Scalable Estimation
Version:1.1.3
Maintainer:Junhyung Lyle Kim <jlylekim@gmail.com>
Description:A fast and flexible set of tools for large scale estimation. It features many stochastic gradient methods, built-in models, visualization tools, automated hyperparameter tuning, model checking, interval estimation, and convergence diagnostics.
URL:https://github.com/airoldilab/sgd
BugReports:https://github.com/airoldilab/sgd/issues
License:GPL-2
Imports:ggplot2, MASS, methods, Rcpp (≥ 0.11.3), stats
Suggests:bigmemory, glmnet, gridExtra, R.rsp, testthat, microbenchmark
LinkingTo:BH, bigmemory, Rcpp, RcppArmadillo
LazyData:yes
VignetteBuilder:R.rsp
Encoding:UTF-8
RoxygenNote:7.3.3
NeedsCompilation:yes
Packaged:2025-10-21 04:16:50 UTC; jlylekim
Author:Junhyung Lyle Kim [cre, aut], Dustin Tran [aut], Panos Toulis [aut], Tian Lian [ctb], Ye Kuang [ctb], Edoardo Airoldi [ctb]
Repository:CRAN
Date/Publication:2025-10-21 11:00:02 UTC

Extract Model Coefficients

Description

Extract model coefficients fromsgd objects.coefficientsis analias for it.

Usage

## S3 method for class 'sgd'coef(object, ...)

Arguments

object

object of classsgd.

...

some methods for this generic require additionalarguments. None are used in this method.

Value

Coefficients extracted from the model objectobject.


Extract Model Fitted Values

Description

Extract fitted values from fromsgd objects.fitted.values is analias for it.

Usage

## S3 method for class 'sgd'fitted(object, ...)

Arguments

object

object of classsgd.

...

some methods for this generic require additionalarguments. None are used in this method.

Value

Fitted values extracted from the objectobject.


Plot objects of classsgd.

Description

Plot objects of classsgd.

Usage

## S3 method for class 'sgd'plot(x, ..., type = "mse", xaxis = "iteration")## S3 method for class 'list'plot(x, ..., type = "mse", xaxis = "iteration")

Arguments

x

object of classsgd.

...

additional arguments used for each type of plot. See‘Details’.

type

character specifying the type of plot:"mse","clf","mse-param". See ‘Details’. Default is"mse".

xaxis

character specifying the x-axis of plot:"iteration"plots the y values over the log-iteration of the algorithm;"runtime" plots the y values over the time in seconds to reach them.Default is"iteration".

Details

Types of plots available:

mse

Mean squared error in predictions, which takes thefollowing arguments:

x_test

test set

y_test

test responses to compare predictions to

clf

Classification error in predictions, which takes thefollowing arguments:

x_test

test set

y_test

test responses to compare predictions to

mse-param

Mean squared error in parameters, which takes thefollowing arguments:

true_param

true vector of parameters to compare to


Model Predictions

Description

Form predictions using the estimated model parameters from stochasticgradient descent.

Usage

## S3 method for class 'sgd'predict(object, newdata, type = "link", ...)predict_all(object, newdata, ...)

Arguments

object

object of classsgd.

newdata

design matrix to form predictions on

type

the type of prediction required. The default "link" ison the scale of the linear predictors; the alternative '"response"'is on the scale of the response variable. Thus for a defaultbinomial model the default predictions are of log-odds(probabilities on logit scale) and 'type = "response"' gives thepredicted probabilities. The '"terms"' option returns a matrixgiving the fitted values of each term in the model formula on thelinear predictor scale.

...

further arguments passed to or from other methods.

Details

A column of 1's must be included tonewdata if theparameters include a bias (intercept) term.


Print objects of classsgd.

Description

Print objects of classsgd.

Usage

## S3 method for class 'sgd'print(x, ...)

Arguments

x

object of classsgd.

...

further arguments passed to or from other methods.


Extract Model Residuals

Description

Extract model residuals fromsgd objects.resid is analias for it.

Usage

## S3 method for class 'sgd'residuals(object, ...)

Arguments

object

object of classsgd.

...

some methods for this generic require additionalarguments. None are used in this method.

Value

Residuals extracted from the objectobject.


Stochastic gradient descent

Description

Run stochastic gradient descent in order to optimize the induced lossfunction given a model and data.

Usage

sgd(x, ...)## S3 method for class 'formula'sgd(formula, data, model, model.control = list(), sgd.control = list(...), ...)## S3 method for class 'matrix'sgd(x, y, model, model.control = list(), sgd.control = list(...), ...)## S3 method for class 'big.matrix'sgd(x, y, model, model.control = list(), sgd.control = list(...), ...)

Arguments

x,y

a design matrix and the respective vector of outcomes.

...

arguments to be used to form the defaultsgd.controlarguments if it is not supplied directly.

formula

an object of class"formula" (or one that can becoerced to that class): a symbolic description of the model to be fitted.The details can be found in"glm".

data

an optional data frame, list or environment (or object coerciblebyas.data.frame to a data frame) containing thevariables in the model. If not found in data, the variables are taken fromenvironment(formula), typically the environment from which glm is called.

model

character specifying the model to be used:"lm" (linearmodel),"glm" (generalized linear model),"cox" (Coxproportional hazards model),"gmm" (generalized method of moments),"m" (M-estimation). See ‘Details’.

model.control

a list of parameters for controlling the model.

family ("glm")

a description of the error distribution andlink function to be used in the model. This can be a character stringnaming a family function, a family function or the result of a call toa family function. (Seefamily for details offamily functions.)

rank ("glm")

logical. Should the rank of the design matrixbe checked?

fn ("gmm")

a functiong(\theta,x) which returns ak-vector corresponding to thek moment conditions. It is arequired argument ifgr not specified.

gr ("gmm")

a function to return the gradient. Ifunspecified, a finite-difference approximation will be used.

nparams ("gmm")

number of model parameters. This isautomatically determined for other models.

type ("gmm")

character specifying the generalized method ofmoments procedure:"twostep" (Hansen, 1982),"iterative"(Hansen et al., 1996). Defaults to"iterative".

wmatrix ("gmm")

weighting matrix to be used in the lossfunction. Defaults to the identity matrix.

loss ("m")

character specifying the loss function to beused in the estimating equation. Default is the Huber loss.

lambda1

L1 regularization parameter. Default is 0.

lambda2

L2 regularization parameter. Default is 0.

sgd.control

an optional list of parameters for controlling the estimation.

method

character specifying the method to be used:"sgd","implicit","asgd","ai-sgd","momentum","nesterov". Default is"ai-sgd". See ‘Details’.

lr

character specifying the learning rate to be used:"one-dim","one-dim-eigen","d-dim","adagrad","rmsprop". Default is"one-dim".See ‘Details’.

lr.control

vector of scalar hyperparameters one canset dependent on the learning rate. For hyperparameters aimedto be left as default, specifyNA in the correspondingentries. See ‘Details’.

start

starting values for the parameter estimates. Default israndom initialization around zero.

size

number of SGD estimates to store for diagnostic purposes(distributed log-uniformly over total number of iterations)

reltol

relative convergence tolerance. The algorithm stopsif it is unable to change the relative mean squared difference in theparameters by more than the amount. Default is1e-05.

npasses

the maximum number of passes over the data. Defaultis 3.

pass

logical. Shouldtol be ignored and run thealgorithm for all ofnpasses?

shuffle

logical. Should the algorithm shuffle the data setincluding for each pass?

verbose

logical. Should the algorithm print progress?

Details

Models:The Cox model assumes that the survival data is ordered when passedin, i.e., such that the risk set of an observation i is all data points afterit.

Methods:

sgd

stochastic gradient descent (Robbins and Monro, 1951)

implicit

implicit stochastic gradient descent (Toulis et al.,2014)

asgd

stochastic gradient with averaging (Polyak and Juditsky,1992)

ai-sgd

implicit stochastic gradient with averaging (Toulis etal., 2015)

momentum

"classical" momentum (Polyak, 1964)

nesterov

Nesterov's accelerated gradient (Nesterov, 1983)

Learning rates and hyperparameters:

one-dim

scalar value prescribed in Xu (2011) as

a_n = scale * gamma/(1 + alpha*gamma*n)^(-c)

where the defaults arelr.control = (scale=1, gamma=1, alpha=1, c)wherec is1 if implemented without averaging,2/3 if with averaging

one-dim-eigen

diagonal matrixlr.control = NULL

d-dim

diagonal matrixlr.control = (epsilon=1e-6)

adagrad

diagonal matrix prescribed in Duchi et al. (2011) aslr.control = (eta=1, epsilon=1e-6)

rmsprop

diagonal matrix prescribed in Tieleman and Hinton(2012) aslr.control = (eta=1, gamma=0.9, epsilon=1e-6)

Value

An object of class"sgd", which is a list containing the followingcomponents:

model

name of the model

coefficients

a named vector of coefficients

converged

logical. Was the algorithm judged to have converged?

estimates

estimates from algorithm stored at each iterationspecified inpos

fitted.values

the fitted mean values

pos

vector of indices specifying the iteration number each estimatewas stored for

residuals

the residuals, that is response minus fitted values

times

vector of times in seconds it took to complete the number ofiterations specified inpos

model.out

a list of model-specific output attributes

Author(s)

Dustin Tran, Tian Lan, Panos Toulis, Ye Kuang, Edoardo Airoldi

References

John Duchi, Elad Hazan, and Yoram Singer. Adaptive subgradient methods foronline learning and stochastic optimization.Journal of MachineLearning Research, 12:2121-2159, 2011.

Yurii Nesterov. A method for solving a convex programming problem withconvergence rateO(1/k^2).Soviet Mathematics Doklady,27(2):372-376, 1983.

Boris T. Polyak. Some methods of speeding up the convergence of iterationmethods.USSR Computational Mathematics and Mathematical Physics,4(5):1-17, 1964.

Boris T. Polyak and Anatoli B. Juditsky. Acceleration of stochasticapproximation by averaging.SIAM Journal on Control and Optimization,30(4):838-855, 1992.

Herbert Robbins and Sutton Monro. A stochastic approximation method.The Annals of Mathematical Statistics, pp. 400-407, 1951.

Panos Toulis, Jason Rennie, and Edoardo M. Airoldi, "Statistical analysis ofstochastic gradient methods for generalized linear models", InProceedings of the 31st International Conference on Machine Learning,2014.

Panos Toulis, Dustin Tran, and Edoardo M. Airoldi, "Stability and optimalityin stochastic gradient descent", arXiv preprint arXiv:1505.02417, 2015.

Wei Xu. Towards optimal one pass large scale learning with averagedstochastic gradient descent. arXiv preprint arXiv:1107.2490, 2011.

# Dimensions

Examples

## Linear regressionset.seed(42)N <- 1e4d <- 5X <- matrix(rnorm(N*d), ncol=d)theta <- rep(5, d+1)eps <- rnorm(N)y <- cbind(1, X) %*% theta + epsdat <- data.frame(y=y, x=X)sgd.theta <- sgd(y ~ ., data=dat, model="lm")sprintf("Mean squared error: %0.3f", mean((theta - as.numeric(sgd.theta$coefficients))^2))

Wine quality data of white wine samples from Portugal

Description

This dataset is a collection of white "Vinho Verde" winesamples from the north of Portugal. Due to privacy and logisticissues, only physicochemical (inputs) and sensory (the output)variables are available (e.g. there is no data about grape types,wine brand, wine selling price, etc.).

Usage

winequality

Format

A data frame with 4898 rows and 12 variables

Source

https://archive.ics.uci.edu/ml/datasets/Wine+Quality


[8]ページ先頭

©2009-2025 Movatter.jp