modelc is an R model object to SQL compiler. It generates SQL selectstatements from linear and generalized linear models.
Its interface currently consists of a single function,modelc, which takes a single input, namely anlm orglm model object.
It currently supports Gaussian and gamma family distributions usinglog or identity link functions.
To import linear models directly to your SQL Server database,consider usingCastpack, which dependsonmodelc.
Supposing the following data
a<-1:10b<-2*1:10+runif(1)*1.5c<-as.factor(1:10)df<-data.frame(a,b,c)formula= b~ a+ cA vanilla linear model
linear_model<-lm(formula,data=df)modelc(linear_model)generates the following SQL
0.231808555545287+2* `a`+ (CASEWHEN c=2THEN-0.00000000000000193216758587821* cWHEN c=3THEN-0.000000000000000776180314897008* cWHEN c=4THEN-0.000000000000000665297412768863* cWHEN c=5THEN-0.00000000000000055441451064072* cWHEN c=6THEN-0.000000000000000887620818362638* cWHEN c=7THEN-0.000000000000000332648706384432* cWHEN c=8THEN-0.00000000000000110994422395641* cWHEN c=9THEN-0.00000000000000188723974152839* cWHEN c=10THEN0* cEND )GLMs are also supported with log or identity link functions
glm_model<-glm(formula,data=df,family=Gamma(link="log"))modelc(glm_model)EXP(0.557874070609732+0.244938197625494* `a`+ (CASEWHEN c=2THEN0.394878990324516* cWHEN c=3THEN0.536977925025217* cWHEN c=4THEN0.570378881020516* cWHEN c=5THEN0.542936294999294* cWHEN c=6THEN0.476536561025273* cWHEN c=7THEN0.383038044594683* cWHEN c=8THEN0.269593156578649* cWHEN c=9THEN0.140849942185343* cWHEN c=10THEN0* cEND ) )glm_model_idlink<-glm(formula,data=df,family=Gamma(link="identity"))modelc(glm_model_idlink)0.231808555545287+2* `a`+ (CASEWHEN c=2THEN0.00000000000000139594865689472* cWHEN c=3THEN-0.000000000000000581567338978993* cWHEN c=4THEN-0.00000000000000111588502938831* cWHEN c=5THEN0.000000000000000967650035758108* cWHEN c=6THEN-0.00000000000000149265067586469* cWHEN c=7THEN-0.000000000000000100985345060517* cWHEN c=8THEN-0.0000000000000000673235633736781* cWHEN c=9THEN0.00000000000000199047558220559* cWHEN c=10THEN0* cEND )In order to avoid generating invalid SQL,modelctemporarily sets yourscipen option to 999.
Usingdevtools:
install.packages("devtools")install.packages("remotes")remotes::install_github("sparkfish/modelc")Note that you may encounter minor differences between the output ofyour R and generated SQL models depending on the precision with whichyour numeric types are represented in the database. To ensure paritybetween the two models, numeric types should have a precision of atleast 17.
Tests are written usingtestthat. To run them, simplydo
devtools::test()