| SAS | |
|---|---|
| Paradigm | Multi-paradigm:Data-driven,Procedural programming |
| Designed by | Anthony James Barr |
| Developer | SAS Institute |
| First appeared | 1976; 50 years ago (1976) |
| OS | Windows andmacOS |
| License | Proprietarycommercial software |
| Filename extensions | .sas |
| Website | sas |
TheSAS language is afourth-generation computerprogramming language used for statistical analysis, created byAnthony James Barr atNorth Carolina State University.[1][2] Its primary applications includedata mining andmachine learning. The SAS language runs undercompilers such as theSAS System that can be used onMicrosoft Windows,Linux,UNIX andmainframe computers.[3]
SAS was developed in the 1960s byAnthony James Barr, who built its fundamental structure,[4] andSAS Institute CEOJames Goodnight, who developed a number of features including analysis procedures.[5] The language is currently developed and sponsored by theSAS Institute, of which Goodnight is founder and CEO.[6]
Base SAS is afourth-generationprocedural programming language designed for the statistical analysis of data.[7] It isTuring-complete and domain specific, with many of the attributes of acommand language. As aninterpreted language, it is generally parsed, compiled, and executed step by step.[8] The SAS system was originally asingle instruction, single data (SISD) engine, butsingle instruction, multiple data (SIMD) andmultiple instruction, multiple data (MIMD) functionality was later added.[9] Most base SAS code can be ported between versions, but some are functions and parameters are specific to certain operating systems and interfaces.[10]
All SAS programs are written within the SAS language, although some packages use menu-drivengraphical user interfaces on thefront-end.[11] Various SAS editors use color coding to identify components like step boundaries, keywords and constants.[12] It can read in data from common spreadsheets and databases and output the results of statistical analyses in tables, graphs, and asRTF,HTML andPDF documents.[13]
The language consists of two main types of blocks: DATA blocks and PROC blocks.[14] DATA blocks can be used to read and manipulate input data, and create data sets. PROC blocks are used to perform analyses and operations on these data sets, sort data, and output results in the form ofdescriptive statistics, tables, results, charts and plots.[15][16] PROC SQL can be used to work withSQL syntax within SAS.[17]
Users can input both numeric and character data into base SAS. SAS statements must begin with a reserved keyword and end with;[18] but the language is otherwise flexible in terms of formatting and most statements arecase insensitive.[19] SAS statements can continue across multiple lines and do not require indenting, although indents can improve readability.[18] Comments are delimited by/* and*/.[20]
A standard SAS program typically entails the definition of data, the creation of adata set, and the performance of procedures such as analysis on that data set.[18] SAS scripts have the .sas extension.
A simple example of SAS code is the following
* COMMENT;DataTEMP;inputX Y Z; datalines;1 2 35 6 7;run;PROC PRINTDATA = TEMP;RUN;
The SASmacro language is made available within base SAS software to reduce the amount of code, and createcode generators for building more versatile and flexible programs.[21] The macro language can be used for functionalities as simple as symbolic substitution and as complex asdynamic programming.[8] SAS macro is considered to be a rich language,[22] although its overall syntax is very similar to that of base SAS. The names of macro variables in SAS are usually preceded by&, while macro program statements are usually preceded by%.[8]
SAS Institute develops a number of tools and software suites, also called SAS, which are used for creating programs in the language. These suites includeJMP,SAS Viya, SAS Enterprise Guide and SAS Enterprise Miner.[3][9][17] In 2002,World Programming also developed software that allows the execution of most SAS scripts.[17]
The SAS language is used as a standard in many industries,[17] and was ranked #22 on theTIOBE index in February 2024.[23] It is especially widely used formachine learning,[24]data mining, anddata warehousing in the finance, insurance, manufacturing, health care and pharmaceutical industries.[14] It has a high level of documentation and community support,[20] which has contributed to its uptake.[24]
SAS is used for preparing input data, and building and optimizingmachine learning algorithms.[25] Various models, such asartificial neural networks (ANN),convolutional neural networks anddeep learning models, are developed and trained in SAS.[26] These are applied to areas such ascomputer vision andfraud detection.[27] SAS has also been noted for its applications in the area ofdecision intelligence.[28]
While SAS was originally developed fordata analysis, it became an important language fordata storage.[5] SAS is one of the primary languages used for data mining in business intelligence and statistics.[29] According toGartner'sMagic Quadrant andForrester Research, the SAS Institute is one of the largest vendors of data mining software.[24]