Movatterモバイル変換

[0]ホーム

Jump to content

S-expression

Edit links

From Wikipedia, the free encyclopedia

Data serialization format

Tree data structure representing the S-expression`(* 2 (+ 3 4))`

Incomputer programming, anS-expression (orsymbolic expression, abbreviated assexpr orsexp) is an expression in a like-named notation for nestedlist (tree-structured) data. S-expressions were invented for, and popularized by, the programming languageLisp, which uses them forsource code as well as data.

Characteristics

[edit]

In the usual parenthesizedsyntax of Lisp, an S-expression is classically defined^[1] as

an atom of the formx, or
anexpression of the form(x .y) wherex andy are S-expressions.

This definition reflects LISP's representation of a list as a series of "cells", each one anordered pair. In plain lists,y points to the next cell (if any), thus forming alist. Therecursive clause of the definition means that both this representation and the S-expression notation can represent anybinary tree. However, the representation can in principle allowcircular references, in which case the structure is not a tree at all, but acyclic graph, and cannot be represented in classical S-expression notationunless a convention for cross-reference is provided, analogous to SQLforeign keys,SGML/XML IDREFs, etc. Modern Lisp dialects such asCommon Lisp^[2] andScheme^[3] provide such syntax viadatum labels, with which objects can be marked, which can then recur elsewhere, indicating shared rather than duplicated structure, enabling thereader orprinter to detect and thus trigger evaluation or display of cycles without infinitely recursing

#n=(xy . #n#)

The definition of an atom varies per context; in the original definition byJohn McCarthy,^[1] it was assumed that there existed "an infinite set of distinguishableatomic symbols" represented as "strings of capitalLatin letters and digits with single embedded blanks" (a subset ofcharacter string and numericliterals).

Most modern sexpr notations allow more general quoted strings (for example including punctuation or fullUnicode), and use an abbreviated notation to represent lists with more than 2 members, so that

(xyz)

stands for

(x . (y . (z . NIL)))

NIL is the special end-of-listobject (alternatively written(), which is the only representation inScheme^[4]).

In the Lisp family of programming languages, S-expressions are used to represent both source code and data. Other uses of S-expressions are in Lisp-derived languages such asDSSSL, and asmark-up incommunication protocols likeIMAP andJohn McCarthy'sCBCL. It is also used as text representation ofWebAssembly. The details of the syntax and supporteddata types vary in the different languages, but the most common feature among these languages is the use of S-expressions and prefix notation.

Datatypes and syntax

[edit]

There are many variants of the S-expression format, supporting a variety of different syntaxes for different datatypes. The most widely supported are:

Lists and pairs:(1 () (2 . 3) (4))
Symbols:with-hyphen?@!$|a symbol with spaces|
Strings:"Hello, world!"
Integers:-9876543210
Floating-point numbers:-0.06.283186.022e23

The character# is often used to prefix extensions to the syntax, e.g.#x10 for hexadecimal integers, or#\C for characters.

Use in Lisp

[edit]

When representing source code in Lisp, the first element of an S-expression is commonly an operator or function name and any remaining elements are treated as arguments. This is called "prefix notation" or "Polish notation". As an example, theBoolean expression written4 == (2 + 2) inC, is represented as(= 4 (+ 2 2)) in Lisp's s-expr-based prefix notation.

As noted above, the precise definition of "atom" varies across LISP-like languages. A quoted string can typically contain anything but a quote, whilean unquoted identifier atom can typically contain anything but quotes, whitespace characters, parentheses, brackets, braces, backslashes, and semicolons. In either case, a prohibited character can typically be included by escaping it with a preceding backslash.Unicode support varies.

The recursive case of the definition of S-expressions is traditionally implemented usingcons cells.

S-expressions were originally intended only for data to be manipulated byM-expressions, but the first implementation of Lisp was an interpreter of S-expression encodings of M-expressions, and Lisp programmers soon became accustomed to using S-expressions for both code and data.This means that Lisp ishomoiconic; that is, the primary representation of programs is also a data structure in a primitive type of the language itself.

Nested lists can be written as S-expressions:((milk juice) (honey marmalade)) is a two-element S-expression whose elements are also two-element S-expressions. The whitespace-separated notation used in Lisp (and this article) is typical. Line breaks (newline characters) usually qualify as separators. This is a simplecontext-free grammar for a tiny subset of English written as an S-expression,^[5] where S = sentence, NP = Noun Phrase, VP = Verb Phrase, V = Verb:

(((S)(NPVP))((VP)(V))((VP)(VNP))((V)died)((V)employed)((NP)nurses)((NP)patients)((NP)Medicenter)((NP)"Dr Chan"))

Program code can be written in S-expressions, usually using prefix notation. Example inCommon Lisp:

(defunfactorial(x)(if(zeropx)1(*x(factorial(-x1)))))

S-expressions can be read in Lisp using the function READ. READ reads the textual representation of an S-expression and returns Lisp data. The function PRINT can be used to output an S-expression. The output then can be read with the function READ, when all printed data objects have a readable representation. Lisp has readable representations for numbers, strings, symbols, lists and many other data types. Program code can be formatted as pretty printed S-expressions using the function PPRINT (note: with two Ps, short forpretty-print).

Lisp programs are valid S-expressions, but not all S-expressions are valid Lisp programs.(1.0 + 3.1) is a valid S-expression, but not a valid Lisp program, since Lisp uses prefix notation and a floating point number (here 1.0) is not valid as an operation (the first element of the expression).

An S-expression preceded by a single quotation mark, as in'x, issyntactic sugar for aquoted S-expression, in this case(quote x).

Relation to XML

[edit]

S-expressions are often compared toXML: one key difference is that S-expressions have just one form of containment, the dotted pair, while XML tags can contain simple attributes, other tags, orCDATA, each using different syntax. Another is that S-expressions do not define a reference mechanism, whereas XML provides a notion of unique identifiers and references to them. For simple use cases, S-expressions are simpler than XML, but for more advanced use cases, XML has a query language calledXPath which many tools and third party libraries use to simplify the handling of XML data.

Standardization

[edit]

References

[edit]

^^a ^bJohn McCarthy (1960/2006).Recursive functions of symbolic expressions Archived 2004-02-02 at theWayback Machine. Originally published inCommunications of the ACM.
^"Common Lisp HyperSpec: 22.4 - The Printer Dictionary: *PRINT-CIRCLE*". 2018-12-28.
^"Revised⁷ Report on the Algorithmic Language‌ Scheme: Section 2.4: Datum Labels"(PDF). 2013-07-06.
^"Revised^5 Report on the Algorithmic Language Scheme".schemers.org.
^G. Gazdar, Ch. Melish, Natural Language Processing in LISP
^Sperber, Michael; Dybvig, R. Kent; Flatt, Matthew; Van Straaten, Anton; Findler, Robby; Matthews, Jacob (Aug 12, 2009). "Revised6 Report on the Algorithmic Language Scheme".Journal of Functional Programming.19 (S1):1–301.CiteSeerX 10.1.1.372.373.doi:10.1017/S0956796809990074.S2CID 267822156.
^S-expressions, Network Working Group, Internet Draft, Expires November 4, 1997 - R. Rivest, May 4, 1997 draft-rivest-sexp-00.txt, Ronald L. Rivest, CSAIL MIT website
^rivest sexp, Google Scholar (search)
^"SEXP (S-expressions)".people.csail.mit.edu. Archived fromthe original on 2023-02-23. Retrieved2023-05-05.

External links

[edit]

sfsexp the small, fast S-expression library for C/C++ on GitHub
minilisp, by Léon Bottou.
S-expressions on Rosettacode has implementations of readers and writers in many languages.

Lisp programming

Features

Automatic storage management Conditionals Dynamic typing Higher-order functions Linked lists Macros M-expressions (deprecated) Read–eval–print loop Recursion S-expressions Self-hosting compiler Tree data structures
Object systems	Common Lisp Object System (CLOS) CommonLoops Flavors

Implementations

Standardized

Common Lisp	Allegro Common Lisp Armed Bear Common Lisp (ABCL) CLISP Clozure CL CMU Common Lisp (CMUCL) Corman Common Lisp Embeddable Common Lisp (ECL) GNU Common Lisp (GCL) LispWorks Macintosh Common Lisp Mocl Movitz Poplog Steel Bank Common Lisp (SBCL) Symbolics Common Lisp
Scheme	History Bigloo Chez Scheme Chicken Gambit Game Oriented Assembly Lisp (GOAL) GNU Guile Ikarus JScheme Kawa MIT/GNU Scheme MultiLisp Pico Pocket Scheme Racket (features) Scheme 48 SCM SIOD T TinyScheme
ISLISP	OpenLisp

Unstandardized

Logo	MSWLogo NetLogo StarLogo UCBLogo
POP	COWSEL (POP-1) POP-2 POP-11

Operating system

List

Common Lisp Interface Manager,McCLIM

Genera

Scsh

Hardware

Community
of practice

Technical standards

Education

Books	Common Lisp the Language How to Design Programs (HTDP) On Lisp Practical Common Lisp Structure and Interpretation of Computer Programs (SICP)
Curriculum	ProgramByDesign

Organizations

Business	Apple Computer Bolt, Beranek and Newman Harlequin Lucid Inc. Symbolics Xanalys
Education	Massachusetts Institute of Technology (MIT) MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) Stanford Artificial Intelligence Laboratory University of California, Berkeley

People

Edmund Berkeley Daniel G. Bobrow William Clinger R. Kent Dybvig Matthias Felleisen Robert Bruce Findler Matthew Flatt Phyllis Fox Paul Graham Richard Greenblatt Timothy P. Hart Louis Hodes Mike Levin David Luckham John McCarthy Robert Tappan Morris Joel Moses David Park Steve Russell Richard Stallman
Common Lisp	Scott Fahlman Richard P. Gabriel Philip Greenspun (10th rule) David A. Moon Kent Pitman Guy L. Steele Jr. Daniel Weinreb
Scheme	Matthias Felleisen Shriram Krishnamurthi Guy L. Steele Jr. Gerald Jay Sussman Julie Sussman
Logo	Hal Abelson Denison Bollay Wally Feurzeig Brian Harvey Seymour Papert Mitchel Resnick Cynthia Solomon
POP	Rod Burstall Robin Popplestone