Movatterモバイル変換


[0]ホーム

URL:


GitHub

Julia ASTs

Julia has two representations of code. First there is a surface syntax AST returned by the parser (e.g. theMeta.parse function), and manipulated by macros. It is a structured representation of code as it is written, constructed byjulia-parser.scm from a character stream. Next there is a lowered form, or IR (intermediate representation), which is used by type inference and code generation. In the lowered form there are fewer types of nodes, all macros are expanded, and all control flow is converted to explicit branches and sequences of statements. The lowered form is constructed byjulia-syntax.scm.

First we will focus on the AST, since it is needed to write macros.

Surface syntax AST

Front end ASTs consist almost entirely ofExprs and atoms (e.g. symbols, numbers). There is generally a different expression head for each visually distinct syntactic form. Examples will be given in s-expression syntax. Each parenthesized list corresponds to an Expr, where the first element is the head. For example(call f x) corresponds toExpr(:call, :f, :x) in Julia.

Calls

InputAST
f(x)(call f x)
f(x, y=1, z=2)(call f x (kw y 1) (kw z 2))
f(x; y=1)(call f (parameters (kw y 1)) x)
f(x...)(call f (... x))

do syntax:

f(x) do a,b    bodyend

parses as(do (call f x) (-> (tuple a b) (block body))).

Operators

Most uses of operators are just function calls, so they are parsed with the headcall. However some operators are special forms (not necessarily function calls), and in those cases the operator itself is the expression head. In julia-parser.scm these are referred to as "syntactic operators". Some operators (+ and*) use N-ary parsing; chained calls are parsed as a single N-argument call. Finally, chains of comparisons have their own special expression structure.

InputAST
x+y(call + x y)
a+b+c+d(call + a b c d)
2x(call * 2 x)
a&&b(&& a b)
x += 1(+= x 1)
a ? 1 : 2(if a 1 2)
a,b(tuple a b)
a==b(call == a b)
1<i<=n(comparison 1 < i <= n)
a.b(. a (quote b))
a.(b)(. a (tuple b))

Bracketed forms

InputAST
a[i](ref a i)
t[i;j](typed_vcat t i j)
t[i j](typed_hcat t i j)
t[a b; c d](typed_vcat t (row a b) (row c d))
t[a b;;; c d](typed_ncat t 3 (row a b) (row c d))
a{b}(curly a b)
a{b;c}(curly a (parameters c) b)
[x](vect x)
[x,y](vect x y)
[x;y](vcat x y)
[x y](hcat x y)
[x y; z t](vcat (row x y) (row z t))
[x;y;; z;t;;;](ncat 3 (nrow 2 (nrow 1 x y) (nrow 1 z t)))
[x for y in z, a in b](comprehension (generator x (= y z) (= a b)))
T[x for y in z](typed_comprehension T (generator x (= y z)))
(a, b, c)(tuple a b c)
(a; b; c)(block a b c)

Macros

InputAST
@m x y(macrocall @m (line) x y)
Base.@m x y(macrocall (. Base (quote @m)) (line) x y)
@Base.m x y(macrocall (. Base (quote @m)) (line) x y)

Strings

InputAST
"a""a"
x"y"(macrocall @x_str (line) "y")
x"y"z(macrocall @x_str (line) "y" "z")
"x = $x"(string "x = " x)
`a b c`(macrocall @cmd (line) "a b c")

Doc string syntax:

"some docs"f(x) = x

parses as(macrocall (|.| Core '@doc) (line) "some docs" (= (call f x) (block x))).

Imports and such

InputAST
import a(import (. a))
import a.b.c(import (. a b c))
import ...a(import (. . . . a))
import a.b, c.d(import (. a b) (. c d))
import Base: x(import (: (. Base) (. x)))
import Base: x, y(import (: (. Base) (. x) (. y)))
export a, b(export a b)

using has the same representation asimport, but with expression head:using instead of:import.

Numbers

Julia supports more number types than many scheme implementations, so not all numbers are represented directly as scheme numbers in the AST.

InputAST
11111111111111111111(macrocall @int128_str nothing "11111111111111111111")
0xfffffffffffffffff(macrocall @uint128_str nothing "0xfffffffffffffffff")
1111...many digits...(macrocall @big_str nothing "1111....")

Block forms

A block of statements is parsed as(block stmt1 stmt2 ...).

If statement:

if a    belseif c    delse    eend

parses as:

(if a (block (line 2) b)    (elseif (block (line 3) c) (block (line 4) d)            (block (line 6 e))))

Awhile loop parses as(while condition body).

Afor loop parses as(for (= var iter) body). If there is more than one iteration specification, they are parsed as a block:(for (block (= v1 iter1) (= v2 iter2)) body).

break andcontinue are parsed as 0-argument expressions(break) and(continue).

let is parsed as(let (= var val) body) or(let (block (= var1 val1) (= var2 val2) ...) body), likefor loops.

A basic function definition is parsed as(function (call f x) body). A more complex example:

function f(x::T; k = 1) where T    return x+1end

parses as:

(function (where (call f (parameters (kw k 1))                       (:: x T))                 T)          (block (line 2) (return (call + x 1))))

Type definition:

mutable struct Foo{T<:S}    x::Tend

parses as:

(struct true (curly Foo (<: T S))        (block (line 2) (:: x T)))

The first argument is a boolean telling whether the type is mutable.

try blocks parse as(try try_block var catch_block finally_block). If no variable is present aftercatch,var is#f. If there is nofinally clause, then the last argument is not present.

Quote expressions

Julia source syntax forms for code quoting (quote and:( )) support interpolation with$. In Lisp terminology, this means they are actually "backquote" or "quasiquote" forms. Internally, there is also a need for code quoting without interpolation. In Julia's scheme code, non-interpolating quote is represented with the expression headinert.

inert expressions are converted to JuliaQuoteNode objects. These objects wrap a single value of any type, and when evaluated simply return that value.

Aquote expression whose argument is an atom also gets converted to aQuoteNode.

Line numbers

Source location information is represented as(line line_num file_name) where the third component is optional (and omitted when the current line number, but not file name, changes).

These expressions are represented asLineNumberNodes in Julia.

Macros

Macro hygiene is represented through the expression head pairescape andhygienic-scope. The result of a macro expansion is automatically wrapped in(hygienic-scope block module), to represent the result of the new scope. The user can insert(escape block) inside to interpolate code from the caller.

Lowered form

Lowered form (IR) is more important to the compiler, since it is used for type inference, optimizations like inlining, and code generation. It is also less obvious to the human, since it results from a significant rearrangement of the input syntax.

In addition toSymbols and some number types, the following data types exist in lowered form:

Expr types

These symbols appear in thehead field ofExprs in lowered form.

Method

A unique'd container describing the shared metadata for a single method.

MethodInstance

A unique'd container describing a single callable signature for a Method. See especiallyProper maintenance and care of multi-threading locks for important details on how to modify these fields safely.

CodeInstance

CodeInfo

A (usually temporary) container for holding lowered source code.

Optional Fields:

Boolean properties:

UInt8 settings:

Settings


This document was generated withDocumenter.jl version 1.8.0 onWednesday 9 July 2025. Using Julia version 1.11.6.


[8]ページ先頭

©2009-2025 Movatter.jp