Movatterモバイル変換

[0]ホーム

Jump to content

Bytecode

Edit links

From Wikipedia, the free encyclopedia

(Redirected fromByte code)

Form of instruction set designed to be run by a software interpreter

"Portable code" and "P-code" redirect here. For other uses, seesoftware portability andP-Code (disambiguation).

This articleneeds additional citations forverification. Please helpimprove this article byadding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "Bytecode" – news ·newspapers ·books ·scholar ·JSTOR(January 2009) (Learn how and when to remove this message)

Program execution
General concepts
Code Translation Compiler Compile time Optimizing compiler Linking Execution Runtime system Executable Interpreter Virtual machine Intermediate representation (IR)
Types of code
Source code Object code Bytecode Machine code Microcode
Compilation strategies
Ahead-of-time (AOT) Just-in-time (JIT) Tracing just-in-time Compile and go system Precompilation Transcompilation Recompilation
Notable runtimes
Android Runtime (ART) BEAM (Erlang) Common Language Runtime (CLR) and Mono CPython and PyPy crt0 (C target-specific initializer) Java virtual machine (JVM) LuaJIT Objective-C and Swift's V8 and Node.js Zend Engine (PHP)
Notable compilers & toolchains
GNU Compiler Collection (GCC) LLVM and Clang MSVC Glasgow Haskell Compiler (GHC)
v t e

Bytecode (also calledportable code orp-code) is a form ofinstruction set designed for efficient execution by a softwareinterpreter. Unlikehuman-readable^[1]source code, bytecodes are compact numeric codes, constants, and references (normally numeric addresses) that encode the result ofcompiler parsing and performingsemantic analysis of things like type, scope, and nesting depths of program objects.

The namebytecode stems from instruction sets that have one-byte opcodes followed by optional parameters.Intermediate representations such as bytecode may be output byprogramming language implementations to easeinterpretation, or it may be used to reduce hardware andoperating system dependence by allowing the same code to runcross-platform, on different devices. Bytecode may often be either directly executed on avirtual machine (ap-code machine, i.e., interpreter), or it may be further compiled intomachine code for better performance.

Since bytecode instructions are processed by software, they may be arbitrarily complex, but are nonetheless often akin to traditional hardware instructions: virtualstack machines are the most common, but virtualregister machines have been built also.^[2]^[3] Different parts may often be stored in separate files, similar toobject modules, but dynamically loaded during execution.

Execution

[edit]

A bytecode program may be executed by parsing anddirectly executing the instructions, one at a time. This kind ofbytecode interpreter is very portable. Some systems, called dynamic translators, orjust-in-time (JIT) compilers, translate bytecode intomachine code as necessary atruntime. This makes the virtual machine hardware-specific but does not lose the portability of the bytecode. For example,Java andSmalltalk code is typically stored in bytecode format, which is typically then JIT compiled to translate the bytecode to machine code before execution. This introduces a delay before a program is run, when the bytecode is compiled to native machine code, but improves execution speed considerably compared to interpreting source code directly, normally by around an order of magnitude (10x).^[4]

Because of its performance advantage, today many language implementations execute a program in two phases, first compiling the source code into bytecode, and then passing the bytecode to the virtual machine. There are bytecode based virtual machines of this sort forJava,Raku,Python,PHP,^[a]Tcl,mawk andForth (however, Forth is seldom compiled via bytecodes in this way, and its virtual machine is more generic instead). The implementation ofPerl andRuby 1.8 instead work by walking anabstract syntax tree representation derived from the source code.

More recently, the authors ofV8^[1] andDart^[7] have challenged the notion that intermediate bytecode is needed for fast and efficient VM implementation. Both of these language implementations currently do direct JIT compiling from source code to machine code with no bytecode intermediary.^[8]

Examples

[edit]

ActionScript executes in the ActionScript Virtual Machine (AVM), which is part of Flash Player andAIR. ActionScript code is typically transformed into bytecode format by acompiler. Examples of compilers include one built into Adobe Flash Professional and one built into Adobe Flash Builder and available in theAdobe Flex SDK.
Adobe Flash objects
BANCStar, originally bytecode for an interface-building tool but used also as a language
Berkeley Packet Filter
EBPF
Berkeley Pascal^[9]
Byte Code Engineering Library
C toJava virtual machine compilers
CLISP implementation ofCommon Lisp used to compile only to bytecode for many years; however, now it also supports compiling to native code with the help ofGNU lightning
CMUCL and Scieneer Common Lisp implementations ofCommon Lisp can compile either to native code or to bytecode, which is far more compact
Common Intermediate Language executed byCommon Language Runtime, used by.NET languages such asC#
Dalvik bytecode, designed for theAndroid platform, is executed by theDalvik virtual machine
Dis bytecode, designed for theInferno (operating system), is executed by theDis virtual machine
EiffelStudio for theEiffel programming language
EM, theAmsterdam Compiler Kit virtual machine used as an intermediate compiling language and as a modern bytecode language
Emacs is a text editor with most of its functions implemented byEmacs Lisp, its built-in dialect ofLisp. These features are compiled into bytecode. This architecture allows users to customize the editor with a high level language, which after compiling into bytecode yields reasonable performance.
Embeddable Common Lisp implementation ofCommon Lisp can compile to bytecode or C code
Common Lisp provides adisassemble function^[10] which prints to the standard output the underlying code of a specified function. The result is implementation-dependent and may or may not resolve to bytecode. Its inspection can be utilized for debugging and optimization purposes.^[11]Steel Bank Common Lisp, for instance, produces:

(disassemble'(lambda(x)(printx))); disassembly for (LAMBDA (X)); 2436F6DF:       850500000F22     TEST EAX, [#x220F0000]     ; no-arg-parsing entry point;       E5:       8BD6             MOV EDX, ESI;       E7:       8B05A8F63624     MOV EAX, [#x2436F6A8]      ; #<FDEFINITION object for PRINT>;       ED:       B904000000       MOV ECX, 4;       F2:       FF7504           PUSH DWORD PTR [EBP+4];       F5:       FF6005           JMP DWORD PTR [EAX+5];       F8:       CC0A             BREAK 10                   ; error trap;       FA:       02               BYTE #X02;       FB:       18               BYTE #X18                  ; INVALID-ARG-COUNT-ERROR;       FC:       4F               BYTE #X4F                  ; ECX

Ericsson implementation ofErlang uses BEAM bytecodes
Ethereum's Virtual Machine (EVM) is the runtime environment, using its own bytecode, for transaction execution in Ethereum (smart contracts).
Icon^[12] andUnicon^[13] programming languages
Infocom used theZ-machine to make its software applications more portable
Java bytecode, which is executed by theJava virtual machine
- ASM
- BCEL
- Javassist
Keiko bytecode used by theOberon-2 programming language to make it and theOberon operating system more portable.
KEYB, theMS-DOS/PC DOS keyboard driver with its resource fileKEYBOARD.SYS containing layout information and shortp-code sequences executed by an interpreter inside the resident driver.^[14]^[15]
LLVM IR
LSL, a scripting language used in virtual worlds compiles into bytecode running on a virtual machine. Second Life has the original Mono version, Inworldz developed the Phlox version.
Lua language uses a register-based bytecode virtual machine
m-code of theMATLAB language^[16]
Malbolge is anesoteric machine language for a ternary virtual machine.
Microsoft P-code used inVisual C++ andVisual Basic
Multiplan^[17]
O-code of theBCPL programming language
OCaml language optionally compiles to a compact bytecode form
p-code ofUCSD Pascal implementation of thePascal language
Parrot virtual machine
Pick BASIC also referred to as Data BASIC orMultiValue BASIC
TheR environment for statistical computing offers a bytecode compiler through the compiler package, now standard with R version 2.13.0. It is possible to compile this version of R so that the base and recommended packages exploit this.^[18]
Pyramid 2000 adventure game
Python scripts are being compiled on execution to Python's bytecode language, and the compiled files (.pyc) are cached inside the script's folder

Compiled code can be analysed and investigated using a built-in tool for debugging the low-level bytecode. The tool can be initialized from the shell, for example:

>>>importdis# "dis" - Disassembler of Python byte code into mnemonics.>>>dis.dis('print("Hello, World!")')  1           0 LOAD_NAME                0 (print)              2 LOAD_CONST               0 ('Hello, World!')              4 CALL_FUNCTION            1              6 RETURN_VALUE

Scheme 48 implementation of Scheme using bytecode interpreter
Bytecodes of many implementations of theSmalltalk language
TheSpin interpreter built into theParallax Propellermicrocontroller
TheSQLite database engine translates SQL statements into a bespoke byte-code format.^[19]
AppleSWEET16
Tcl
TIMI is used by compilers on theIBM i platform.
Tiny BASIC
Visual FoxPro compiles to bytecode
WebAssembly
YARV andRubinius forRuby
ZCODE
Zend Engine opcodes forPHP

Notes

[edit]

^PHP hasjust-in-time compilation in PHP 8,^[5]^[6] and before while not on in the default version, had options likeHHVM. For older versions of PHP: AlthoughPHP opcodes are generated each time the program is launched, and are always interpreted and notjust-in-time compiled.

References

[edit]

^^a ^b"Dynamic Machine Code Generation". Google Inc.Archived from the original on 2017-03-05. Retrieved2024-12-01.
^"The Implementation of Lua 5.0". (NB. This involves a register-based virtual machine.)
^"Dalvik VM". Archived fromthe original on 2013-05-18. Retrieved2012-10-29. (NB. This VM is register based.)
^"Byte Code Vs Machine Code".www.allaboutcomputing.net. Retrieved2017-10-23.
^O’Phinney, Matthew Weier."Exploring the New PHP JIT Compiler".Zend by Perforce. Retrieved2021-02-19.
^"PHP 8: The JIT - stitcher.io".stitcher.io. Retrieved2021-02-19.
^Loitsch, Florian."Why Not a Bytecode VM?".Google. Archived fromthe original on 2013-05-12.
^"JavaScript myth: JavaScript needs a standard bytecode".2ality.com.
^G., Adam Y. (2022-07-11)."Berkeley Pascal".GitHub. Retrieved2022-01-08.
^"CLHS: Function DISASSEMBLE".www.lispworks.com.
^Collective (2023-12-13)."The Common Lisp Cookbook – Performance Tuning and Tips".lispcookbook.github.io.
^"The Implementation of the Icon Programming Language"(PDF). Archived fromthe original(PDF) on 2016-03-05. Retrieved2011-09-09.
^"The Implementation of Icon and Unicon a Compendium"(PDF).Archived(PDF) from the original on 2022-10-09.
^Paul, Matthias R. (2001-12-30)."KEYBOARD.SYS internal structure".Newsgroup: comp.os.msdos.programmer.Archived from the original on 2017-09-09. Retrieved2016-09-17.[…] In fact, the format is basically the same inMS-DOS 3.3 - 8.0,PC DOS 3.3 - 2000, including Russian, Lithuanian, Chinese and Japanese issues, as well as in Windows NT, 2000, and XP […]. There are minor differences and incompatibilities, but the general format has not changed over the years. […] Some of the data entries contain normal tables […] However, most entries containexecutable code interpreted by some kind ofp-code interpreter at *runtime*, including conditional branches and the like. This is why theKEYB driver has such a huge memory footprint compared to table-driven keyboard drivers which can be done in 3 - 4 Kb getting the same level of function except for the interpreter. […]
^Mendelson, Edward (2001-07-20)."How to Display the Euro in MS-DOS and Windows DOS". Display the euro symbol in full-screen MS-DOS (including Windows 95 or Windows 98 full-screen DOS).Archived from the original on 2016-09-17. Retrieved2016-09-17.[…] Matthias [R.] Paul […] warns that theIBM PC DOS version of the keyboard driver uses some internal procedures that are not recognized by theMicrosoft driver, so, if possible, you should use theIBM versions of bothKEYB.COM andKEYBOARD.SYS instead of mixing Microsoft and IBM versions […] (NB. What is meant by "procedures" here are some additional bytecodes in the IBM KEYBOARD.SYS file not supported by the Microsoft version of the KEYB driver.)
^"United States Patent 6,973,644". Archived fromthe original on 2017-03-05. Retrieved2009-05-21.
^Microsoft C Pcode Specifications. p. 13.Multiplan wasn't compiled tomachine code, but to a kind of byte-code which was run by aninterpreter, in order to make Multiplan portable across the widely varying hardware of the time. This byte-code distinguished between the machine-specificfloating point format to calculate on, and an external (standard) format, which wasbinary coded decimal (BCD). The PACK and UNPACK instructions converted between the two.
^"R Installation and Administration".cran.r-project.org.
^"The SQLite Bytecode Engine". Archived fromthe original on 2017-04-14. Retrieved2016-08-29.

Retrieved from "https://en.wikipedia.org/w/index.php?title=Bytecode&oldid=1294734401"

Categories:

Hidden categories:

[8]ページ先頭

Movatterモバイル変換

Execution

Examples

See also

Notes

References