ImportC is a C compiler embedded into the D implementation. It enables direct importation of C files, without needing to manually prepare a D file corresponding to the declarations in the C file. It directly compiles C files into modules that can be linked in with D code to form an executable. It can be used as a C compiler to compile and link 100% C programs.
C code in filehello.c:
#include <stdio.h>int main(){ printf("hello world\n"); return 0;}Compile and run:
dmd hello.c./hellohello world
C function in filefunctions.c:
int square(int i){ return i * i;}D program in filedemo.d:
import std.stdio;import functions;void main(){int i = 7; writefln("The square of %s is %s", i, square(i));}
Compile and run:
dmd demo.d functions.c./demoThe square of 7 is 49
There are many versions of C. ImportC is an implementation ofISO/IEC 9899:2011, which will be referred to asC11. References to the C11 Standard will be C11 followed by the paragraph number. Prior versions, such as C99, C89, and K+R C, are not supported.
Further adjustment is made to take advantage of some of the D implementation's capabilities.
The ImportC compiler can be invoked:
ImportC files have one of the extensions.i,.h, or.c.
dmd hello.c
will compilehello.c with ImportC and link it to create the executable filehello (hello.exe on Windows)
Use the DImportDeclaration:
import hello;which will, ifhello is not a D file, and has an extension.i,.h, or.c, compilehello with ImportC.
A C file that is imported through a package and gets compiled needs to have a module declaration that includes that package. It is done by giving it a label as an importC module using the __module keyword guarded by __IMPORTC__ macro.
In C file living in project/net/hi.c
#if __IMPORTC____module net.hi#endifint sqrt(int x) {return x * x; }
D file in project/hello.d
import net.hi;// C file you need in your D sourcevoid main(){assert(sqrt(3) == 9);}
user@ -- project % dmd hello.d net/hi.c
ImportC does not have a preprocessor. It is designed to compile C files after they have been first run through the C preprocessor. ImportC can automatically run the C preprocessor associated with theAssociated C Compiler, or a preprocessor can be run manually.
If the C file has a.c extension, ImportC will run the preprocessor for it automatically.
The-v switch can be used to observe the command that invokes the preprocessor.
The-Ppreprocessorflag switch passespreprocessorflag to the preprocessor.
The druntime filesrc/importc.h will automatically be#included first.importc.h provides adjustments to the source code to account for various C compiler extensions not supported by ImportC.
On Posix systems, ImportC will pass the switch-Wno-builtin-macro-redefined to the C preprocessor used bygcc andclang. Thisswitch does not exist ingcc preprocessors made before 2008. The workaround is to run the preprocessor manually.
If the C file has a.i extension, the file is presumed to be already preprocessed. Preprocessing can be run manually:
sppn.exe runs on Win32 and is invoked as:
sppn file.c
and the preprocessed output is written tofile.i.
TheGnu C Preprocessor can be invoked as:
gcc -E file.c > file.i
The Clang Preprocessor can be invoked as:
clang -E file.c -o file.i
The VC Preprocessor can be invoked as:
cl /P /Zc:preprocessor file.c -Fifile.i
and the preprocessed output is written tofile.i.
Thedmpp C Preprocessor can be invoked as:
dmpp file.c
and the preprocessed output is written tofile.i.
ImportC collects all the#define macros from the preprocessor run when it is run automatically. Some can be made available to D code by interpreting them as declarations. The variety of macros that can be interpreted as D declarations may be expanded, but will never encompass all the metaprogramming uses of C macros.
Macros that look like manifest constants, such as:
#define COLOR 0x123456#define HELLO "hello"
are interpreted as D manifest constant declarations of the form:
enum COLOR = 0x123456;enum HELLO ="hello";
Many macros look like functions, and can be treated as template functions:
#define ABC a + b#define DEF(a) (a + x)
auto ABC() {return a + b; }auto DEF(T)(T a) {return a + x; }
Some macro formulations, however, will not produce the same result:
#define ADD(a, b) a + bint x = ADD(1, 2) * 4; // sets x to 9
auto ADD(U, V)(U a, V b) {return a + b; }int x = ADD(1, 2) * 4;// sets x to 12
#define ADD(a, b) ((a) + (b))
Another area of trouble is side effects in the arguments:
#define DOUBLE(x) ((x) + (x))
int i = 0;DOUBLE(i++);assert(i == 2);// D result will be 1, C result will be 2
and treating arguments as references:
#define INC(x) (++x)
int i = 0;INC(i);assert(i == 1);// D result will be 0, C result will be 1
ImportC does not predefine any macros.
To distinguish an ImportC compile vs some other C compiler, use:
#if __IMPORTC__
__IMPORTC__ is defined insrc/importc.h which is automatically included when the preprocessor is run.importc.h contains many macro definitions that are used to adapt various C source code vagaries to ImportC.
ImportC supports these preprocessor directives:
C11 6.10.4
linemarker directives are normally embedded in the output of C preprocessors.
The following pragmas are supported:
#pragma attribute(push, [storage classes...])The storage classesnothrow,nogc andpure are supported. Unrecognized attributes are ignored. Enabling a default storage class affects all function declarations after the pragma until it is disabled with another pragma. Declarations in includes are also affected. The following example enables@nogc andnothrow for a library:
#pragma attribute(push, nogc,nothrow)#include <somelibrary.h> The changed storage classes are pushed on a stack. The last change can be undone with the following pragma:#pragma attribute(pop)This can also disable multiple default storage classes at the same time, if they were enabled with a single#pragma attribute(push, ...) directive.
The first thing the compiler does when preprocessing is complete is to importsrc/__builtins.di. It provides support for various builtins provided by other C compilers.__builtins.di is a D file.
The implementation defined characteristics of ImportC are:
enumeration-constants are always typed as integers.
The expression that defines the value of anenumeration-constant must be an integer. If an underlyingintis specified in the enumeration, the defining expression must evaluate to a value that fits in anint. If no underlyingint is specified, the compiler chooses a large enough integer type to accomodate the size of the expression.
enum E { a = -10, b = 0x81231234 };// okenum F { c = 0x812312345678 };// ok, c expands to 8 byte to accomodate expressionenum G :int { c = 0x80000000 };// error, doesn't fit in intenum G { d = 1.0 };// error, not integral type
There are many implementation defined aspects of C11 bit fields. ImportC's behavior adjusts to match the behavior of theassociated C compiler on the target platform.
Implicit function declarations:
int main(){ func(); // implicit declaration of func()}were allowed in K+R C and C89, but were invalidated in C99 and C11. Although many C compilers still support them, ImportC does not.
This is described in C11 7.6.1
#pragma STDC FENV_ACCESS on-off-switchon-off-switch: ON OFF DEFAULT
It is completely ignored.
ImportC is assumed to never throw exceptions.setjmp andlongjmp are not supported.
C11 specifies thatconst only applies locally.const in ImportC applies transitively, meaning that although:
int *const p;
means in C11 thatp is a const pointer toint, in ImportC it meansp is aconst pointer to aconst int.
Thevolatile type-qualifier (C11 6.7.3) is ignored. Use ofvolatile to implement shared memory access is unlikely to work anyway,_Atomic is for that. To usevolatile as a device register, call a function to do it that is compiled separately, or use inline assembler.
Therestrict type-qualifier (C11 6.7.3) is ignored.
The_Atomic type-qualifier (C11 6.7.3) is ignored. To do atomic operations, use an externally compiled function for that, or the inline assembler.
Compatible Types (C11 6.7.2) are identical types in ImportC.
While every effort is made to match up C and D so it "just works", the languages have some fundamental differences that appear now and then.
D and C use mostly the same keywords, C has keywords that D doesn't have, and vice versa. This does not affect compilation of C code, but it can cause difficulty when accessing C variables and types from D. For example, the Dversion keyword is not uncommonly used as a struct member in C:
C code in filedefs.c:
struct S { int version; };Accessing it from D:
import defs;int tech(S* s) {return s.version;// fails because version is a D keyword}
A workaround is available:
import defs;int tech(S* s) {return__traits(getMember, *s,"version");}
On some platforms, Clong andunsigned long are the same size asint andunsigned int, respectively. On other platforms, Clong andunsigned long are the same size aslong long andunsigned long long.long double andlong double _Complex can be the same size asdouble anddouble _Complex. In ImportC, these types that are the same size and signed-ness are treated as the same types.
Generic selection expressions (C11 6.5.1.1) differ from ImportC. The types inSame only Different Types are indistinguishable in thetype-name parts ofgeneric-association. Instead of giving an error for duplicate types per C11 6.5.1.1-2, ImportC will select the first compatibletype-name in thegeneric-assoc-list.
For the D language,asm is a standard keyword, and its construct is shared with ImportC. For the C language,asm is an extension (J.5.10), and the recommendation is to instead use__asm__. All alternative keywords forasm are translated by the druntime filesrc/importc.h during the preprocessing stage.
Theasm keyword may be used to embed assembler instructions, its syntax is implementation defined. The Digital Mars D compiler only supports the dialect of inline assembly as described in the documentation of theD x86 Inline Assembler.
asm(mangleName) in a function or variable declaration may be used to specify the mangle name for a symbol. Its use is analogous topragma mangle.
char **myenviron asm("environ") = 0;int myprintf(char *, ...) asm("printf");Usingasm to associate registers with variables is ignored.
Any declarations in scope can be accessed, not just declarations that lexically precede a reference.
Ta *p; // Ta is forward referencedstruct Sa { int x; };typedef struct Sa Ta; // Ta is definedstruct S s;int* p = &s.t.x; // struct S definition is forward referencedstruct S { int a; struct T t; }; // T still forward referencedstruct T { int b; int x; }; // definition of struct TIn C++,struct,union orenum tag symbols can be accessed without needing to be prefixed with thestruct,union orenum keywords, as long as there is no other declaration with the same name at the same scope. ImportC behaves the same way.
For example, the following code is accepted by both C++ and ImportC:
struct s { int a; };void g(int s){ struct s* p = (struct s*)malloc(sizeof(struct s)); // unambiguous p->a = s;}Whereas this is rejected by both C++ and ImportC, for the same reason.
struct s { int a; };void g(int s){ s* p = (s*)malloc(sizeof(s)); // error: parser matchess toint s p->a = s;}Evaluating constant expressions includes executing functions in the same manner as D'sCTFE can. Aconstant-expression invokes compile-time evaluation.
Examples:
_Static_assert("\x1"[0] == 1, "indexing should be 1");int mint1() { return -1; }_Static_assert(mint1() == -1, "call should be -1");const int a = 7;int b = a; // sets b to 7Functions for which the function body is present can be inlined by ImportC as well as by the D code that calls them.
Enums are extended with an optionalEnumBaseType:
EnumDeclaration:enumIdentifier:EnumBaseTypeEnumBodyEnumBaseType:Type
which, when supplied, causes the enum members to be implicitly cast to theEnumBaseType.
enum S : byte { A };_Static_assert(sizeof(A) == 1, "A should be size 1");Objects withregister storage class are treated asauto declarations.
Objects withregister storage class may have their address taken. C11 6.3.2.1-2
Arrays can haveregister storage class, and may be enregistered by the compiler. C11 6.3.2.1-3
Thetypeof operator may be used as a type specifier:
type-specifier:typeof-specifiertypeof-specifier:typeof (expression)typeof (type-name)
Modules can be imported with aCImportDeclaration:
CImportDeclaration:__importImportList;
Imports enable ImportC code to directly access D declarations and functions without the necessity of creating a.h file representing those declarations. The tedium and brittleness of keeping the.h file up-to-date with the D declarations is eliminated. D functions are available to be inlined.
Imports also enable ImportC code to directly import other C files without needing to create a.h file for them, either. Imported C functions become available to be inlined.
TheImportList works the same as it does for D.
The ordering ofCImportDeclarations has no significance.
An ImportC file can be imported, the name of the C file to be imported is derived from the module name.
All the global symbols in the ImportC file become available to the importing module.
If a name is referred to in the importing file is not found, the global symbols in each imported file are searched for the name. If it is found in exactly one module, that becomes the resolution of the name. If it is found in multiple modules, it is an error.
Preprocessor symbols in the imported module are not available to the importing module, and preprocessing symbols in the importing file are not available to the imported module.
A D module can be imported, in the same manner as that of aImportDeclaration.
Imports can be circular.
__import core.stdc.stdarg; // get D declaration of va_list__import mycode; // import mycode.cint foo(){ va_list x; // picks up va_list from core.stdc.stdarg return 1 + A; // returns 4}mycode.c looks like:
enum E { A = 3; }A control-Z character\x1A in the source text means End Of File.
A signed integer constant with no suffix that is larger than along long type, but will fit into anunsigned long long type, is accepted and typed asunsigned long long. This matches D behavior, and that of some C compilers.
The. operator is used to designate a member of a struct or union value. The-> operator is used to designate a member of a struct or union value pointed to by a pointer. The extension is that. and-> can be used interchangeably on values and pointers. This matches D's behavior for..
gcc andclang are presumed to have the same behavior w.r.t. extensions, sogcc as used here refers to both.
The following__attribute__ extensions:
__attribute__((noreturn)) marks a function as never returning.gcc set this as an attribute of the function, it is not part of the function's type. In D, a function that never returns has the return typenoreturn. The difference can be seen with the code:
attribute((noreturn)) int foo();size_t x = sizeof(foo());
This code is accepted bygcc, but makes no sense for D. Hence, although it works in ImportC, it is not representable as D code, meaning one must use judgement in creating a.di file to interface with Cnoreturn functions.
Furthermore, the D compiler takes advantage ofnoreturn functions by issuing compile time errors for unreachable code. Such unreachable code, however, is valid C11, and the ImportC compiler will accept it.
All theDigital Mars C Extensions.
__stdcall sets the calling convention for a function to the Windows API calling convention.
int __stdcall foo(int x);
The following__declspec extensions:
The following__pragma extensions:
The following__declspec extensions:
There is no one-to-one mapping of C constructs to D constructs, although it is very close. What follows is a description of how the D side views the C declarations that are imported.
The module name assigned to the ImportC file is the filename stripped of its path and extension. This is just like the default module name assigned to a D module that does not have a module declaration.
All C symbols areextern (C).
The C enum:
enum E { A, B = 2 };appears to D code as:
enum E :int { A, B = 2 }alias A = E.A;alias B = E.B;
The.min and.max properties are available:
staticassert(E.min == 0 && E.max == 2);
Tag symbols are the identifiers that appear after thestruct,union, andenum keywords, (C11 6.7.2.3). In C, they are placed in a different symbol table from other identifiers. This means two different symbols can use the same name:
int S;struct S { int a, b; };S = 3;struct S *ps;D does not make this distinction. Given a tag symbol that is the only declaration of an identifier, that's what the D compiler recognizes. Given a tag symbol and a non-tag symbol that share an identifier, the D compiler recognizes the non-tag symbol. This is normally not a problem due to the common C practice of applyingtypedef, as in:
typedef struct S { int a, b; } S;The D compiler recognizes thetypedef applied toS, and the code compiles as expected. But whentypedef is absent, as in:
int S;struct S { int a, b; };The most pragmatic workaround is to add atypedef to the C code:
int S;struct S { int a, b; };typedef struct S S_t; // add this typedefThen the D compiler can access the struct tag symbol viaS_t.
Many difficulties with adapting C code to ImportC can be done without editing the C code itself. Wrap the C code in another C file and then#include it. Consider the following problematic C filefile.c:
void func(int *__restrict p);int S;struct S { int a, b; };The problems are that__restrict is not a type qualifier recognized by ImportC (or C11), and the structS is hidden from D by the declarationint S;. To wrapfile.c with a fix, create the filefile_ic.c with the contents:
#define __restrict restrict#include "file.c"typedef struct S S_t;
Then,import file_ic; instead ofimport file;, and useS_t whenstruct S is desired.
Sometimes its desirable to go further than importing C code, to actually do a C source to D source conversion. Reasons include:
This can be done with the D compiler by using the-Hf switch:
dmd -c mycode.c -Hf=mycode.di
which will convert the C source code inmycode.c to D source code inmycode.di. If the-inline switch is also used, it will emit the C function bodies as well, instead of just the function prototypes.
A precise mapping of C semantics, with all its oddities, to D source code is not always practical. ImportC uses C semantics in its semantic analysis to get much closer to exact C semantics than is expressible in D source code. Hence, the translation to D source code will be less than perfect. For example:
int S;struct S { int a, b; };int foo(struct S s){ return S + s.a;}will work fine in ImportC, because theint S and thestruct S are in different symbol tables. But in the generated D code, both symbols would be in the same symbol table, and will collide. Such D source code translated from C will need to be adjusted by the user.
Nevertheless, reports from the field are that this conversion capability is a huge timesaver for users who need to deal with existing C code.
Many suspicious C constructs normally cause warnings to be emitted by default by typical compilers, such as:
int *p = 3; // Warning: integer implicitly converted to pointer
ImportC does not emit warnings. The presumption is the user will be importing existing C code developed using another C compiler, and it is written as intended. If C11 says it is legal, ImportC accepts it.
ImportC will not compile C++ code. For that, usedpp.
From the Article:
dpp is a compiler wrapper that will parse a D source file with the.dpp extension and expand in place any#include directives it encounters, translating all of the C or C++ symbols to D, and then pass the result to a D compiler (DMD by default).
Like DStep, dpp relies on libclang.
From the Article:
DStep is a tool for automatically generating D bindings for C and Objective-C libraries. This is implemented by processing C or Objective-C header files and outputting D modules. DStep uses the Clang compiler as a library (libclang) to process the header files.
htod converts a C.h file to a D source file, suitable for importing into D code.htod is built from the front end of the Digital Mars C and C++ compiler. It works just like a C or C++ compiler except that its output is source code for a D module rather than object code.
ImportC's implementation is based on the idea that D's semantics are very similar to C's. ImportC gets its own parser, which converts the C syntax into the same AST (Abstract Syntax Tree) that D uses. The lexer for ImportC is the same as for D, but with some modifications here and there, such as the keywords and integer literals being different. Where the semantics of C differ from D, there are adjustments in the semantic analysis code in the D compiler.
This co-opting of the D semantic implementation allows ImportC to be able to do things like handle forward references, CTFE (Compile Time Function Execution), and inlining of C functions into D code. Being able to handle forward references means it is not necessary to even write a.h file to be able to import C declarations into D. Being able to perform CTFE is very handy for testing that ImportC is working without needing to generate an executable. But, in general, the strong temptation to add D features to ImportC has been resisted.
The optimizer and code generator are, of course, the same as D uses.