This is a living document, describing theconceptual data model behindWikibase.It is not aspecification of any concretebinding,implementation,mapping, orserialization.
Thedata model ofWikibase describes the structure of the data that is handled in Wikibase.In particular, it specifies which kind of information users can contribute to the system.On a more abstract level, the Wikibase data model provides ametamodel orontology for describing real world entities.Such descriptions are concrete models for real world entities.
This document describes a conceptual model ("Which information do we have to support?") and does not specify how this data should be represented technically ("Which data structures should the software use?") or syntactically ("How should the data be expressed in a file?").Separate documents describe the serialization of the Wikibasedata model inJSON and inRDF (Resource Description Framework).
This specification is technical.Aprimer to the data model is also available that is more accessible (however, it is more ambiguous and less complete).
This document is extended by other documents to describe the data model of Wikibase extensions.The current extensions are:
Editorial Note: This document contains a number of "Editorial Notes". These are remarks that have been left by the editor to record some open issue or known problem. Eventually, all such notes will be addressed and removed.
The data model has the goal to clarify which information is stored in Wikibase. The model is extensible, but at any point in time it should document all things that are possibly stored in the system. It has two main goals:
Conceptual clarity: It should be clearwhat Wikibase can (and what it cannot) capture. It is not possible to capture all statements that one could make about the world (not even all that are important or reasonable). A balance must be found between expressive power and complexity/usability.
Technical documentation: Almost every component of Wikibase has to work with the data. To develop the software, it is therefore essential to have a common understanding of what the data is. Internally, the data can be represented quite differently (in objects, in a syntactic format, in a user interface, etc.): it is only important that each representation has a unique and unambiguous reading in terms of the data model.
There are a number of (sometimes conflicting) requirements that the data model should address in a balanced fashion:
The data model covers information that is expected to be relevant in the cause of the Wikidata project. Initially, only a part of it needs to be implemented, but it is important to ensure that the data model can also support later requirements (at least to the extent that they are in scope of the Wikidata project). Therefore, the below data model is not separated into phases.
There are also a number of things that the data modelis not supposed to do (or that are at least beyond this document), in particular:
Editorial Note: Provide documentation for (at least) the following bindings used by Wikibase: PHP, JavaScript, JSON, RDF. Additional bindings that may be particularly useful are Java and Python
The main purpose of Wikidata is to store data about things that are described by pages in Wikipedia (in any language). For example, one might want to store that the population of Berlin is 3,499,879. In this case,Berlin is the thing that is described, for example, bythe article Berlin in English Wikipedia. In Wikidata, such a "thing" is represented as anItem. The WikidataItem for Berlin would represent the thing that the Wikipedia article is about, not the Wikipedia article itself. Wikidata is concerned with recording facts about the subject of Wikipedia articles.
For everyItem, various pieces of information are stored in Wikidata. First, there is some basic information that clarifies what theItem is about, such as thesitelink to a Wikipedia page in some language. There are also human readable labels and short descriptions that are used to help Wikidata users find the rightItem. Second, there is a list ofStatements that users have entered about theItem. Together, the information that is stored about oneItem is called anItemDescription.
Statements are the main approach of representing factual data, such as the population number in the above example. A Statement consists of two parts: aclaim that something is the case (e.g., the claim "Berlin has a population of 3,499,879") and a list ofreferences for that claim (e.g., a publication by the statistical office for Berlin-Brandenburg). The reference is given by aReferenceRecord, and the list of references is allowed to be empty (like in Wikipedia, editors can addStatements without a reference, which might later be improved by others who know about a suitable reference).
The claim that is made in a Statement can have various forms. The most common form is a single assignment of aValue to aProperty. For example,population is aProperty and the number3,499,879 is aValue.Property-Value pairs can express many different claims, and Values can be numbers, dates and times, geographic coordinates, and many more. An important special case are values that are Items. For example, one could state that Berlin is the capital of Germany, where Germany has its ownItem in Wikidata, that thePropertycapital of refers to. Properties are defined by users, so anyProperty can be created. As opposed to Items, Properties do not refer to Wikipedia pages, but they do specify aDatatype for the data that they (usually) store. The data stored about Properties forms aPropertyDescription.
The individual things that Wikidata talks about, including Items and Properties, are calledEntities. AllEntities are Values, but many kinds of Values are notEntities (examples of the latter kind include Values for numbers, strings, and geographic coordinates). This is so since Wikidata does not intend to store Statements about individual data values, such as strings or numbers (but it could storeStatements about a number as a concept that is discussed on a Wikipage, in which case the number is represented by a WikidataItem).
Property-Value pairs are not the only kind of claims that can be given in a Statement. It is also possible to say, for example, that aProperty hasnoValues for the givenItem. For example, one can say that a circle hasno angles. Stating this can be relevant to distinguish it from the (common) case that the property has simply not been entered into Wikidata yet. Other things that one can say are related to classification, for example to state that Berlin is a city (i.e., "an instance of the class of all cities"). This is treated in a specific way since classification is important in many areas, e.g., in biologic taxonomies. For lack of a better name, any such basic assertion that one can make in Wikidata is called aSnak (which is small, but more than a byte). This term will not be relevant for using Wikidata (editors will not encounter it), but it is relevant for developers to avoid confusion withStatements or other claims.
For advanced usage, it is possible to make claims that consist of more than oneSnak. For example, one might need to say that "the population of Berlin is 3,499,879, considering only the territory of the city, as estimated on 30 November 2011." Here, we have two additional Snaks that specify the territory the number refers to and the time when the measure was taken. It will be described below how exactly a claim can use additional Snaks.
This section explains our notation and general concepts that are used throughout this document.
The data structures that are specified in this document are usually described using UML class diagrams (seethe Wikipedia page on UML for an introduction). We use only the following basic UML features:
The types of class members are either classes that are defined below, or one of the following basic datatypes:
| Datatype | Explanation |
|---|---|
| String | a sequence of characters, possibly empty, where each character represents aUnicodecode point |
| integer | an integer number of arbitrarily large or small value |
| nonNegativeInteger | an integer number of arbitrarily large value greater than or equal to 0 |
| decimal | a decimal number of arbitrarily large or small value, and arbitrary precision |
| IRI | anabsoluteInternationalized Resource Identifier according toRFC 3987; we do not consider relative IRIs |
| GlobalSiteIdentifier | a short string for identifying external sites, e.g., the language-related identification scheme of Wikipedia sites. (Note that this is different fromBCP 47, e.g., there is no "en-US" in Wikipedia, just "en") |
| UserLanguageCode | a short string for identifying languages, based on the language preference setting of logged in Wikipedia users. (This might be more similar toBCP 47 but is not necessarily the same either; it is more fine-grained than a GlobalSiteIdentifier) |
Numbers of arbitrarily large absolute value or precision can be represented as Strings, e.g., as described in the next section. For purposes of data access (e.g., retrieving values in numeric order), it will often be possible to approximate the value, e.g., by using adouble value. However, technical formats such as float or double are not appropriate to represent user input accurately.
UML describes data structures in a rather abstract way. To talk about concrete instances of these data structures, it is useful to have a simple serialization syntax for objects, which we callWikidata Object Notation (WON). The WON is not intended to be used in implementations, but it is useful to give examples and to describe how the data model maps to other syntaxes, such as JSON or RDF.
The WON is described in this text along with the data model, and it will use exactly the same format. We give its simple grammar inBNF notation, using the following standard notation:
| Construct | Syntax | Example |
|---|---|---|
| terminal symbols | strings in single quotes | 'PropertyDescription' |
| a set of terminal symbols described in English | italic | a nonempty finite sequence of digits between 0 and 9 |
| nonterminal symbols | boldface | Statement |
| zero or more | curly braces | {Statement } |
| zero or one | square brackets | [Statement ] |
| alternative | vertical bar | Item |Property |
The basic datatypes that were described above can be serialized in WON as follows:
quotedString := | a finite sequence of characters in which " and \ occur only in pairs of the form \" and \\, enclosed in a pair of " characters |
integer := | [ '-' ]nonNegativeInteger |
nonNegativeInteger := | a nonempty finite sequence of digits between 0 and 9 |
decimal := | integer [ '.'nonNegativeInteger ] |
IRI := | an IRI as defined inRFC 3987, enclosed in a pair of < and > characters |
GlobalSiteIdentifier := | a nonempty finite sequence of Latin characters between a and z, and - |
UserLanguageCode := | a nonempty finite sequence of Latin characters between a and z, and - |
We follow common conventions for escaping "-quoted strings, and of enclosing IRIs with < >.
Values are basic objects of Wikidata, that only represent one particular thing. Items represent topics of Wikipedia pages, Properties represent the properties that Items (or otherEntities) can have, DataValues represent individual values of a particularDatatype (a number, a geographic coordinate, etc.). The kinds of Values and their structure is shown in the following figure:

Various kinds of Values can be the subject of basic statements (Snaks): they are calledEntities. Entities are identified in a uniform way using Uniform Resource Identifiers (URIs), or rather Internationalized Resource Identifiers (IRIs) that also allow Unicode symbols. Since an IRI is a global identifier, no two different Entities may have the same IRI. Hence, all entities can be represented by their IRI alone, without noting what kind of Entity they are. (Items have IRIs of the formhttps://www.wikidata.org/entity/Qnnn and Properties have IRIs of the formhttps://www.wikidata.org/entity/Pnnn)
Value := | DataValue |Entity |
Entity := | Datatype |Item |Property |
Datatype := | IRI |
Item := | IRI |
Property := | IRI |
In contrast to Entities, DataValues are not identified by anIRI but can simply be viewed as compound values that are identified by their content. Values without anIRI can still be named internally or in exports, but the identifiers that are used in this case will usually consist in the actual content (or a hash thereof).
Note that we distinguish singleEntities (e.g., anItem about Berlin) from Descriptions ofEntities (e.g., the collection of information that is stored about thatItem about Berlin).
Items are Entities that are typically represented by a Wikipage (at least in some Wikipedia languages). They can be viewed as "the thing that a Wikipage is about," which could be an individual thing (the personAlbert Einstein), a general class of things (the class of allPhysicists), and any other concept that is the subject of some Wikipedia page (including things likeHistory of Berlin).
TheIRI of anItem will typically be closely related to the URL of its page on Wikidata. It is expected that Items store a shorter ID string (for example, as a title string in MediaWiki) that is used in both cases. ID strings might have a standardized technical format such as "Q1234567890" and will usually not be seen by users. The ID of anItem should be stable and not change after it has been created.
The exact meaning of anItem cannot be captured in Wikidata (or any technical system), but is discussed and decided on by the community of editors, just as it is done with the subject of Wikipedia articles now. It is possible that anItem has multiple "aspects" to its meaning. For example, the pageOrca describes a species of whales. It can be viewed as aclass of all Orca whales, and an individual whale such asKeiko would be an element of this class. On the other hand, the species Orca is also a concept about which we can make individual statements. For example, one could say that the binomial name (aProperty) of the Orca species has theValue "Orcinus orca (Linnaeus, 1758)."
However, it is intended that the information stored in Wikidata is generallyabout the topic of theItem. For example, theItem forHistory of Berlin should store data about this history (if there is any such data), not about Berlin (the city). It is not intended that data about one subject is distributed across multiple Wikidata Items: eachItem fully represents one thing. This also helps for data integration across languages: many languages have no separate article about Berlin's history, but most have an article about Berlin.
An Item can be linked to pages on other wikis via sitelinks. This is used by Wikipedia to link other language versions of an article (since the different language-specific instances of Wikipedia are technically separate wikis). Note that while an Item can have multiple sitelinks to different wikis, it cannot have multiple sitelinks to the same wiki. Sitelinks can additionally have a set of "badges" associated with the page (such as "featured article"). Badges are also represented as Items.
Properties are Entities that describe a relationship between Items (or other Entities) and Values of the property. Typical properties arepopulation (using numbers as values),binomial name (using strings as values), but alsohas father andauthor of (both using Items as values).
Like Items, Properties are identified by anIRI that will probably be closely related to their URL on Wikidata. However, the IDs will be based on a different naming scheme so that no confusion with Items is possible. For example, a typical identifier string used in aProperty ID could be "P123456789".The ID of aProperty should be stable and not change after it has been created.
Properties are treated differently to Items because they do not usually have a page in Wikipedia. While there is a pageen:population, it does not describe the relationship between a region and its number of (human) inhabitants, but rather the nounpopulation. This can be close to the property, but it can also lack important information. For example, the pageen:parent describes what a parent is, but there are multiple related properties, especiallyparent of andhas parent (which have a very different meaning). Wikipedias do not usually contain specific articles about such properties, only about the concepts that they relate to.
As another difference from Items, Properties can have aDatatype that specifies what kind of values users will normally enter for them. Note, however, that the data model does not require strict typing for Properties in Snaks (see below).
ADatatype is an Entity that determines the type and shape of the values that can be assigned to aProperty. There are various common Datatypes, and each must be handled specifically by the software (for example, the user interface will be different depending on the type of data that is edited). Therefore, the Datatypes that are supported by Wikidata can only be extended by software developers, not by editors on the site. However, it might be possible to customize some Datatypes when using them for aProperty (e.g., one might be able to say that aProperty should only accept numbers withoutdecimal digits, i.e., integers).
Most Datatypes are notprimitive in the sense that their values consist of only one single value of a type that is commonly found in programming languages. For example, geographic coordinates are an important type of data in Wikidata, but they have an internal structure (e.g., specifying a latitude, longitude, and possibly a height).
More information about the Datatypes available in Wikidata is given in the respective section below.
DataValues are Values that are notEntities. They represent values of a particularDatatype, such as a particular number or point in time. Details on the available DataValues and their according types is given in the respective section below.
Snaks are the basic information structures used to describe Entities in Wikidata. They are an integral part of each Statement (which can be viewed as collection of Snaks about an Entity, together with a list of references).
Many of the Snaks are based on similar pieces of information, yet we distinguish Snaks that are intended to have a different meaning. This is useful in many places. Typically, Snaks of different meaning will be represented differently in the user interface.
Snak := | PropertySnak |
PropertySnak := | PropertyValueSnak |PropertySomeValueSnak |PropertyNoValueSnak |
Note that currently, all Snaks are PropertySnak. Other types of Snaks that are not PropertySnak may be defined in the future.
A PropertyValueSnak describes that an Entity has a certainProperty with a givenValue. Note that it is not required thatValue belongs to theDatatype that is currently given to theProperty in the system. In general, the UI and API of Wikidata will only allow Values that match the givenDatatype, but if theDatatype is changed, then it will not be possible to update all stored data immediately. Moreover, if theDatatype is changed back to its earlier value, it might be possible to continue using existing data that was not changed. This is the main reason for not limiting the data model to strictly typed Properties.
Please also note that the data model does not actually define a uniqueDatatype for eachProperty: it just specifies howDatatype assignments would be represented; a uniqueDatatype is only obtained in a closed system where everyProperty has a globally uniqueDatatype assignment.
The Wikidata Object Notation for PropertyValueSnaks is as follows:
PropertyValueSnak := | 'PropertyValueSnak('PropertyValue ')' |
Here and below, we omit the names of attributes (e.g., "subject") in WON, and simply encode their values positionally. We do not specify any delimiters between the arguments in this notation. It is silently assumed that whitespace is introduced to avoid ambiguities.
Example: Many basic kinds of data are naturally expressed by assigning Values to Properties. Some examples:
Obviously, eachValue in these statements would refer to one clearly identified object (e.g., our label "Georgia" above is surely not precise enough). We omit such details for simplicity here. Also note that Snaks do not mention the subject to which they refer (Berlin, Georgia, Gandhi); this is given by the context in which a Snak is used (typically as part of a Statement).
A PropertyNoValueSnak describes that an Entity has no values for a certainProperty.
PropertyNoValueSnak := | 'PropertyNoValueSnak('Property ')' |
Example: In some cases, we want to emphasize that a property value has not just been left out (or not entered yet) but that it really does not exist. Some examples:
Such statements should only be made in cases where one could otherwise expect an incompleteness. It is not intended that Wikidata stores all things that are not the case (e.g., "The Pacific Ocean has no angle").
A PropertySomeValueSnak describes that an Entity has some value for a certainProperty, without saying anything about this value. This can be used if the value of a property is unknown.
PropertySomeValueSnak := | 'PropertySomeValueSnak('Property ')' |
Example: The information that a property has some value can be important and useful, even if the value is not known. For example:
Such statements should only be made if no concrete date is known. Wikidata does not support constraints on unknown values ("William of Ockham died in 1347 or 1348") but it does support precision on some types of data values ("William of Ockham died in the 1340s") and it does support different (possibly conflicting) values from multiple sources.
Statements describe the claim of a statement and list references for this claim. Every Statement refers to one particular Entity, called thesubject of the Statement. There is always onemainSnak that forms the most important part of the statement. Moreover, there can be zero or more additional PropertySnaks that describe the Statement in more detail. These qualifier Snaks (or "qualifiers" for short) store additional information that does not directly refer to the subject (e.g., the time at which the main part of the statement was valid). References are provided as a list (the order is significant in some contexts, especially for displaying a main reference). The complete structure is described as follows:

The individual components have the following meaning:
Statement := | 'Statement('EntitySnak {PropertySnak} {ReferenceRecord}Rank ')' |
Example: A simple statement could just contain any of the Snaks in the above examples. The use of qualifier Snaks is illustrated in the following examples:
In each case, there are other ways to capture the respective information. Like in Wikipedia, it is left to the community to agree on uniform ways of expressing such things. Often, there are good reasons to prefer one representation over the other. For example, there are cases where a country is known to have inhabitants of some ethnic group, while the percentage of that group is not known; then the qualifierSnak could simply be omitted.
The ranks provide a simple selection/filtering criterion in cases where there are manyStatements for some property. There are three possible ranks, which have roughly the following meaning:
This model is intentionally left coarse and simple. The three levels translate to different treatments in data access, UI (e.g., what is displayed by default), and export (one could, e.g., have an export with only the preferred and normalStatements). The ranks may also be useful for protectingStatements from editing (e.g., by protecting only preferred and normal statements). More fine-grained rankings do not seem to have such a clear interpretation and would thus increase the UI complexity unnecessarily. Having only two ranks (or no ranks at all), on the other hand, would make it harder to cope withStatements that are not trusted, known to contain wrong claims, or simply unpatrolled (if ranks are used for protection).
Another useful concept can be constructed based on the ranks defined above: the "best rank" for the Statements about a given Property with respect to a given Item. If there is at least one Statementwithpreferred rank about the property (in the context of a given Item), the best rank for that property ispreferred. Otherwise, the best rank isnormal. Correspondingly, the "best Statements" about a given Property in the context of a given Item are the ones that have the best rank for that Property.
ReferenceRecords are intended to store information about some source, represented as a set of Snaks. In the simplest case, the source can be represented by a single Snak, e.g. providing a URL. But SourceRecords can also be more complex or, e.g. consisting of Snaks representing the title, author, and publisher of a book, along with chapter and page of the cited information.
ReferenceRecord := | 'ReferenceRecord('Snak {Snak} ')' |
EntityDescriptions are collections of information about an entity, and they mainly serve as data containers that can be interpreted as sets of Snaks with some further attributes (that could also be represented as Snaks, if desired). In addition, EntityDescriptions may support lexical information that can be used for displaying, searching, or referencing the respective entity.
We definePropertyDescription andItemDescription subtypes of EntityDescription that correspond to entities of the respective type,Item andProperty. In particular, allStatements of anItemDescription must use the expectedItem as the subject of their mainSnak, and allStatements of aPropertyDescription must use the expectedProperty as the subject of their mainSnak.
EntityDescriptions can contain basic lexical information. Each ItemDescriptions and PropertyDescriptions supports internationalized labels, descriptions, and aliases. The overall structure of ItemDescription and PropertyDescription can be defined as follows:
EntityDescription := | ItemDescription |PropertyDescription |
ItemDescription := | 'ItemDescription('Item [MultilingualTextValue] [MultilingualTextValue] [MultilingualMultiTextValue] {Statement} ')' |
PropertyDescription := | 'PropertyDescription('Property [MultilingualTextValue] [MultilingualTextValue] [MultilingualMultiTextValue] {Statement} ')' |
The three types of lexical information supported by ItemDescription and PropertyDescription are labels, descriptions, and aliases. Labels and descriptions areMultilingualTextValues, aliases areMultilingualMultiTextValues: for any given language, an EntityDescription may have at most one label and at most one description, but any number of aliases. Their respective purposes are:
The lexical information in EntityDescriptions may be used as uniquekeys as follows:
Planned Feature:
Planned Feature:
The structure of PropertyDescriptions may be expanded in the future, to cover the following:
Note that this information could be expressed using Statements on the PropertyDescriptions, without extending the structure of PropertyDescriptions. In order to make use of such Statements, the processing software would have to have knowledge of the meaning of the properties used to make such statements.
Datatypes[1] are Entities that specify the format ofProperty Values. The set of Datatypes in Wikibase is system-defined (it can be extended, but only by developers). EveryDatatype has a fixedIRI, that is also system-defined.
For everyDatatype, there is one particular form of Value that is used to represent Values of that type. Wikibase distinguishes between Values that can be the subject of Snaks, calledEntities, and Values that are not the subject of Snaks, called DataValues. The following is an overview of all DataValues:

DataValue := | QuantityValue |StringValue |TimeValue |GeoCoordinateValue |GeoShapeValue |MediaValue |IriValue |MonolingualTextValue |MultilingualTextValue |
A QuantityValue represents adecimal number, together with information about the uncertainty interval of this number, and a unit of measurement. Thedecimal number is represented as a string using the lexical form of XML Schemadecimal. The attributes are:
QuantityValue := | 'QuantityValue('decimal [decimal] [decimal]IRI ')' |
The givenamount is interpreted as the main value of the QuantityValue. The optionallowerBound andupperBound specify how far the true value of the represented quantity could deviate from thenumber in positive or negative direction. This allows to capture expressions such as12300 +/- 50. For many practical purposes, only the number might be used (e.g., for sorting and query answering), but the variance can provide valuable information for presentation (e.g., for selecting reasonable precision in unit conversions). If the lower and upper bound are not present, the uncertainty is unspecified.
The exact interpretation of the uncertainty interval provided withlowerBound andupperBound is unspecified. Depending on context, it may represent hard limits on the value, or the interval may just describe the 66 or 95 percentile interval of a normal distribution.
In the Wikibase UI, quantities and their bounds are input together as a string: for instance, "4~" will give an amount of 4, a lower bound of 3.5 and an upper bound of 4.5. Strings must currently match the following regular expression to be parsed in this way:
^\s*((?:[-+]\s*)?(?:[\d,]+\.\d*|\.?\d+)(?:[eE][-+]?\d+)?)\s*(?:([~!])|(?:\+/?-|±)\s*((?:[-+]\s*)?(?:[\d,]+\.\d*|\.?\d+)(?:[eE][-+]?\d+)?)|)\s*$
Theunit specifies a physical quantity that the number refers to. It is represented as aIRI rather than as aString, since a string like "m" might represent different units in different contexts. The value should be meaningful independently of the declaration information for itsProperty (from which more details about units could possibly be obtained), hence the unit is a fullIRI. In practice, thisIRI might be the IRI refering to anItemDescription representing the desired unit, or be taken from a standard vocabulary for units, like QUDT[1].
Editorial Note: It is not clear yet how exactly the variance information is to be used to ensure "reasonable" unit conversion and display. There are special cases such as English body sizes that may need special treatment ("5 feet, 3 inches") but this should not affect the data model. Describe plans for unit conversion
The calendar model used forsaving the data is always the proleptic Gregorian calendar according to ISO 8601, but the Calendar model used fordisplaying the data is given by the saved Calendar model.[2]
A TimeValue represents a point in time that might be imprecise (e.g., if only a year is given). For practical purposes (e.g., sorting values), the value will often be interpreted to be exact by filling the missing positions with more details. The structure of values of this type is as follows:
Interpretation of dates follow ISO 8601:[4]
If you have something like "between 1846 and 1855", you can use the "before" and"after" fields of the time value:
time: "+00000001850-00-00T00:00:00Z", precision: 9, before: 4, after: 5
This means the "main" value is 1850, given as a year, with a lower bound fouryears before and an upper bound 5 years after the "main" value (before and afterare given in the unit specified by the precision value). The "main" value iswhat is going to be displayed per default; it will also be used for sortingquery results (once we have queries).
This is a bit complicated, but should allow you to actually represent uncertaindates. We made it so you can be precise about the uncertainty.
IriValue := | 'IriValue('IRI ')' |
An IriValue represents an arbitraryIRI that followsRFC 3987. If the protocol part is supported by MediaWiki, a hyperlink might be displayed, but theDatatype as such does not require such protocols, and generally it is not required that all IRIs work as URLs. For example, the "tel:" protocol (RFC 3966) might also be allowed.
A coordinate is represented as:
GlobeCoordinateValue := | 'GlobeCoordinateValue('decimaldecimaldecimalURI ')' |
Editorial Note: This needs to be specified. It is likely that Wikibase will simply refer to an existing standard for representing geographic shapes here, e.g.,WKT or GeoJSON.
Items in Wikibase are represented byItem as explained in thesection on Values above. While not subtypes of DataValue, we list them here to define the IRI for the respective datatype. It is not planned to have user-defined properties for other types of Entities for now.
Item attributes in Wikibase are represented byProperties as explained in thesection on Values above. While not subtypes of DataValue, we list them here to define the IRI for the respective datatype. It is not planned to have user-defined properties for other types of Entities for now.
Editorial Note: Media is represented by a dedicatedDatatype since Media items should be handled in a specific way. Moreover, it might be useful to have additional metadata for Media objects. To be defined.
StringValue := | 'StringValue('String ')' |
Strings are represented by StringValues. All strings are considered as sequences of Unicode glyphs. As opposed to multilingual and monolingual texts, strings do not contain any language information, and are typically used directly only for strings that do not belong to a language, e.g., the post code of a UK city.
Note: Wikibase enforces that strings are at least one character long and disallows strings that match the regular expression^\s|[\v\t]|\s$ (disallowing any strings that start or end with whitespace or contain vertical whitespace such as newlines).
MonolingualTextValues are Values that represent a phrase in some language. In particular, their content could also be pronounced (and be associated with pronunciation information or audio versions). The attributes of MonolingualTextValues are:
MonolingualTextValue := | 'MonolingualTextValue('UserLanguageCodeString ')' |
MultilingualTextValue := | 'MultilingualTextValue(' {MonolingualTextValue} ')' |
MultilingualTextValues are Values that represent a phrase in many languages. This is different from representing individual Values for each language, since it also captures the information that all of the Values are direct translations (otherwise, if aProperty has multiple MonolingualTextValues in each language, it would not be clear which values belong together). MultilingualTextValues store a list of MonolingualTextValues, but at most one for eachUserLanguageCode.
MonolingualMultiTextValue := | 'MonolingualTextValue(' UserLanguageCode {String} ')' |
MultilingualMultiTextValue := | 'MultilingualMultiTextValue(' {MonolingualMultiTextValue} ')' |
MultilingualMultiTextValue are Values that represent a list of phrases in several languages. There isno implied relationship between the list of phrases in the different languages. MultilingualMultiTextValue store a list of MonolingualMultiTextValues.