Movatterモバイル変換


[0]ホーム

URL:


Skip to main content

Advertisement

Springer Nature Link
Log in

The Unified Code for Units of Measure in RDF:cdt:ucum and other UCUM Datatypes

  • Conference paper
  • First Online:

Part of the book series:Lecture Notes in Computer Science ((LNISA,volume 11155))

Included in the following conference series:

  • 2670Accesses

Abstract

Being able to describe quantity values and their units is a requirement that is common to many applications in several industrial sectors such as manufacturing, transport and logistics, personal and public health, smart cities, energy, environment, buildings, agriculture. Different ontologies have been developed to describe units, their relations, and quantities with their values. In this paper we propose an alternative approach that leverages the Unified Code of Units of Measure, a code system intended to includeall units of measures being contemporarily used in international sciences, engineering, and business. Our approach consists of a main UCUM datatype identified by IRIhttp://w3id.org/lindt/custom_datatypes#ucum, abbreviated ascdt:ucum. This datatype can be used for lightweight encoding and querying of quantity values, in a wide range of applications where representing and reasoning with quantity kinds and values is more important than reasoning with units. We compare our approach with existing approaches, and demonstrate it with our implementation on top of Apache Jena and an online testing tool.

You have full access to this open access chapter, Download conference paper PDF

Similar content being viewed by others

1Introduction

Applications in many industry sectors rely on quantity values with units of measures, for sensor observations, actuation, design calculations/simulations, quantitative general knowledge, etc. A typical way to convey the value of a quantity in RDF consists in using a structure with one triple providing a numerical value as a literal in standard datatypes (xsd:float,xsd:double,xsd:decimal), and a triple with an IRI identifying the unit. A dedicated ontology can define the properties that connect the quantity value to the numerical value and the unit. An alternative approach relies on custom datatypes [5].

In this paper, we introduce an RDF datatype,cdt:ucum, that transposes to RDF the full expressive power of the Unified Code for Units of Measure (UCUM [8]), enabling lightweight descriptions and querying of physical quantities using a single datatype. We currently provide 32 more specific datatypes such ascdt:speed andcdt:length to further specify the quantity kind of quantity values, but more datatypes may be introduced in the future.

We first show in Sect. 2 how quantity values are typically described in RDF with existing ontologies or custom datatypes. Then, in Sect. 3, we introduce thecdt:ucum datatype, highlighting the conciseness of the representation with many examples. Finally, in Sect. 4, we describe our implementation as an extension of Apache Jena with support forcdt:ucum in SPARQL queries, with an online testing tool.

2Related Work

We identify two approaches to represent physical quantities in RDF: using ontologies, or using custom datatypes.

Using ontologies of units of measurements. The classical approach consists in using an ontology to describe units, their relations, and measurements. A recent survey [4] compares and evaluates eight well known ontologies for units of measurements, among which MUO [6], QUDV [1], OM [7], QUDT [3]. This survey also report on the Wikidata corpusFootnote1 that currently contains over 4.4 k measurement units and 4.1 k non-prefixed units. Using such ontologies, quantity values are usually represented as OWL individuals linked to some numeric value and to some individual representing a unit of measure. For example, Listing 1.1 represents the quantity value\(29 \ ^\circ \text {C}\) using QUDT 1.1.

figure a

Not all possible units of measurement are (or will be) defined in these ontologies, for example QUDT 1.1 defines a unit for kilowatt hour, but not megawatt hour. Application developers in the energy domain can force themselves to use units they are not used to, or they can define missing units using the definition mechanism provided by QUDT. This extension mechanism uses concepts such as base units, conversion offsets and multipliers, numerator and denominator. For example, Listing 1.2 illustrates how the unit megawatt hour may be defined. Even then, two energy operators may define the same unit using different URIs, leading to potential interoperability issues.

figure b

Datasets using quantity values defined with such ontologies require 4 triples every time a quantity needs to be linked to a quantity value, and complex mechanisms are needed to canonicalize quantity values so as to query them uniformly. We are not aware of any existing support of QUDT or OM custom units in any RDF or SPARQL engine.

Using datatypes. DBpedia has many datatypesFootnote2, which are hard-coded inOntologyDatatypes.scala and listed in the DBpedia Mappings Wiki for reference. Dbpedia defines datatypes for physical dimensions (http://dbpedia.org/datatype/Area) along with datatypes for specific units of measures (http://dbpedia.org/datatype/cubicInch). Yet, these datatypes do not dereference, so one cannot understand if Inch here is in the international customary units, U.S. survey lengths, British Imperial lengths, for example. Again, not all possible units of measurement are (or will be) defined in the Dbpedia ontology, and complex mechanisms are needed to canonicalize quantity values so as to query them uniformly.

We previously proposed an approach for RDF and SPARQL engines to support arbitrarily complex custom datatypes on-the-fly by dereferencing their URIs and retrieving specifications in JavaScript [5]. In this paper we are exclusively interested in datatypes for quantity values, and do not consider on-the-fly support capabilities.

3Specification ofcdt:ucum and other UCUM Datatypes

The Unified Code for Units of Measure (UCUM) [8] is a code system intended to includeall units of measures being contemporarily used in international science, engineering, and business.

We define a RDF datatype UCUM identified by IRIhttp://w3id.org/lindt/custom_datatypes#ucum, abbreviated ascdt:ucum. Its lexical space is the concatenation of anxsd:decimal, optionally followed by or and the lexical form of anxsd:integer, at least one space, and a unit chosen in the case sensitive version of the UCUM code system. The value space corresponds to the set of measures, or quantity values as defined by the International Systems of Quantities. The lexical-to-value mapping maps lexical forms with a UCUM unit to their corresponding measures according to the International Systems of Quantities.

We also define a set of additional datatypes such ascdt:length andcdt:speed that further specify the quantity kind of quantity values. Their lexical spaces, value spaces, and lexical-to-value mappings are subsets of those ofcdt:ucum. More such datatypes may be defined in the future. Table 1 lists examples of validcdt:ucum literals, and their equivalent using more specific datatypes.

Table 1. Some valid UCUM literals.

4Implementation of UCUM Datatypes on Apache Jena

The UCUM specification has implementations in different languages. We used the latest version ofsystems-ucum-java8Footnote3, an implementation leveraging the recent Java units of measurement API 2.0 (JSR 385), to add support of the 33 datatypes specified above on top of Apache Jena. Our extension, namedjena-ucum, is open-source and available onlineFootnote4. It overloads native SPARQL operators (=,<, etc.) to compare UCUM literals, and arithmetic functions (+, −, *, /) to manipulate quantity value literals: 1. Add two commensurable quantity value literals; 2. Subtract a quantity value literals to a commensurable one; 3. Multiply two quantity value literals, or a quantity value literal and a scalar (xsd:int,xsd:decimal,xsd:float,xsd:double); 4. Divide a quantity value literal by a quantity value literal, a quantity value literal by a scalar, or a scalar by a quantity value literal. We additionally define a custom SPARQL function with IRI:http://w3id.org/lindt/custom_datatypes#sameDimension which takes two parameters and returns true if they are commensurable quantity values.

Fig. 1.
figure 1

Screenshot of the UCUM datatypes playgroundhttps://w3id.org/lindt/playground.html.

5Demonstration

We demonstrate the UCUM datatypes using a playground illustrated on Fig. 1 and accessible online.Footnote5 The user can enter a SPARQL Construct or Select query and the default graph of the RDF Dataset on which it is evaluated. The result is computed in real-time and returned to the user using the WebSocket protocol.

Queries are predefined to progressively introduce the use of SPARQL comparison operators, arithmetic functions, solution sequence modifiers (ORDER BY). Other predefined queries are predefined to illustrate each of the 33 currently defined UCUM datatypes. We will also showcase how the UCUM datatypes may be used in combination with other vocabularies such as SOSA/SSN [2].

6Conclusion

Using the UCUM datatypes, one only requires 1 triple to link a quantity to a fully qualified value, and one does not require custom mechanisms to canonicalize literals based on external descriptions of units of measurements. Using UCUM Datatypes, datasets are therefore drastically lightened, and queries are also simpler. The UCUM datatype can inherently represent an infinite set of custom units, and is therefore suitable for an open set of application domains.

A similar datatype could be defined to support amounts of money, potentially with any type of currencies and a timestamp for this currency.

References

  1. Quantities, Units, Dimensions, Values (QUDV). SysML 1.2 Revision Task Force Working draft, Object Management Group, October 30 2009

    Google Scholar 

  2. Haller, A., Janowicz, K., Cox, S.J.D., Le Phuoc, D., Taylor, K., Lefrançois, M.: Semantic Sensor Network Ontology. W3C Recommendation, W3C, 19 October 2017

    Google Scholar 

  3. Hodgson, R., Keller, P.J., Hodges, J., Spivak, J.: QUDT - Quantities, Units. Dimensions and Data Types Ontologies. Technical report, NASA (2014)

    Google Scholar 

  4. Keil, J.M., Schindler, S. Comparison and evaluation of ontologies for units of measurement. Semant. Web J. (2018, to appear).http://www.semantic-web-journal.net/content/comparison-and-evaluation-ontologies-units-measurement-1

  5. Lefrançois, M., Zimmermann, A.: Supporting arbitrary custom datatypes in RDF and SPARQL. In: Sack, H., Blomqvist, E., d’Aquin, M., Ghidini, C., Ponzetto, S.P., Lange, C. (eds.) ESWC 2016. LNCS, vol. 9678, pp. 371–386. Springer, Cham (2016).https://doi.org/10.1007/978-3-319-34129-3_23

    Chapter  Google Scholar 

  6. Polo, L., Berrueta, D.: MUO - Measurement Units Ontology, Working Draft - DD April 2008. Working draft, Fundación CTIC (2008)

    Google Scholar 

  7. Rijgersberg, H., van Assem, M., Top, J.L.: Ontology of units of measure and related concepts. Semant. Web J.4(1), 3–13 (2013)

    Article  Google Scholar 

  8. Shadow, G., McDonald, C.J.: The Unified Code for Units of Measure. Technical report, Regenstrief Institute Inc., 22 October 2013

    Google Scholar 

Download references

Acknowledgments

This work has been partly funded by the ANR 14-CE24-0029 OpenSensingCity project.

Author information

Authors and Affiliations

  1. Univ Lyon, MINES Saint-Étienne, CNRS, Laboratoire Hubert Curien UMR 5516, 42023, Saint-Étienne, France

    Maxime Lefrançois & Antoine Zimmermann

Authors
  1. Maxime Lefrançois

    You can also search for this author inPubMed Google Scholar

  2. Antoine Zimmermann

    You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence toMaxime Lefrançois.

Editor information

Editors and Affiliations

  1. University of Bologna, Bologna, Italy

    Aldo Gangemi

  2. IBM Research - Almaden, San Jose, CA, USA

    Anna Lisa Gentile

  3. CNR-ISTC, Rome, Italy

    Andrea Giovanni Nuzzolese

  4. Technische Universität Dresden, Dresden, Germany

    Sebastian Rudolph

  5. Karlsruhe Institute of Technology, Karlsruhe, Germany

    Maria Maleshkova

  6. University of Mannheim, Mannheim, Germany

    Heiko Paulheim

  7. University of Aberdeen, Aberdeen, UK

    Jeff Z Pan

  8. CNR-ISTC, Rome, Italy

    Mehwish Alam

Rights and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Lefrançois, M., Zimmermann, A. (2018). The Unified Code for Units of Measure in RDF:cdt:ucum and other UCUM Datatypes. In: Gangemi, A.,et al. The Semantic Web: ESWC 2018 Satellite Events. ESWC 2018. Lecture Notes in Computer Science(), vol 11155. Springer, Cham. https://doi.org/10.1007/978-3-319-98192-5_37

Download citation

Publish with us


[8]ページ先頭

©2009-2025 Movatter.jp