- Notifications
You must be signed in to change notification settings - Fork3
ZigaSajovic/optimizing-the-memory-layout-of-std-tuple
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
In the last few years I have become increasingly interested in bringing higher order concepts of category theory closer to the bits that implement their instances. This leads one to languages likeC++, where the types have insight into the hardware, which gives the constructs control over how they are mapped onto it. On the way towards of such meta-endeavours I createdCppML, ametalanguage for C++, which I use when developing libraries.In this text, we will use it to optimize the memory layout ofstd::tuple
, at no runtime or cognitive cost on the end of the user.
Before we begin, have a look at the result.
Tuple<char,int,char,int,char,double,char> tup{'a',1,'c',3,'d',5.0,'e'}; std::cout <<"Size of Tuple:" <<sizeof(tup) <<" Bytes" << std::endl; std::tuple<char,int,char,int,char,double,char> std_tup{'a',1,'c',3,'d',5.0,'e'}; std::cout <<"Size of std::tuple:" <<sizeof(std_tup) <<" Bytes" << std::endl; std::cout <<"Actual size of data:" <<4 *sizeof(char) +2 *sizeof(int) +sizeof(double) <<" Bytes" << std::endl; std::cout << get<2>(tup) <<" ==" << std::get<2>(std_tup) << std::endl;assert(tup == std_tup);
Size of Tuple: 24 Bytes
Size of std::tuple: 40 Bytes
Actual size of data: 20 Bytes
c == c
We notice that thestd::tuple has20 Bytes ofwasted space (making ittwice as big as the actual data), whileTuple only has4 Bytes ofwasted space.
class | size [B] | efficiency |
---|---|---|
Data | 20 | 1 |
Tuple | 24 | 0.84 |
std::tuple | 40 | 0.5 |
The solution spans roughly70
lines of code, which we will build up step by step in thisREADME. Please note that it does not contain all the functionalities required ofstd::tuple
, but it does provide all the non-trivial implementations (hence others are trivially implementable in terms (or in light) of those provided). The entire code can be foundhere.
Please note that this text is not intended to be a tutorial onCppML, for that please see the in-depthCppML Tutorial
. Regardless, we will still include explanations and illustrative examples of the steps we take here. Note that all the links with theml::
prefix (likeml::ZipWith
) lead to the metafunctions entry in theCppML Reference
, which you are encouraged to follow.
You can imaginestd::tuple<T0, T1, ..., Tn>
as a class, with membersT0 t0
,T1 t1
, ...,Tn tn
, laid out in the specified (or reversed) order. As such, like in any class, padding needs to be inserted between the members, so that they arenaturally aligned, which generally means that the data's memory address is a multiple of the data size (have a look at theData structure alignment). This means that the order of the members will influence the size of the objects of that type. This is why tuples with same types, but in different order, can have different sizes.
std::tuple<char,int,char> t_big;std::tuple<int,char,char> t_small;std::cout <<"Size of t_big:" <<sizeof(t_big) <<std::endl;std::cout <<"Size of t_small:" <<sizeof(t_small) <<std::endl;
Size of t_big: 12
Size of t_small: 8
Given that the order of elements has an impact on the size of the objects of that class we can turn towards finding the permutation of the order of members that minimizes the needed padding. But checking each permutation hassuperexponential complexity (seeSterlings formula). But we can approximate this process, by sorting the elements by their alignment. You can see this, by imagining aligning elements, which have been sorted by size:
Lets say you are aligning the first elementT
, and it needs padding to be naturally aligned. All the elements with the same size asT
will follow it in the sequence, and wont need padding. The first elementU
that comes with a different size fromT
will need some padding. Than all the elements of same size that follow it, will not. And so on.
We implement a classTuple
, which is an interface wrapper aroundstd::tuple
. It works by approximating the optimal permutation by thePermutation
that sorts the types by theiralignment
. It than lays the objects out in memory in that order. It holds thePermutation
as its template argument, and uses it to internally redirect the users indexing (hence the user can be oblivious to the permutation).
We want aTupleBase
wrapper of astd::tuple
, which will
template<typename Permutation,typename StdTuple>structTupleBase;template<int ...Is,typename ...Ts>structTupleBase< ml::ListT<ml::Int<Is>...>, std::tuple<Ts...>> {/* Implementation*/};
have thePermuatation
that sorts the types by theiralignment
as its first template parameter, and the already permutedstd::tuple
as its second. Hence, we need aMakeBase
metafunction, which will allow us to implementTuple
class like
template<typename... Ts>structTuple : MakeBase<Ts...> {using MakeBase<Ts...>::MakeBase;};
On a concrete example, we wantMakeBase
using TB0 = MakeBase<char,int,char,int,char,double,char>;using TB1 = TupleBase<ml::ListT<ml::Int<5>, ml::Int<3>, ml::Int<1>, ml::Int<6>, ml::Int<4>, ml::Int<2>, ml::Int<0>>, std::tuple<double,int,int,char,char,char,char>>;static_assert( std::is_same_v<TB0, TB1>);
We will first write themetaprogram that computes the permutations, and than code theinterface indirecting wrapper.
To get the permutation and its inverse, we take the list of types, enumerate (zip with a tag) them (so we can track their position), and sort them by their alignment. We than extract the enumerates from the sequence as our permutation and extract the sorted types into astd::tuple
.
Below we specify themetaprogram in a bullet-list, which we will than translate step by step intoCppML
.
- Zip with
Tag
(usingml::ZipWith
) ml::Sort
the resulting parameter packTag<ml::Int<Is>, Ts>...
, with theComparator
that takes thealignment
of theT
.Comparator: P0, P1 -> Bool<t>
- We
ml::Map
(theP0
andP1
) by:ml::Unwrap
the parameter pack fromTag<Int<I>, T>
(seeUnwrapping template arguments into metafunctions
), and- extract the second element (
T
) usingml::Get
, and - pipe the extracted
T
intoml::AlignOf
- and pipe the alignments into
ml::Greater
- We than split the sorted parameter pack
Tag<ml::Int<Is>, Ts>...
intoTupleBase<ml::ListT<ml::Int<Is>...>, std::tuple<Ts...>>
by:- create a
ml::ProductMap
of:ml::Map
of extractors of theml::Int<i>
:ml::Unwrap
the parameter pack fromTag<Int<I>, T>
(seeUnwrapping template arguments into metafunctions
), and- extract the first element (
ml::Int<I>
) usingml::Get
, and - pipe into
ml::ToList
ml::Map
of extractors of theT
:ml::Unwrap
the parameter pack fromTag<Int<I>, T>
(seeUnwrapping template arguments into metafunctions
), and- extract the second element (
T
) usingml::Get
, and - pipe into the metafunction created from
std::tuple
(usingml::F
; seeLifting templates to metafunctions
),
- and
Pipe
into the metafunction created fromTupleBase
(usingml::F
; seeLifting templates to metafunctions
)
- create a
This sequence is easily translated intoCppML
:
template<typename ...Ts>using MakeBase = ml::f< ml::ZipWith< Param, ml::Sort<ml::Map<ml::Unwrap<ml::Get<1, ml::AlignOf<>>>, ml::Greater<>>, ml::Product<ml::Map<ml::Unwrap<ml::Get<0>>>, ml::Map<ml::Unwrap<ml::Get<1>>, ml::F<std::tuple>>, ml::F<TupleBase>>>>, ml::Range<>::f<0,sizeof...(Ts)>, ml::ListT<Ts...>>;
We will spend the rest of this post building theMakeBase
metafunction step by step.
We will also need a metafunction that will compute the inverse permutation for an indexI
, which will allow us to internally redirect users indexing. This is done by locating the index ofI
in the permutation (usingml::FindIdIf
.
Our goal is two-fold. We wish to generate a permutation of elements, which will optimize the object size, while ensuring theas if rule. This means that the user is to be oblivious to the permutation behind the scene, and can access elements in the same order he/she originally specified. To this purpose, we will enumerate the types (i.e.Zip them with appropriate index) before permuting them.
CppML provides theml::ZipWith
metafunction, which takesN
lists of types, and Zips them using the provided template. It also providesml::Range
, which generates a list of integer constantsml::ListT<ml::Int<0>, ..., ml::Int<N>>
.
template<typename Id,typename T>structTag {};template<typename... Ts>using TaggedList =ml::f<ml::ZipWith<Tag// Type with which to Zip// ml::ToList<> // is the implicit Pipe >,typename ml::TypeRange<>::template f<0,sizeof...(Ts)>,// Integer sequence from [0, |Ts...|) ml::ListT<Ts...>>
For example
using Tagged = ml::ListT<Tag<ml::Int<0>,int>, Tag<ml::Int<1>,double>>;static_assert(std::is_same_v<TaggedList<int,double>, Tagged>);
Note the implicitPipe (ml::ToList
) in the above code section.Pipe is a key concept inCppML, it is how metafunction execution can be chained (think bash pipes). In the above code section, the implicitPipe is (ml::ToList
), which returns the result in a list. We will be replacing it withml::Sort
later (i.e. we will pipe the result ofml::ZipWith
intoml::Sort
).
As specified, we will useml::Sort
as a greedy solution to the packing problem.ml::Sort
<Comparator>
provided byCppML operates on a parameter pack:
using SortedIs = ml::f<ml::Sort<ml::Greater<>>,// Predicate ml::Int<0>, ml::Int<2>, ml::Int<1>>;using Sorted = ml::ListT<ml::Int<2>, ml::Int<1>, ml::Int<0>>;static_assert(std::is_same_v<SortedIs, Sorted>);
We need to write a predicate metafunction appropriate for out list of tagged types.
The predicate is a metafunction mapping a pair of types toml::Bool<truth_val>
. The elements of the list are of the formTag<ml::Int<I0>, T0>
, and we wish to compare onAlignments ofT
s. Therefore our predicate is to be a metafunction that maps
(Tag<ml::Int<I0>, T0>, Tag<ml::Int<I1>, T1>)->(T0, T1)>->(ml::Int<alignment_of_T0>, ml::Int<alignment_of_T1>)->ml::Bool<truth_val>
We will achieve this bytaking a pack of two types and (ml::Map
)ing them by a metafunction that (ml::Unwrap
)s theTag
,pipes
the result toml::Get
<1>
, to extract the second element (beingTi
), and pipe the result toml::AlignOf
. We will thanpipe
the result of (ml::Map
) toml::Greater
.
using Predicate = ml::Map<// Map each of the two types: ml::Unwrap<// Unwrap the Tag ml::Get<1,// Extract the second element ml::AlignOf<>>>,// Take its alignment ml::Greater<>>;// And Pipe the result of Map to// Greater
For example
static_assert( std::is_same_v< ml::f<Predicate, Tag<ml::Int<2>,double>, Tag<ml::Int<5>,float>>, ml::Bool<true>>)
because alignment ofdouble
is greater than that offloat
(on my machine).
We can now put it all together.
template<typename Id,typename T>structTag{};template<typename... Ts>using TaggedPermutation = ml::f<ml::ZipWith<Tag,// Zip the input lists with Tag and pipe into ml::Sort<Predicate>>,// than compare the generated// elements (pipe-ing from the// Map) using Greater ml::Range<>::f<0,sizeof...(Ts)>,// generate a// range from// [0,// numOfTypes) ml::ListT<Ts...>>;
On a concrete example, this metafunction evaluates to
using TaggedPerm = TaggedPermutation<char,int,char,int,char,double,char>;using List = ml::ListT<Tag<ml::Int<5>,double>, Tag<ml::Int<3>,int>, Tag<ml::Int<1>,int>, Tag<ml::Int<6>,char>, Tag<ml::Int<4>,char>, Tag<ml::Int<2>,char>, Tag<ml::Int<0>,char>>static_assert( std::is_same_v<TaggedPerm, List>);
Examining theTaggedPerm
above, it is essentially a list of Tags,Tag<ml::Const<int, I>, T>
, withI
being the original position ofT
. We now need to split this list into two lists, where the first will only hold the integer constants, and the other only the types. We accomplish this by (ml::Map
)ing eachTag
with theml::Get
. Note that we will need to (ml::Unwrap
) theTag
, as (ml::Map
) operates on parameter packs.
using Extract0 = ml::Map<// Mapping ml::Unwrap<ml::Get<0>>>// Get N-th element
and for the types
using Extract1 = ml::Map<// Mapping ml::Unwrap<ml::Get<1>, ml::F<std::tuple>>>// Get N-th element
This metafunctions can now be used on the computedTaggedPerm
.Note that in theExtract1
we are using a lifted templatestd::tuple
as thePipe
of (ml::Map
).
using Permutation = ml::ListT<ml::Int<5>, ml::Int<3>, ml::Int<1>, ml::Int<6>, ml::Int<4>, ml::Int<2>, ml::Int<0>>using Perm = ml::f<ml::Unwrap<TaggedPerm>, Extract0>;static_assert( std::is_same_v< Permutation, Perm>);using PermutedTuple = std::tuple<double,int,int,char,char,char,char>using Tupl = ml::f<ml::Unwrap<TaggedPerm>, Extract1>;static_assert( std::is_same_v< PermutedTuple, Tupl>);
To run bothExtract0
andExtract1
on theTaggedPermutation
we will use(ml::ProductMap
), which takesN
metafunctions and executes all of them on the input parameter packTs...
. As mentioned, we willPipe
the results ofExtract0
toml::F<TupleBase>
.
using ExtractBoth = ml::ProductMap< Extract0, Extract1, ml::F<TupleBase>>;// Pipe results of both into here
We see that it correctly computes theTupleBase
instance:
using TupleBase_ = ml::f<ml::Unwrap<TaggedPerm>, ExtractBoth>;using TB1 = TupleBase<ml::ListT<ml::Int<5>, ml::Int<3>, ml::Int<1>, ml::Int<6>, ml::Int<4>, ml::Int<2>, ml::Int<0>>, std::tuple<double,int,int,char,char,char,char>>;static_assert( std::is_same_v< TB1, TupleBase_>);
which matches our wishes from theformulated solution
.
Putting it all together, theMakeBase
metafunction looks like so:
template<typename ...Ts>using MakeBase = ml::f< ml::ZipWith< Param, ml::Sort<ml::Map<ml::Unwrap<ml::Get<1, ml::AlignOf<>>>, ml::Greater<>>, ml::Product<ml::Map<ml::Unwrap<ml::Get<0>>>, ml::Map<ml::Unwrap<ml::Get<1>>, ml::F<std::tuple>>, ml::F<TupleBase>>>>, ml::Range<>::f<0,sizeof...(Ts)>, ml::ListT<Ts...>>;
The last thing to do, is to compute the inverse permutation. Each indexI
is inverted by finding position in the permutation.
CppML provides the metafunctionml::FindIdIf
<Predicate>
, which returns the index of the first element that satisfies the predicate. Hence, we only need to form the predicate. We do so, bypartially evaluating theml::IsSame
metafunction. For example,
using Is1 = ml::Partial<ml::IsSame<>, ml::Int<1>>;using T = ml::f<Is1, ml::Int<2>>;static_assert( std::is_same_v<T, ml::Bool<false>>);
This means that we can invert the indexI
by
template<typename I>using Invert = ml::f< ml::Unwrap<Permutation>, ml::FindIdIf< ml::Partial<ml::IsSame<>>, I>>;
With thepermutation and itsinverse computed, and the permuted elements extracted, we turn to coding the class interface indirection. The classTuple will contain:
- the permuted
std::tuple
member_tuple
- from its second template argument
Invert
alias which will compute the inverse permutation for an indexI
- a delegate constructor:
- It will forward the arguments
Us...
as a tuple to thework construcotr
- It will forward the arguments
- a
work constructor
:- it will initialize the
_tuple
member by:- permuting the arguments of the forwarding tuple into its initializer
std::get<Is>(fwd)...
- permuting the arguments of the forwarding tuple into its initializer
- it will initialize the
- the
get<I>()
friend function, which:- will use the
f
alias to invertI
in thePermutation
- and forward the inverted index to
std::get
- will use the
template<int... Is,typename... Ts>structTupleBase<ml::ListT<ml::Int<Is>...>, std::tuple<Ts...>> {private: std::tuple<Ts...> _tuple;template<typename... Us>TupleBase(ml::_, std::tuple<Us...> &&fwd)// work constructor : _tuple{static_cast<ml::f<ml::Get<Is>, Us...> &&>(std::get<Is>(fwd))...} {}public:template<typename... Us>TupleBase(Us &&... us)// delegate constructor : TupleBase{ml::_{},std::forward_as_tuple(static_cast<Us &&>(us)...)} {}template<typename I>// Compute the inverse indexusing f = ml::f<ml::FindIdIf<ml::Partial<ml::IsSame<>, I>>, ml::Int<Is>...>;template<int I,typename... Us>frienddecltype(auto) get(TupleBase<Us...> &tup);};template<int I,typename... Us>decltype(auto) get(TupleBase<Us...> &tup) {return std::get<ml::f<TupleBase<Us...>, ml::Int<I>>::value>(tup._tuple);}
Which allows us to implement theTuple
class, like so:
template<typename... Ts>structTuple : MakeBase<Ts...> {using MakeBase<Ts...>::MakeBase;};
Asget<N>
is the driving force behind the interface, in terms of which other functionalities are implemented, their implementation is trivial. Here we demonstrate the implementation of the equality operator.
template<int... Is,template<class...>classT,typename... Ts,template<class...>classU,typename... Us,typename F>decltype(auto) envoker(ml::ListT<ml::Int<Is>...>,const T<Ts...> &t,const U<Us...> &u, F &&f) {using std::get;return (... &&f(get<Is>(t), get<Is>(u)));}template<typename... Ts,typename... Us>autooperator==(const Tuple<Ts...> &lhs,const Tuple<Us...> &rhs) ->bool {using List = ml::TypeRange<>::f<0,sizeof...(Ts)>;returnenvoker(List{}, lhs, rhs, [](auto &&x,auto &&y) {return x == y; });};template<typename... Ts,typename... Us>autooperator==(const std::tuple<Ts...> &lhs,const Tuple<Us...> &rhs) ->bool {using List = ml::TypeRange<>::f<0,sizeof...(Ts)>;returnenvoker(List{}, lhs, rhs, [](auto &&x,auto &&y) {return x == y; });};template<typename... Ts,typename... Us>autooperator==(const Tuple<Us...> &lhs,const std::tuple<Ts...> &rhs) ->bool { rhs == lhs;};
We trust the reader would be able to implement the missing functions and methods, such as other comparison operators, or converting constructors (forTuple
andstd::tuple
).
About
Optimizing the memory layout of std::tuple
Topics
Resources
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Releases
Packages0
Contributors2
Uh oh!
There was an error while loading.Please reload this page.