Posted on • Originally published atholmeshe.me on
Understanding The Memcached Source Code - LRU I
II,III) is the core module of the cache system, which largely determines how efficient the bottleneck resource, memory, can be utilized. The other 3 parts, namely,
LRU algorithm (I - this article, II) for entry expiration; and an
event driven model (not complete) based on libevent; and the
consistent harsh (not complete) for data distribution,
are built around it.
More often than not, theLRU algorithm is combined with ahash map , and is referred to as a
LRU Cache
In aLRU-cache , thehash map enables the fast accessing of thecached objects; andLRU avoids thecache to grow infinitely by marking expired, or so called,least recently used objects. Next we look at howLRU works from a high level standpoint.
Linked list
Technically,LRU algorithm works on alinked list, whenever a list entry is used (accessed or updated), it is removed from the list and be attached to the list head. In this way, the closer an element is to the list tail, theless recently used it is. Hence it is easy to remove irrelevant or “expired” elements from the tail based on a certain configuration.
Harsh map
Linked list is slow when it comes to element access, hence another data structure is employed. We have seen howlinked list “strings”chunks inslabs to makefree list_s. In anLRU cache , the mechanism is similar, however, it is thehash map entries instead of _chunks inslabs got wired up this time, which looks like:
We can also flatten the linked list, and make the structure a bit more clear,
Core data structure - item
typedefstruct_stritem{/* Protected by LRU locks */struct_stritem*next;struct_stritem*prev;/* Rest are protected by an item lock */struct_stritem*h_next;/* hash chain next */rel_time_ttime;/* least recent access */rel_time_texptime;/* expire time */intnbytes;/* size of data */unsignedshortrefcount;uint8_tnsuffix;/* length of flags-and-length string */uint8_tit_flags;/* ITEM_* above */uint8_tslabs_clsid;/* which slab class we're in */uint8_tnkey;/* key length, w/terminating null and padding *//* this odd type prevents type-punning issues when we do * the little shuffle to save space when not using CAS. */union{...// scr: cascharend;// scr: flexible array member indicating the item header "end"}data[];/* if it_flags & ITEM_CAS we have 8 bytes CAS *//* then null-terminated key *//* then " flags length\r\n" (no terminating null) *//* then data with terminating \r\n (no terminating null; it's binary!) */}item;
Properties in use
next
,prev
,slabs_clsid
- item_link_q, item_unlink_q
it_flags
- do_item_link, do_item_unlink
time
,refcount
- do_item_link
h_next
- assoc_insert, assoc_delete
nkey
- assoc_delete
Memory layout of an item chunk
We have mentioneditem chunk indo_slabs_free. With the help of this data structure, we can now examine the chunk more closely.
Next we read the relevant code that performs the above discussed LRU operations.
do_item_link
intdo_item_link(item*it,constuint32_thv){// scr: -------------------> 1)...it->it_flags|=ITEM_LINKED;// scr: -------------------> 2)it->time=current_time;...// scr: stat/* Allocate a new CAS ID on link. */...// scr: casassoc_insert(it,hv);// scr: -------------------> 3)item_link_q(it);// scr: -------------------> 4)refcount_incr(&it->refcount);// scr: -------------------> 5)...// scr: statreturn1;}
1)hv
is supposed to be the shortened “hashed value”.
2) SetITEM_LINKED
init->it_flags
, and set current time toit->time
.
The fieldit_flags
is used indo_slabs_free anddo_slabs_alloc
3) Insert theitem to hash map.
4) Insert theitem to linked list.
5) Increase thereference count.
This field is initialized as1
indo_slabs_alloc
It is worth noting here thatreference count indicates how many sub-modules are using the same resource, so as to determine when to actually deallocate the resource (In this particular case,item is referred by bothslab andLRU ). I have writtenthis article that explains a similar mechanism of C++.
item_link_q - add to linked list
item_link_q is a thread safe wrapper of the workhorse methoddo_item_link_q
.
staticvoiditem_link_q(item*it){pthread_mutex_lock(&lru_locks[it->slabs_clsid]);do_item_link_q(it);pthread_mutex_unlock(&lru_locks[it->slabs_clsid]);}
staticvoiddo_item_link_q(item*it){/* item is the new head */item**head,**tail;assert((it->it_flags&ITEM_SLABBED)==0);head=&heads[it->slabs_clsid];// scr: -------------------> 1)tail=&tails[it->slabs_clsid];assert(it!=*head);assert((*head&&*tail)||(*head==0&&*tail==0));it->prev=0;// scr: -------------------> 2)it->next=*head;if(it->next)it->next->prev=it;*head=it;if(*tail==0)*tail=it;sizes[it->slabs_clsid]++;// scr: -------------------> 3)return;}
1) Get the head and tail of the respectiveLRU linked list indicated byslabs_clsid
. Note that theslabs_clsid
issalted with the type of the queue, hence eachslab group may enlist multiplelists.
2) Standard operations of “adding an element to the front”.
3) Increase the global array sizes.
staticitem*heads[LARGEST_ID];staticitem*tails[LARGEST_ID];...staticunsignedintsizes[LARGEST_ID];
assoc_insert - add to hash map
intassoc_insert(item*it,constuint32_thv){// scr: again, hv -> hash valueunsignedintoldbucket;...// scr: expanding related operations}else{it->h_next=primary_hashtable[hv&hashmask(hashpower)];// scr: 1)primary_hashtable[hv&hashmask(hashpower)]=it;// scr: 2)}...// scr: expanding related operations}
1) Deal with potential conflict. If there is no, theh_next
will be set tonull
.
2) Set the item to the bucket in primary_hashtable.
...staticitem**primary_hashtable=0;...
The expanding logic omitted here will be covered in following articles.
do_item_unlink
voiddo_item_unlink(item*it,constuint32_thv){MEMCACHED_ITEM_UNLINK(ITEM_key(it),it->nkey,it->nbytes);if((it->it_flags&ITEM_LINKED)!=0){it->it_flags&=~ITEM_LINKED;// scr: -------------------> 1)...// scr: statassoc_delete(ITEM_key(it),it->nkey,hv);// scr: ---------------> 2)item_unlink_q(it);// scr: -------------------> 3)do_item_remove(it);// scr: -------------------> *)}}
1) ClearITEM_LINKED
init->it_flags
.
2) Remove theitem from hash map.
3) Remove theitem from linked list.
*) The actual releasing of anitem will be covered in the next article.
item_unlink_q - remove from linked list
Likewise, item_unlink_q is a thread safe wrapper of the workhorse methoddo_item_unlink_q
.
staticvoiditem_link_q(item*it){pthread_mutex_lock(&lru_locks[it->slabs_clsid]);do_item_link_q(it);pthread_mutex_unlock(&lru_locks[it->slabs_clsid]);}
staticvoiddo_item_unlink_q(item*it){item**head,**tail;head=&heads[it->slabs_clsid];// scr: -------------------> 1)tail=&tails[it->slabs_clsid];if(*head==it){// scr: -------------------> 2)assert(it->prev==0);*head=it->next;}if(*tail==it){assert(it->next==0);*tail=it->prev;}assert(it->next!=it);assert(it->prev!=it);if(it->next)it->next->prev=it->prev;if(it->prev)it->prev->next=it->next;sizes[it->slabs_clsid]--;// scr: -------------------> 3)return;}
1) Same, get the head and tail of the respectiveLRU linked list indicated byslabs_clsid
.
2) Standard operations of “removing an element from a linked list”.
3) Decrease the global array sizes.
staticitem*heads[LARGEST_ID];staticitem*tails[LARGEST_ID];...staticunsignedintsizes[LARGEST_ID];
assoc_delete - remove from hash map
staticitem**_hashitem_before(constchar*key,constsize_tnkey,constuint32_thv){item**pos;unsignedintoldbucket;...// scr: expanding related operations}else{pos=&primary_hashtable[hv&hashmask(hashpower)];// scr: -----> 1)}while(*pos&&((nkey!=(*pos)->nkey)||memcmp(key,ITEM_key(*pos),nkey))){pos=&(*pos)->h_next;// scr: ----------------------------------> 2)}returnpos;}...voidassoc_delete(constchar*key,constsize_tnkey,constuint32_thv){item**before=_hashitem_before(key,nkey,hv);if(*before){item*nxt;...nxt=(*before)->h_next;// scr: --------------------------------> 3)(*before)->h_next=0;/* probably pointless, but whatever. */*before=nxt;// scr: ------------------------------------------> 4)return;}/* Note: we never actually get here. the callers don't delete things they can't find. */assert(*before!=0);}
1) Get the hash bucket usinghv
.
2) Go through the conflict chain and compare thekey
. Note that the result value is theaddress of thenext
member of the elementbefore
the found one. When there is no conflict, the address is the bucket itself.
3) Set the next element after the found one to temporary variablenxt
.
4) Update thenext
member of the elementbefore
the found one.
Take home
Trythis.
Top comments(0)
For further actions, you may consider blocking this person and/orreporting abuse