- Notifications
You must be signed in to change notification settings - Fork112
Memory Manager For Small(ish) Microprocessors
License
rhempel/umm_malloc
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
This is a memory management library specifically designed to work with theARM7 embedded processor, but it should work on many other 32 bit processors,as well as 16 and 8 bit devices.
You can even use it on a bigger project where a single process might wantto manage a large number of smaller objects, and using the system heapmight get expensive.
Joerg Wunsch and the avr-libc provided the firstmalloc()
implementationthat I examined in detail.
http://www.nongnu.org/avr-libc
Doug Lea's paper on malloc() was another excellent reference and providesa lot of detail on advanced memory management techniques such as binning.
http://gee.cs.oswego.edu/dl/html/malloc.html
Bill Dittman provided excellent suggestions, including macros to supportusing these functions in critical sections, and for optimizingrealloc()
further by checking to see if the previous block was free and could beused for the new block size. This can help to reduce heap fragmentationsignificantly.
Yaniv Ankin suggested that a way to dump the current heap conditionmight be useful. I combined this with an idea from plarroy to alsoallow checking a free pointer to make sure it's valid.
Dimitry Frank contributed many helpful additions to make things morerobust including a user specified config file and a method of testingthe integrity of the data structures.
GitHub user @devyte provided useful feedback on the nesting of functionsas well as a fix for the problem that separates out the core free andmalloc functionality.
GitHub users @d-a-v and @devyte provided great input on establishinga heap fragmentation metric which they graciously allowed to be usedin umm_malloc.
Katherine Whitlock (@stellar-aria) extended the library for usage inscenarios where more than one heap or memory space is needed.
This library is designed to be included in your application as asubmodule that has default configuration that can be overriddenas needed by your application code.
Theumm_malloc
library can be initialized two ways. The first isat link time:
- Set
UMM_MALLOC_CFG_HEAP_ADDR
to the symbol representingthe starting address of the heap. The heap must bealigned on the natural boundary size of the processor. - Set
UMM_MALLOC_CFG_HEAP_SIZE
to the size of the heap in bytes.The heap size must be a multiple of the natural boundary size ofthe processor.
This is how theumm_init()
call handles initializing the heap.
We can also callumm_init_heap(void *pheap, size_t size)
where theheap details are passed in manually. This is useful in systems whereyou can allocate a block of memory at run time - for example in Rust.
For usage in a scenario that requires multiple heaps, the heap typeumm_heap
is exposed. All API functions (malloc
,free
,realloc
, etc.)have a correspondingumm_multi_*
variant that take a pointer to thistype as their first parameter.
Much like standard initialization, there are two methods:
umm_multi_init(umm_heap *heap)
, which initializes a given heapusing linker symbolsumm_multi_init_heap(umm_heap *heap, void *ptr, size_t size)
, whichwill initialize a given heap using a known address and size.
umm_malloc
is designed to be testable in standalonemode usingceedling
. To run the test suite, just make sure you haveceedling
installed and then run:
ceedling cleanceedling test:all
⚠️ You MUST provide a file calledumm_malloc_cfgport.h
somewhere in your app, even if it's blank
The reason for this is the way the configuration override heirarchyworks. The priority for configuration overrides is as follows:
- Command line defines using
-D UMM_xxx
- A custom config filename using
-D UMM_MALLOC_CFGFILE="<filename.cfg>"
- The default config filename
path/to/config/umm_malloc_cfgport.h
- The default configuration in
src/umm_malloc_cfg.h
The following#define
s are set to useful defaults insrc/umm_malloc_cfg.h
and can be overridden as needed.
The fit algorithm is defined as either:
UMM_BEST_FIT
which scans the entire free list and looksfor either an exact fit or the smallest block that willsatisfy the request. This is the default fit method.UMM_FIRST_FIT
which scans the entire free list and looksfor the first block that satisfies the request.
The following#define
s are disabled by default and shouldremain disabled for production use. They are helpful whentesting allocation errors (which are normally due to bugs inthe application code) or for running the test suite whenmaking changes to the code.
UMM_INFO
is used to include code that allows dumpingthe entire heap structure (helpful when there's a problem).UMM_INTEGRITY_CHECK
is used to include code thatperforms an integrity check on the heap structure. It'sup to you to call theumm_integrity_check()
function.UMM_POISON_CHECK
is used to include code thatadds some bytes around the memory being allocated thatare filled with known data. If the data is not intactwhen the block is checked, then somone has written outsideof the memory block they have been allocated. It is upto you to call theumm_poison_check()
function.
The following functions are available for your application:
void*umm_malloc(size_tsize)void*umm_calloc(size_tnum,size_tsize)void*umm_realloc(void*ptr,size_tsize)voidumm_free(void*ptr)
They have exactly the same semantics as the corresponding standard libraryfunctions.
To initialize the library there are two options:
voidumm_init(void)voidumm_init_heap(void*ptr,size_tsize)
For the case of multiple heaps, correspondingumm_multi_*
functions are provided.
void*umm_multi_malloc(umm_heap*heap,size_tsize)void*umm_multi_calloc(umm_heap*heap,size_tnum,size_tsize)void*umm_multi_realloc(umm_heap*heap,void*ptr,size_tsize)voidumm_multi_free(umm_heap*heap,void*ptr)
As with the standard API, there are two options for initialization:
voidumm_multi_init(umm_heap*heap)voidumm_multi_init_heap(umm_heap*heap,void*ptr,size_tsize)
The memory manager assumes the following things:
- The standard POSIX compliant malloc/calloc/realloc/free semantics are used
- All memory used by the manager is allocated at link time, it is alignedon a 32 bit boundary, it is contiguous, and its extent (start and endaddress) is filled in by the linker.
- All memory used by the manager is initialized to 0 as part of theruntime startup routine. No other initialization is required.
The fastest linked list implementations use doubly linked lists so thatits possible to insert and delete blocks in constant time. This memorymanager keeps track of both free and used blocks in a doubly linked list.
Most memory managers use a list structure made up of pointersto keep track of used - and sometimes free - blocks of memory. In anembedded system, this can get pretty expensive as each pointer can useup to 32 bits.
In most embedded systems there is no need for managing a large quantityof memory block dynamically, so a full 32 bit pointer based data structurefor the free and used block lists is wasteful. A block of memory onthe free list would use 16 bytes just for the pointers!
This memory management library sees the heap as an array of blocks,and uses block numbers to keep track of locations. The block numbers are15 bits - which allows for up to 32767 blocks of memory. The high orderbit marks a block as being either free or in use, which will be explainedlater.
The result is that a block of memory on the free list uses just 8 bytesinstead of 16.
In fact, we go even one step futher when we realize that the free blockindex values are available to store data when the block is allocated.
The overhead of an allocated block is therefore just 4 bytes.
Each memory block holds 8 bytes, and there are up to 32767 blocksavailable, for about 256K of heap space. If that's not enough, youcan always add more data bytes to the body of the memory blockat the expense of free block size overhead.
There are a lot of little features and optimizations in this memorymanagement system that makes it especially suited to small systems, andthe best way to appreciate them is to review the data structures andalgorithms used, so let's get started.
We have a general notation for a block that we'll use to describe thedifferent scenarios that our memory allocation algorithm must deal with:
+----+----+----+----+c |* n | p | nf | pf | +----+----+----+----+
Where:
- c is the index of this block
- is the indicator for a free block
- n is the index of the next block in the heap
- p is the index of the previous block in the heap
- nf is the index of the next block in the free list
- pf is the index of the previous block in the free list
The fact that we have forward and backward links in the block descriptorsmeans that malloc() and free() operations can be very fast. It's easyto either allocate the whole free item to a new block or to allocate partof the free item and leave the rest on the free list without traversingthe list from front to back first.
The entire block of memory used by the heap is assumed to be initializedto 0. The very first block in the heap is special - it't the head of thefree block list. It is never assimilated with a free block (more on thislater).
Once a block has been allocated to the application, it looks like this:
+----+----+----+----+c | n | p | ... | +----+----+----+----+
Where:
- c is the index of this block
- n is the index of the next block in the heap
- p is the index of the previous block in the heap
Note that the free list information is gone because it's nowbeing used to store actual data for the application. If we hadeven 500 items in use, that would be 2,000 bytes forfree list information. We simply can't afford to waste that much.
The address of the...
area is what is returned to the applicationfor data storage.
The following sections describe the scenarios encountered during theoperation of the library. There are two additional notation conventions:
??
inside a pointer block means that the data is irrelevant. We don't careabout it because we don't read or modify it in the scenario beingdescribed.
...
between memory blocks indicates zero or more additional blocks areallocated for use by the upper block.
While we're talking about "upper" and "lower" blocks, we should makea comment about adresses. In the diagrams, a block higher up in thepicture is at a lower address. And the blocks grow downwards theirblock index increases as does their physical address.
Finally, there's one very important characteristic of the individualblocks that make up the heap - there can never be two consecutive freememory blocks, but there can be consecutive used memory blocks.
The reason is that we always want to have a short free list of thelargest possible block sizes. By always assimilating a newly freed blockwith adjacent free blocks, we maximize the size of each free memory area.
As part of the system startup code, all of the heap has been cleared.
During the very first malloc operation, we start traversing the free liststarting at index 0. The index of the next free block is 0, which meanswe're at the end of the list!
At this point, the malloc has a special test that checks if the currentblock index is 0, which it is. This special case initializes the freelist to point at block index 1 and then points block 1 to thelast block (lf) on the heap.
BEFORE AFTER +----+----+----+----+ +----+----+----+----+0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 1 | 1 | +----+----+----+----+ +----+----+----+----+ +----+----+----+----+ 1 |*lf | 0 | 0 | 0 | +----+----+----+----+ ... +----+----+----+----+ lf | 0 | 1 | 0 | 0 | +----+----+----+----+
The heap is now ready to complete the first malloc operation.
Operation of malloc when we have reached the end of the free list and there is no block large enough to accommodate the request.
This happens at the very first malloc operation, or any time the freelist is traversed and no free block large enough for the request isfound.
The current block pointer will be at the end of the free list, and weknow we're at the end of the list because the nf index is 0, like this:
BEFORE AFTER +----+----+----+----+ +----+----+----+----+pf |*?? | ?? | cf | ?? | pf |*?? | ?? | lf | ?? | +----+----+----+----+ +----+----+----+----+ ... ... +----+----+----+----+ +----+----+----+----+ p | cf | ?? | ... | p | cf | ?? | ... | +----+----+----+----+ +----+----+----+----+ +----+----+----+----+ +----+----+----+----+ cf | 0 | p | 0 | pf | c | lf | p | ... | +----+----+----+----+ +----+----+----+----+ +----+----+----+----+ lf | 0 | cf | 0 | pf | +----+----+----+----+
As we walk the free list looking for a block of size b or larger, we getto cf, which is the last item in the free list. We know this because thenext index is 0.
So we're going to turn cf into the new block of memory, and then createa new block that represents the last free entry (lf) and adjust the previndex of lf to point at the block we just created. We also need to adjustthe next index of the new block (c) to point to the last free block.
Note that the next free index of the pf block must point to the new lfbecause cf is no longer a free block!
Operation of malloc when we have found a block (cf) that will fit the current request of b units exactly
This one is pretty easy, just clear the free list bit in the currentblock and unhook it from the free list.
BEFORE AFTER +----+----+----+----+ +----+----+----+----+pf |*?? | ?? | cf | ?? | pf |*?? | ?? | nf | ?? | +----+----+----+----+ +----+----+----+----+ ... ... +----+----+----+----+ +----+----+----+----+ p | cf | ?? | ... | p | cf | ?? | ... | +----+----+----+----+ +----+----+----+----+ +----+----+----+----+ +----+----+----+----+ Clear the freecf |* n | p | nf | pf | cf | n | p | .. | list bit here +----+----+----+----+ +----+----+----+----+ +----+----+----+----+ +----+----+----+----+ n | ?? | cf | ... | n | ?? | cf | ... | +----+----+----+----+ +----+----+----+----+ ... ... +----+----+----+----+ +----+----+----+----+nf |*?? | ?? | ?? | cf | nf | ?? | ?? | ?? | pf | +----+----+----+----+ +----+----+----+----+
Unhooking from the free list is accomplished by adjusting the next andprev free list index values in the pf and nf blocks.
Operation of malloc when we have found a block that will fit the current request of b units with some left over
We'll allocate the new block at the END of the current free block so wedon't have to change ANY free list pointers.
BEFORE AFTER +----+----+----+----+ +----+----+----+----+pf |*?? | ?? | cf | ?? | pf |*?? | ?? | cf | ?? | +----+----+----+----+ +----+----+----+----+ ... ... +----+----+----+----+ +----+----+----+----+ p | cf | ?? | ... | p | cf | ?? | ... | +----+----+----+----+ +----+----+----+----+ +----+----+----+----+ +----+----+----+----+cf |* n | p | nf | pf | cf |* c | p | nf | pf | +----+----+----+----+ +----+----+----+----+ +----+----+----+----+ This is the new c | n | cf | .. | block at cf+b +----+----+----+----+ +----+----+----+----+ +----+----+----+----+ n | ?? | cf | ... | n | ?? | c | ... | +----+----+----+----+ +----+----+----+----+ ... ... +----+----+----+----+ +----+----+----+----+nf |*?? | ?? | ?? | cf | nf | ?? | ?? | ?? | pf | +----+----+----+----+ +----+----+----+----+
This one is prety easy too, except we don't need to mess with thefree list indexes at all becasue we'll allocate the new block at theend of the current free block. We do, however have to adjust theindexes in cf, c, and n.
That covers the initialization and all possible malloc scenarios, so nowwe need to cover the free operation possibilities...
The operation of free depends on the position of the current block beingfreed relative to free list items immediately above or below it. The codeworks like this:
if next block is free assimilate with next block already on free listif prev block is free assimilate with prev block already on free listelse put current block at head of free list
Step 1 of the free operation checks if the next block is free, and if itis assimilate the next block with this one.
Note that c is the block we are freeing up, cf is the free block thatfollows it.
BEFORE AFTER +----+----+----+----+ +----+----+----+----+pf |*?? | ?? | cf | ?? | pf |*?? | ?? | nf | ?? | +----+----+----+----+ +----+----+----+----+ ... ... +----+----+----+----+ +----+----+----+----+ p | c | ?? | ... | p | c | ?? | ... | +----+----+----+----+ +----+----+----+----+ +----+----+----+----+ +----+----+----+----+ This block is c | cf | p | ... | c | nn | p | ... | disconnected +----+----+----+----+ +----+----+----+----+ from free list, +----+----+----+----+ assimilated withcf |*nn | c | nf | pf | the next, and +----+----+----+----+ ready for step 2 +----+----+----+----+ +----+----+----+----+nn | ?? | cf | ?? | ?? | nn | ?? | c | ... | +----+----+----+----+ +----+----+----+----+ ... ... +----+----+----+----+ +----+----+----+----+nf |*?? | ?? | ?? | cf | nf |*?? | ?? | ?? | pf | +----+----+----+----+ +----+----+----+----+
Take special note that the newly assimilated block (c) is completelydisconnected from the free list, and it does not have its free listbit set. This is important as we move on to step 2 of the procedure...
Step 2 of the free operation checks if the prev block is free, and if itis then assimilate it with this block.
Note that c is the block we are freeing up, pf is the free block thatprecedes it.
BEFORE AFTER +----+----+----+----+ +----+----+----+----+ This block haspf |* c | ?? | nf | ?? | pf |* n | ?? | nf | ?? | assimilated the +----+----+----+----+ +----+----+----+----+ current block +----+----+----+----+ c | n | pf | ... | +----+----+----+----+ +----+----+----+----+ +----+----+----+----+ n | ?? | c | ... | n | ?? | pf | ?? | ?? | +----+----+----+----+ +----+----+----+----+ ... ... +----+----+----+----+ +----+----+----+----+nf |*?? | ?? | ?? | pf | nf |*?? | ?? | ?? | pf | +----+----+----+----+ +----+----+----+----+
Nothing magic here, except that when we're done, the current block (c)is gone since it's been absorbed into the previous free block. Note thatthe previous step guarantees that the next block (n) is not free.
Step 3 of the free operation only runs if the previous block is not free.it just inserts the current block to the head of the free list.
Remember, 0 is always the first block in the memory heap, and it's alwayshead of the free list!
BEFORE AFTER +----+----+----+----+ +----+----+----+----+ 0 | ?? | ?? | nf | 0 | 0 | ?? | ?? | c | 0 | +----+----+----+----+ +----+----+----+----+ ... ... +----+----+----+----+ +----+----+----+----+ p | c | ?? | ... | p | c | ?? | ... | +----+----+----+----+ +----+----+----+----+ +----+----+----+----+ +----+----+----+----+ c | n | p | .. | c |* n | p | nf | 0 | +----+----+----+----+ +----+----+----+----+ +----+----+----+----+ +----+----+----+----+ n | ?? | c | ... | n | ?? | c | ... | +----+----+----+----+ +----+----+----+----+ ... ... +----+----+----+----+ +----+----+----+----+nf |*?? | ?? | ?? | 0 | nf |*?? | ?? | ?? | c | +----+----+----+----+ +----+----+----+----+
Again, nothing spectacular here, we're simply adjusting a few pointersto make the most recently freed block the first item in the free list.
That's because finding the previous free block would mean a reversetraversal of blocks until we found a free one, and it's just easier toput it at the head of the list. No traversal is needed.
Finally, we can cover realloc, which has the following basic operation.
The first thing we do is assimilate up with the next free block ofmemory if possible. This step might help if we're resizing to a biggerblock of memory. It also helps if we're downsizing and creating a newfree block with the leftover memory.
First we check to see if the next block is free, and we assimilate itto this block if it is. If the previous block is also free, and ifcombining it with the current block would satisfy the request, then weassimilate with that block and move the current data down to the newlocation.
Assimilating with the previous free block and moving the data workslike this:
BEFORE AFTER +----+----+----+----+ +----+----+----+----+pf |*?? | ?? | cf | ?? | pf |*?? | ?? | nf | ?? | +----+----+----+----+ +----+----+----+----+ ... ... +----+----+----+----+ +----+----+----+----+cf |* c | ?? | nf | pf | c | n | ?? | ... | The data gets +----+----+----+----+ +----+----+----+----+ moved from c to +----+----+----+----+ the new data area c | n | cf | ... | in cf, then c is +----+----+----+----+ adjusted to cf +----+----+----+----+ +----+----+----+----+ n | ?? | c | ... | n | ?? | c | ?? | ?? | +----+----+----+----+ +----+----+----+----+ ... ... +----+----+----+----+ +----+----+----+----+nf |*?? | ?? | ?? | cf | nf |*?? | ?? | ?? | pf | +----+----+----+----+ +----+----+----+----+
Once we're done that, there are three scenarios to consider:
The current block size is exactly the right size, so no more work isneeded.
The current block is bigger than the new required size, so carve offthe excess and add it to the free list.
The current block is still smaller than the required size, so malloca new block of the correct size and copy the current data into the newblock before freeing the current block.
The only one of these scenarios that involves an operation that has notyet been described is the second one, and it's shown below:
BEFORE AFTER +----+----+----+----+ +----+----+----+----+ p | c | ?? | ... | p | c | ?? | ... | +----+----+----+----+ +----+----+----+----+ +----+----+----+----+ +----+----+----+----+ c | n | p | ... | c | s | p | ... | +----+----+----+----+ +----+----+----+----+ +----+----+----+----+ This is the s | n | c | .. | new block at +----+----+----+----+ c+blocks +----+----+----+----+ +----+----+----+----+ n | ?? | c | ... | n | ?? | s | ... | +----+----+----+----+ +----+----+----+----+
Then we call free() with the adress of the data portion of the newblock (s) which adds it to the free list.
About
Memory Manager For Small(ish) Microprocessors
Resources
License
Uh oh!
There was an error while loading.Please reload this page.
Stars
Watchers
Forks
Packages0
Uh oh!
There was an error while loading.Please reload this page.
Contributors10
Uh oh!
There was an error while loading.Please reload this page.