This is my first C program I just wrote in like 30 minutes and I was wondering if you could give me any pointers on ways to improve this. I'm decent at programming (around 5 years experience), but have no experience with C so I imagine I'm not following the proper idioms and/or doing things properly.
Note: I didn't make it "generic", because learning templates was beyond the scope of this little adventure.
#include <assert.h>#include <stddef.h>#include <stdlib.h>#include <string.h>/** A dynamically sized array of `int` */typedef struct { size_t capacity; size_t length; int *data;} DynamicArray;/** Appends `value` to the `DynamicArray` */void dynamic_array_push(DynamicArray *arr, int value) { if (arr->length == arr->capacity) { int *new_data = malloc(arr->capacity * 2 * sizeof(int)); memcpy(new_data, arr->data, arr->capacity * sizeof(int)); free(arr->data); arr->data = new_data; arr->capacity *= 2; } arr->data[arr->length] = value; arr->length++;}/** Cleans up the `DynamicArray`. */void dynamic_array_cleanup(DynamicArray *arr) { free(arr->data); }DynamicArray dynamic_array_new(size_t cap) { DynamicArray arr = { .capacity = cap, .length = 0, .data = malloc(cap * sizeof(int)), }; return arr;}int main() { DynamicArray arr = dynamic_array_new(2); // Putting in some test values for (int i = 0; i < 30; i++) { dynamic_array_push(&arr, i); } // Testing that the data came in correctly for (int i = 0; i < arr.length; i++) { assert(i == arr.data[i]); } dynamic_array_cleanup(&arr); return 0;}- 6\$\begingroup\$C doesn't have templates, that's only in C++.\$\endgroup\$G. Sliepen– G. Sliepen2023-12-12 23:07:21 +00:00CommentedDec 12, 2023 at 23:07
- 6\$\begingroup\$A quick and dirty way of turning this into kind of generic is to wrap it in a macro that takes a type and the you can easily create
DynamicArrayfor any type you like.\$\endgroup\$Ajinkya Kamat– Ajinkya Kamat2023-12-12 23:42:29 +00:00CommentedDec 12, 2023 at 23:42 - 3\$\begingroup\$Potential overflow in
cap * sizeof(int)is a vulnerability.\$\endgroup\$Ben Voigt– Ben Voigt2023-12-13 16:11:23 +00:00CommentedDec 13, 2023 at 16:11 - 2\$\begingroup\$@G.Sliepen C does have
genericwhich can be used in a C++ template-like fashion.\$\endgroup\$chux– chux2023-12-14 22:29:18 +00:00CommentedDec 14, 2023 at 22:29 - \$\begingroup\$I know you're not supposed to have like "thank you" comments but seriously guys thanks for the overwhelming number of extremely helpful comments and answers. I wish I could accept multiple answers! This is honestly the best received post I've had across all of stackexchange and it has really made me happy!\$\endgroup\$Zachiah– Zachiah2023-12-15 15:54:38 +00:00CommentedDec 15, 2023 at 15:54
5 Answers5
Thank you for the very nice Doxygen comment.Consider renaming the struct toDynamicIntArray,and then I won't even have to read the comment.Or perhaps in the use case you're tackling, there is a conceptin the business domain, such as Widget, that can berepresented as an int. Then we'd have aDynamicWidgetArray.
magic number
Some code bases, such asBIND9,choose to start (some of) their structswith a distinctmagic number.This is a debugging aid, so library code can choosetoassert that an object of the expected type was passed in.
It's completely optional, but you might consider adopting such a practice.It tends to make debugging withgdb a bit easier.
libc function reimplemented
int *new_data = malloc(arr->capacity * 2 * sizeof(int)); memcpy(new_data, arr->data, arr->capacity * sizeof(int)); free(arr->data); arr->data = new_data;This is duplicating the functionality of arealloc()call.For the Author's sake, may as well just reuse a well -tested, -documented function.For the Gentle Reader's sake, write a single line instead of several, using a familiarcontract.
cleanup
Indynamic_array_cleanup the free() is very nice.Consider additionally zeroing out those three fields,so we don't have a dangling pointer that temptssome application to try a use-after-free blunder.Certainly it is accurate thatlength andcapacityare zero after the free().
Or write0xDeadBeefinto them, for the benefit of someone examining memory withgdb.
public API
You have defined a Public API that is broken -- it is not fit for purpose.
Sometimes malloc() returns NULL.I know, it's sad.But it's a fact of life.One which a library author must cope with.And C doesn't offer error reporting via exceptions.
Each time you make a call, your code must verify that malloc() succeeded.You should either document that malloc fail will tear downthe calling process, or you should return its status to the caller,so now it is the caller's problem to decide what to do.
mulitply by zero
If caller fails to supply new() with a positive numberthen Bad Things can happen.Recycling an old struct after free() can also run afoul of this.@MatthieuM. observes that
the growth function does not work with cap == 0, because 0 * 2 == 0. max(cap * 2, 1) will work.
Evencap * 2 + 1 would suffice.
- \$\begingroup\$Thanks for the advice. Zeroing out the fields makes sense. In Rust I would have just taken ownership of the object if I needed to make a cleanup function and that would solve this, so I'm not used to thinking about it haha. With the magic number thing you are saying each struct would have a field at the beginning with the number and then we can essentially polymorphically check the magic number to check the type after receiving a pointer to verify it is of the correct type? That's very interesting... As for the malloc stuff yeah I completely forgot about that...\$\endgroup\$Zachiah– Zachiah2023-12-13 15:58:49 +00:00CommentedDec 13, 2023 at 15:58
- 4\$\begingroup\$
malloc/copy/freeisn't just reimplementingrealloc, it's forcing the worst-case behaviour. The actualreallocfunction isn't required to suck the way C++std::vectoris (unless libraries + compilers are very clever in working around the C++ allocator API), especially for large reallocations.\$\endgroup\$Peter Cordes– Peter Cordes2023-12-14 01:33:31 +00:00CommentedDec 14, 2023 at 1:33 - 3\$\begingroup\$I don't feel like adding an answer for just one tiny remark, so I would appreciate if you were to include it in your answer: the growth function does not work with
cap == 0, because0 * 2 == 0.max(cap * 2, 1)will work.\$\endgroup\$Matthieu M.– Matthieu M.2023-12-14 08:23:36 +00:00CommentedDec 14, 2023 at 8:23
Userealloc - it can be much more efficient than malloc+copy+free
Unlike C++, C's allocator API doesn't totally suck: it hasrealloc.
(Unfortunately there's no portablealigned_realloc, butmalloc/realloc give you memory that's sufficiently for any standard type up tomax_align_t. MSVC has_aligned_realloc and_aligned_calloc, but I'm not aware of a POSIX or GNU equivalent.)
If there's free memory after the end of the existing allocation,realloc can just grow the existing allocation without having to copy anything. Or if the C library knows how to use a system call like Linuxmremap(MREMAP_MAYMOVE) (see theman page), it can move it to a new virtual address without actually copying the physical pages. (This probably only works if the allocation started at the beginning of a page, but that's what glibcmalloc does for large allocations. Actually it keeps the first 16 bytes of the page for bookkeeping data, so unfortunately you get a pointer that's aligned by 16 but misaligned by 32 and wider. But there isn't unrelated data thatmremap would unmap from where it's supposed.) Glibcrealloc does in fact usemremap(MREMAP_MAYMOVE); I checked withstrace ./a.out after bumping up the loop count for your testmain to 2 billion. Hopefully othermalloc implementations are similarly smart on platforms they use.
Besides the copying time, the other downside of alloc+copy that is that you need both copies allocated at once. For a 5 GiB array of int, you'd temporarily have 10 GiB of "dirty" anonymous pages, forcing the OS to find a lot of extra pages. At best just dropping some potentially-useful disk cache, at worst thrashing swap space as it pages out pages from other processes.
Callingrealloc is at worst like manually callingmalloc/memcpy/free, but can bemuch better, especially for large allocations. Don't fall into the same trap that C++std::vector is stuck in, of having to always alloc+copy+dealloc when growing.
As a GNU extension, there's areallocarray which checks for multiply overflow likecalloc does (but it doesn't zero the newly-allocated space when growing). (Available in glibc 2.26. OpenBSD 5.6, FreeBSD 11.0.) Portable C should do that manually.
Some implementations, including GCC,limit the max object size to PTRDIFF_MAX bytes. If you're going to check manually, that's probably a good limit, unless you really care about bending the implementation limits and having arrays of 2GiB or larger in 32-bit programs for example, right up toSIZE_MAX. (Glibcmalloc on 32-bit x86 Linux will allocate arrays larger than 2GiB if you ask it to, for example. Under a 64-bit kernel you have the full 4GiB of address-space. Subtractingint* first subtracts the raw pointers and then does an arithmetic right shift to scale bysizeof(int) (Godbolt), so you could get a negativeptrdiff_t instead of e.g. 1 billion, but probably nothing breaks if you don't run any code that relies on that.)
@chux points out thatptrdiff_t could be wider thansize_t in an implementation that wants to allow objects up to SIZE_MAX bytes. (Such an implementation would then have to effectively widen pointers before subtracting them and dividing or right shifting, perhaps with the help of the carry flag on ISAs that have one.)
So perhaps#define DYNAMIC_ARRAY_MAXBYTES (SIZE_MAX/2) as a conservative lower bound on implementation-defined limits. I think if you're going to useSIZE_MAX/2, there's no point involvingPTRDIFF_MAX, since I doubt there are any systems whereptrdiff_t is narrower thansize_t but the max object size is stillPTRDIFF_MAX. If you want to be paranoid about that,#define DYNAMIC_ARRAY_MAXBYTES ((PTRDIFF_MAX <= SIZE_MAX/2) ? PTRDIFF_MAX : SIZE_MAX/2). A C implementation could arbitrarily set a smaller limit on max object sizes if it wanted, but there's no standard macro to expose that.
With overflow and allocation-failure checking
#include <stdint.h>#include <stdbool.h>bool dynamic_array_push(DynamicArray *arr, int value){ size_t cap = arr->capacity; if (cap == arr->length) { cap *= 2; // can't overflow if size_t is at least as wide as ptrdiff_t because it's unsigned. // But that's not technically guaranteed, so maybe check cap < PTRDIFF_MAX / sizeof(int) / 2 before doubling. (sizeof(int) == 1 is possible on some DSPs) if (cap > PTRDIFF_MAX / sizeof(int)) { cap = PTRDIFF_MAX / sizeof(int); // grow to the max object size for typical C implementations if (cap <= arr->length) return false; } int *new_data = realloc(arr->data, cap*sizeof(int)); if (!new_data) { return false; // original data is still there, but we didn't grow } arr->data = new_data; arr->capacity = cap; } arr->data[arr->length] = value; arr->length++; return true;}I used the same constant twice (PTRDIFF_MAX / sizeof(int)) instead of checking againstPTRDIFF_MAX / sizeof(int) / 2 before doubling the capacity. This hopefully helps compilers only put one 64-bit constant into a register like x86-64 GCC does (Godbolt).
In your exiting API which returnsvoid, you could callabort() or something on failure, killing the whole process. That could be a appropriate for a simplistic program, but perhaps better to provide that behaviour via a wrapper likedynamic_array_push_or_abort so the documentation of the failure-case behaviour is right there in code using it, easy to search for and start replacing it on a case-by-case basis in a codebase that decides it wants to sometimes do something else.
A microbenchmark: over 2x faster for huge arrays
For growing the array from size 3 up to 2 billionints (8 GiB) and then reading them back,usingrealloc is more than twice as fast on my system, with 70x fewer page faults, 9x fewer TLB misses. (All the page faults are "soft", not paging to disk. TLB misses is only counting second-level TLB misses that cause page-walks.) A good fraction of the extra time is spent in the kernel, zeroing new pages
I timed this vs. the old version on my i7-6700k Skylake with dual-channel DDR4-2666 RAM, running Arch Linux, kernel 6.5, compiled with GCC 13.2.1-O3 -fno-plt (no-march options, although it doesn't use any new instructions even if compiled with-march=native. The branching on size for every push defeats auto-vectorization of the fill pattern.) Myenergy_performance_preference isbalance_performance so the CPU only runs 3.9GHz.Transparent hugepages are enabled (as with the default),defrag isdefer+madvise (and we're not makingmadvise calls, although my system has 32 GiB RAM, much of it free, so hopefully it is making hugepages.)
$ taskset -c 1 perf stat -etask-clock,context-switches,cpu-migrations,page-faults,cycles,instructions,uops_issued.any,dtlb_load_misses.miss_causes_a_walk,dtlb_store_misses.miss_causes_a_walk ./dynarray-slow Performance counter stats for './dynarray-slow': 5,616.15 msec task-clock # 0.999 CPUs utilized 17 context-switches # 3.027 /sec 0 cpu-migrations # 0.000 /sec 779,398 page-faults # 138.778 K/sec 21,628,006,730 cycles # 3.851 GHz 31,442,121,299 instructions # 1.45 insn per cycle 29,540,890,921 uops_issued.any # 5.260 G/sec 1,710,792 dtlb_load_misses.miss_causes_a_walk # 304.620 K/sec 16,175,526 dtlb_store_misses.miss_causes_a_walk # 2.880 M/sec 5.624499953 seconds time elapsed 3.514993000 seconds user 2.093666000 seconds sys$ taskset -c 1 perf stat -etask-clock,context-switches,cpu-migrations,page-faults,cycles,instructions,uops_issued.any,dtlb_load_misses.miss_causes_a_walk,dtlb_store_misses.miss_causes_a_walk ./dynarray-fast Performance counter stats for './dynarray-fast': 2,459.69 msec task-clock # 1.000 CPUs utilized 8 context-switches # 3.252 /sec 0 cpu-migrations # 0.000 /sec 10,601 page-faults # 4.310 K/sec 9,517,533,410 cycles # 3.869 GHz 27,308,580,581 instructions # 2.87 insn per cycle 22,551,679,420 uops_issued.any # 9.169 G/sec 1,831,019 dtlb_load_misses.miss_causes_a_walk # 744.412 K/sec 116,378 dtlb_store_misses.miss_causes_a_walk # 47.314 K/sec 2.459971145 seconds time elapsed 1.574258000 seconds user 0.881689000 seconds sysThe testmain is the same for both. The starting point of3 means we're about half way between growth points when we exit. (I was just randomly playing with different values for that, this isn't a recommendation, just what I happened to test with.)
int main(void){ DynamicArray arr = dynamic_array_new(3); // Putting in some test values for (int i = 0; i < 2000000000; i++) { dynamic_array_push(&arr, i); } // Testing that the data came in correctly for (int i = 0; i < arr.length; i++) { assert(i == arr.data[i]); } dynamic_array_cleanup(&arr); return 0;}The system-calls it makes are:
... slow versionbrk(NULL) = 0x560b599ad000brk(0x560b599ce000) = 0x560b599ce000brk(0x560b599fe000) = 0x560b599fe000mmap(NULL, 200704, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f3717dd9000brk(0x560b599ce000) = 0x560b599ce000mmap(NULL, 397312, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f3717b87000munmap(0x7f3717dd9000, 200704) = 0mmap(NULL, 790528, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f3717ac6000munmap(0x7f3717b87000, 397312) = 0...... fast versionbrk(NULL) = 0x5567b45c9000brk(0x5567b45ea000) = 0x5567b45ea000mmap(NULL, 200704, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7fbc9f7d4000 # switch to mmap, will have to copy this timemremap(0x7fbc9f7d4000, 200704, 397312, MREMAP_MAYMOVE) = 0x7fbc9f582000mremap(0x7fbc9f582000, 397312, 790528, MREMAP_MAYMOVE) = 0x7fbc9f4c1000mremap(0x7fbc9f4c1000, 790528, 1576960, MREMAP_MAYMOVE) = 0x7fbc9f340000mremap(0x7fbc9f340000, 1576960, 3149824, MREMAP_MAYMOVE) = 0x7fbc9f03f000...Glibc's defaultmalloc starts with small allocations from the break (brk). This presumably is just extending onrealloc, vs. alloc+copy leading to some heap-shifting (as @Davislor's answer describes) leaving some allocated + dirty space on the free list, not given back to the OS. (But which other small allocations could use.) So that's why we see twobrk allocations (after the initialbrk(NULL) to find the break address) from the slow version vs. only one from therealloc version before switching over tommap (and copying this one time, at size 1.5MiB I suppose).
Interestingly,mremap does keep moving the virtual address, forcing TLB invalidation for the existing part of the array. With our workload that's append-only until the end, those invalidations don't cause extra TLB misses except in the final read-through after the last growth (when it's so big the start of the array would probably already have been evicted from the TLBs in this case). And it's a single-threaded process so there's no need for TLB shootdowns (inter-processor interrupts) to other cores. The downside could be bigger in other cases.
It might be interesting to check/proc/<PID>/maps to see if there was free virtual address-space after the mapping. If not, we're seeing the Heap-Shifting effect that @Davislor's answer talked about, except in reverse. If there is space, it's a missed optimization not to use it.
Reducing the growth factor to 1.5 for sizes above 512 MiB or something could make sense, especially on platforms like GNU/Linux that are known to have an efficientrealloc which can use page-table tricks. Or depending on your use-case, even make growth linear instead of exponential above some threshold like 1 GiB or 1 billionint's. The amount of work to update N page-table entries is still O(N), but the constant factor istiny. But more branching for decision making on growth will bloat the code at each use ofpush, if it inlines like you want it to for efficiency in the non-growing case.
Partial inlining is possible:
static inline // in the .hbool dynamic_array_push(DynamicArray *arr, int value) { if (arr->capacity == arr->length) { return dynamic_array_push___growth_needed(arr, value); // defined in a separate .c file } arr->data[arr->length] = value; arr->length++; return true;}This keeps machine-code size small at each call-site. It makes every caller a non-leaf function, but that was already true because the growth path calledrealloc.
We could have just split the growth code into a separate function, not duplicating thearr->data[arr->length] = value; /arr->length++; there and not passingvalue as an arg. But it's common topush values that are temporary at the call-site and not needed after the push. Passing it as an arg lets the caller avoid needing acall-preserved register for it, instead just computing it in a call-clobbered register since it's dead after the function call. (I haven't actually looked at an example caller in a loop to see what difference it makes. This makes the inlined store / increment code conditional on the inlined compare rather than the return value of the__growth_needed slowpath, which I expect is probably about equal. A simplistic caller that just pushes random numbers or loops another array might not be very representative anyway.)
- \$\begingroup\$
cap > PTRDIFF_MAX / sizeof(int)is the wrong test. Considerptrdif_tmay have 2x the number of bits assize_t. Usecap > SIZE_MAX / sizeof(int).prtdiff_tnot needed in this task.\$\endgroup\$chux– chux2023-12-14 18:00:01 +00:00CommentedDec 14, 2023 at 18:00 - \$\begingroup\$@chux-ReinstateMonica: In theory
ptrdiff_tcould be wider, but it seems unrealistic unless some implementation defines the behaviour of pointer subtraction for pointers to different objects, but limits size of any single object to much smaller than the address-space. (Or something with non-flat memory models). Perhapsmin(PTRDIFF_MAX, SIZE_MAX), since there are implementations (such as GCC) where the implementation-defined max object size is PTRDIFF_MAX bytes. (This means the compiler can assume pointer subtraction never has signed overflow, including beforesarfor sizeof>1)\$\endgroup\$Peter Cordes– Peter Cordes2023-12-15 23:01:50 +00:00CommentedDec 15, 2023 at 23:01 - \$\begingroup\$Awider
ptrdiff_tthansize_tinsures pointer subtraction never has signed overflow, even with objects near sizeSIZE_MAX.\$\endgroup\$chux– chux2023-12-16 02:37:40 +00:00CommentedDec 16, 2023 at 2:37
A few points (without repeating thatrealloc() would fulfill the need better than a DIY version, or other points in current answers.)
int *new_data = malloc(arr->capacity * 2 * sizeof *new_data);would be appropriate if/when the array is to become an array oflongofdouble. Less likely to overlook changing thesizeof(int)thereby creating a hard-to-find bug....push()needs to return success or failure, and that return code needs to be checked by the caller.This code mixes passing an entire
struct(fromdynamic_array_new()) with passing apointer to thestruct. I'd prefer to see code following one practice or the other; not mixing both.(nit-picking) Better to use
size_t(instead ofint) when coding an array index (as in the loops ofmain()).Read the part of the
realloc()documentation that says something like "acts just likemalloc()if original pointer is NULL." In other words, there's no need for an initial allocation in..._new(). Just let the first..._push()with a NULL pointer perform the initial allocation. Maybe the caller will only create 1 element in the array. You probably already sense that2is an arbitrary initial allocation.
PS: Revising to use #5, above, would obviate #3. When you want/need an array, simply:
array_t arrN = { 0 };would do the job. All struct members are set to zero (or NULL), and the first..._push() handles initialising the length (grows by 1), capacity (set to 1, then doubles on each successive call) and thedata pointer is initialised then used. Simple...
- \$\begingroup\$Using
void *new_data = ...instead ofint *new_data = ...is less maintenance as it supports the "Less likely to overlook changing the sizeof(int)...." concern.\$\endgroup\$chux– chux2023-12-14 19:07:12 +00:00CommentedDec 14, 2023 at 19:07 - \$\begingroup\$@chux-ReinstateMonica Yes, that's an alternative. I've tended to prefer the datatype "over on the left" and
sizeof var_nameas being easier to see. Is there an advantage (that I'm not seeing) in the version you suggest? Cheers! :-)\$\endgroup\$user272752– user2727522023-12-14 20:40:23 +00:00CommentedDec 14, 2023 at 20:40 - 2\$\begingroup\$This answer advocates "Less likely to overlook changing the sizeof(int) thereby creating a hard-to-find bug." By using
void *on-the-left, we have one less piece of code to update should the type of member.datachange by similar reasoning.\$\endgroup\$chux– chux2023-12-14 22:44:01 +00:00CommentedDec 14, 2023 at 22:44 - 1\$\begingroup\$@chux-ReinstateMonica Thanks :-) I was still in
malloc()mindset, notrealloc(), when writing that comment... Thank you:-)...\$\endgroup\$user272752– user2727522023-12-14 22:52:28 +00:00CommentedDec 14, 2023 at 22:52 - 1\$\begingroup\$I should have suggested
void *new_data = realloc(arr->data, sizeof arr->data[0] * arr->capacity * 2);for a more complete idea.\$\endgroup\$chux– chux2023-12-15 12:20:07 +00:00CommentedDec 15, 2023 at 12:20
Check for Overflow
At present, the algorithm to enlarge the vector does not detect if the capacity wraps around. It is actually possible, on some architectures, to overflow asize_t before getting an out-of-memory error (typically by allocating more than 2GiB on a 32-bit memory model). Either compare the old size toSIZE_MAX/2, or (since it is unsigned, and overflow is guaranteed to truncate) check whether the new size is smaller than the old.
Beware of Heap-Shifting
Let’s say you’re working with big data and read the first megabyte into the vector. So far, so good. You’ve got plenty of memory in the heap. You run out of room and need to resize it. The algorithm checks the heap and finds one free block, past the end of the first one you allocated. It allocates 2 MB of that copies the entire array to its new location, and frees the original 1 MB.
You have more than 2 MB of data, so you need to resize again. The algorithm checks and finds one free block 1 MB in size, then the allocated block of 2 MB, then a big free block. It needs 4 MB, which is more than 1 MB, so it allocates 4 MB to the right of the current block, copies the entire array to that, then frees the 2 MB block. You now have a 3 MB free block followed by a 4 MB allocated block. You need to resize again. You now need 8 MB, which is too big for your first free block, so you move the array over to the right again and free the previous block to get a 7 MB one. This will keep happening each time, and you will never re-use any of the memory you freed. And copying a large array is expensive!
What happens if you increase the size by only 50% each time, instead of 100%? After five iterations of this, you will have a free block 8.125 times the size of your original, and you will only need one 7.59 times as large, so you will be able to go back to the beginning of the heap, not keep expanding it further and further to the right.
Of course, the real solution here is to userealloc(), which can detect if the current block can be enlarged without moving it. But that is a problem to be aware of, especially when using C++-style memory management without this feature.
- \$\begingroup\$The way you explained why
reallocis better made a lot of sense. Seems great to reuse the existing memory without moving if it can be done. Thanks!\$\endgroup\$Zachiah– Zachiah2023-12-14 18:01:47 +00:00CommentedDec 14, 2023 at 18:01 - \$\begingroup\$3 comments: 1) Good explanation. 2) Need to start with allocating
2to gain traction for 50% increases. 3) "Grow without shifting" may only apply to 'pure' cases of accessing heap. Intervening code (like an innocuousprintf()) may lead to having to relocate the block anyway.\$\endgroup\$user272752– user2727522023-12-14 22:49:47 +00:00CommentedDec 14, 2023 at 22:49 - \$\begingroup\$Unless you keep large chunks in the free-list (instead of returning them to the OS), that virtual address-space "at the start of the heap" will be unmapped for a while before you use it again. It's likely that different physical pages will be allocated to map those virtual pages when you map them again, so the reuse doesn't help with L3 cache hits. (And doesn't help with TLB hits since those TLB entries had to get invalidate on unmap).\$\endgroup\$Peter Cordes– Peter Cordes2023-12-15 23:44:41 +00:00CommentedDec 15, 2023 at 23:44
- \$\begingroup\$If you're worried about not being able to grow to more than half of virtual address-space with one such array that's the only thing you're allocating, then yeah, a good realloc, e.g. that can use
mremap(MAYMOVE)or equivalent is important, like I showed in my answer, avoiding double-allocation and the actual memory traffic for copying. Or like you said by growing an existing allocation if there's nothing after it. (Or if there's enough space total before+after an allocation, allocate +memmove). But smaller growth factors don't seem great if you typically still copy every time.\$\endgroup\$Peter Cordes– Peter Cordes2023-12-15 23:48:09 +00:00CommentedDec 15, 2023 at 23:48
Clarifyingsize_t overflow concerns, etc.
Test before scaling, watch out for a 0 capacity, userealloc() and size to the referenced object, not type, check for allocation success:
// Prevent overflowif (arr->capacity > SIZE_MAX/2/sizeof arr->data[0]) { // Handle overflow with TBD code}size_t new_capacity = (arr->capacity > 0) ? arr->capacity * 2 : 1;void *new_data = realloc(arr->data, new_capacity * sizeof arr->data[0]);if (new_data == NULL) { // Handle out-of-memory with TBD code}arr->data = new_data;arr_capacity = new_capacity;You mustlog in to answer this question.
Explore related questions
See similar questions with these tags.


