I implemented a basic dynamic array using C++.
When deleting an element and shifting the others, the last element will be twice repeated (the original still exists after copying). I want to avoid memory leaking by deallocating the last element of the array. Currently, I do this by shrinking once after a number of deletions. Are there better solutions?
Finally, I am interested in some feedback on this implementation.
#include <iostream>#include <stdexcept>using namespace std;class DynamicArray {protected: int size; int* arr; int length; void grow() { copyToNewSize(size * 2); } void shrink() { copyToNewSize(size / 2); } void copyToNewSize(int newSize) { int* temp = new int[newSize]; if (newSize > size) for (int i = 0; i < size; i++) { temp[i] = arr[i]; } else for (int i = 0; i < newSize; i++) { temp[i] = arr[i]; } delete[] arr; size = newSize; arr = temp; } void checkIndex(int index) { if (index >= length || index < 0) throw "Index is out of the array range!"; }public: DynamicArray(int startingSize) { size = startingSize; arr = new int[size]; length = 0; } ~DynamicArray() { delete[] arr; } int push(int value) { length++; if (length == (size + 1)) grow(); arr[length - 1] = value; return length; } int pop() { length--; int value = arr[length]; // memory leaking , delete arr[length] if (length <= size / 2) shrink(); return value; } int insert(int index, int value) { checkIndex(index); length++; if (length == (size + 1)) grow(); for (int i = length - 1; i > index; i--) { arr[i] = arr[i - 1]; } arr[index] = value; return length; } int remove(int index) { checkIndex(index); int value = arr[index]; length--; for (int i = index; i < length; i++) arr[i] = arr[i + 1]; // memory leaking , delete arr[length] if (length <= (size / 2)) shrink(); return value; } int get(int index) { checkIndex(index); return arr[index]; } void set(int index, int value) { checkIndex(index); arr[index] = value; } int search(int value) { for (int i = 0; i < length; i++) { if (arr[i] == value) return i; } return -1; } int getLength() { return length; }}- \$\begingroup\$Do you have any specific reason for not using
std::vector<int>or one of the other standard containers?\$\endgroup\$jdt– jdt2021-09-20 11:54:32 +00:00CommentedSep 20, 2021 at 11:54 - 14\$\begingroup\$@JohanduToit Yes, learning to code.\$\endgroup\$Swedgin– Swedgin2021-09-20 12:38:29 +00:00CommentedSep 20, 2021 at 12:38
3 Answers3
First of all, there is a major problem with your class: it does not respect therule of three/five/zero:
it has a non trivial destructor that destroys memory allocated in constructor
it neither defines nor deletes the copy constructor and assignment operatorLet us look at the following code:
DynamicArray dynarr[16]; // initialize values in dynarr if (...) { // opens a nested block DynamicArray dynarr2 = dynarr; // copy construct the new DynamicArray (1) // use it } // dynarr2 goes out of scope (2)
At (1), the compiler generated copy constructor will blindly copy the member, havingdynarr anddynarr2 share their inner int array.
At (2), the dtor fordynarr2 will delete the shared inner array:ZAP!...dynarr now has a dangling pointer, Undefined Behaviour is ensured from there....
There is another possible error insize management. You consistently divide it by 2 when removing elements and multiply it by 2 when adding. Fine. Except that if you let the size reach 0 by removing the last element of aDynamicArray 0*2 is still 0 so the size will be definitively stuck at 0! You must handle that corner case by preventing the size to reach 0 inshrink.
What follow are only minor possible improvements.
Your code uses
using namespace std;This is not an error because it is allowed by the language, and is even common in many tutorials. It is a nice trick that allows to use all the goodies from the standard library without the
std::prefix.But, it does import a number of useless symbols into the current namespace, somehow defeating the whole namespace concepts. In production code, best practices recommend to never useusing namespace stdbut only import the relevant symbols. Only a cosmetic improvement.in
checkIndex, you usethrow "Index is out of the array range!";This actually throws a pointer to a string litteral. Here again it is not an error since it is allowed by the language.But best practices recomment to only throw objects of (subclasses of) the
std::exceptionclass. Users of your class will probably not expect to have to catch a char pointer. You'd better use anout_of_rangeor aruntime_errorexception if you think it is not worth declaring a custom exceptionYou compare
sizeandnewsizeincopyToNewSizeto know how many elements should be copied. But in fact you have only to copylengthelements... This one is only a tiny optimization.
For your initial question, there is nothing bad in only shrinking after a number of removals, because a shrink involve reallocation and a copy of the elements, so it is an expensive operation. For that reason, you should even considere stop shrinking below a threshold (for example 8 elements). As a bonus, it will directly preventsize to reach 0, and you could even make that minimal size an optional parameter of your constructor.
- \$\begingroup\$and you could even make that minimal size an optional parameter of your constructor. - Then you'd have to store it somewhere. That sounds unlikely to be good, although if you're implementing this for real (rather than a learning exercise) then making it different from
std::vectoris a good idea, otherwise you'd just usestd::vector. I'd be more inclined to make a tuning option like that an optionaltemplate parameter, or possibly astaticmember variable.\$\endgroup\$Peter Cordes– Peter Cordes2021-09-21 03:15:29 +00:00CommentedSep 21, 2021 at 3:15 - 1\$\begingroup\$@PeterCordes: I generally avoid integer templates, because they are a nightmare in a library: either you put the full definition in headers, or you have to instantiate a number of concrete classes and users will be restricted to them. I have never found acorrect implementation of multidimensional arrays in C++ because of that (inner dimensions can only be constexpr).\$\endgroup\$Serge Ballesta– Serge Ballesta2021-09-21 07:29:30 +00:00CommentedSep 21, 2021 at 7:29
Consider using
size_tinstead ofint. The actual type of size_t is platform-dependent; a common mistake is to assume size_t is the same as unsigned int, which can lead to problems, particularly as 64-bit architectures become more prevalent.It is generally bad practice to repeat code that does the same thing. Consider changing
copyToNewSizeto something like this:
void copyToNewSize(size_t newSize) { int* temp = new int[newSize]; size_t copySize = std::min(size, newSize); for (size_t i = 0; i < copySize; i++) { temp[i] = arr[i]; } delete[] arr; size = newSize; arr = temp;}- I would suggest that you rename
sizetoallocatedSizeto avoid confusion betweensizeandlength.
- \$\begingroup\$64-bit architectures are already widespread; what's still left to change is that real-world memory capacities continue to grow, so it's becoming more realistic (andmaybe less rare in real-world usage) to actually have arrays that large.\$\endgroup\$Peter Cordes– Peter Cordes2021-09-21 03:17:52 +00:00CommentedSep 21, 2021 at 3:17
- \$\begingroup\$Also,
std::copyexists, can be a good idea to use it instead of open-coding a copy loop.\$\endgroup\$Peter Cordes– Peter Cordes2021-09-21 03:19:38 +00:00CommentedSep 21, 2021 at 3:19 - \$\begingroup\$
#pragma omp parallel foron a copy loop? Yeah, could be forvery large copies (like maybe more than 4GiB), on systems like a big Xeon where a single core can't saturate memory-controller bandwidth. (Why is Skylake so much better than Broadwell-E for single-threaded memory throughput?). If you have so many large copies to do that it's worth considering threads just for that, though, probably algorithmic optimizations to avoid copying so much will be even better. Or see my comments on mdfst13's answer re: Linuxmremapto avoid physical copying.\$\endgroup\$Peter Cordes– Peter Cordes2021-09-21 12:02:31 +00:00CommentedSep 21, 2021 at 12:02 - \$\begingroup\$(Or if other cores can be doing something useful while some are just copying, that might be good for overall throughput, unless multiple other threads are waiting for this copy to complete. On some CPUs there are diminishing returns for using even more threads to max out copy bandwidth, although on current Skylake-Xeon it's fairly linear for quite a few cores because the per-core memory bandwidth is so small (compared to a desktop and also as a fraction of the available aggregate memory-controller bandwidth))\$\endgroup\$Peter Cordes– Peter Cordes2021-09-21 12:05:29 +00:00CommentedSep 21, 2021 at 12:05
- \$\begingroup\$You forgot to add any compiler flags on TIO, so that was
gcc -O0. If you just meant to link the code butnot try to run it on the server,godbolt.org/z/jW91snGdj lets you look at the asm. (usinggcc -O3 -fopenmp. Without-fopenmp,#pragma ompdoes nothing.)\$\endgroup\$Peter Cordes– Peter Cordes2021-09-21 16:20:12 +00:00CommentedSep 21, 2021 at 16:20
Don't talk like a pirate
I would prefer the namearray (another reason not tousing namespace std;) toarr. And I'd prefer a more descriptive name likedata ornumbers to either.
Don't waste work
int push(int value) { length++; if (length == (size + 1)) grow(); data[length - 1] = value; return length; }
You add and subtract three times. Consider
int push(int value) { if (length == size) { grow(); } data[length] = value; length++; return length; }That adds once (the increment).
In general, I would avoid the statement form of control structures. That can lead to odd, hard to find bugs. Worse, you use it inconsistently, which makes the code harder to read.
Fragile code
if (length <= size / 2) shrink();
and
void shrink() { copyToNewSize(size / 2); }
Combined these are fragile. To work properly, you needsize / 2 to be the same in both places. Consider
size_t bound = size / 2; if (length <= bound) { copyToNewSize(bound); }Now it has to be the same in both places. Another alternative
void shrinkIfNecessary() { size_t bound = size / 2; if (length <= bound) { copyToNewSize(bound); } }Then replace theif and call toshrink withshrinkIfNecessary.
In general, you should try to avoid parallel logic. If you need the same logic in two places, try to abstract it out into at least a variable (e.g.bound here) or a function (e.g.shrinkIfNecessary).
It's natural to think thatgrow should have ashrink. But here, the shrinking operation is triggered differently. They aren't mirror images of each other. You have to grow any time you exceed the bounds. You don't need to shrink every time. So it actually makes sense to havegrow andshrinkIfNecessary instead.
You also might consider moving the 2 to a constant, e.g.GROWTH_FACTOR. But that will work fine even if inconsistent between growing and shrinking, so it's less important.
std::copy
if (newSize > size) for (int i = 0; i < size; i++) { temp[i] = arr[i]; } else for (int i = 0; i < newSize; i++) { temp[i] = arr[i]; }
In this, you copy manually. But you don't need to do that. C++ hasstd::copy. So you could just say
std::copy(arr, arr + std::min(size, newSize), temp);Thestd::min was already postedhere.
Note thatstd::copy requires#include <algorithm>.
This would get around the entire question ofint vs.size_t vs.auto as regards to the loop index variables. No more loops means no more loop indexes.
The C++ way
C++ also hasstd::vector, which would get around the entire class.
The C way
If you prefer to reinvent the wheel, you might consider if you want to go all the way and return to the C memory management. Why C? Because in C, you haverealloc, which can make arrays smaller or larger. Usually when making an array smaller, it would use the same memory and free just what is at the end. C++ doesn't have that capability (except in that C++ has access to the C routines). Of course, if you do that, you want to stop usingnew anddelete in favor ofcalloc/malloc andfree.
Therealloc function also handles the copying for you if necessary. But it won't always be necessary, becauserealloc will usually reuse the existing array when making it smaller. And it will sometimes use the existing array when making it bigger as well.
Duplicates
When deleting an element and shifting the others, the last element will be twice repeated (the original still exists after copying).
In one sense, this is harmless. If you track the size correctly and initialize the memory before using, this might be true, but you'd never know.
If you are concerned about leaving traces in your program, note thatrealloc can make an array smaller efficiently enough that you could call it on everypop orremove. However, you'd still likely prefer to grow less frequently. Because at least some of the time, it would cause the array to need to be copied.
You might also consider manually clearing the entry. E.g. inremove:
data[length] = 0;That's the kind of thing you'd do if the data were sensitive for some reason.
- \$\begingroup\$
std::vectorunfortunately can't usereallocbecause it has to work on non-trivially-copyable types (i.e. whose object-representation may need to change when copied to a new address, or just whose copy-constructor has a side-effect such as debug logging.). But even more unfortunately, not even in a template specialization for trivially-copyable types is easy: C++ has replaceablenew, and there's a guarantee that ifstd::vectorallocates, it's via calls tonew, so user-definednewwill run. This replacement might be in a different compilation unit so you can't template on it.\$\endgroup\$Peter Cordes– Peter Cordes2021-09-21 10:16:57 +00:00CommentedSep 21, 2021 at 10:16 - \$\begingroup\$IDK why C++ has refused to add a
try_reallocallocator interface that could avoid copying, but it makesstd::vectorpretty garbage for huge arrays, especially compared to what it could be doing on Linux where themremap(2)system call does this at page granularity, trying to extend an existing mapping. And withMREMAP_MAYMOVEcan always avoid copying by remapping the existing physical pages to the start of a larger extend of virtual address space if there were wasn't room to grow at the original spot. (Still TLB misses though.)\$\endgroup\$Peter Cordes– Peter Cordes2021-09-21 10:22:05 +00:00CommentedSep 21, 2021 at 10:22 - 1\$\begingroup\$So basicallyC++ requires
std::vectorto suck at clever reallocation optimizations, although if an implementation could detect thatnewwasn't replaced (or promise via a command line arg), the as-if rule would let std::vector use smarter realloc.Howard Hinnant said on SO) that his attempts to improve things for C++11 didn't get traction. IMO adding atry_reallocinterface to the language would have been easy; implementations could always define it asnew+ copy +deleteorreturn false.\$\endgroup\$Peter Cordes– Peter Cordes2021-09-21 10:29:19 +00:00CommentedSep 21, 2021 at 10:29 - 1\$\begingroup\$So yeah, writing your own container that used
reallocor evenmmap/mremap(intended for large allocations where bypassing a free-list makes sense) could be a case where it's actually interesting for real use, not just as a learning exercise.\$\endgroup\$Peter Cordes– Peter Cordes2021-09-21 10:31:41 +00:00CommentedSep 21, 2021 at 10:31 - \$\begingroup\$@PeterCordes
std::vectoris a template (and so isstd::allocator), templates can have specialisations.std::vector<TriviablyCopyable>can userealloc\$\endgroup\$Caleth– Caleth2021-09-21 12:39:37 +00:00CommentedSep 21, 2021 at 12:39
You mustlog in to answer this question.
Explore related questions
See similar questions with these tags.


