Dynamic array implementation using C++

Question 1

I implemented a basic dynamic array using C++.

When deleting an element and shifting the others, the last element will be twice repeated (the original still exists after copying). I want to avoid memory leaking by deallocating the last element of the array. Currently, I do this by shrinking once after a number of deletions. Are there better solutions?

Finally, I am interested in some feedback on this implementation.

#include <iostream>#include <stdexcept>using namespace std;class DynamicArray {protected:    int size;    int* arr;    int length;    void grow() {        copyToNewSize(size * 2);    }    void shrink() {        copyToNewSize(size / 2);    }    void copyToNewSize(int newSize) {        int* temp = new int[newSize];        if (newSize > size)            for (int i = 0; i < size; i++) {                temp[i] = arr[i];            }        else            for (int i = 0; i < newSize; i++) {                temp[i] = arr[i];            }        delete[] arr;        size = newSize;        arr = temp;    }    void checkIndex(int index) {        if (index >= length || index < 0)            throw "Index is out of the array range!";    }public:    DynamicArray(int startingSize) {        size = startingSize;        arr = new int[size];        length = 0;    }    ~DynamicArray() {        delete[] arr;    }    int push(int value) {        length++;        if (length == (size + 1))            grow();        arr[length - 1] = value;        return length;    }    int pop() {        length--;        int value = arr[length];        // memory leaking , delete arr[length]        if (length <= size / 2)            shrink();        return value;    }    int insert(int index, int value) {        checkIndex(index);        length++;        if (length == (size + 1))            grow();        for (int i = length - 1; i > index; i--) {            arr[i] = arr[i - 1];        }        arr[index] = value;        return length;    }    int remove(int index) {        checkIndex(index);        int value = arr[index];        length--;        for (int i = index; i < length; i++)            arr[i] = arr[i + 1];        // memory leaking , delete arr[length]        if (length <= (size / 2))            shrink();        return value;    }    int get(int index) {        checkIndex(index);        return arr[index];    }    void set(int index, int value) {        checkIndex(index);        arr[index] = value;    }    int search(int value) {        for (int i = 0; i < length; i++) {            if (arr[i] == value)                return i;        }        return -1;    }    int getLength() {        return length;    }}

Question 2

Do you have any specific reason for not usingstd::vector<int> or one of the other standard containers?

Question 3

@JohanduToit Yes, learning to code.

Question 4

First of all, there is a major problem with your class: it does not respect therule of three/five/zero:

it has a non trivial destructor that destroys memory allocated in constructor

it neither defines nor deletes the copy constructor and assignment operatorLet us look at the following code:

  DynamicArray dynarr[16];  // initialize values in dynarr  if (...) {      // opens a nested block      DynamicArray dynarr2 = dynarr;    // copy construct the new DynamicArray (1)      // use it  }  // dynarr2 goes out of scope (2)

At (1), the compiler generated copy constructor will blindly copy the member, havingdynarr anddynarr2 share their inner int array.

At (2), the dtor fordynarr2 will delete the shared inner array:ZAP!...dynarr now has a dangling pointer, Undefined Behaviour is ensured from there....

There is another possible error insize management. You consistently divide it by 2 when removing elements and multiply it by 2 when adding. Fine. Except that if you let the size reach 0 by removing the last element of aDynamicArray 0*2 is still 0 so the size will be definitively stuck at 0! You must handle that corner case by preventing the size to reach 0 inshrink.

What follow are only minor possible improvements.

Your code usesusing namespace std;
This is not an error because it is allowed by the language, and is even common in many tutorials. It is a nice trick that allows to use all the goodies from the standard library without thestd:: prefix.But, it does import a number of useless symbols into the current namespace, somehow defeating the whole namespace concepts. In production code, best practices recommend to never useusing namespace std but only import the relevant symbols. Only a cosmetic improvement.
incheckIndex, you usethrow "Index is out of the array range!";
This actually throws a pointer to a string litteral. Here again it is not an error since it is allowed by the language.But best practices recomment to only throw objects of (subclasses of) thestd::exception class. Users of your class will probably not expect to have to catch a char pointer. You'd better use anout_of_range or aruntime_error exception if you think it is not worth declaring a custom exception
You comparesize andnewsize incopyToNewSize to know how many elements should be copied. But in fact you have only to copylength elements... This one is only a tiny optimization.

For your initial question, there is nothing bad in only shrinking after a number of removals, because a shrink involve reallocation and a copy of the elements, so it is an expensive operation. For that reason, you should even considere stop shrinking below a threshold (for example 8 elements). As a bonus, it will directly preventsize to reach 0, and you could even make that minimal size an optional parameter of your constructor.

Question 5

and you could even make that minimal size an optional parameter of your constructor. - Then you'd have to store it somewhere. That sounds unlikely to be good, although if you're implementing this for real (rather than a learning exercise) then making it different fromstd::vector is a good idea, otherwise you'd just usestd::vector. I'd be more inclined to make a tuning option like that an optionaltemplate parameter, or possibly astatic member variable.

Question 6

@PeterCordes: I generally avoid integer templates, because they are a nightmare in a library: either you put the full definition in headers, or you have to instantiate a number of concrete classes and users will be restricted to them. I have never found acorrect implementation of multidimensional arrays in C++ because of that (inner dimensions can only be constexpr).

Question 7

Consider usingsize_t instead ofint. The actual type of size_t is platform-dependent; a common mistake is to assume size_t is the same as unsigned int, which can lead to problems, particularly as 64-bit architectures become more prevalent.
It is generally bad practice to repeat code that does the same thing. Consider changingcopyToNewSize to something like this:

void copyToNewSize(size_t newSize) {    int* temp = new int[newSize];    size_t copySize = std::min(size, newSize);    for (size_t i = 0; i < copySize; i++) {       temp[i] = arr[i];    }    delete[] arr;    size = newSize;    arr = temp;}

I would suggest that you renamesize toallocatedSize to avoid confusion betweensize andlength.

Question 8

64-bit architectures are already widespread; what's still left to change is that real-world memory capacities continue to grow, so it's becoming more realistic (andmaybe less rare in real-world usage) to actually have arrays that large.

Question 9

Also,std::copy exists, can be a good idea to use it instead of open-coding a copy loop.

Question 10

#pragma omp parallel for on a copy loop? Yeah, could be forvery large copies (like maybe more than 4GiB), on systems like a big Xeon where a single core can't saturate memory-controller bandwidth. (Why is Skylake so much better than Broadwell-E for single-threaded memory throughput?). If you have so many large copies to do that it's worth considering threads just for that, though, probably algorithmic optimizations to avoid copying so much will be even better. Or see my comments on mdfst13's answer re: Linuxmremap to avoid physical copying.

Question 11

(Or if other cores can be doing something useful while some are just copying, that might be good for overall throughput, unless multiple other threads are waiting for this copy to complete. On some CPUs there are diminishing returns for using even more threads to max out copy bandwidth, although on current Skylake-Xeon it's fairly linear for quite a few cores because the per-core memory bandwidth is so small (compared to a desktop and also as a fraction of the available aggregate memory-controller bandwidth))

Question 12

You forgot to add any compiler flags on TIO, so that wasgcc -O0. If you just meant to link the code butnot try to run it on the server,godbolt.org/z/jW91snGdj lets you look at the asm. (usinggcc -O3 -fopenmp. Without-fopenmp,#pragma omp does nothing.)

Question 13

Don't talk like a pirate

I would prefer the namearray (another reason not tousing namespace std;) toarr. And I'd prefer a more descriptive name likedata ornumbers to either.

Don't waste work

    int push(int value) {        length++;        if (length == (size + 1))            grow();        data[length - 1] = value;        return length;    }

You add and subtract three times. Consider

    int push(int value) {        if (length == size) {            grow();        }        data[length] = value;        length++;        return length;    }

That adds once (the increment).

In general, I would avoid the statement form of control structures. That can lead to odd, hard to find bugs. Worse, you use it inconsistently, which makes the code harder to read.

Fragile code

        if (length <= size / 2)            shrink();

and

    void shrink() {        copyToNewSize(size / 2);    }

Combined these are fragile. To work properly, you needsize / 2 to be the same in both places. Consider

        size_t bound = size / 2;        if (length <= bound) {            copyToNewSize(bound);        }

Now it has to be the same in both places. Another alternative

    void shrinkIfNecessary() {        size_t bound = size / 2;        if (length <= bound) {            copyToNewSize(bound);        }    }

Then replace theif and call toshrink withshrinkIfNecessary.

In general, you should try to avoid parallel logic. If you need the same logic in two places, try to abstract it out into at least a variable (e.g.bound here) or a function (e.g.shrinkIfNecessary).

It's natural to think thatgrow should have ashrink. But here, the shrinking operation is triggered differently. They aren't mirror images of each other. You have to grow any time you exceed the bounds. You don't need to shrink every time. So it actually makes sense to havegrow andshrinkIfNecessary instead.

You also might consider moving the 2 to a constant, e.g.GROWTH_FACTOR. But that will work fine even if inconsistent between growing and shrinking, so it's less important.

`std::copy`

        if (newSize > size)            for (int i = 0; i < size; i++) {                temp[i] = arr[i];            }        else            for (int i = 0; i < newSize; i++) {                temp[i] = arr[i];            }

In this, you copy manually. But you don't need to do that. C++ hasstd::copy. So you could just say

        std::copy(arr, arr + std::min(size, newSize), temp);

Thestd::min was already postedhere.

Note thatstd::copy requires#include <algorithm>.

This would get around the entire question ofint vs.size_t vs.auto as regards to the loop index variables. No more loops means no more loop indexes.

The C++ way

C++ also hasstd::vector, which would get around the entire class.

The C way

If you prefer to reinvent the wheel, you might consider if you want to go all the way and return to the C memory management. Why C? Because in C, you haverealloc, which can make arrays smaller or larger. Usually when making an array smaller, it would use the same memory and free just what is at the end. C++ doesn't have that capability (except in that C++ has access to the C routines). Of course, if you do that, you want to stop usingnew anddelete in favor ofcalloc/malloc andfree.

Therealloc function also handles the copying for you if necessary. But it won't always be necessary, becauserealloc will usually reuse the existing array when making it smaller. And it will sometimes use the existing array when making it bigger as well.

Duplicates

When deleting an element and shifting the others, the last element will be twice repeated (the original still exists after copying).

In one sense, this is harmless. If you track the size correctly and initialize the memory before using, this might be true, but you'd never know.

If you are concerned about leaving traces in your program, note thatrealloc can make an array smaller efficiently enough that you could call it on everypop orremove. However, you'd still likely prefer to grow less frequently. Because at least some of the time, it would cause the array to need to be copied.

You might also consider manually clearing the entry. E.g. inremove:

        data[length] = 0;

That's the kind of thing you'd do if the data were sensitive for some reason.

Question 14

std::vector unfortunately can't userealloc because it has to work on non-trivially-copyable types (i.e. whose object-representation may need to change when copied to a new address, or just whose copy-constructor has a side-effect such as debug logging.). But even more unfortunately, not even in a template specialization for trivially-copyable types is easy: C++ has replaceablenew, and there's a guarantee that ifstd::vector allocates, it's via calls tonew, so user-definednew will run. This replacement might be in a different compilation unit so you can't template on it.

Question 15

IDK why C++ has refused to add atry_realloc allocator interface that could avoid copying, but it makesstd::vector pretty garbage for huge arrays, especially compared to what it could be doing on Linux where themremap(2) system call does this at page granularity, trying to extend an existing mapping. And withMREMAP_MAYMOVE can always avoid copying by remapping the existing physical pages to the start of a larger extend of virtual address space if there were wasn't room to grow at the original spot. (Still TLB misses though.)

Question 16

So basicallyC++ requiresstd::vector to suck at clever reallocation optimizations, although if an implementation could detect thatnew wasn't replaced (or promise via a command line arg), the as-if rule would let std::vector use smarter realloc.Howard Hinnant said on SO) that his attempts to improve things for C++11 didn't get traction. IMO adding atry_realloc interface to the language would have been easy; implementations could always define it asnew + copy +delete orreturn false.

Question 17

So yeah, writing your own container that usedrealloc or evenmmap /mremap (intended for large allocations where bypassing a free-list makes sense) could be a case where it's actually interesting for real use, not just as a learning exercise.

Question 18

@PeterCordesstd::vector is a template (and so isstd::allocator), templates can have specialisations.std::vector<TriviablyCopyable>can userealloc

Serge Ballesta 1,0399 silver badges18 bronze badges · Accepted Answer · 2021-09-20 08:33:17Z

First of all, there is a major problem with your class: it does not respect therule of three/five/zero:

it has a non trivial destructor that destroys memory allocated in constructor

it neither defines nor deletes the copy constructor and assignment operatorLet us look at the following code:

  DynamicArray dynarr[16];  // initialize values in dynarr  if (...) {      // opens a nested block      DynamicArray dynarr2 = dynarr;    // copy construct the new DynamicArray (1)      // use it  }  // dynarr2 goes out of scope (2)

At (1), the compiler generated copy constructor will blindly copy the member, havingdynarr anddynarr2 share their inner int array.

At (2), the dtor fordynarr2 will delete the shared inner array:ZAP!...dynarr now has a dangling pointer, Undefined Behaviour is ensured from there....

There is another possible error insize management. You consistently divide it by 2 when removing elements and multiply it by 2 when adding. Fine. Except that if you let the size reach 0 by removing the last element of aDynamicArray 0*2 is still 0 so the size will be definitively stuck at 0! You must handle that corner case by preventing the size to reach 0 inshrink.

What follow are only minor possible improvements.

Your code usesusing namespace std;
This is not an error because it is allowed by the language, and is even common in many tutorials. It is a nice trick that allows to use all the goodies from the standard library without thestd:: prefix.But, it does import a number of useless symbols into the current namespace, somehow defeating the whole namespace concepts. In production code, best practices recommend to never useusing namespace std but only import the relevant symbols. Only a cosmetic improvement.
incheckIndex, you usethrow "Index is out of the array range!";
This actually throws a pointer to a string litteral. Here again it is not an error since it is allowed by the language.But best practices recomment to only throw objects of (subclasses of) thestd::exception class. Users of your class will probably not expect to have to catch a char pointer. You'd better use anout_of_range or aruntime_error exception if you think it is not worth declaring a custom exception
You comparesize andnewsize incopyToNewSize to know how many elements should be copied. But in fact you have only to copylength elements... This one is only a tiny optimization.

For your initial question, there is nothing bad in only shrinking after a number of removals, because a shrink involve reallocation and a copy of the elements, so it is an expensive operation. For that reason, you should even considere stop shrinking below a threshold (for example 8 elements). As a bonus, it will directly preventsize to reach 0, and you could even make that minimal size an optional parameter of your constructor.

and you could even make that minimal size an optional parameter of your constructor. - Then you'd have to store it somewhere. That sounds unlikely to be good, although if you're implementing this for real (rather than a learning exercise) then making it different fromstd::vector is a good idea, otherwise you'd just usestd::vector. I'd be more inclined to make a tuning option like that an optionaltemplate parameter, or possibly astatic member variable.
@PeterCordes: I generally avoid integer templates, because they are a nightmare in a library: either you put the full definition in headers, or you have to instantiate a number of concrete classes and users will be restricted to them. I have never found acorrect implementation of multidimensional arrays in C++ because of that (inner dimensions can only be constexpr).

Movatterモバイル変換

Stack Exchange Network

Dynamic array implementation using C++

3 Answers3

Don't talk like a pirate

Don't waste work

Fragile code

`std::copy`

The C++ way

The C way

Duplicates

You mustlog in to answer this question.

Related

Hot Network Questions

Subscribe to RSS