I have implemented Binary search code for a given sorted array, written unit test too, Please review my BST unit test code and let me know your comments for improvements.
class BinarySearch { public int binSearch(int[] sortedArray, int key) { return search(0, sortedArray.length - 1, key, sortedArray); } private static int search(int start, int end, int key, int[] sortedArray) { int mid = start + ((end-start)/2); if (mid >= start && mid <= end) { if (sortedArray[mid] == key) { return mid; } else if (sortedArray[mid] > key) { return search(start, mid-1, key, sortedArray); } else if(sortedArray[mid] < key){ return search(mid+1, end, key, sortedArray); } } return -1; }}public class BinarySearchTest { private BinarySearch binarySearchCUT; private static final int UNSUCCESSFUL = -1; @Before public void setUp(){ binarySearchCUT = new BinarySearch(); } @Test public void testShouldReturnUnsuccessfulOnEmptyArray() { assertEquals(UNSUCCESSFUL, binarySearchCUT.binSearch(new int[]{}, 0)); } @Test public void testShouldReturnUnsuccessfulOnLeftBound() { assertEquals(UNSUCCESSFUL, binarySearchCUT.binSearch(new int[]{1, 2, 4, 7, 8, 12, 15, 19, 24, 50, 69, 80, 100}, 0)); } @Test public void testShouldReturnUnsuccessfulOnRightBound() { assertEquals(UNSUCCESSFUL, binarySearchCUT.binSearch(new int[]{1, 2, 4, 7, 8, 12, 15, 19, 24, 50, 69, 80, 100}, 101)); } @Test public void testShouldReturnSuccessfulOnLeftBound() { assertEquals(0, binarySearchCUT.binSearch(new int[]{1, 2, 4, 7, 8, 12, 15, 19, 24, 50, 69, 80, 100}, 1)); } @Test public void testShouldReturnSuccessfulOnRightBound() { assertEquals(12, binarySearchCUT.binSearch(new int[]{1, 2, 4, 7, 8, 12, 15, 19, 24, 50, 69, 80, 100}, 100)); } @Test public void testShouldReturnSuccessfulOnMid() { assertEquals(7, binarySearchCUT.binSearch(new int[]{1, 2, 4, 7, 8, 12, 15, 19, 24, 50, 69, 80, 100}, 19)); } @Test public void testShouldReturnSuccessfulOnMidGreaterThanGivenNumber() { assertEquals(5, binarySearchCUT.binSearch(new int[]{1, 2, 4, 7, 8, 12, 15, 19, 24, 50, 69, 80, 100}, 12)); } @Test public void testShouldReturnSuccessfulOnMidLesserThanGivenNumber() { assertEquals(10, binarySearchCUT.binSearch(new int[]{1, 2, 4, 7, 8, 12, 15, 19, 24, 50, 69, 80, 100}, 69)); }}- 1\$\begingroup\$DRY: You could create a helper-method
testSearch(int[] arr, int searchFor, int expected)and put your test-array into a constant. The individual tests would then just betestSeach(normArray, 4, normArray[4])much shorter to read and reason about.\$\endgroup\$Falco– Falco2020-01-02 15:21:26 +00:00CommentedJan 2, 2020 at 15:21 - \$\begingroup\$Consider using
Optionals.\$\endgroup\$Bex– Bex2020-01-02 20:29:32 +00:00CommentedJan 2, 2020 at 20:29 - \$\begingroup\$I have to disagree with @Falco . In unit testing, repeating yourself is fine. Martin Fowler actually mentions this in his Refactoring book, that repetition and copy-paste is acceptable in unit tests. (Seriously, read that book!)\$\endgroup\$markspace– markspace2020-01-03 06:40:22 +00:00CommentedJan 3, 2020 at 6:40
- \$\begingroup\$@markspace I did read the Fowler book - but I think there is a difference between acceptable and recommended. Unit Tests should be easy to read, understand, refactor and adapt. If repeating yourself makes the Tests more readable and easier to adapt it is fine. But in this case I think a more concise signature would make it easier to verify all Tests actually do what they're supposed to and prevent copy&paste errors with future tests.\$\endgroup\$Falco– Falco2020-01-03 09:16:00 +00:00CommentedJan 3, 2020 at 9:16
4 Answers4
BinarySearch
TheBinarySearch class should bepublic since it contains utility methods that are generally useful. To do this, writepublic class BinarySearch instead ofclass BinarySearch.
ThebinSearch method should bestatic since it does not access any fields from theBinarySearch class. After all, that class doesn't have any fields that could be accessed. To do this, writepublic static int binSearch instead ofpublic int binSearch.
When you make thebinSearch method static, you no longer need thenew BinarySearch() in the unit test. This is good since it sounds weird to have a "new binary search". It's confusing to hear this, since "binary search" is an algorithm and not a countable thing. It makes more sense to speak of "the binary search algorithm", avoiding the word "new". On the other hand, "new house" or "new sheet of paper" sounds much more natural.
The helper method currently is:
private static int search(int start, int end, int key, int[] sortedArray) {The order of the parameters is not the usual one. Usually the array comes first, before its indexes. In this case, this would mean:
private static int search(int[] sortedArray, int start, int end, int key) {When you compare your code withthe predefined Arrays.binarySearch andthe helper function, you can see that the code is quite similar, which is a good sign.
One useful thing that the predefinedArrays.binarySearch does is that if the element is not found, it returns a hint about the index where it would be found. This is useful to see the closest existing value.
BinarySearchTest
Your existing unit tests look good. You can leave out the wordtest at the beginning of the method names to make the names a little shorter. They are quite long, but that's ok.
There are test cases missing:
- An element from the middle cannot be found, for example 5 in
[1, 2, 6, 10]. - The array contains duplicate values, for example 3 in
[1, 2, 3, 3, 3, 3, 3, 3, 3].
In the latter case, it is unspecified whetherbinSearch should return 2 or 3 or any other index. All you can do in your unit test is to check that`array[binSearch(array, key)] == key.
- 2\$\begingroup\$
it returns the index where it would be found.But then you lose information about whether or not the object is included or not. The C# standard library solves this by returning the binary complement of the index (which would be a negative number).\$\endgroup\$JAD– JAD2020-01-02 15:08:13 +00:00CommentedJan 2, 2020 at 15:08 - \$\begingroup\$@JAD That's exactly what Java is doing as well.\$\endgroup\$Roland Illig– Roland Illig2020-01-02 16:34:11 +00:00CommentedJan 2, 2020 at 16:34
- \$\begingroup\$Well look at that. I initially misread
return -(low + 1);as if it would just return-1.\$\endgroup\$JAD– JAD2020-01-03 07:05:23 +00:00CommentedJan 3, 2020 at 7:05
Your code looks good but just wanted to suggest below
if (mid >= start && mid <= end)- this condition will never fail becausemidwill either equal tostartorendso you can skip this because this is taking unnecessary condition check time- You can check if
startis less than or equal toendotherwise you will end up with wrongmidvalue. By putting this condition method will not calculatemidvalue but return -1 and it will improve performance.
`
private static int search(int start, int end, int key, int[] sortedArray) { if(start <= end) { int mid = start + ((end - start) / 2); //if (mid >= start && mid <= end) { if (sortedArray[mid] == key) { return mid; } else if (sortedArray[mid] > key) { return search(start, mid - 1, key, sortedArray); } else if (sortedArray[mid] < key) { return search(mid + 1, end, key, sortedArray); } //} } return -1;}`
I have some suggestions for you.
1) In thesearch method, extract thesortedArray[mid] into a variable.
//[...]int currentMid = sortedArray[mid];if (currentMid == key) {//[...]} else if (currentMid > key) {//[...]} else if(currentMid < key){//[...]}For the array part, instead of passing an array, you can use thevarargs. In my opinion, it will be easier to test since you don’t need to create a new array each time, but it can be uglier since the other parameters are also integers.
public class BinarySearchTest { private BinarySearch binarySearchCUT; private static final int UNSUCCESSFUL = -1; @Before public void setUp(){ binarySearchCUT = new BinarySearch(); } @Test public void testShouldReturnUnsuccessfulOnEmptyArray() { assertEquals(UNSUCCESSFUL, binarySearchCUT.binSearch(0)); } @Test public void testShouldReturnUnsuccessfulOnLeftBound() { assertEquals(UNSUCCESSFUL, binarySearchCUT.binSearch(0, 1, 2, 4, 7, 8, 12, 15, 19, 24, 50, 69, 80, 100)); } @Test public void testShouldReturnUnsuccessfulOnRightBound() { assertEquals(UNSUCCESSFUL, binarySearchCUT.binSearch(101, 1, 2, 4, 7, 8, 12, 15, 19, 24, 50, 69, 80, 100)); } @Test public void testShouldReturnSuccessfulOnLeftBound() { assertEquals(0, binarySearchCUT.binSearch(1, 1, 2, 4, 7, 8, 12, 15, 19, 24, 50, 69, 80, 100)); } @Test public void testShouldReturnSuccessfulOnRightBound() { assertEquals(12, binarySearchCUT.binSearch(100, 1, 2, 4, 7, 8, 12, 15, 19, 24, 50, 69, 80, 100)); } @Test public void testShouldReturnSuccessfulOnMid() { assertEquals(7, binarySearchCUT.binSearch(19, 1, 2, 4, 7, 8, 12, 15, 19, 24, 50, 69, 80, 100)); } @Test public void testShouldReturnSuccessfulOnMidGreaterThanGivenNumber() { assertEquals(5, binarySearchCUT.binSearch(12, 1, 2, 4, 7, 8, 12, 15, 19, 24, 50, 69, 80, 100)); } @Test public void testShouldReturnSuccessfulOnMidLesserThanGivenNumber() { assertEquals(10, binarySearchCUT.binSearch(69, 1, 2, 4, 7, 8, 12, 15, 19, 24, 50, 69, 80, 100)); } static class BinarySearch { public int binSearch(int key, int... sortedArray) { return search(0, sortedArray.length - 1, key, sortedArray); } private static int search(int start, int end, int key, int... sortedArray) { int mid = start + ((end-start)/2); if (mid >= start && mid <= end) { int currentMid = sortedArray[mid]; if (currentMid == key) { return mid; } else if (currentMid > key) { return search(start, mid-1, key, sortedArray); } else if(currentMid < key){ return search(mid+1, end, key, sortedArray); } } return -1; } }}```- \$\begingroup\$I think a vararg signature makes the whole method less easy to read and understand. And since he is using the same Array in all tests, he could just put it in a constant in his test-class. I think it is more error prone this way - e.g. counting of indexes in the array is harder. I would prefer the Tests to read like this:
arr=[]...; assert(binSearch(arr[10], arr)).is(10)it is easier to see what the expected result represents.\$\endgroup\$Falco– Falco2020-01-03 09:20:46 +00:00CommentedJan 3, 2020 at 9:20
One significant issue that has not been addressed in previous answers (all of which make excellent points), is that of deterministic behaviour.
Your code is going to produce unexpected, and inconsistent results if the data in the sorted array is not unique. You clearly indicate that the array is pre-sorted, but you do not specify any restraints on the uniqueness of the values.
It is standard in search routines to return the index of thefirst value in the array that matches the search term, but your code will return the index of "some" matching value.
When you find the match, scan backwards to find the first instance..., so your match code would change from:
if (sortedArray[mid] == key) { return mid;} else if
to more like:
if (sortedArray[mid] == key) { // Ensure we return the first matching instance in the array. while (mid > 0 && sortedArray[mid -1] == key) { mid--; } return mid;} else if- \$\begingroup\$Interesting. Wouldn't that make the worst case running time O(N) [since if the array is all the same number, you'd loop through N/2 or so times] versus O(log(N))?\$\endgroup\$Foon– Foon2020-01-02 22:10:17 +00:00CommentedJan 2, 2020 at 22:10
- \$\begingroup\$Note that Java at least specifically says: " If the list contains multiple elements equal to the specified object, there is no guarantee which one will be found. " It's a valid thing to think about, but I disagree it's a universal standard to return the first (docs.oracle.com/javase/7/docs/api/java/util/… for Java, though counter point isdocs.python.org/2/library/bisect.html shows how to get the first (or last) with bisect)\$\endgroup\$Foon– Foon2020-01-02 22:18:47 +00:00CommentedJan 2, 2020 at 22:18
- \$\begingroup\$(Check ofgithub.com/python/cpython/blob/master/Lib/bisect.py suggests slightly tweaking the algorithm will support getting the first element for uniques but still be O(log(N)))\$\endgroup\$Foon– Foon2020-01-02 22:32:05 +00:00CommentedJan 2, 2020 at 22:32
You mustlog in to answer this question.
Explore related questions
See similar questions with these tags.
