I accomplished the following task to practice allocating on the heap.
- reading words from stdin
- sorting them in lexicographical order
- printing the sorted words to stdout
To accomplish this is allocate the size of each word and the count of words.
I would like to hear any suggestions for improvements. Is there maybe an easier way to sort the words?
#include <stdio.h>#include <stdlib.h>#include <stdbool.h>#include <ctype.h>struct Word { char* signs; // pointer to a word int sz; int capacity;};// allocates factor 2 of the current capacityvoid reserve_space_for_letter(char **c, int *capacity) { if (*capacity == 0) { // allocate the first time *capacity = 1; *c = malloc(sizeof(char) * ((*capacity))); } else { *capacity *= 2; // double the new capacity *c = realloc(*c, sizeof(char) * (*capacity)); }}void reserve_space_for_words(struct Word** w, int *capacity) { if (*capacity == 0) { // allocate the first time *capacity = 1; *w = malloc(sizeof(struct Word) * ((*capacity))); } else { *capacity *= 2; // double the new capacity *w = realloc(*w, sizeof(struct Word) * (*capacity)); }}void print_words(struct Word* words,int sz) { int i = 0; for (i = 0; i < sz; ++i) { printf("%s ", words[i].signs); }}int cmp(void const *lhs, void const *rhs) { struct Word left = *(const struct Word *)lhs; struct Word right = *(const struct Word *)rhs; return strcmp(left.signs, right.signs);}void deallocate_words(struct Word* words, int sz){ int i = 0; for (i = 0; i < sz; ++i) { free(words[i].signs); } free(words);}int main(){ struct Word* words; // Pointer to several words int sz = 0; int capacity = 0; int ch=0; bool in_word = true; reserve_space_for_words(&words, &capacity); // reserve space for first word ++sz; words[sz - 1].capacity = 0; words[sz - 1].sz = 0; while ((ch = getc(stdin)) != EOF) { if (words[sz - 1].sz == words[sz - 1].capacity) { // if current letter sz = capacity reserve_space_for_letter(&words[sz - 1].signs, &words[sz - 1].capacity); } if (!isalnum(ch) && in_word) { // end of current word; words[sz - 1].signs[words[sz - 1].sz] = '\0'; in_word = false; continue; } else if (!isalnum(ch) && !in_word) { // not in a word so dont bother doing sth continue; } else if (isalnum(ch) && !in_word) { // the next word is reached in_word = true; if (sz == capacity) { // maximum size of words reached reserve_space_for_words(&words, &capacity); // reserve space for first word } ++sz; // iterate to the new word words[sz - 1].capacity = 0; words[sz - 1].sz = 0; reserve_space_for_letter(&words[sz - 1].signs, &words[sz - 1].capacity); } words[sz - 1].signs[words[sz - 1].sz] = ch; // append the sign in the array ++words[sz - 1].sz; } qsort(words, sz, sizeof(struct Word), cmp); print_words(words, sz); deallocate_words(words, sz); getchar(); return 0;}- \$\begingroup\$one thing i found myself i use
strcmpbut i didnt included#include <string.h>the c compiler of visual studio 2017 didnt cared. i turned it to cpp and got more warnings. is it a good practice to do (just for debug turn to c++) that because c++ catches more errors even in c?\$\endgroup\$Sandro4912– Sandro49122018-06-20 18:33:01 +00:00CommentedJun 20, 2018 at 18:33
3 Answers3
Bug
If the last word is followed byEOF, then anull character is not append nor space made for it.words[sz - 1] will not be astring andstrcmp() isundefined behavior.
Simplifymain()
Suggest a re-write of themain() loop, (which should be in a helper function).
// while ((ch = getc(stdin)) != EOF) {int ch = ' '; // dummy non-isalnum characterwhile (ch != EOF) { while (!isalnum(ch)) && ch != EOF) { ch = getc(stdin); } if (ch == EOF) break; // Beginning of new word found, allocate for it. new_word(...); while (isalnum(ch))) { append_letter(ch, ...); ch = getc(stdin); } append_letter('\0', ...);}Simplifications possible if your care to usedo {} while loops.
do { do { ch = getc(stdin); } while (!isalnum(ch)) && ch != EOF); if (ch == EOF) break; // Beginning of new word found, allocate for it. new_word(...); do { append_letter(ch, ...); ch = getc(stdin); } while (isalnum(ch)); append_letter('\0', ...);} while (ch != EOF);main() splits out thewords aswords , sz, capacity and has a lot of code for reading. It has simple calls for sorting and printing. I'd expect amain() more along this:
int main(void) { Words word_list = { 0 }; if (Words_Read(&word_list)) { Words_Sort(&word_list); Words_Print(&word_list); } Words_Free(&word_list);}isalpha() vs.isalnum()
Unclear why code accepts digits for "words". I'd expectisalpha() or at least a comment about digits in words.
sizeof type vs. sizeof object
Using the size of the object rather than the size of the type is easier to code right, review and maintain.
// *w = malloc(sizeof(struct Word) * ((*capacity)));*w = malloc(sizeof **w * (*capacity));// *c = malloc(sizeof(char) * ((*capacity)));*c = malloc(sizeof **c * (*capacity));// qsort(words, sz, sizeof(struct Word), cmp);qsort(words, sz, sizeof words[0], cmp);No need to copy the whole structure
// struct Word left = *(const struct Word *)lhs;const char *left = ((const struct Word *)lhs)->signs;...// return strcmp(left.signs, right.signs);return strcmp(left, right);- \$\begingroup\$thanks for the effort of revieing. I found maybe one issue.
qsort(words, sz, sizeof sz[0], cmp);doesnt seem to be a valid expression\$\endgroup\$Sandro4912– Sandro49122018-06-20 18:12:02 +00:00CommentedJun 20, 2018 at 18:12 - \$\begingroup\$it markes the '0' and gives
SeverityCodeDescriptionProjectFileLineSuppression State Error (active)E0142expression must have pointer-to-object type\$\endgroup\$Sandro4912– Sandro49122018-06-20 18:34:23 +00:00CommentedJun 20, 2018 at 18:34 - \$\begingroup\$yes now it works as intended\$\endgroup\$Sandro4912– Sandro49122018-06-20 19:14:33 +00:00CommentedJun 20, 2018 at 19:14
When allocating an array to hold arbitrary numbers of things, consider usingcalloc andreallocarray (reallocarray is not standard C, but there isn't a standard analog tocalloc).
The reason is thatcalloc will fail if you try to allocate more memory thansize_t can hold. If, for example on a 32-bit system, you decide you want to allocate 500 million of some struct, you could do this:
typedef struct { char a; } little_struct_t;...ptr = malloc(500000000 * sizeof(little_struct_t));and it might work. But later, when you decide that your struct actually needs more stuff, you'll end up with something like this:
typedef struct { char a; int b; bool c } little_struct_t;...ptr = malloc(500000000 * sizeof(little_struct_t));and boom, you've just caused a multiplication overflow without thinking about it. If instead you had usedcalloc, you'd either see the error when you checked thatptr was notNULL (see my next comment) or a happy segfault when you try to dereferenceptr. Instead, now you have a 1705032704-byte ((sizeof(little_struct_t) * half a billion) & 0xffffffff)) chunk of memory, and if you're lucky you'll get a segfault when you write off the end of it. If you're not lucky, you'll corrupt a bunch of other memory before your program crashes.
My second comment is that you should alwaysalways check the return value ofmalloc. If it's notNULL, no harm done. If it is, either you want to crashright now or you want to take a different path. Whatever happens, you don't want your program to happily keep chugging along as though nothing's wrong.
Third, and this isn't super relevant to your example (though I would apply it indeallocate_words after thefree call inside the loop), but it's good practice to null out your pointers after you free them. I think I've spent too many words already, so I'll just say that it will simplify debugging and get rid of double free issues in the future.
- \$\begingroup\$thank you very much for youre review. So in conclusion its always better to use
callocfor user defined data structures? when to usemallocthen?\$\endgroup\$Sandro4912– Sandro49122018-06-19 20:30:18 +00:00CommentedJun 19, 2018 at 20:30 - \$\begingroup\$If you're only allocating space for 1 instance of a struct, not an array of them, and you don't care about having your memory zeroed out, use
malloc. In general, if you're multiplying inside themalloccall, consider usingcalloc.\$\endgroup\$nmichaels– nmichaels2018-06-19 20:49:34 +00:00CommentedJun 19, 2018 at 20:49 - \$\begingroup\$i tryed to find the reallocarray function but i couldnt find a source for it. is it available already in visual studio 2017?\$\endgroup\$Sandro4912– Sandro49122018-06-20 17:30:00 +00:00CommentedJun 20, 2018 at 17:30
- \$\begingroup\$Sorry,
reallocarrayis a Linux/glibc thing. I don't know about windows libraries.\$\endgroup\$nmichaels– nmichaels2018-06-20 17:39:08 +00:00CommentedJun 20, 2018 at 17:39
Put the logic where it belongs. The main loop should only adds characters to the word, while
Wordshall manage itself, e.g.:if (isalnum(ch)) { add_character(word, ch);}A possible implementation would be
add_character(struct Word * word, char ch) { if (word->size == word->capacity) { expand_capacity(word); } word->signs[word->size++] = ch;}The same applies to the list of words.
As a side note, they logic of
reserve_space_for_letterandreserve_space_for_wordsis identical, and it may be worthy of effort to unify expansion mechanism along the lines ofvoid * expand_array(void * array, size_t capacity, size_t element_size);The main logic could be streamlined:
do { struct Word * word = get_empty_word_from_list(....); while (ch = getchar() && isalnum(ch) { add_character(word, ch); } add_character(word, '\0'); while ((ch = getchar()) != EOF && !isalnum(ch)) { ; }} while (ch != EOF);Of course both inner loops would benefit if factored out into functions, say
read_wordandskip_non_word.
- \$\begingroup\$1) The "streamlined" version looks like it loses letters. When
while ((ch = getchar()) != EOF && !isalnum(ch))stops due to a letter like'e', that'e'is lost. Perhapsunget()or something to fix? 2) if 1st charter is a non-alnum, code will make word"". Yet you are very correct that themain()loop needs improvement.\$\endgroup\$chux– chux2018-06-20 04:11:07 +00:00CommentedJun 20, 2018 at 4:11 - \$\begingroup\$@chux Agreed. It was somewhat half baked. I should be clear that it is not a working code (CR is not about that) but a guideline.\$\endgroup\$vnp– vnp2018-06-20 05:23:09 +00:00CommentedJun 20, 2018 at 5:23
- \$\begingroup\$i tryed the generalisation to
expand arraybut is it possible? it always gives a corrput heap with realloc. malloc seems to kinda work. I do this: Giving the array to the functionword_list->words = (struct Word*) expand_array(&word_list->words, &word_list->capacity, sizeof(struct Word));then malloc and looks like this:array = malloc(element_size * (*capacity));array = realloc(array, element_size * (*capacity));I changed the parameter capacity to a pointer to be able to return it by reference.\$\endgroup\$Sandro4912– Sandro49122018-06-20 18:52:54 +00:00CommentedJun 20, 2018 at 18:52
You mustlog in to answer this question.
Explore related questions
See similar questions with these tags.

