C Strings character arrays and null terminator guide for beginners
|

C Strings: Character Arrays, Null Terminator & Beyond (2026)

What Are Strings in C?

C does not have a built-in string type. A “string” in C is simply a character array terminated by a null character ('\0'). That is it — no objects, no methods, no automatic memory management. Every string operation you perform is ultimately manipulating an array of char values.

If you have completed our lesson on C Arrays, you already understand the underlying data structure. Strings add one critical rule: the last meaningful character must be followed by '\0' (ASCII value 0) so that functions know where the string ends.

This design is simple, fast, and gives you total control — but it also means you are responsible for managing memory, tracking sizes, and preventing buffer overflows. More security vulnerabilities have been caused by C string handling than perhaps any other single programming concept.

The Null Terminator

The null terminator '\0' is the invisible sentinel that marks the end of a string:

char greeting[] = "Hello";
// Memory: ['H']['e']['l']['l']['o']['\0']
// Indices:  0    1    2    3    4     5
// Array size: 6 (5 characters + null terminator)

Every C string function — printf, strlen, strcpy — scans forward until it finds '\0'. If the null terminator is missing, the function reads past the end of your array into random memory. This is undefined behavior and a common source of bugs.

char bad[5] = {'H', 'e', 'l', 'l', 'o'};  // NOT a valid string — no \0!
printf("%s\n", bad);  // reads past the array — undefined behavior

char good[6] = {'H', 'e', 'l', 'l', 'o', '\0'};  // valid string
printf("%s\n", good);  // prints "Hello" then stops

Declaring and Initializing Strings

Using String Literals (Most Common)

char name[] = "Alice";        // compiler allocates 6 bytes (5 + \0)
char city[20] = "New York";   // 20 bytes allocated, 9 used (8 + \0)
char empty[10] = "";          // empty string: just \0, rest is zeros

When you use double quotes, the compiler automatically appends '\0'. This is the preferred way to initialize strings.

Character-by-Character

char manual[6];
manual[0] = 'H';
manual[1] = 'e';
manual[2] = 'l';
manual[3] = 'l';
manual[4] = 'o';
manual[5] = '\0';  // MUST add this!

String Pointer vs Array

char arr[] = "Hello";    // mutable array on the stack
char *ptr = "Hello";     // pointer to read-only string literal

arr[0] = 'J';   // OK — "Jello"
// ptr[0] = 'J';  // UNDEFINED BEHAVIOR — string literals are read-only

This distinction is critical. char arr[] creates a modifiable copy. char *ptr points to a string literal stored in read-only memory. Attempting to modify a string literal through a pointer is undefined behavior — it may crash or silently corrupt data.

String Input and Output

Output with printf and puts

char msg[] = "Hello, World!";

printf("%s\n", msg);       // format specifier for strings
puts(msg);                 // prints string + newline automatically
printf("%.5s\n", msg);     // print first 5 chars: "Hello"
printf("%-20s|\n", msg);   // left-aligned, padded to 20 chars

Input with scanf

char name[50];
printf("Enter your name: ");
scanf("%49s", name);  // reads one word, max 49 chars + \0
printf("Hello, %s\n", name);

scanf with %s stops at the first whitespace. It cannot read full sentences. Also note the 49 width limit — without it, a long input overflows the buffer. As discussed in C Input & Output, always limit scanf input.

Input with fgets (Safer)

char line[100];
printf("Enter a sentence: ");
fgets(line, sizeof(line), stdin);
// fgets reads until newline or buffer is full
// It INCLUDES the newline character

// Remove trailing newline
size_t len = strlen(line);
if (len > 0 && line[len - 1] == '\n') {
    line[len - 1] = '\0';
}

printf("You said: %s\n", line);

fgets is the safe way to read strings. It limits how many characters are read (preventing overflow) and can read spaces. The one gotcha is that it includes the newline character — always strip it if you don’t want it.

Manual String Operations

Before diving into the standard library functions (next lesson), it is valuable to implement basic string operations yourself. This builds your understanding of how strings really work:

String Length

int my_strlen(const char *s) {
    int len = 0;
    while (s[len] != '\0') {
        len++;
    }
    return len;
}

String Copy

void my_strcpy(char *dest, const char *src) {
    int i = 0;
    while (src[i] != '\0') {
        dest[i] = src[i];
        i++;
    }
    dest[i] = '\0';  // don't forget the terminator!
}

String Compare

int my_strcmp(const char *a, const char *b) {
    while (*a && *a == *b) {
        a++;
        b++;
    }
    return *(unsigned char *)a - *(unsigned char *)b;
    // returns 0 if equal, negative if a < b, positive if a > b
}

String Concatenate

void my_strcat(char *dest, const char *src) {
    // Find end of dest
    while (*dest) dest++;
    // Copy src to end
    while (*src) {
        *dest = *src;
        dest++;
        src++;
    }
    *dest = '\0';
}

String Reverse

void reverse_string(char *s) {
    int len = 0;
    while (s[len]) len++;

    for (int i = 0; i < len / 2; i++) {
        char temp = s[i];
        s[i] = s[len - 1 - i];
        s[len - 1 - i] = temp;
    }
}

String Literals vs Character Arrays

// String literal — stored in read-only data section
"Hello"   // type: const char[6]

// Character array — modifiable, on the stack
char arr[] = "Hello";   // type: char[6], copy of the literal

// Single character — NOT a string
char c = 'A';           // type: char (1 byte)
// 'A' != "A"
// 'A' is the integer 65
// "A" is a char array: {'A', '\0'} — 2 bytes

The distinction between single quotes (character) and double quotes (string) is one of the most fundamental things in C. 'A' is an integer value. "A" is a pointer to a two-byte array. Mixing them up causes type errors or subtle bugs.

Arrays of Strings

2D Character Array

char names[4][20] = {
    "Alice",
    "Bob",
    "Charlie",
    "Diana"
};

for (int i = 0; i < 4; i++) {
    printf("%s\n", names[i]);
}

Each row is a fixed 20-byte buffer, even if the name is shorter. This wastes memory but is simple.

Array of Pointers (More Flexible)

const char *fruits[] = {
    "Apple",
    "Banana",
    "Cherry",
    "Date"
};
int count = sizeof(fruits) / sizeof(fruits[0]);

for (int i = 0; i < count; i++) {
    printf("%s\n", fruits[i]);
}

Each pointer points to a string literal of different length — no wasted memory. But the strings are read-only since they are literals.

Command-Line Arguments

int main(int argc, char *argv[]) {
    printf("Program: %s\n", argv[0]);
    for (int i = 1; i < argc; i++) {
        printf("Arg %d: %s\n", i, argv[i]);
    }
    return 0;
}

argv is an array of strings — the command-line arguments passed to your program. This is one of the most practical uses of string arrays.

Buffer Overflows and Safety

Buffer overflows are the most dangerous consequence of C’s string model. When you write more data into a buffer than it can hold, you overwrite adjacent memory:

char buffer[10];

// DANGEROUS — no bounds checking:
strcpy(buffer, "This is way too long for the buffer");
// Writes 36+ bytes into a 10-byte buffer → undefined behavior

// SAFE alternatives:
strncpy(buffer, "This is way too long", sizeof(buffer) - 1);
buffer[sizeof(buffer) - 1] = '\0';  // ensure null termination

snprintf(buffer, sizeof(buffer), "This is way too long");
// snprintf automatically null-terminates and never overflows

Buffer overflow vulnerabilities have been exploited in real-world attacks for decades — from the Morris Worm (1988) to modern zero-day exploits. Always use bounded functions (strncpy, snprintf, fgets) in production code.

Common Mistakes

Mistake 1: Forgetting the Null Terminator

char name[5] = {'A', 'l', 'i', 'c', 'e'};
printf("%s\n", name);  // reads past array — add '\0'!

Mistake 2: Comparing Strings with ==

char a[] = "hello";
char b[] = "hello";

if (a == b) { }  // WRONG — compares addresses, not content
if (strcmp(a, b) == 0) { }  // CORRECT

Mistake 3: Modifying String Literals

char *s = "Hello";
s[0] = 'J';  // CRASH or undefined behavior

Mistake 4: Not Allocating Space for \0

char buf[5];
strcpy(buf, "Hello");  // needs 6 bytes (5 + \0), buffer is only 5!

Mistake 5: Using scanf Without Width Limit

char name[20];
scanf("%s", name);     // no limit — overflow risk!
scanf("%19s", name);   // correct — max 19 chars + \0

Practice Exercises

  1. Palindrome Check: Write a function that checks if a string is a palindrome (reads the same forwards and backwards).
  2. Word Counter: Count the number of words in a sentence (words are separated by spaces).
  3. Caesar Cipher: Implement encryption and decryption with a shift of N positions.
  4. Vowel Counter: Count vowels and consonants in a string.
  5. Title Case: Convert a string to title case (first letter of each word uppercase, rest lowercase).

Summary

C strings are character arrays with a null terminator — nothing more, nothing less. This simplicity gives you speed and control, but demands vigilance about buffer sizes, null termination, and bounds checking. You now understand how to declare, initialize, read, and manipulate strings safely.

In the next lesson, we will explore the C standard library string functionsstrlen, strcpy, strcat, strcmp, strstr, and more — the power tools that make working with C strings practical.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *