Back to Dashboard

Strings

A sequence of characters, usually terminated by a null character in C or stored as an object in C++.

Operations & Variations

Core Concepts

What is a String?

String Representation

A String is a sequence of characters, primarily used to represent text. Under the hood, a string is generally implemented as an array of characters. It is one of the most widely used data structures across all programming domains, from basic standard output to complex text-processing algorithms.


Memory Layout

In C, a string is stored as a contiguous array of characters, ending with a special null-terminator character ('\0'). This terminator lets functions know where the string ends.

Index:   [ 0 ][ 1 ][ 2 ][ 3 ][ 4 ][ 5 ]
Char:    [ H ][ e ][ l ][ l ][ o ]['\0' ]
Address: [100][101][102][103][104][105]   (1 byte per char)

In modern high-level languages like C++, Java, and Python, strings are often implemented as objects. These objects manage the underlying character array dynamically, keeping track of the string's length without needing a null-terminator, and providing safety against buffer overflows.


Core Operations & Time Complexities

OperationTime ComplexityNotes
Access by indexO(1)Like an array, direct access to the character
Search (Naive)O(N * M)N=text length, M=pattern length
Search (Optimized)O(N + M)KMP or Rabin-Karp algorithms
ConcatenationO(N + M)Copying both strings into a new one
SubstringO(K)Extracting K characters
LengthO(1) or O(N)O(1) in C++/Java (stored), O(N) in C (counted)

Space Complexity: O(N) — proportional to the number of characters.


Mutability

A critical concept with strings is mutability.

1. Mutable Strings

In C and C++, strings can be modified in place. Changing a character does not require allocating new memory.

std::string s = "hello";
s[0] = 'H'; // String is now "Hello"

2. Immutable Strings

In languages like Java, Python, and JavaScript, strings are immutable. Once created, a string's content cannot be changed. Operations like concatenation or replacing characters create entirely new string objects in memory.

String s = "hello";
s = s.toUpperCase(); // Creates a new string "HELLO"

Immutability provides security and thread-safety but can lead to performance overhead if many string manipulations are performed in a loop (which is why classes like StringBuilder exist).


Key Algorithmic Patterns

Mastering strings involves many of the same patterns as arrays, plus some text-specific ones:

Two Pointers

Used for problems like reversing a string or checking if a string is a palindrome.

Example: Checking if a string reads the same forwards and backwards.

Sliding Window

Ideal for finding substrings that meet certain criteria.

Example: Longest substring without repeating characters.

Hashing / Frequency Counting

Using a hash map or an array of size 256 (or 26 for alphabets) to count character occurrences.

Example: Checking if two strings are anagrams of each other.

Tries (Prefix Trees)

A specialized tree structure used for fast string retrieval and prefix matching.

Example: Autocomplete systems or spell checkers.


Advanced String Algorithms

When performance is critical, specialized algorithms are used for pattern matching:

  • Knuth-Morris-Pratt (KMP): Uses a prefix table (LPS) to avoid redundant comparisons, finding a pattern in O(N + M).
  • Rabin-Karp: Uses a rolling hash function to quickly eliminate non-matching substrings.
  • Z Algorithm: Finds all occurrences of a pattern by constructing a Z-array in O(N + M) time.

Real-World Applications

  • Text Processing & Compilers: Parsing code, spell checking, and formatting.
  • Search Engines & Databases: Indexing keywords and performing fast text searches.
  • Bioinformatics: DNA sequencing heavily relies on string matching algorithms to find genetic patterns (A, C, G, T).
  • Data Serialization: Formats like JSON, XML, and HTML are entirely string-based and require efficient parsing.
  • Cryptography: Encrypting and decrypting data involves complex string and byte-array manipulations.

Interview Tips 💡

  1. Ask about the character set — is it just lowercase English letters (a-z)? ASCII? Unicode? This determines if you can use a fixed-size array (like int count[26]) for frequency counting.
  2. Watch out for immutability. If you are building a string dynamically in a loop in Java/Python, use a StringBuilder or a list to avoid O(N²) time complexity from repeated copying.
  3. Empty strings ("") and null values are common edge cases. Always handle them first!
  4. For anagram problems, sorting strings takes O(N log N), but frequency counting takes O(N) time and O(1) space.