A JMH Dive into Java String Concatenation

The String Concatenation Reality Check

Every developer has written this at least once:

String result = "";
for (String item : items) {
result += item;
}

It works. It’s readable. It passes tests. And at 100 iterations, it’s fast enough.

But scale that to 10,000 operations — and suddenly, you’re allocating and copying megabytes of strings in a tight loop. What looked harmless becomes a performance bottleneck.

This is exactly why string concatenation makes a perfect micro-benchmark.

⚡ TL;DR (Quick Recap)

  • char[] and pre-sized StringBuilder are the fastest
  • StringBuilder is the best practical default
  • String.join() is clean and nearly as fast
  • +, .concat(), and .format() become hundreds to thousands of times slower at scale

Benchmark Setup

All implementations follow a simple contract:

String concat(List<String> items);

Test parameters:

  • Input sizes: 100 (baseline), 1,000 (real-world), 10,000 (stress test)
  • String length: 15–35 characters
  • JMH 1.37 on Java 25
  • JVM: -Xms1g -Xmx1g -XX:+UseG1GC

The Naive Trap: +, .concat(), .format()

result = result + item; 
// Java 9+ compiles this via invokedynamic / StringConcatFactory:
// Inside a loop, the compiler does NOT fuse across iterations.
// Each iteration still calls StringConcatFactory, producing a new String.
// The O(n²) memory behavior is preserved at loop scale.

Looks innocent. It isn’t.

What actually happens:

  • Each iteration creates a new String
  • Copies previous content
  • Discards the old object

Results (N=10,000):

  • + operator: ~28,000,000 ns/op
  • String.concat(): similar
  • String.format(): even worse (~80M–190M ns/op)

These approaches scale catastrophically.

The Workhorse: StringBuilder

StringBuilder sb = new StringBuilder();
for (String item : items) {
sb.append(item);
}

Why it works:

  • Mutable buffer
  • Avoids repeated allocations
  • Grows dynamically

Results (N=10,000):

  • Stable O(n)
  • ~44,000 ns/op

This is the default choice for almost every real-world scenario.

The Optimization: Pre-Sized StringBuilder

StringBuilder sb = new StringBuilder(totalLength);

What changes:

  • No internal resizing
  • No array copying

Results (N=10,000):

  • ~40,000 ns/op
  • ~10–15% faster than default builder

When to use: You know (or can estimate) total size. Small tweak, measurable gain.

The Speed King: char[]

public String concat(List<String> items) {
int totalLength = 0;
for (String item : items) {
totalLength += item.length();
}

char[] buffer = new char[totalLength];
int position = 0;
for (String item : items) {
int len = item.length();
item.getChars(0, len, buffer, position);
position += len;
}

return new String(buffer);
}

Why it wins:

  • Single allocation
  • No intermediate objects
  • No resizing

Results (N=10,000):

  • ~34,000 ns/op (fastest across all datasets)

Trade-offs:

  • Verbose
  • Error-prone
  • Harder to maintain

Use only when profiling proves you need it. Manual char[] manipulation carries two real risks:

  1. Index arithmetic errors are easy to introduce and hard to debug.
  2. 2. It doesn’t handle Unicode surrogate pairs — strings containing emoji or rare CJK characters will silently produce corrupted output.

The Clean Option: String.join()

return String.join("", items);

Results (N=10,000):

  • ~5–10% slower than StringBuilder

Why it matters:

  • Clean
  • Idiomatic
  • Perfect for most business code

This is the “clean enough + fast enough” sweet spot. Rely on StringBuilder or String.join which are “Compact String” aware and optimize for byte[].

The Over-Engineered: Streams & Parallelism

items.stream().collect(Collectors.joining());

or

items.parallelStream().collect(Collectors.joining());

Results (N=10,000):

  • Streams: ~5–10% slower
  • Parallel: 2–3× slower

Parallelism does not help here.

The Misfit: ByteBuffer

public String concat(List<String> items) {
int totalBytes = 0;
byte[][] encoded = new byte[items.size()][];

for (int i = 0; i < items.size(); i++) {
encoded[i] = items.get(i).getBytes(StandardCharsets.UTF_8);
totalBytes += encoded[i].length;
}

ByteBuffer buffer = ByteBuffer.allocate(totalBytes);
for (byte[] bytes : encoded) {
buffer.put(bytes);
}

return new String(buffer.array(), StandardCharsets.UTF_8);
}

Problem:

  • Requires UTF-8 encoding/decoding
  • Extra conversion step

Results (N=10,000):

  • ~2× slower than StringBuilder

Only useful in I/O-heavy pipelines — not for pure string work.

The Real Insight: It’s About Growth, Not Speed

At small scale (N=100):

  • Everything looks fine
  • Even bad practices “work”

At medium scale (N=1,000):

  • Differences become visible

At large scale (N=10,000):

  • Bad approaches collapse
  • Good ones remain stable

This isn’t about nanoseconds — it’s about how your algorithm scales.

Final Takeaways

String concatenation in Java is a perfect example of how “it works” is not the same as “it scales.” What looks clean and harmless in a small loop can quietly introduce serious performance issues once the input grows. The biggest trap isn’t writing incorrect code — it’s writing code that behaves correctly, but inefficiently under real workloads.

  • Simple code can hide serious performance pitfalls
  • O(n²) patterns are invisible — until they aren’t
  • StringBuilder remains the most reliable tool in Java
  • The clean option - String.join()
  • Micro-benchmarks reveal truths intuition often misses

You can find all the code on GitHub.

Originally posted on marconak-matej.medium.com.