C, C#, Go, TS via WebAssembly

This article compares the performance of a brute-force solution to the traveling salesman problem as implemented in C, C#, and Go compiled to WebAssembly, versus a TypeScript implementation compiled to JavaScript. Feel the need to try it immediately yourself? Skip to the results!

The Contenders

C

Dennis Ritchie designed C in 1972 and it’s still hugely popular today, though I believe it’s more effective than it is beloved. I’ve heard Python, the most popular language, described as just a user-friendly way to execute libraries written in C. C happens to be the second programming language I learned after Visual Basic.

C#

For a long time C# suffered from a “Windows-only” stigma (even CoPilot suggested that phrase) which left it underrated and mostly unnoticed in the broader software development world. Since 2014, when Microsoft open sourced .NET Core, Microsoft and the open-source community have been unafraid to evolve the language through an open process (the meeting notes are great). Dare I say that through this approach, C# has left Java behind?

Go

The Go developer experience around testing, benchmarking, and profiling is unmatched, and the way packages are source all the way down facilitates understanding, learning, and performance. The way Go interfaces are implemented implicitly by any type with the required methods often leads to cleaner code. The Go team at Google and the Go community have been dependably improving the language and runtime of Go over time.

TypeScript

Anders Hejlsberg designed TypeScript in 2012; in my imagination after becoming enraged by the idiosyncrasies of JavaScript while implementing some random side project (2012’s weather.bingo?) TypeScript became a platform to implement features (classes, async/await, etc.) then drive them into JavaScript itself. Of the languages used this article, TypeScript is the only non-WebAssembly approach.

The Traveling Salesman Problem (TSP)

The TSP is a famous, easy-to-understand problem — given a set of cities and the distances between them, what is the shortest path that visits each city exactly once and returns to the starting city? Variants of this problem describe tons of practical real-world situations and it has been intensely studied for decades.

The brute-force algorithm for solving TSP has $O(n!)$ time complexity. This makes it highly impractical even for, say, 20 cities, which would take years to compute.

Better algorithms like the Held–Karp algorithm can achieve $O(n^{2}2^{n})$ time complexity at the expense of $\Theta(n2^{n})$ space complexity.

For this article, I’ll just use a naive brute-force approach and limit the number of cities to 10. The algorithm will generate all permutations of cities and pick one with the lowest total distance. While it’s a brute-force approach, it will use one optimization which is to generate the permutations in lexicographic order and reuse distance sums when possible (so, for cities A, B, C, D, E, the next permutation will be A, B, C, E, D, and the sum of distances from A to B to C will be reused.)

The Input

We’ll provide the input as a 2D array of distances between cities, of length $n^2$ . For example:

	A	B	C	D
A	000	150	100	100
B	150	000	100	100
C	100	100	000	150
D	100	100	150	000

Here the distance from A to B is the same as the distance from B to A so this table contains some redundant information, justified because lookups are simpler.

The Output

Each function will output the length of the shortest path and shortest path itself as an array of indices. For example, if an optimum route here is A->C->B->D, the output would be [0, 2, 1, 3], and the length would be 400.

Results

Clicking the button below will run the brute-force solution for a 10-city TSP problem 10 times for each language. These are executed one-at-a-time so they don’t interfere with each other on machines with few cores. The total time taken for each language is displayed below.

⚠️ Warning: Benchmarking is hard! Make sure your machine is idle, not overheated, and you don’t have the developer tools open (that slows things down significantly and affects relative performance of WebAssembly versus TypeScript). My recommendation is to run it multiple times and count the first few as warm-up.

Your Results

Language	Time (ms)
C	0
C# (NativeAOT-LLVM)	0
C# Unsafe (NativeAOT-LLVM)	0
Go (TinyGo)	0
TypeScript	0

My Results

Gathered on an AMD Ryzen 7 7735HS machine running Windows 11.

Language	Time (ms)
C	214
C# (NativeAOT-LLVM)	290
C# Unsafe (NativeAOT-LLVM)	256
Go (TinyGo)	413
TypeScript	429

For further comparison, natively-compiled versions of both the C# and Go code finished on this machine within ~50ms or so of their WebAssembly counterparts. From what I’ve read and previously observed, a 20-30% slowdown compared to native is typical.

Remarks

Using higher-level languages like Go and C# compiled to WebAssembly can have an advantage over JavaScript. C achieves the best performance here, but not by a huge margin.

You might wonder - why NativeAOT-LLVM and not Blazor? I previously wrote about NativeAOT-LLVM and the same reasons still apply. In addition, it has gotten even better now: this .wasm file is about 700 KiB uncompressed.

Why TinyGo and not Go? I tried both, and Go produced a 1.5 MiB .wasm instead of TinyGo’s 135 KiB .wasm, and took ~3x the time to execute, so I didn’t bother including it here. TinyGo remains the champion for the WebAssembly use case for now.

If you have any suggestions for improving the implementations in any of these languages, please let me know on X. With my next article I’ll move back toward more real-world practical topics 😂.

Code Listings

C#

public static long SolveTspBruteForce(
    ReadOnlySpan<int> input,
    Span<int> output)
{
    int n = output.Length;
    long best = long.MaxValue;

    // The 'a' span contains the current permutation of cities
    // Start with the lexicographically smallest permutation
    Span<int> a = stackalloc int[n];
    for (int i = 0; i < a.Length; i++)
    {
        a[i] = i;
    }

    // The 'p' span has the cumulative sum of the distances between cities
    Span<long> p = stackalloc long[n];
    for (int i = 1; i < p.Length; i++)
    {
        p[i] = p[i - 1] + input[a[i - 1] * n + a[i]];
    }

    while (true)
    {
        // Check the sum of the current permutation
        // This is the final entry in the cumulative sum span 'p' plus
        // the distance from the last city back to the first
        long sum = input[a[n - 1] * n + a[0]] + p[^1];
        if (sum < best)
        {
            best = sum;
            a.CopyTo(output);
        }

        // Advance the permutation to the next lexicographically larger
        int x = a.Length - 2;
        while (x >= 0 && a[x] >= a[x + 1]) x--;

        if (x < 0)
        {
            // We were already at the lexicographically largest permutation
            break;
        }

        int y = a.Length - 1;
        while (a[y] <= a[x]) y--;

        (a[x], a[y]) = (a[y], a[x]);
        for (int i = x + 1, j = n - 1; i < j; i++, j--)
        {
            (a[i], a[j]) = (a[j], a[i]);
        }

        // Update the cumulative sums based on how the 'a' span was modified
        if (x == 0)
        {
            x = 1;
        }
        for (int i = x; i < p.Length; i++)
        {
            p[i] = p[i - 1] + input[a[i - 1] * n + a[i]];
        }
    }

    return best;
}

C# Unsafe

private static unsafe long SolveTspBruteForceUnsafe(
    int* input,
    int* output,
    int n)
{
    long best = long.MaxValue;

    int* a = stackalloc int[n];
    for (int i = 0; i < n; i++)
    {
        a[i] = i;
    }

    long* p = stackalloc long[n];
    for (int i = 1; i < n; i++)
    {
        p[i] = p[i - 1] + input[a[i - 1] * n + a[i]];
    }

    while (true)
    {
        long sum = input[a[n - 1] * n + a[0]] + p[n - 1];
        if (sum < best)
        {
            best = sum;
            for (int i = 0; i < n; i++)
            {
                output[i] = a[i];
            }
        }

        int x = n - 2;
        while (x >= 0 && a[x] >= a[x + 1]) x--;

        if (x < 0)
        {
            break;
        }

        int y = n - 1;
        while (a[y] <= a[x]) y--;

        (a[x], a[y]) = (a[y], a[x]);
        Reverse(a + x + 1, n - x - 1);

        if (x == 0)
        {
            x = 1;
        }

        for (int i = x; i < n; i++)
        {
            p[i] = p[i - 1] + input[a[i - 1] * n + a[i]];
        }
    }

    return best;
}

private static unsafe void Reverse(int* start, int length)
{
    int* end = start + length - 1;
    while (start < end)
    {
        int temp = *start;
        *start = *end;
        *end = temp;
        start++;
        end--;
    }
}

Go

func solveTspBruteForce(input []int32, output []int32) int64 {
	n := int32(len(output))
	best := int64(math.MaxInt64)

	a := make([]int32, n)
	for i := 0; i < len(a); i++ {
		a[i] = int32(i)
	}

	p := make([]int64, n)
	for i := 1; i < len(p); i++ {
		p[i] = p[i-1] + int64(input[a[i-1]*n+a[i]])
	}

	for {
		sum := int64(input[a[n-1]*n+a[0]]) + p[len(p)-1]
		if sum < best {
			best = sum
			copy(output, a)
		}

		x := len(a) - 2
		for x >= 0 && a[x] >= a[x+1] {
			x--
		}

		if x < 0 {
			break
		}

		y := len(a) - 1
		for a[y] <= a[x] {
			y--
		}

		a[x], a[y] = a[y], a[x]
		reverse(a[x+1:])

		if x == 0 {
			x = 1
		}
		for i := x; i < len(p); i++ {
			p[i] = p[i-1] + int64(input[a[i-1]*n+a[i]])
		}
	}

	return best
}

func reverse(a []int32) {
	for i, j := 0, len(a)-1; i < j; i, j = i+1, j-1 {
		a[i], a[j] = a[j], a[i]
	}
}

C

int64_t solve_tsp_brute_force(const int32_t *input, int32_t *output, int32_t n)
{
    int64_t best = INT64_MAX;
    int a[n];
    for (int i = 0; i < n; i++)
    {
        a[i] = i;
    }

    int64_t p[n];
    p[0] = 0;
    for (int i = 1; i < n; i++)
    {
        p[i] = p[i - 1] + input[a[i - 1] * n + a[i]];
    }

    while (true)
    {
        long sum = input[a[n - 1] * n + a[0]] + p[n - 1];
        if (sum < best)
        {
            best = sum;
            memcpy(output, a, n * sizeof(int));
        }

        int x = n - 2;
        while (x >= 0 && a[x] >= a[x + 1])
            x--;

        if (x < 0)
        {
            break;
        }

        int y = n - 1;
        while (a[y] <= a[x])
            y--;

        int temp = a[x];
        a[x] = a[y];
        a[y] = temp;

        for (int i = x + 1, j = n - 1; i < j; i++, j--)
        {
            temp = a[i];
            a[i] = a[j];
            a[j] = temp;
        }

        if (x == 0)
        {
            x = 1;
        }
        for (int i = x; i < n; i++)
        {
            p[i] = p[i - 1] + input[a[i - 1] * n + a[i]];
        }
    }

    return best;
}

TypeScript

function solveTspBruteForce(input: number[], output: number[]): number {
  const n = output.length;
  let best = Number.MAX_SAFE_INTEGER;
  let a: number[] = new Array(n);

  for (let i = 0; i < n; i++) {
    a[i] = i;
  }

  const p: number[] = Array(n).fill(0);
  for (let i = 1; i < p.length; i++) {
    p[i] = p[i - 1] + input[a[i - 1] * n + a[i]];
  }

  while (true) {
    const sum = input[a[n - 1] * n + a[0]] + p[p.length - 1];
    if (sum < best) {
      best = sum;
      a.forEach((val, index) => (output[index] = val));
    }

    let x = a.length - 2;
    while (x >= 0 && a[x] >= a[x + 1]) x--;

    if (x < 0) {
      break;
    }

    let y = a.length - 1;
    while (a[y] <= a[x]) y--;

    [a[x], a[y]] = [a[y], a[x]];
    reverseArray(a, x + 1, a.length - 1);

    if (x === 0) {
      x = 1;
    }
    for (let i = x; i < p.length; i++) {
      p[i] = p[i - 1] + input[a[i - 1] * n + a[i]];
    }
  }

  return best;
}

function reverseArray(a: number[], start: number, end: number): void {
  while (start < end) {
    [a[start], a[end]] = [a[end], a[start]];
    start++;
    end--;
  }
}