How to Run C on an Apple Device (And Why You'd Want To)

I ported a tiny GPT to an Apple Watch in both Swift and C. The C version was 50x faster. Here's the bridging setup and why the gap exists.

An image showing the C programming language book next to an Apple Watch

I recently ported Andrej Karpathy's microgpt — a tiny from-scratch GPT in a single Python file, no dependencies!! — to run on an Apple Watch. I (me and my bestie Claude) wrote it three ways: pure Swift, C called from Swift, and Objective-C. The Swift version took ~100 seconds to train and inference. The C version? ~2 seconds. Same device, same model, same dataset.

That's a ~50x speedup for the same algorithm on identical hardware. Here's how to bridge C into a watchOS (or iOS/tvOS/visionOS) app, and why the performance difference makes it worth the effort.

Why C?

Swift is great for UI, app architecture, and rapid iteration. But for tight numerical loops — ML training, signal processing, physics sims, codecs — it carries overhead that C doesn't.

Swift's value types involve ARC for class instances, protocol witness tables for generics, and compiler-inserted retain/release traffic that adds up in hot loops. C gives you flat memory, no reference counting, and decades of battle-tested compiler optimizations (-O2/-O3) that know exactly how to vectorize a for loop over a double array.

In this project, the workload is a from-scratch GPT with a custom autograd engine. Thousands of small heap-allocated Value objects linked in a computation graph, with a backward pass doing topological sort and gradient accumulation. In Swift, each Value is a class (reference type), so every creation, every link, every traversal touches ARC. In C, the same structure is a flat arena of structs indexed by int. Zero allocation overhead per node, zero reference counting, cache-friendly sequential access.

The Setup

Standard Xcode watchOS project. The goal: call a C function microgpt_c_run() from Swift and get back an inference result string.

File structure inside the Watch App target:

microgpt-watchos Watch App/
  CMicroGPTBridge.h       // C header declaring the public API
  CMicroGPTBridge.c       // The full C implementation
  microgpt-watchos-Bridging-Header.h  // Tells Swift about the C header
  CMicroGPTRunner.swift   // Swift wrapper that calls the C function
  ContentView.swift       // SwiftUI view

Step 1: Write Your C Code with a Clean Interface

The key constraint: your C function needs to be callable from Swift. That means a simple C signature with primitive types and pointers. No FILE *, no printf to stdout (there is no stdout on watchOS), no main().

The header (CMicroGPTBridge.h):

#ifndef CMICROGPTBRIDGE_H
#define CMICROGPTBRIDGE_H

#ifdef __cplusplus
extern "C" {
#endif

enum {
    MG_C_SUCCESS = 0,
    MG_C_ERROR_INVALID_ARGS = 1,
    MG_C_ERROR_NO_DATASET = 2,
    MG_C_ERROR_VOCAB_TOO_LARGE = 3,
    MG_C_ERROR_ARENA_OVERFLOW = 4,
    MG_C_ERROR_ALLOCATION = 5,
};

int microgpt_c_run(const char *dataset_utf8,
                   int num_steps,
                   double temperature,
                   char *out_inference,
                   int out_inference_capacity,
                   double *out_last_loss);

#ifdef __cplusplus
}
#endif

#endif

A few things to note:

Pass data in, get data out. The dataset comes in as a UTF-8 C string. The inference result is written to a caller-provided buffer. No file I/O.
Return an error code. Swift can check this and throw a proper Swift error.
extern "C" guard. Harmless if you're not mixing C++, but good hygiene.

The implementation (CMicroGPTBridge.c) is the full microgpt training loop: arena-allocated autograd nodes, forward pass, backward pass, Adam optimizer, then a single inference sample. The main adaptation from the standalone microgpt.c was replacing fopen("input.txt", "r") with accepting the dataset as a parameter, and swapping printf output for writing results to the output buffer.

The core of the adapted entry point:

int microgpt_c_run(const char *dataset_utf8,
                   int num_steps,
                   double temperature,
                   char *out_inference,
                   int out_inference_capacity,
                   double *out_last_loss) {
    if (!dataset_utf8 || !out_inference || out_inference_capacity <= 0 || !out_last_loss)
        return MG_C_ERROR_INVALID_ARGS;

    out_inference[0] = '\0';
    *out_last_loss = 0.0;

    reset_run_state();
    srand48(42);

    int status = load_dataset_from_utf8(dataset_utf8);
    if (status != MG_C_SUCCESS) return status;

    status = init_params();
    if (status != MG_C_SUCCESS) { reset_run_state(); return status; }

    // ... training loop (same as standalone microgpt.c) ...

    // ... inference: write result to out_inference ...

    free(m_adam); free(v_adam);
    reset_run_state();
    return status;
}

One thing worth noting for embedded/watch targets: the original microgpt.c uses a 2-million-element static arena (static Value arena[ARENA_SIZE]). This works fine — watchOS apps get a reasonable stack and static storage budget. But very large static arrays can be a problem on some targets, so keep that in mind.

Step 2: Create the Bridging Header

Create a file called <YourTarget>-Bridging-Header.h:

#ifndef MICROGPT_WATCHOS_BRIDGING_HEADER_H
#define MICROGPT_WATCHOS_BRIDGING_HEADER_H

#include "CMicroGPTBridge.h"

#endif

Then tell Xcode about it. In your target's Build Settings, set:

SWIFT_OBJC_BRIDGING_HEADER = microgpt-watchos Watch App/microgpt-watchos-Bridging-Header.h

This is the piece that makes your C function visible to Swift. Once it's set, you can call microgpt_c_run(...) directly from any Swift file in the target. No import needed. Apple's Imported C and Objective-C APIs docs cover the full details of how Swift sees C types, enums, pointers, and function signatures through this bridge.

Step 3: Write a Swift Wrapper

The raw C call works, but it's nicer to wrap it in a Swift type with proper error handling:

SWIFT

struct CMicroGPTRunner: Sendable {
    struct Config {
        var numSteps: Int32 = 1000
        var temperature: Double = 0.5
    }

    private let config: Config

    init(config: Config = Config()) {
        self.config = config
    }

    func run() throws -> String {
        let content = try loadDatasetContent()
        var inferenceBuffer = [CChar](repeating: 0, count: 256)
        var lastLoss = 0.0

        let resultCode = content.withCString { datasetPointer in
            microgpt_c_run(
                datasetPointer,
                config.numSteps,
                config.temperature,
                &inferenceBuffer,
                Int32(inferenceBuffer.count),
                &lastLoss
            )
        }

        guard resultCode == 0 else {
            throw CMicroGPTRunnerError.cEngineFailed(code: resultCode)
        }

        return String(cString: inferenceBuffer)
    }

    private func loadDatasetContent() throws -> String {
        if let localURL = Bundle.main.url(forResource: "input", withExtension: "txt") {
            return try String(contentsOf: localURL, encoding: .utf8)
        }
        let data = try Data(contentsOf: namesURL)
        return String(decoding: data, as: UTF8.self)
    }
}

Key details:

content.withCString gives you a const char * pointer that's valid for the duration of the closure. This is the correct way to pass a Swift String to C.
[CChar] buffer for the output. The C function writes into it, then String(cString:) converts it back.
Sendable conformance because we dispatch this to a background thread. The C function is a pure computation with no shared mutable state across calls.

Step 4: Call It from SwiftUI

In the view model, both runners are used identically. The only difference is that the C runner blocks synchronously (it finishes in ~2s, so we just dispatch it off the main actor):

SWIFT

case .c:
    self.latestStatus = "C: training + inference..."
    let cRunner = self.cRunner
    inference = try await Task.detached(priority: .userInitiated) {
        try cRunner.run()
    }.value

The Swift runner uses async callbacks to report per-step progress because it runs long enough (~100s) that you'd want to see it. The C runner is fast enough that "training + inference..." is the only status you need.

The Result

On an Apple Watch (Series 10, watchOS 26), training a 1-layer, 16-dim, 4-head transformer on a names dataset for 1000 steps with Adam optimizer:

Engine	Time
Swift	~100s
Objective-C	~25s
C	~2s

Same model architecture. Same hyperparameters. Same dataset. Same device.

The full comparison app and all three ports are on GitHub: microgpt-comparisons.

What About Objective-C?

Objective-C is a natural thought here. It's been on Apple platforms since the beginning, it's a superset of C, and you can drop into raw C within an .m file whenever you want.

For numerical compute like this though, Objective-C's language features actively work against you. Every method call goes through objc_msgSend, a dynamic dispatch lookup that checks the class's method table at runtime. In a tight training loop that calls thousands of small operations per step, that overhead is real. C function calls resolve at compile time to a direct jump.

Data representation matters too. An NSNumber wrapping a double is a heap-allocated object with an isa pointer, a reference count, and type metadata. A C double is 8 bytes in a register. An NSArray of autograd nodes carries per-element retain/release and pointer indirection. A C array of structs is a flat, contiguous block of memory that the CPU's cache prefetcher can eat for breakfast.

You can write "C with Objective-C syntax" by avoiding Foundation types, skipping message sends, and using plain C functions. But at that point you're just writing C in a .m file, and you might as well use a .c file with the cleaner tooling and mental model.

When to Reach for C

Not every problem needs this. If your bottleneck is network I/O, Core Data queries, or UI layout, C won't help. But if you have:

Tight numerical loops (ML training, audio DSP, image processing)
Custom data structures with millions of small nodes (autograd graphs, particle systems)
Code that's already written in C and battle-tested

...then a bridging header and a clean C API is a straightforward way to get native C performance inside any Apple platform app. Including watchOS.

The bridging header approach works identically for iOS, macOS, tvOS, and visionOS. The only thing that changes is the SDKROOT in your build settings.