Post-Mortem: MulleCG / MulleCGNanoVG Split

· nat's blog


Date: 2026-03-15 Duration: ~1 day of focused work Result: 83/83 tests passing across 8 libraries. Milestone complete.


1. What We Did #

Split the monolithic MulleCG library into two:

This is the foundation for supporting multiple GPU backends (D3D11, Metal, Vulkan) without dragging nanovg into every consumer of the drawing API.

2. Architecture After the Split #

MulleCG (abstract)
├── CGContext, CGColor, CGImage, CGFramebuffer (base classes)
├── CGBackendFunctionTable (vtable: createFramebuffer, bindFramebuffer, etc.)
├── Type definitions (CGTextAlign*, CARotationDirection*, CGImageFlag*, etc.)
└── No #include of nanovg, no GL headers, no GPU code

MulleCGNanoVG (backend)
├── nanovg-specific/
│   ├── CGContext+nanovg.m          (NVG context creation, the ONE file with NANOVG_GL_IMPLEMENTATION)
│   ├── CGBackendFunctionTable+nanovg.m  (vtable implementation: _createFramebuffer → nvgluCreateFramebuffer)
│   ├── CGFramebuffer+nanovg.m     (framebuffer init via vtable)
│   └── include-private.h          (GL headers + nanovg_gl.h + nanovg_gl_utils.h)
├── gpu/                           (backend-neutral GPU headers + per-backend .m)
└── mulle-nanovg/                  (embedded nanovg source: nanovg.c, nanovg_gl.h, etc.)

The key design: CGContext holds a CGBackendFunctionTable with function pointers. MulleCGNanoVG populates that table with nanovg implementations. A future D3D11 backend would provide a different table. Consumer code calls through the vtable and never touches nanovg directly.

3. What Broke and Why #

3.1 Leftover nanovg Symbols in Consumer Code (7 files) #

After the split, several libraries still referenced raw nanovg constants and types that now live exclusively in MulleCGNanoVG's private headers.

Symbol Replacement Files
NVG_ALIGN_CENTER, NVG_ALIGN_MIDDLE, etc. CGTextAlignCenter, CGTextAlignMiddle MulleCycleButtonLayer, MulleSegmentedControlLayer, MulleCheckboxLayer
NVG_CW, NVG_CCW CARotationDirectionClockWise, CARotationDirectionCounterClockWise MulleSliderLayer
NVG_IMAGE_FLIPY CGImageFlagFlipY UIBitmapImage+CGImage
NVGpaintWithBounds: deleted (dead code) UIControlPaint

These were straightforward — the abstract equivalents already existed in MulleCG, with MULLE_C_ASSERT compile-time checks ensuring the numeric values match.

3.2 Wrong Argument Order: CGContextSetFillBoxGradient (1 file) #

MulleScrollIndicatorGraphicsLayer.m called CGContextSetFillBoxGradient with arguments in the wrong order. The correct signature:

1CGContextSetFillBoxGradient(context, rect, cornerRadius, color, feather, featherColor)
2//                                        ^^^^^^^^^^^^   ^^^^^  ^^^^^^^  ^^^^^^^^^^^^
3//                                        NOT feather    NOT featherColor

The old code had feather and color swapped. This compiled without warnings because both are numeric types.

3.3 CACircleMake API Mismatch (1 file) #

MulleSliderLayer.m called CACircleMake(CGPoint, radius) — but the actual signature is CACircleMake(x, y, radius) (three floats, not a point + float). This was masked before because the old code went through nanovg's nvgArc directly.

3.4 Files That Couldn't Survive the Split (4 files deleted) #

3.5 Missing UIWindowStyleNoDrawing → UIGraphicsNoDrawing Mapping (1 file, 2 locations) #

This was the most impactful bug. UIWindow+UIGraphicsContext.m in MulleUIGraphicsWindow had two context creation paths (main window and offscreen window). Neither mapped UIWindowStyleNoDrawing to UIGraphicsNoDrawing. This meant:

The fix was two lines, one in each path:

1options |= (_styleMask & UIWindowStyleNoDrawing) ? UIGraphicsNoDrawing : 0;

3.6 UINullWindow initWithOSWindow: Loses styleMask (test bug) #

[UINullWindow initWithOSWindow:osw] internally calls initWithOSWindow:styleMask:0. The zero styleMask drops UIWindowStyleNoDrawing. Tests had to be changed to call initWithOSWindow:styleMask: explicitly.

3.7 ClearType Assert in Headless Tests (6 test files) #

CATextLayer always calls CGContextClearType(context, YES), which asserts that the context was created with ClearType support. Tests using UIWindowStyleNoDrawing alone didn't include UIWindowStyleClearType, so the assert fired. Fix: add UIWindowStyleClearType to the test style masks.

3.8 Missing Protocol Members in Test Glue (1 file) #

MulleUIWindow/test/fake-ui-application-glue.inc was missing inputState, debuggingFlags, os_pollInputState:, and renderObjects: — protocol requirements that had been added to UIApplication during development but never propagated to the test fake.

3.9 ASLR-Dependent Test Output (3 test files) #

Three nvgtrace tests in MulleUIWidgets had .stdout files containing nvgTextMetrics pointer addresses. These addresses change every run due to ASLR. The .stdout files were deleted — the tests pass by checking return code, not stdout content.

4. The Phantom Crash: noleak-cglayer #

This consumed the most debugging time and turned out to be a non-issue.

Symptom: SIGSEGV in glnvg__findTexturegl->textures was 0x100000200, clearly not a valid heap pointer.

Investigation path:

  1. Verified struct layout (GLNVGcontext, GLNVGshader) — correct, GLNVG_MAX_LOCS=3 always
  2. Verified no conditional compilation differences between translation units
  3. Disassembled glnvg__allocTexture and glnvg__findTexture — both use identical offsets (0x18 for textures, 0x28 for ntextures)
  4. Dumped raw memory at the GLNVGcontext* address before and after nvgCreateImageRGBAtextures pointer was garbage from the start
  5. Discovered nvgCreateGLES2 was called with flags=164 = NVG_NO_DRAWING|NVG_CLEARTYPE|NVG_DEBUG
  6. NVG_NO_DRAWING creates a DummyNVGcontext (a flat struct with textures[256] inline) instead of GLNVGcontext (which has GLNVGtexture* textures as a heap pointer)
  7. nvglImageHandleGLES2 casts userPtr to GLNVGcontext* regardless — reading offset 0x18 of a DummyNVGcontext gives you bytes from the middle of the inline texture array, interpreted as a pointer

Root cause: The .exe was stale — compiled against older libraries before our UIWindowStyleNoDrawing mapping fix. The old libraries didn't pass NVG_NO_DRAWING through, so the context was always a real GL context. After our fix, the mapping works correctly, but the stale binary still had the old behavior baked in. When the test runner recompiled from source, the test passed immediately.

Lesson: Always delete stale .exe files before debugging. mulle-sde test run recompiles; running .exe directly does not.

5. Key Technical Details Learned #

nanovg Compilation Model #

nanovg_gl.h is a header-only implementation. All struct definitions (GLNVGcontext, GLNVGshader, GLNVGtexture) and function implementations are inside #ifdef NANOVG_GL_IMPLEMENTATION. Only ONE .m file defines this: CGContext+nanovg.m (via DEFINE_NANOVG_GL_IMPLEMENTATIONNANOVG_GLES2_IMPLEMENTATIONNANOVG_GL_IMPLEMENTATION).

nanovg_gl_utils.h (framebuffer utilities) has its implementations inside #ifdef NANOVG_GL_IMPLEMENTATION as well, guarded by #ifdef NANOVG_FBO_VALID (which is defined inside the same #ifdef). So nvgluCreateFramebuffer is also compiled only in CGContext+nanovg.m.

Other .m files that include these headers only see declarations and link to the implementations.

NVG_NO_DRAWING and DummyNVGcontext #

When NVG_NO_DRAWING is set, nvgCreateGLES2 allocates a DummyNVGcontext instead of GLNVGcontext:

1typedef struct {
2    int flags;
3    int nextTextureId;
4    int textureCount;
5    int maxTextures;
6    struct { int id; int w, h, type, flags; } textures[256];
7} DummyNVGcontext;

This is a completely different layout from GLNVGcontext (which starts with GLNVGshader shader followed by GLNVGtexture* textures as a pointer). Any code that casts userPtr to GLNVGcontext* without checking for the dummy backend will crash. The nvglImageHandleGLES2 function does exactly this — it's only safe to call on a real GL context.

The Style Mask Pipeline #

UIWindowStyleNoDrawing (0x20000)
  → UIWindow+UIGraphicsContext.m maps to UIGraphicsNoDrawing (0x20)
    → CGContext+nanovg.m maps to NVG_NO_DRAWING (1<<7 = 128)
      → nvgCreateGLES2 creates DummyNVGcontext instead of GLNVGcontext

If any link in this chain is missing, you get a real GL context when you expected a dummy one (or vice versa).

mulle-sde Test Workflow #

6. Final Tally #

Library Tests Status
MulleCG ✅ builds
MulleCGNanoVG 9/9
MulleGraphics 21/21
MulleGraphicsImage 4/4
MulleUIWidgets 16/16
MulleUI ✅ builds
MulleUIWindow 3/3
MulleUIOS 30/30
Total 83/83

7. What Went Well #

8. What Could Have Gone Better #

last updated: