I haven’t done a series of posts about anything, only small snippets of things I’ve played with and topics that I am interested in. Originally I thought about doing a video recording about this, but I feel writing the progress is a much better way of sharing.
I don’t plan to make a tutorial on how to “correctly” build a game or a game engine. My plan is to keep track of the development of a small game and inspect in detail interesting topics that I enjoy focusing on.
The game is a 2D platformer with very simple “physics”. I still haven’t decided if I want to create it with pixel art or high resolution sprites. My original prototype was done with pixel art so I’ll probably stick to that for now.
I am currently developing this game on a MacBook Pro and the design is specific for mobile phones. For fun I still want to add other platforms like the Apple TV, MacOS, Windows and browsers.
Other external tools I am using are small libraries. Currently the project is only using stb_image.h for loading and parsing image files to raw pixels.
The first thing I wanted to do was getting something on the screen. Of course a “Hello, World!” isn’t enough. I wanted to draw sprites. So first I decided to design a very simple interface for my drawing API. Since I’ll only be rendering sprites, I won’t need to overcomplicate that specification. Things I need from this drawing interface:
- Clear background.
- Draw texture.
- Draw a tinted texture.
- Draw a portion of a texture (frame).
- Apply simple 2D transformations to the rendered objects.
I want the interface for this to be platform agnostic and easy to use. I’ve been using immediate drawing APIs for a while and I really enjoy the design. The last one I used was Dear ImGui, a library for rendering GUI which I highly recommend. I love how easy it is to prototype and have something running with very little code.
For making this design fast I’ll need to implement batching of sprites. Ideally this should also be mixed with texture atlases instead of individual textures.
This is how the current graphics and drawing API looks.
A quick overview of the functions.
This function initializes the platform’s graphics API. Allocates and binds the required resources like the rendering pipeline state, blend states, buffers, etc.
This will destroy any instance of the graphics API and do a cleanup of acquired resources.
These functions handle setting up of the rendering state, emitting draw calls and doing the presentation of the current frame.
This indicates which color will be used for clearing the screen.
These functions are for 2D transformations. Transformations are applied to each vertex during
gfx_draw_xxx() function calls.
Finally, we have three flavors for drawing an image on the screen.
TextureID is just an opaque pointer. I’ve decided to use an opaque pointer so it’s easier to port the implementation.
gfx_draw_texture_with_color just renders an image at an X and Y coordinate with a color. Color is multiplied to the texture’s base color. This makes white the default color.
gfx_draw_texture_frame_with_color renders a portion of and image. That portion is a rectangular section which has an x, y, width and height. All of this values must be inside the bounds that represent the whole texture size.
gfx_draw_texture is the same as
gfx_draw_texture_with_color but with a default color.
gfx_draw_texture_frame is the same as
gfx_draw_texture_frame_with_color but with a default color.
Since Apple is killing OpenGL in all their devices, I’ve decided to use their own graphics API called Metal. The good thing about this is that Apple provides a nice list of resources to get started with it.
The Metal backend has been very simple to implement. If you’ve worked with any other API apart from OpenGL, Metal will seem very familiar. Before using Metal I had worked with OpenGL, WebGL, D3D11 and GX (Nintendo 3DS), and the last two have a lot of similarities with Apple’s graphics API. It’s important to point out that Metal seems to be a lot more explicit than D3D11.
Since I am targeting Apple devices, I need to either use Objective-C or Swift. Objective-C seems to work better with C so I chose that. I’ve only programmed in this language once and it was a very long time ago. The good thing is that Objective-C is a superset of C. This means I can use C and interoperate with it. Of course there are some weird things with this interoprability which I’ll mention later.
One thing that really helped me a lot on getting this thing up and running was MetalKit. MetalKit is a framework that helps with the creation of Metal applications. It helps by setting up a lot of stuff for you. It’s kind of SDL or GLFW but for Metal. The only things I’ll be using from MetalKit are
To start using MetalKit I only need to link the library
MetalKit.framework and change the class of my View on the ViewController to
MTKView. After that I create a View delegate that extends
MTKViewDelegate. Having done that I should be ready to initialize Metal and start doing some programming on the implementation of my drawing interface.
The code for initializing Metal is very simple. It’s only a couple of lines:
This will create an instance for a Metal device and bind the view delegate with the view. All this is done on the method
UIViewController on iOS/TvOS and
NSViewController for MacOS.
The Metal Shading Language is very similar to HLSL but with a C++ flavor. Luckily a “Sprite” shader is very simple. Here you can see the code:
Let’s split this code into sections.
This is the representation for the constant/uniform data. We could add more values here but for now
resolution should be enough. Here
resolution is just the width and height of the current view.
TextureVertexOut are the layout definitions for the vertex and the fragment program input data. If you’ve worked with GLSL this would be the equivalent of
This is the vertex function.
This is the only line that really matters in this function.
Here we transform from world space to clip space. First we normalize the vertex position, after that we center it in the screen and finally we invert the vertical coordinate. We do this because in our coordinate system of our renderer we use pixels as coordinate units and the origin is at the top left corner. I do this because I am used to it.
The fragment function is also very simple. First we define a sampler with
neareast filtering and then we take a sample at the current interpolated texture coordinate and multiply that value by the vertex color.
Sprites in this game are represented by two triangles forming a quad. I know I could’ve used indexed buffers but I didn’t. No specific reason why.
Each vertex is multiplied by the current transform matrix before being added to the array of vertices.
The function that handles this is very simple:
These functions are not exposed via an interface and are only used in the implementation.
The first function
_transform_vertex multiplies the vertex position by the current transform matrix.
The second function
_push_quad generates all the vertices for the current quad and adds them to the array of vertices.
These functions are used internally by the
gfx_draw_xxx() functions to generate vertices.
This is how the implementation for drawing a texture frame looks like:
I can probably optimize this function by caching the UV coordinates or something, but for now this works for me.
Dynamic batching is still very relevant when dealing with sprite rendering even with modern APIs like Metal. The way I am internally handling this is checking if there is any texture change. If there is one I create a new batch entry indicating the texture id, starting vertex index and the vertex count for that batch.
This is how the batching structure looks like.
gfx_flush() function I process the batch collection and emit draw calls. This how it looks:
What this section of code does is copy the contents from the vertex array to the vertex buffer, set the render pipeline state, update the constant buffer and finally iterate through all the batches while binding the batch’s texture and adds draw calls to the current command buffer.
Triple buffering is part of Metal’s Best Practices, but it’s more than just a good practice, it’s necessary if you want to have something running decently. It also applies to other APIs so it’s not limited to Metal. This means you need to add sync points by hand. APIs like OpenGL generally handle this internally.
Sync points are important tools because they help us prevent writing over data that is being used by the GPU. So instead of stalling until the GPU has finished processing the current frame, we allocate three buffers and then once the current command buffer is commited we swap to the next buffer. That way we don’t have to wait for that frame’s work to be done and we can process the next buffer. If you’ve done AZDO in OpenGL you’ve probably done triple buffering too.
The recommended way for adding sync points on Metal is by using the semaphore primitive
dispatch_semaphore_t. On the function
gfx_begin() I call
dispatch_semaphore_wait(gGfxState.frameSemaphore, DISPATCH_TIME_FOREVER) if the semaphore hasn’t been signaled it’ll stall and wait for it. In the end of my frame on
gfx_end() I call:
This will register a callback for when that current command buffer has finished executing and in the callback I will emit a signal for the semaphore to be unlocked and stop any stall during
The stupid pointer bug
Finally, I wanted to end this post with a very stupid bug that took me hours to fix. I’ve never used C with Objective-C so I had no idea how they interoperability worked. I just went with it and used Objective-C like I would use C. One important thing I didn’t know was that by default Objective-C is a memory managed language. This can be disabled but by default it uses an approach called Automatic Reference Counting or ARC. C doesn’t. This means that any interaction between a C API and Objective-C needs to have that in mind. The problem I encountered while developing the Metal backend was that I was casting an
id<MTLTexture> to a
void*. I do this so I can pass around the texture pointer on my game code, which is all written in C. The problem with this is that Objective-C needs to know what happens when you cast that object. Since I wasn’t properly communicating that information to the objc runtime as soon as I tried to access that texture, I would get a
EXC_BAD_INSTRUCTION (code=EXC_I386_INVOP, subcode=0x0) error with a log on the console saying
[MTLDebugTexture retain]: message sent to deallocated instance 0x6000001a25a0. This meant of course that my texture was being released and I was trying to access that stale texture pointer. The solution for this problem was basically RTFM. I hadn’t read how bridge casts worked and as soon as I did the “You IDIOT!” message started echoing in my head. Here you can read how bridge casts work. What I was doing originally was this:
and then I tried to read it by casting it back to an
As soon as I casted it from
void* I got a
EXC_BAD_INSTRUCTION. The solution was easy. When creating the texture instead of using
__bridge modifier I should’ve used
__bridge_retained. What this keyword does is increment the reference counter for that specific object. Which means that if I pass the pointer to my C game code I can use it without problem and if I pass it back to Objective-C I can just use
__bridge and it won’t modify the ref counter. The only moment I need to decrement the ref count is when I want to release it and for that I just do a cast to
id<MTLTexture> by using the keyword
__bridge_transfer. After doing this change I stopped getting the bad instruction error and the renderer worked perfectly.
The result at the end was what I wanted. A simple 2D renderer that can draw frames and tint them. Here is how it looks on an iPhone 7.
More stuff added to my tiny renderer. Support for sprite sheets and tinting pic.twitter.com/XMN1vsrQn3— felipe (@bitnenfer) September 20, 2018
The cool thing is that since it was developed with Metal, I can also run it on my Apple TV. It honestly has a better performance than what I expected.