Real-Time Image Processing on iOS

PrismaCam was a 'weekend hack' project I did last year, and it was briefly on the App Store. I took it down after deciding that it wasn't quite a minimum viable product. At some point I'm going to revive it with a few basic updates, because I still think it's a really fun app to play with.

I'm hoping to have some time this year to dust it off, and put out a more finished version.

Camera In, Kaleidoscope out

The project started as a way to explore the iPhone video processing facilities. The AVFoundation library is extremely powerful and allows a programmer to process video frames as they are captured in real-time through the camera.

Actually saving out a video from the processed images requires a ton of boilerplate code, and fortunately the brilliant Brad Larson's GPUImage library makes this very easy.

Later, I may add some more specific details, but here's a high-level overview of how the product works.

Early Steps

plotting out the effect

I unearthed this photoshop file I made early on in the process. My plan was to crop the source video frame (the sunflower in the lower right of the example) into a single facet, which would then be transformed and mirrored in some interesting pattern.

I looked at a lot of vintage kaleidoscopes and decided that my favorite were the kind that had repeating hexagonal blocks composed of smaller, triangular facets. The photoshop diagram is a rough illustration of what the geometry would look like. It shows how texture coordinates in the source image would be applied to the final polygon mesh.

Building the Geometry

wireframe of prismacam's geometry

After figuring out the process, I built a polygonal mesh in a 3D modeling program, and exported it as an OBJ model. OBJ models are old school, textual formats that are easy to understand and script around. I converted the mesh points into an array that would work with OpenGL ES, and then took a painstaking couple of hours to assign the proper texture coordinates to each point.

Reading the Video Data

The very first version was hacked around this method of an AVAssetWriterInput class:

(void)requestMediaDataWhenReadyOnQueue:(dispatch_queue_t)queue usingBlock:(void (^)(void))block

As data becomes available to the video input, this block is called and a CVPixelBufferPool can be created. These pixel data can be passed to openGL, at which point they're available for use as a texture map. The texture map is then rendered onto the mesh with the appropriate coordinates, and the user sees a real time kaleidoscope made out of whatever the camera is pointed at.

While this was a workable solution, it needed a lot more work in order to be able to actually save out clips. I had other projects to work on, and it sat on the shelf for a while, until when GPUImage was released and I realized it would only take a day or two to complete.

So, that's the high-level overview of how it all works. At some point, I'll re-release it. For now, here's a video made with the app: