In this project we are going to design and code a visual synthesizer, Image Performer. The Image Performer visual synthesizer will run on iPads. It will be developed using Apple’s state-of-the-art declarative language, SwiftUI, and take advantage of their latest technologies including the reactive framework Combine and the low-level graphics engine Metal. While designed to run as a stand-alone app, it will allow for MIDI controllers and the use of resources created by other graphics applications.
What we will learn
The purpose of the project is to serve as an environment for designers (including myself) to see how a substantial integrated app designed around SwiftUI, CoreData, Metal and Combine can be built from the ground up. We will learn a bit about designing modern software and about visual music synthesizer design in particular.
What you will need
Our development platform is Xcode. You will also need a fairly recent iPad running the most current iOS. We will design Image Performer to be played on iPads of any size, though experience with prototypes suggests that playing a visual synthesizer benefits from the larger form factors.
The modules assume familiarity with computer programming, Swift, and Xcode development. If you don’t have those, you might want to go through one of the many excellent online books or tutorials designed to help with that.
What is a visual synthesizer?
A visual synthesizer is a musical instrument that produces moving graphic images rather than sounds. Unlike a video sequencer, which plays back prerecord or camera-captured video, a synth relies upon oscillators to produce changing digital images mathematically. The oscillators control the colors, form and motion of “drawn” objects in real-time.
Image Performer’s specific character
The particular model used in Image Performer is based on ideas from modernist painters of the early and mid 20th-century, which represents a high water mark of interest in creating a visual art like the art of music. You can learn more about that on this web site. My lecture at the recent Expanded Animation symposium gives a quick (20 minute) introduction.
Image Performer’s architecture is organized around a spine in which CoreData is used to store information about individual objects, referred to as lumis (logical units for manipulating images) and Metal is used to display them. Groups of lumis are collected into chords, as musical notes are. Chords are organized into chordsets, much as presets are used in audio synthesizers.
The properties of lumis cluster around defining and mutating forms, altering colors, and affecting motion. There are a dozen or so oscillators at work in each of these three domains: color, form and motion. Each oscillator controls selected mathematical functions, whose parameters are in turn controlled by sliders, triggers, rhythm pads, and other interfaces.
Image Performer’s design will address connection with music generation programs through MIDI. So it will be possible to create integrated (visual and sonic) artworks. But it will also be playable with nothing more than an iPad. My own thinking places Image Performer in a context where the visual performance is the product of a visual artist interacting live with musicians, including through improvisation.
Our design approach
The approach we will take is iterative prototyping with successive refinement. This is a software design strategy with both top down and bottom up aspects. It represents a nice combination of reflection and action, thinking and doing. Because of its iterative nature, the big picture is always evident. Because of its commitment to successive refinement, we are free to develop rough versions of something that we know will be replaced by a more refined version.
The approach allows for a back and forth between having a plan and taking advantage of situational opportunities. Having built several visual synths, I have many thoughts about architecture, but being relatively new to declarative and reactive programming and the opportunities provided by Metal, SwiftUI, and Combine, I am, like you, first and foremost on a learning journey.
My expectation is that by taking this approach, design issues will be brought into sharp focus so that each of you can pursue alternatives to the design work done here. Indeed, if this project is successful, you should be able to build the visual synthesize of your dreams, rather than being stuck with only that which we build here.
How long will it take?
I truly do not know. One goal is to produce something elegant and easy to maintain and extend. Among the things I have learned about design are that great designs require extended reflection and that design often generates surprises. The interaction of these forces makes time frames unpredictable. For example, I don’t know if doing the project in this public way will reduce or extend the amount of time it takes. There will be the obvious costs of writing narratives, sharing code, and responding to questions. But there may also be accelerating effects of other minds at work. I look forward to learning how this works out.
Because this project is being conducted in real time and based in part on what we learn along the way, what follows is a general guide which will certainly change. It gives an idea of the topics and the order I plan on addressing them in, but as I have indicated, the actual designing will be more iterative than this might suggest, with us returning to earlier modules to refine them.
Image Performer is organized around eight user interface modules, three for defining and updating its basic drawing objects and others for collecting those objects into chords, for triggering their appearance on the display, for saving, organizing and sharing them, and for preserving user preferences.
Metal is Apple’s high powered, low level graphics engine. We will use it to render Image Performer’s output, first in the canvas panel and later in chordset views and on external displays. In this module, we create a connection to the Metal engine. While keeping things simple, it provides enough to get us started with Metal and to begin building a simple color organ.
Image Peformer’s high-level visual model is based around controlling color, form, and motion. Each of these areas requires a model. The color model was chosen to be intuitive, flexible, and easy to manipulate in real time. Two importa.nt controls are the ribbons for controlling hue and purity.
To make the color model more playable, we draw on a trick from musical instrument design: tempered scales.
To further facilitate movement through color spaces, we add controllers to allow for mixing combinations of brightness and saturation to vary color purity, for combining collections of hues, and for accessing the monochromatic space of black, white and grays.
Chords are used to manage the elements that Image Performer uses to create and change images. There are color chords, form chords, motion chords and others. What they hold in common is that they organize the complexities of drawing in many places and with many objects simultaneously. Chords are in turn organized into chordsets. Chordsets determine what flexibilities a player has at her fingertips at the moment. They are often associated with a tune or a movement. In this module we create containers for chords and chordsets as a foundation for our persistence model in CoreData.
We start adding some form to the colors. To get started we will allow the canvas to contain multiple panes. I think of these as akin to the pieces of glass that are arranged to make up a stained glass panel. Af first, each pane will be filled with a single, possibly changing color. In time, we will add moving forms to them.
The metronome controls Image Performer’s heartbeat. Once created, we’ll use the metronome to enable each chord’s background color to change independently.
Image Performer is used to produce a performance which can be viewed live on an external devices or recorded. We will use scenes in SwiftUI to support that requirement.
Functors as oscillators
A functor is a software-based oscillator. Synthesizers are made up of oscillators that alter connected inputs to produce outputs. In our case, the inputs are values that emerge from controllers and the outputs will be changes in colors, forms, and motions.
Lumis, the basic drawing units
The lumi (logical unit for manipulating images) is Image Performer’s most basic unit. Each lumi has a form, color, and motion associate with it. These change in real time in response to the controls we are building. Even the simple version of a lumi we start with here will transform what we have been doing from a mere color organ into a more compelling instrument.
Among the most familiar ways of playing a musical instrument is through the use of keyboards. A simple eight-key unit provides a model controller for playing Image Peformer alongside music. With this addition Image Peformer will begin to feel like an instrument.
There is an infinity of form models that can be used in visual synthesis. The first ones we build in this are based on manipulating basic geometric objects. Functors are used to manipulate their line sizes, number of edges, aspect ratios, and other features.
Our motion model, which imitates the one used in animation and cinema more generally, provides oscillators for controlling rotation, scaling and translation. With these capabilities in place we’ll have a substantial expressive palette.
Layers and listening
When playing many objects, one wishes to discretely select which objects respond to control value changes. Multiple layers and a listening model provide that capability.
Sketchpads and rhythm pads
Alternative models can be added to provide for other kinds of interaction, including sketched forms and more complex rhythmic patterns.
Until now we have relied upon surface-based motions to play Image Peformer, but the iPad has motion sensing capabilities that add variety to the performance styles that can be achieved, even with nothing more than the iPad.
In the visual domain, there are a lot of dimensions to control (even more than in the sonic domain). So, sequencers enable the player to offload some repeating patterns, thereby expanding the complexity of the available visual pallette, particularly in improvisational settings.
A constraint of our design has been that it play on a simple and unmodified iPad. We can expand interface options by providing access to Image Peformer’s oscillators through MIDI controllers.
Sharing models and performance data
A file sharing protocol and access to iCloud will enable users to share chordsets with one another.
As of: March 8, 2021