In this project we are going to design and code a visual synthesizer, Imager. The Imager Visual Synthesizer will run on iPads. It will be developed using Apple’s state-of-the-art declarative language, SwiftUI, and take advantage of their latest technologies including the reactive framework Combine and the low-level graphics engine Metal. While designed to run as a stand-alone app, it will allow for MIDI controllers and the use of resources created by other graphics applications.
What we will learn
The purpose of the project is to serve as an environment for designers (including myself) to see how a substantial integrated app designed around SwiftUI, CoreData, Metal and Combine can be built from the ground up. We will learn a bit about designing modern software and about visual music synthesizer design in particular.
What you will need
Our development platform is Xcode. You will also need a fairly recent iPad running the most current iOS. We will design Imager to be played on iPads of any size, though experience with prototypes suggests that playing a visual synthesizer benefits from the larger form factors.
The modules assume familiarity with computer programming, Swift, and Xcode development. If you don’t have those, you might want to go through one of the many excellent online books or tutorials designed to help with that.
What is a visual synthesizer?
A visual synthesizer is a musical instrument that produces moving graphic images rather than sounds. Unlike a video sequencer, which plays back prerecord or camera-captured video, a synth relies upon oscillators to produce changing digital images mathematically. The oscillators control the colors, form and motion of “drawn” objects in real-time.
Imager’s specific character
The particular model used in Imager is based on ideas from modernist painters of the early and mid 20th-century, which represents a high water mark of interest in creating a visual art like the art of music. You can learn more about that on this web site. My lecture at the recent Expanded Animation symposium gives a quick (20 minute) introduction.
Imager’s architecture is organized around a spine in which CoreData is used to store information about individual objects, referred to as lumis (logical units for manipulating images) and Metal is used to display them. Groups of lumis are collected into chords, as musical notes are. Chords are organized into chordsets, much as presets are used in audio synthesizers.
The properties of lumis cluster around defining and mutating forms, altering colors, and affecting motion. There are a dozen or so oscillators at work in each of these three domains: color, form and motion. Each oscillator controls selected mathematical functions, whose parameters are in turn controlled by sliders, triggers, rhythm pads, and other interfaces.
Imager’s design will address connection with music generation programs through MIDI. So it will be possible to create integrated (visual and sonic) artworks. But it will also be playable with nothing more than an iPad. My own thinking places Imager in a context where the visual performance is the product of a visual artist interacting live with musicians, including through improvisation.
Our design approach
The approach we will take is iterative prototyping with successive refinement. This is a software design strategy with both top down and bottom up aspects. It represents a nice combination of reflection and action, thinking and doing. Because of its iterative nature, the big picture is always evident. Because of its commitment to successive refinement, we are free to develop rough versions of something that we know will be replaced by a more refined version.
The approach allows for a back and forth between having a plan and taking advantage of situational opportunities. Having built several visual synths, I have many thoughts about architecture, but being relatively new to declarative and reactive programming and the opportunities provided by Metal, SwiftUI, and Combine, I am, like you, first and foremost on a learning journey.
My expectation is that by taking this approach, design issues will be brought into sharp focus so that each of you can pursue alternatives to the design work done here. Indeed, if this project is successful, you should be able to build the visual synthesize of your dreams, rather than being stuck with only that which we build here.
How long will it take?
I truly do not know. One goal is to produce something elegant and easy to maintain and extend. Among the things I have learned about design are that great designs require extended reflection and that design often generates surprises. The interaction of these forces makes time frames unpredictable. For example, I don’t know if doing the project in this public way will reduce or extend the amount of time it takes. There will be the obvious costs of writing narratives, sharing code, and responding to questions. But there may also be accelerating effects of other minds at work. I look forward to learning how this works out.
Because this project is being conducted in real time and based in part on what we learn along the way, what follows is a general guide which will certainly change. What follows is an idea of the topics in the order I currently plan on addressing them in, but as I have indicated, the actual designing will be more iterative than this might suggest, with us returning to earlier modules to refine them.
The chordset is an organizing unit for managing the elements with which Imager creates and changes images. At any given moment the player has access to the chords in a chordset. A chord is in turn comprised of a background and lumis, the basic units that control drawing. We will create containers for chords and chordsets as well as a basic persistence model in CoreData.
Metal is Apple’s low level, and high powered graphics engine. We will use it to render Imager’s output on both the iPad and external displays. We will create a connection to the Metal engine, much as we did to CoreData in the first module.
Ultimately Imager will have a number of modules which the player will want to make come and go. We’ll create some placeholder hooks to manage the view hierarchy, learning about how Combine will allow us to manage all of the asynchronous processing that will make Imager responsive as an instrument.
Just as lumis, chords, and chordsets form Imager’s spine, the Metronome controls its heartbeat. Once created, we’ll use the metronome to enable each chord’s background color to change independently.
Imager is used to produce a performance which can be viewed live on an external devices or recorded. We will use scenes in SwiftUI to support that requirement.
Imager’s high-level visual model is based around controlling color, form, and motion. Each of these areas requires a model. The color model we will use was chosen to be intuitive, flexible, and easy to manipulate in real time.
Functors as oscillators
A functor is a software-based oscillator. Synthesizers are made up of oscillators that alter connected inputs to produce outputs. In our case, the inputs are values that emerge from controllers and the outputs will be changes in colors, forms, and motions.
Lumis, the basic drawing units
The lumi (logical unit for manipulating images) is Imager’s most basic unit. Each lumi has a form, color, and motion associate with it. These change in real time in response to the controls we are building. Even the simple version of a lumi we start with here will transform what we have been doing from a mere color organ into a more compelling instrument.
Among the most familiar ways of playing a musical instrument is through the use of keyboards. A simple eight-key unit provides a model controller for playing Imager alongside music. With this addition Imager will begin to feel like an instrument.
There is an infinity of form models that can be used in visual synthesis. The first ones we build in this are based on manipulating basic geometric objects. Functors are used to manipulate their line sizes, number of edges, aspect ratios, and other features.
Our motion model, which imitates the one used in animation and cinema more generally, provides oscillators for controlling rotation, scaling and translation. With these capabilities in place we’ll have a substantial expressive palette.
Layers and listening
When playing many objects, one wishes to discretely select which objects respond to control value changes. Multiple layers and a listening model provide that capability.
Sketchpads and rhythm pads
Alternative models can be added to provide for other kinds of interaction, including sketched forms and more complex rhythmic patterns.
Until now we have relied upon surface-based motions to play Imager, but the iPad has motion sensing capabilities that add variety to the performance styles that can be achieved, even with nothing more than the iPad.
In the visual domain, there are a lot of dimensions to control (even more than in the sonic domain). So, sequencers enable the player to offload some repeating patterns, thereby expanding the complexity of the available visual pallette, particularly in improvisational settings.
A constraint of our design has been that it play on a simple and unmodified iPad. We can expand interface options by providing access to Imager’s oscillators through MIDI controllers.
Sharing models and performance data
A file sharing protocol and access to iCloud will enable users to share chordsets with one another.
As of: December 6, 2020