Digital Control ideas for Modular Synths
Modular synths are increasingly coming out of the studio and into performance settings, and not just for experimental music. For pop, rock, and even dance related genres, the ability to switch context quickly between songs would expand the range of use for modular synthesis. The ability to "recall state" would allow the capture of the often fleeting moments in modular exploration where an expressive or unique patch is created. To accomplish this, many things are required: digitally addressable and controllable modules, a "patch" sequencer of some sort, a digital communications standard, and a way of digitally repatching. Not all of these things currently exist, or are perhaps even wanted. For some discussion on the relevant pros/cons of these systems, and the technical limitations, here are a few forum posts:
Some of these features already exist in the Select Bus protocol:
This is MIDI and a clock(?) over the CV and gate bus lines on the Doepfer power header. In this protocol, the modules store their own data locally, and the sequencer merely sends a CC message to all modules on the line. There is no individual addressing, realtime control, back channel communication, or repatching options. But, since its MIDI, addressing and realtime control are possible.
In the forum posts above, the lack of digital modules is cited as one of the main road blocks to patch recall. If there are no modules to control, what's the point of a digital control standard? But, some of those posts are almost ten years old at this point, and modules with a microcontroller in them are now quite common. The tasks these micrcontrollers perform range from UI control (reading pots/switches and changing LEDs/displays), CV generation, LFO and envelope generation, full sound synthesis, and sequencing. Any of these tasks could be globally stored and recalled. And, although there is a "chicken and egg" problem involved, no progress will be made if no one works towards a solution. If a single manufacturer created an eco-system of products that enabled this control, and open-sourced the protocol and hardware, others might choose to opt-in as a way of increasing sales for a small increase in development costs.
There are many forms this module might take, and i will merely sketch out a few ideas here to set the background for how such an eco-system might work. The main task would be to querry the connected modules for their state, and save it in a menu system of some sort. These could be written to flash cards for ease of transportation or computer composition. There could also be real-time record to save certain knob movements (for example) and automate repetitive performance aspects. A global clock signal could be sent to all modules, and sound generation and parameter changes would be synced to this clock. If the conrolled modules reported human-readable parameters, these could be shown on a display for editing.
Digital Communications Standard
This is often seen as the bottle-kneck for the creation of this kind of system, although i would argue that the "chicken and egg" problem stated earlier is the real issue. Having access to the internet wasn't that useful when no one else was on the internet. We already have a number of communication standards that could be shoe-horned into service here, but none are ideal. General requirements would be: low cost (<2$ of added hardware cost), easy and cheap access to cables and jacks, bi-directional, high data throughput (~1Mbps), tolerant of mispatching, and simple to work with on a code level. This communication could either be through front-panel connection, or a back-plane connection. The former would be more versatile, and allow patching across rigs, but the latter could be higher data rate. A third option would be wireless communcation over 802.15.4 or WIFI. This would probably only work as a front panel antenna system, as not all modules are designed to be robust against RF interference. But, the RF power level could be quite low, as it only has to go a few inches, but back-plane RF would also make inter-rig communication difficult for entirely metal enclosures. Just from a cost and ease of implementation perspective i would not select a wireless method as the main mode of communication. Some possible options are:
- MIDI_1.0 - A good standard that is simple, inexpensive, and already well adopted by the music community. But, it is slow and not bi-directional. A second MIDI network could be used for the back channel, but then MIDI mergers would be required at each node, and they may not be able to handle the network traffic. A modified MIDI back channel could be used (see below).
- MIDI_2.0 - This fixes a lot of general complaints with MIDI_1.0, but seems focused on USB MIDI, which does not help in this case as a USB hub would be required on each module to daisy chain the signal. You can run MIDI_2.0 over the 1.0 hardware standard, but then you lose the bi-directionality and speed improvements. This is more costly and complicated option.
- RS232/UART - This is basically what MIDI runs over, and is a generic physical layer that does not require syncronous clocks between nodes. It is inexpensive, simple, and bi-directional, and can be used at speeds up to 1Mbps. It would require a custom protocol to be designed.
- SPI - This is very robust, fast, inexpensive, bi-directional, and simple. The downside is that it requires a clock and chip select line. This would be a fine option for back-plane communication at speeds in excess of 1Mbps. i have used SPI on runs of over 20', in noisey environments, with no problems, and only using the microcontroller's onboard I/O drivers. It would require a custom protocol to be designed.
- I2C - This is a slow and unreliable form of communication that should only be used on runs less than 4". It also requires a clock, and is only half duplex. It would require a custom protocol to be designed.
- RS485/CAN bus - These systems are similar to one another, and to RS232, with the exception that RS485 and CAN bus are designed to allow for MIMO networks. This, along with the differential signalling (and associated reduced EMI) make them a strong option. CAN bus is probably a bit heavy on protocol, and RS485 does not specify a protocol, so something new would probably need to be designed.
- Ethernet - High data rates, but expensive and power hungry. i mostly have it on the list for completeness sake. OSC can be run over ethernet.
These all presume a bussed network, rather than a star topology. The latter is possible, but makes the control module quite complicated, and wiring a bit difficult. For a bussed topology, addressing is required, and as such the limitations of microcontroller hardware need to be considered. Almost all microcontrollers have an addressing mode for SPI, and will ignore packets not assigned to them. This makes SPI an even more promising option for back-plane communication, but the multi-wire requirement preclude it from being useful on front-panels. Unfortunately, not all micrcontrollers support addressing for UARTS. The ATMEGA series has a wake-on-any-adress feature, which would cut down on interrupts greatly, and the STM32F series has a wake-on-address feature, but only allows for 4b addresses, which is too few to be useful here. It also has a "mute mode" which will ignore all subsequent packets in a frame, which could serve the same purpose as the ATMEGA wake-on-any-address feature. Although, this wouldn't allow interspersing of "real-time" messages into data frames. The ATSAMD series does not have addressing features in UART mode.
Most microcontroller lines have an option for a CAN controller, which could work, but again, the overhead of CAN is probably 4x of what is needed here. CAN is more suited to large payloads that can handle a bit of latency. This would probably be fine for simple state recall, but would not work for any real-time control over a large number of modules. Ultimately, for a front-panel interface, the options seem to be: 1. Using a specific microcontroller that has an address peripheral, 2. Spend a lot of time in interrrupts checking if packets are for you, 3. Have a seperate microcontroller that just handles the bus, and passes relevant back to another controller via SPI, I2C, etc. This last option is more appealing than it first might seem. It allows the protocol to be packaged up and sold as a single unit, creating a secondary revenue stream, and ensuring compatibility between modules. A small microcontroller can be had for 0.25$ to 0.50$, and would take up very little board space.
These should be inexpensive, readily available, and allow for cabling which is shielded. It would also be good if it didn't allow for other audio connectors to be plugged in. A review of low cost connectors on DigiKey gave the following (all are similar in cost):
- Modular (RJ45, etc) - Low cost, available, and used for high data rates with CAT5/6 cables, which can come in shielded options, and may different lengths. The cables themselves are also cheap. Unfortunatley, this would not make a good front-panel interconnect, as it takes up a lot of panel real-estate, which is precious in the Eurorack format. The excpetion to this is if SPI were to be used for a front-panel, then the extra wires would be required.
- SATA - Low cost, available, used for very high data rates, and has 2 seperately shielded transmission pairs. The cables come in various lengths with both vertical and horizontal connectors. The connectors would be a bit cumbersome for front-panel, as they don't extend far enough to go from PCB to the panel, and are a bit awkward to plug in vertically. But, they would make an excellent choice for an RS485 back-plane connection.
- IDC - The standard Eurorack power headers, but in a lower pin count, so as to preclude incorrect connections. These are common and cheap, and are easy to route. The connections are sometimes flakey, but doubling up pins can help here. This would be a decent option for a back-plane interconnect that used a signle bus (not daisy-chained).
- 3.5mm TRS phone - Although problematic for incorrect patching, these are otherwise a good choice for front-panel applications. They are cheap, familiar, and alredy being integrated into panels. Shielded cables can be easily found, leaving the remaining lines for TX and RX. It also used for MIDI, if that protocol was chosen.
Front-panel physical layer
There is no ideal solution, but the MIDI style repeater network over 3.5mm phone cables is my personal pick for a front-panel solution. Since all of the modules already share a ground, there is no need for optocouplers, but +/12V tolerant I/O is needed. For data rates of 250kbps or slower, these could just be TL072 opamps or 74HC14 logic with some protection diodes. An example of what this might look like is below:
The specifics of the drivers are not shown, but they will be current limited (10mA), +/-12V tolerant, and use 0V to 3.3V signalling. The low voltage reduces EMI but still allows compatibility with 5V microcontrollers. The inputs will also be +/-12V tolerant, which will mean some sort of schmitt trigger input to deal with floating mid range voltages. The inputs will be tied high so the line is idle when no plug is inserted.
The sequencer output is straight forward, and each module will buffer the signal once for its own use, and a second time to protect the microcontroller. This signal then gets passed along to the second module in the chain. In this way, the sequencer can talk to all modules at once, with each responding only to their own address. The back channel is a bit more complicated as each module needs the ability to take control of the line. Typically this is done with open collector drivers, but that requires a shared line, which could have very large capacitance (>10nF) with just a handful of modules plugged in. This would severely limit the data rate. Instead, each module uses CSMA with random backoff to determine if the line is free and take control. This process is simplified with a small bit of hardware that holds a line low for the duration of a packet, so the microcontroller doesn't mistakenly take control during a transmission while the data line is high (sending a 1).
The CTS hardware would not be required if the modules only responded to requests from the sequencer. In this fashion, the sequencer would control all data flow. If it is assumed that a module might want to asynchronously send data, then the sequencer protocol would need to round-robin check all modules to see if any data is available. This would consume bandwidth, and probably negate the savings of sequencer control. But, if this is not a forseeable need, then sequencer control would be much easier and more bandwidth efficient.
Front-panel data link layer
This would be similar to MIDI, having the benefits of variable sized packets optimized for their specific needs, but allow for addressing and back channel communication. Packets would be comprised of 8b bytes, and could be any length of 1 to infinite bytes. The infinite byte allowance is merely in there to enable firmware or wavetable uploads, most packets not for this purpose would be limited to 16B in order to keep delays under a millisecond.
Bytes would actually be 9b long, with the 9th bit being an address bit. If this bit is set, the 8 preceding bits represent an address, otherwise they are data. This could be turncated to 7 bits to replicate MIDI, but almost all UARTs support 9b packets, so it would not come at any cost. The first 2b would be real-time message options, enabling low latency, 1B packets, and the remaining 6b would address 63 different modules. the 64th address would be used as a global address that all modules would respond to. Similarly, there would only be 3 real-time message options, with the fourth indicating that more bits were to come.
The next byte would be a command byte, with 127 single bit options. If the MSB is set, then the 2 next bits represent the command, giving 127x256 additional options. The first 127 commands will be standardized commands, and the remainder can be module dependent (equivalent to sysex). The data then follows the commands, and can be anywhere from 0 to infinite bytes. A zero data byte allows for commands that do note require data, such as a request to send certain data back to the sequencer, or flip a switch on. MIDI techniques such as real-time messages and running status can also be applied here.
Since there are only 63 addresses, there will need to be a way for the sequencer to assign addresses to the modules. Each module will have a manufacturer, module, and unique ID. These can be requested from all modules on the network via a global address command, and upon receipt each module will release take control of the back channel, and begin transmitting its unique ID. Since siezing the back channel cuts off any upstream data, only the first module on the network will reach the sequencer. Once the sequencer acknowledges this module and assigns it an address (1), it releases the line, and the second module becomes connected to the sequencer. It then gets an address (2), and the process continues until no more data is received by the sequencer. In this fashion, the sequencer assigns all modules, and builds a map of the network (if that is usefuel for some reason).
Digitally Controlled Patch Options
This is probably one of the most difficult aspects of the system to achieve, and perhaps the least wanted. The idea here, is that a person could create a patch, and then via some electronics, recall those cable settings with a push of a button. The reason why this might not be wanted, is that it negates one of the main benefits of modular synthesis: the tangible and visible representation of data flow. You can grab and move cables around quickly to explore the sonic possibilities. If this is abstracted away, then why use such a large cabinet of gear? Why not just use software synthesis? But maybe there is a way to have our cake and eat it, too. To that end i present some optoins:
- Back-plane switching - There would be 8 or 16 (maybe more?) "virtual" patch cables which physically exist on a back-plane connected to all relevant modules. These can be connected between 2 or more modules, in any combination. The front-panel connection would be one of the options, to return manual control. This means that each I/O that is allowed to connect to the bus, must have a MUX associated with it with as many switches as "virtual" patch cables. If a modules has 6 I/0 lines, and its an 8 cable bus, then a 6x9 MUX is required (+1 for the front-panel). This isnt't a bad option, but would be pretty limited in terms of patch cables, and wouldn't be able to patch outside of the case. It would allow for front-panel exploration in the studio, and quick repatch capability on the stage. For a discussion of costs and technical issues, see below.
- X-Point module - These exist, and are expensive. You can plug a cable from a module into a series of "Ins" along the X-axis, and "Outs" along the Y-axis, with a grid in the middle that controls and displays the patched connections. If the price could be brought down, would they be popular? Its hard to say. the organization of cables breaks up the visual information display the grouping of modules once had: a series of lines showing what is connected. It is now an abstract grid, making it hard to connect the dots. There are some famous examples of this working well, like the EMS Synthi, but this works because the patch points are fixed, and one can learn the mapping. And in the same way, a user could learn the mapping if it remained fixed in their setup.
- Digital abstraction - This is perhaps the most radical solution, and would not be backwards compatible. It would treat the front panel as a control surface, with all processing and data sharing happening on a digital bus. Each module would detect where it's cables go, and relay this back to a central processor. These could then be switched virtually. Cables wouldn't even be required, per se, and just touching two modules at once (for example) could make the connection. This has the issues of the other formats of breaking the meaning of patch cables, while at the same time not allowing for much (if any) analog processing. Systems like this exist, but are a fixed ecosystem, and its easy to see why thats the case: it doesn't allow for easy growth as it doesn't work with analog modules as well. A system using sigma-delta data streams could be both analog and digital at the same time, but there would be many compromises involved.
- Patch visualization - The system does not repatch itself, the human still does this, but the system does show you what connects to what. Each jack would have an LED associated with it. This LED would light up for all connections on a single cable, and the user would then patch those points together. After a patch is made, the sequencer unit can be indexed to the next cable, and the patching continued. In this way, its more of a memory unit than a repatcher - a snapshot of your rig. This could be inexpensive and simple, requiring only an LED per jack, a way of detecting connections, and a microcontroller that talks on a bus to the sequencer. But, its not so useful unless every module has this capability, and its unclear how easy retrofitting would be.
There are two ways to place a signal on a bus: voltage or current. A typical voltage mode will have low impedance drivers sending signals to high impedance receivers, and current mode will do the reverse. This means that voltage mode can only have one transmitter per line, and multiple receivers, whereas current mode allows multiple transmitters, but only a single receiver. Of these two options, voltage mode most closely replicates how patch cables currently work in a modular setup, and is therefore more easily integrated. Unfortunately, voltage mode is more likely to have cross-talk between lines on the bus (as the receivers are high impedance). This can be reduced by lowering the receiver impedance.
For voltage mode, each transmitter should have the ability to connect to zero or more bus lines, and each receiver should be able to connect zero or one bus lines. If there are T transmitters, R receivers, and B buss lines, then a total of ((T+R)*B) switches are needed, or a T+R by B cross-point switch. There are a few constraints on these switches, such as crosstalk, common mode voltage range, and bleed-through. But, perhaps the most important is cost, as this would limit accessibility. Since a simple 8x8 cross-point switch has 64 switches, these costs can escalate quickly. An overview of the cost per switch of various options is shown below:
8x6 unidirectional xpoint
+/-6V or +/-9V
+/-6V or +/-9V
From a cost perspective, the 4051 looks good, but for a simple 8x8 cross-point, 8 ICs would be required. Although this would bring down costs as larger quantities were purchased, it would also consume a great deal more PCB area, and all the costs and crosstalk associated with that. The common mode voltage range is limited, and +/-5V rails (HC series) would have to be supplied (+/-9V for 4000 series). This is the advantage of current mode, as the bus lines are always kept at 0V so common mode range doesn't matter. The DG range of switches allow for the full +/-12V rails, but cost upwards of 0.10$/switch. The V411 allows for the full rails as well, but would require twice as many ICs and a great deal more digital DMUX lines for control.
The needs here are very similar to the front-panel system, except for the cabling. This has three major effects: 1. it allows for less available connectors, 2. it allows for non-buffered communication, and 3. it allows for non-protected I/O. Standard IDC cables would probably be the easiest here, allow for shorter runs, and allow for new connections to be made on an existing buss. All modules could share the Tx and Rx lines due to the short run and associated low capacitance. RS485 could work well here, or SPI. The latter would be more robust and allow for less expensive hardware.
I think it would be helpful if both the back-plane and front-panel communication methods used the same data packet format, if not the same hardware layer. This would allow for greater interoperability. A module designed to speak on the front-panel could do so on the back-plane without much modification. This would save code size and programming time.
Conclusions (for now)
Ultimately, what i have described for the front-panel scheme is essentially MIDI++. If i could convince myself that 16 addresses were enough, i would just use MIDI at a faster baud rate. Similarly, if i could convince myself that people would want to use modular jacks, i would use SPI, but i really don't think that would be popular. And between the front vs back setup, the front is more accessible and might lead to wider adoption, but the back could be faster and less expensive. That makes the front a better early option, and the back a better final option.
As for a digitally controlled patch interface, i am not yet convinced that the pros outweigh the cons. As a standalone module this makes some sense, and these already exist. But, for a whole rig, there just wouldn't be enough patch cables. with a scant 16 cables, a module with 4 ins and 4 outs would require 128 switches, which would add 5$ to the BOM at a minimum. Using 16 1x8 MUXs doesn't seem feasible here, and implementing some scheme to put signals on one of the +/-6V rail cross-point ICs would probably be better.