Behind the Scenes: Interview with Serato R&D

Remember our article on timecode and MIDI control article a little while back? We explained at the time that it was just scratching the surface of the realities of what it takes to measure platter control. There are a huge number of factors that go into the development of digital vinyl systems and turntable emulations, and rather than wrestle with the theory on my own I had a chat with Dylan Wood from Research and Development at Serato about some of the technical aspects of Scratch Live and Itch.

Firstly, Dylan makes an important point about the nature of interfacing with a computer system: “A DVS is a relatively complex signal which is encoded and then transmitted through an audio stream, and audio streams are an accepted means for transferring a large amount of data in a realtime manner… and MIDI isn’t, in comparative terms. The amount of data you’re sending in an audio stream is significant (1.4 Mbit/s for 16 bit/CD quality audio), and realtime transfer of this much data was well established with Pro Tools and other DAWs in the 90s so computer systems are built to deal with it  – modern machines are able to deal with large amounts of realtime audio data pretty well.”


So the fact that high bandwidth audio support has been built into operating systems for a long time bodes in DVS’s favour. “Yeah, also when DVS first came out, MIDI technology was still fairly basic; the most advanced gear we had was  things like the Mackie Logic Control.  Even in that case you’re basically talking about a dozen faders only sending data when they’re being moved, they’re not continuously moving they’re just updating their positions as frequently as needed.”   The whole concept of real time MIDI control with a moving platter just didn’t really exist at the time, people were still using hardware MIDI interfaces with really low bandwidth back then.


This had an impact on the development of the Scratch Live concept. “Basically, MIDI as a protocol was never meant to deal with anything as complicated as a moving platter so DVS using an audio stream was the only real solution at the time.  DJ’s weren’t using MIDI controllers as part of their setups when DVS was born so in terms of the question of why we made things the way we did, basically we had to look at what the DJs were already using and replicate it as closely as possible.”


And when it comes to timecode, ideas were already banded about. “In regards to timecode itself… people have been trying to put timecode on records for a really long time. If you go back through PhD theses you’ll see people have been doing it for a long time, and the way it used to be thought of is you’d have something like SMPTE”.


SMPTE is a form of time code that’s traditionally been used in studios to keep time since way back in the days of tape. But as Dylan tells us, Serato use a different technology with their control vinyl, and that technology is a unique Serato NoiseMap Control Tone. “The difference between timecode and our noisemap is that our noisemap is not a sequential series of markers that the software counts to know where it is, it’s more a pattern. Imagine that timecode is like going along a road, and you [need to] pass a certain number of markers until you know where you are. Our unique noise map is more like an island – if you’re a fisherman and you know that island really really well, you can be plonked at any point on that island and know immediately where you are, it’s more like a map, a picture that our software knows REALLY well.

For all the theory though, the implementation needs to be just so in order to capitalise on the numbers. “The technology on the record is not the only thing that defines its accuracy, the other thing is the way you measure it and interpret it. It’s something that we’ve struck when working with hardware companies with MIDI devices. Depending on how the firmware interprets the data you can get varying levels of accuracy from the same kind of sensor; we’ve proposed to manufacturers that they monitor the interval in various different ways to get more accuracy out of the same piece of hardware, so it’s not just a matter of how accurate the raw data is, it’s how it’s interpreted as well.  Our software can interpret 7200 positions per revolution with our control records – and the frequency of the tone doesn’t necessarily define the accuracy, and that’s something which has been difficult to make clear to people for a long time.”

So changing, for instance, the control tone down to 500Hz wouldn’t necessarily translate to halving the amount of positions you can record? “There’s limitations to the way that you can cut signals to vinyl, as you change the frequency, the amount of information that you can encode in it, without it losing fidelity over the vinyl changes, and there are different harmonics generated. On CD, sure, it’s simpler, but on vinyl there are more considerations, and just because a tone is lower doesn’t mean it’s less accurate – in fact, the converse can be true. When it comes to vinyl it really depends on what kind of performance you’re trying to interpret as to how well it works.”

That in mind, I ask if it requires close collaboration with Rane to get the hardware right. “We work with Rane really closely, but not so much around the encoding and decoding. With a good quality audio interface that’s 16 or 24 bit and has a decent sample rate you have everything you need from a DVS point of view. The thing we have worked quite closely with Rane on is on the Rane Sixty-Eight mixer. The audio stream that comes down from the computer to the Sixty-Eight is 32bit – now there’s very few things in the music industry, never mind the DJ industry, that actually use 32bit floating point audio right through to the hardware. All the audio path through the Sixty Eight is 32bit floating point, and that includes from the software sources as well, so you could clip everything in the signal path the whole way through, and as long as you turn down the master at the end it hasn’t really clipped at any point, and it stays clean and dynamic the whole way through.”

And how about with external partners when it comes to Itch? Is there a ‘blueprint’ for Itch’s feature set? “The thing is you can define a software feature set to a certain level, but as soon as you get into the different ways manufacturers do MIDI and that kinda stuff, things change. The MIDI standard itself as a whole was never intended for DJing, so we found there was a lot of difference between the ways hardware companies approached the same controls. Over the course of Itch we’ve settled on more standard ways of doing things, so in some cases the hardware manufacturer might do something a certain way and we’ll make sure that our software supports that, or in another case we might suggest a way that would be better for the user.”

Something we’ve touched on in past articles is the idea of the fully digital turntable; I have to ask whether a 12” platter, to all intents and purposes a deck we know and love, is in the realms of possibility. “Absolutely. It’s not something that has come to the top of the pile from any hardware manufacturer yet.  The closest thing we have is probably the V7 and NS7 with the 7” motorised platters. We worked really closely with Numark on those to figure out the best way to get the most data out of those encoders, and that technology could definitely be used for a 12” platter. For the most part the Itch Hardware Partner defines the ballpark for the project they want to work on.”

When it comes to accuracy, approaching it one way and simply looking just at the numbers if you were to use a 14 bit MIDI message to interpret the encoders on a platter, it appears that there’s potentially more accuracy to be squeezed out than the 7200 per rotation that Serato do with noise map. In real life though, how do those numbers stand up? “Because the transport method is different, it’s really hard to compare. 14 bit MIDI has 16 384 values and most hardware encoders have a lot less than this.  Also the difference between whether the platter is motorised or not is significant in terms of the way the control data is handled, because if you have a motorised platter that’s not touch sensitive, the software is continuously interpreting the position of the record to determine the playback of the audio so not only the number of steps but their timing becomes really important.  If you look at something like the VCI 300 that has a touch sensitive platter though, if you’re not touching the hardware the software just controls the record speed by itself. With the NS7 and V7 it wasn’t just about rotational accuracy it was about time stamping that data, to make sure that when we got a bunch of information from the device that we didn’t just know where it was, we knew when it was. In audio, for example, sampling at 44.1kHz gets you 44,100 samples per second, and you know exactly when sample 1000 occurred. You have to do a lot more work to make sure MIDI data is interpreted accurately.

“When we first designed the VCI 300 and sent the 3600 clicks per revolution through to Windows Vista, the entire system ground to a halt just by plugging it in and spinning the platter.”

“As soon as you start to transmit something via MIDI it all starts to fall apart. Firstly because MIDI’s not designed to send a continuous stream of data like that, and the operating systems are just not designed to handle MIDI in the same way they’re designed to handle audio. For example when we first designed the VCI 300 and sent the 3600 clicks per revolution through to Windows Vista, the entire system ground to a halt just by plugging it in and spinning the platter.  The operating system just didn’t know what to do with that much data. In the case of the VCI 300 we ended up using “HIDI ” Serato’s combination of MIDI and HID to bypass the operating system. In the case of the NS7 and V7 we’re using their own driver, so that information doesn’t go through the operating system. There’s quite a lot of variables when dealing with MIDI data – it’s not just about how much, it’s how accurate it is.”


In a way it seems that MIDI is a particularly hackneyed solution to the issue at hand. We wonder whether Serato has considered a custom interface for Itch, considering the all in one nature of its primary function. “We could, but it’s not something we’d want to develop to keep closed, we don’t think it would benefit the users.  For something like that to thrive the standard has to be open and that’s why MIDI is the standard it is, lots of people know how to use it, no one controls it.  The closest thing is something like OSC, but OSC is hard to configure and difficult for people to even figure out what it’s for.  I guess half the problem is that MIDI was designed to digitise the performance information from a piano which is an instrument that has been around for 100’s of years and has a well established set of controls/inputs.  Dj-ing?  Not so much, it’s new and constantly changing, everyone does it differently so a standard would be a challenge.  Not to say we shouldn’t dream!

As Itch and Scratch Live become increasingly similar in both looks and features, we thought we’d clear something up: Is the sound engine between the two exactly the same? “Yes – for the most part. The resampling and the vinyl playback emulation is the same, obviously the control mechanisms are different (DVS vs MIDI/HIDI) and ITCH has internal mixing.  The way that it re-samples and the way that it sounds when you scratch and so on though, they use the same thing.

You may or may not know that Serato are hugely famous in the pro AV world for their Pitch ‘n Time software timestretching and pitch shifting algorithms. With real time timestretching and pitching making leaps and bounds in studio software (Ableton Live and Propellerhead Record are two very good examples), we thought that to wrap up Dylan would be the perfect person to ask how long it’d be before we started seeing the same kind of progress in DJing software. “Not long, I don’t think. The biggest challenge is the difference between a realtime time stretching algorithm and a non realtime one. Pitch & Time is still considered by many professionals to be the best thing for something like timestretching an entire 5.1 Dolby mix and make sure it sounds perfect, but they’ll have an eight core Mac in the corner, leave to make a cup of tea and it’ll be ready when they get back. Right now that’s a lot of power, and as soon as you need to cut CPU cycles to get realtime performance you need to start making compromises. I don’t think it’ll be too long though. Funnily enough our realtime keylock algorithm, despite taking much less CPU than Pitch n Time, has been found to be really good for particular things – like RnB acapellas, funnily enough.

Nothing better than getting some information straight from the source, eh? Many thanks to Dylan and everyone at Serato for their help. Let us know who else you’re interested in hearing from in the industry for more Behind the Scenes.

Get DJTT love in your inbox
Drop your email address here, we'll send you news, tutorials, and special offers once a week.
Unsubscribe at any time. we won't sell your data, ever.