Video production is hard but fun

I’ve been producing a bunch of new videos for future iterations of Play With Your Music, with the help of the good people at the NYU Blended Learning Lab. So far, we’ve done two sets. There’s a series of tutorials on producing samples, beats and melodies using the in-browser digital audio workstation Soundation:

And there’s a series giving some in-depth analysis to “Sledgehammer” by Peter Gabriel:

All of these are remakes of videos I did, pathetically, on my laptop, without any video editing software. I do not recommend this method.

If you want to make your own zero-budget educational videos, there are a few things you should know. It’s quite easy to record screen captures on a Mac using Quicktime. You can record voiceovers to them too. However, Quicktime can only take one audio input. You can have the system sound in your video, or your voiceover, but not both. If you’re trying to talk about music production, this is a severe limitation.

There are workarounds. You can route both the system sound and the mic input into the lone Quicktime input using Soundflower. This is neither easy nor a great solution — I never did figure out how to send sound to my headphones so I could monitor. Alternatively, if you have an audio interface with multiple outputs, you can use one for your headphones and run a cable from the other one into an input, and record from there. This also isn’t ideal; some interfaces don’t perform well recording from their own outputs. You could also spend a few hundred dollars on Camtasia or some other professional screen capping application, but, yeah.

The solution I finally arrived at, after many hours of frustration, was to not try to record everything at once. For the Soundation tutorials, I recorded the screen captures with system sound. Then I put the video into Ableton Live and overdubbed my narration. I could do basic video edits that way, and sync up my voice no problem. But I couldn’t figure out how to do the screen sizing the right way, so the resolution ended up being too low for anything to be visible. Lame.

For the Peter Gabriel videos, I opted for a different tactic, which was to just sit in front of the computer and talk, and then edit in the music I was talking about afterwards. The quality you get from the laptop camera and built-in mic are nothing to write home about, but it worked. I had a merry time elaborately syncing parts of the song to the moments in my narration where I mentioned them. Unfortunately, clever though it was, it worked poorly. It’s annoying to listen to someone talk while music is playing, especially if the music has lyrics. Also, while I tried to keep things concise, the video ended up being almost twenty minutes long. That makes for an inconveniently huge file, and an unwieldy viewing experience. The video went over well content-wise, but it’s nothing I’m eager to put in the portfolio.

Enter Phil Servati and the NYU Blended Learning Lab. We only learned that this facility existed after shooting a ton of low-quality video, but better late than never. The lab has a nice high definition camera, proper lighting and sound, a great big smartboard, and best of all, a whole team of people to coach you on your presentation and run the equipment. The lab has a rule that no video can be longer than seven minutes. There are two sensible reasons for that. First, the resulting file size is about two gigabytes, which is the most you can upload to the web via the browser. Second, there’s only so much information people can absorb at a time, and seven minutes is a reasonable serving size.

We shot the Soundation videos first. I had largely improvised the first versions, planning to just edit out my stumbles and false moves later. I’m a jazz guy, and I prefer improvisation to reading scripts. Doing the videos off the cuff gives the viewer a sense of my actual thought process in putting a track together. For the new versions, I wanted to use the same approach.

The Blended Learning people don’t do postproduction and I wasn’t eager to do it either, so I wanted to just get it right live.  After a couple of practice runs, I found I could make it through my semi-spontaneous music production performance smoothly, and had some nice little moments of pedagogical discovery along the way. For example, I placed a drum loop in the wrong spot on one take, with a rhythmically awkward result. But talking through the correction of that problem made for valuable material, so I placed the loop wrong intentionally on subsequent takes.

The new videos aren’t perfect. My gaze is directed at some weird random place above and to the left of the camera. There are a couple of awkward seconds at the top of each video that we should probably edit out. I bump the lavalier mic now and then. The mix of the voiceover and music isn’t quite right. I’ll probably take another swing at them after getting the process down a little tighter.

The Peter Gabriel videos are more traditional presentations. Normally I do my slide shows with Omnigraffle, and then export them as PDFs for the presentation itself. But you can’t embed sounds in the slides that way, so I needed to grit my teeth and use PowerPoint instead. By default, sounds in PowerPoint slides are represented by tiny speaker icons. If you make the icons enormous, they look great on the smartboard and make for nice big targets. I think the slides look pretty good, but for future videos, I want to have more slides, each with a single big image or word on them, that I zip through quickly.

The platonic ideal of the educational YouTube video is represented by the work of Vi Hart.

Her process is labor-intensive, as you can see in this behind-the-scenes video, but the results are so compelling. I have serious Vi Hart envy. The question is, how to approach her level of multisensory storytelling without dropping every other thing in my life. I’ll keep you posted.