Z CAM V1 goes out west. Experience a conservancy ranch by visiting with cowboy Billy.

This is a short 7 minute experience that’s best viewed on a headset in the SamsungVR app, which is available for the Samsung GearVR and Oculus Go headsets. In the SamsungVR app please search on my last name “Yost” to find my channel (remember to click on the channel button) and before you watch this in the app, PLEASE download it locally to view it without having to deal with the resolution artifacts inherent with streaming. (The few extra minutes it’ll take to download the file is more than worth it, and you can delete it after watching to free up space.) If you don’t have a headset you can order one here for only $200. Until then, you can watch on your phone in “magic window” mode or in your browser by clicking on the thumbnail above.

This blog post is one of four about a series of videos I'm making for the University of California.  The final deliverable is four pairs of traditional and 360 video combos, each about a different aspect of sustainability in the USA's most populous state.

Capturing video on location: Shooting with the Z CAM V1 is incredibly simple... just hook up the battery and router, point the A camera where you want your exposure to be based on, hit the record button and go.  If you want to check the exposure first, you can fire up the mobile app to determine if you need to EV compensate up or down a bit, and then initiate recording from the app.

I’ve been using the (now discontinued) battery operated TP-Link MR3040 router/access-point to connect to the camera, and while it’s good for short ranges this shoot required that I hide sometimes 30 or 40 yards away.  To get more range in the future I just ordered the TP-Link AC750 dual band travel router, which I can use along with a dtap->usb converter to provide power for the router in the field.  Will be testing that next week. Of course I’d prefer it if Zcam put an access point inside the unit itself, but space and heat requirements on this little monster of a camera made that challenging. (I’m happy to give up some convenience for the incredibly natural quality of the camera’s stereoscopy and that is due to the very small interocular distance between the lenses. Life is a series of compromises, right?)

One thing I love about shooting 360 video is that the amount of gear is minimal compared to a traditional video shoot.  Everything fit into the Polaris ATV with plenty of room to spare and the shoot itself was incredibly fun, driving through herds of cows up and down the side of a mountain.

Me nearing the end of a very long 13 hour day of shooting (8 hours for the traditional video component and 5 hours for the 360 video).

Me nearing the end of a very long 13 hour day of shooting (8 hours for the traditional video component and 5 hours for the 360 video).

Location audio: I used my one of my Sennheiser AVX MKE2 wireless lav mics for getting Billy’s on-camera audio, and my Zoom H2n for getting wild “room tone.”  The AVX system is rated to 50 meters but that’s with line-of-site and I was hiding behind rocks and trees. In practice I was only getting about 20 meters if I was hiding behind stuff and even at that I was hearing occasional dropouts (one of them even made it into the final release in the scene where he’s speaking to us while on his horse Titan).  This was disappointing and I’m going to test the Sennheiser against my Sony wireless mic because I have a suspicion that the Sony will have better range. This whole issue of having to hide while shooting is so new and strange to me… it’s definitely one of the bigger challenges, especially when shooting outdoors in big open spaces without many obvious hiding spots.  

I use the Zoom H2n for room tone only after capturing video and talent lav audio.  Once the shot is completed I quiet down the set and record a few minutes of spatial tone to provide 360° ambience to the audio track.  

Stitching: I captured a ton of video for this shoot because we did a bunch of takes, mostly due to the horse and it not always cooperating (I will try not to use a male horse again, say no more!) and sometimes due to getting audio dropouts.  Probably came home with 50 minutes of 6k stereo captures. In order to start editing quickly I used a MistikaVR V1 profile that Roman Dudek kindly made for my pre-production camera and rendered a set of very fast proxy stitches overnight which got me into FCPX 10.4.1 the next morning.  I edited the entire piece with proxies and then took the start and end frame numbers of each shot, plugged those into Zcam’s Wonderstitch batch processor and stitched that new set of physically accurate shots overnight as well. I then replaced the proxies with the Wonderstitch output, giving me an online edit that looked great in the Vive Pro.  However there were two shots that had ghosting issues in one of the eyes (the opening shot when he greets us, and the shot when he’s on the horse and the horse backs him up so he becomes silhouetted by the sun). I went back into Mistika and used its stereo edge point feature to manipulate how optical flow looks at those areas and eliminated the ghosting.  It’s amazing to have all these incredibly powerful tools to solve challenging shots like this when they inevitably come up.

Note that Wonderstitch right now only produces HEVC files so I used the super useful ff-works front end for ffmpeg to batch convert them to ProresLT, which is the format I use for onlining the piece (data rate for LT at that res is still very high... about 1.8Gb/s at 6144x6144).  Editing Prores 6144 stereo 360 video in FCPX 10.4.1 on a 10-core iMac Pro is totally seamless, real time all the way. In fact the entire FCPX 360 editing experience was very fluid with no problems of any kind. Thanks again to the Final Cut Pro X team for bringing such great tools into my favorite NLE.

Rig Removal:  I used a combination of the 360 Patch Tool in FCPX for the simple shots where there was only a circular nadir patch needed and MochaVR for the more complex shots where the monopod and camera shadow was obvious.  My buddy David Lawrence walked me through the process of using MochaVR in After Effects to generate clean plates and comp them over the rig shadows, and although it’s slightly tricky to learn the first time, once you understand it the process only takes a few minutes to set up for each shot.  (Note that David turned me on to using the Affinity photo editor for making clean plates because it has an incredibly well-written patch tool… way better than the patch tool in Photoshop. Plus Affinity has a simple control for creating the 360 projection needed to set up for painting… it's an incredibly useful photo editor for 360 plates.)  The trickiest rig removal shot was the one of Billy on the horse because as the horse backs up into the sun, the shadow of the horse bisects the rig shadow (of course). That horse had explicit instructions not to move off its mark but yeesh, you can’t trust animals to do what you tell them! So if you look closely at the horse’s shadow you’ll see that part of it disappears as he backs up.  Such is life.

Editing:  Not much to say here except editing with 360 video is the simplest part of the entire production workflow.  All I had to do was put the shot fragments in place, use the Pan Reorient function to line up the 0° spot where I wanted each shot to be, and add titles to the beginning and end.  I used the motionVFX 360 Title package at the head to give the opening title sequence a bit of subtle animation.

I experimented with Mettle's MantraVR a bit in After Effects, trying out a subtle push with their Moebius transform effect on the last shot of the horse being put into the paddock at night.  It was interesting but I felt that in this case it broke the immersive feeling, so I took it out. That effect definitely has potential and I hope to be able to work it into future projects.

Each audio track that will go into Logic is assigned to its own Audio Role in FCPX, which makes managing and exporting to the DAW simple.

Prepping the edit for the spatial audio mix: Once I had a locked edit, I detached all the audio tracks and put each set into its own audio role, as you can see from the color coding of the audio tracks in the screenshot above.  This allowed me to export each role as a separate track for import into Logic.

At this phase I also export the final video movie file in ProresLT format that I’ll use both in Logic for keyframing and finally with the FB360 encoder to mux the video plus audio after the spatial mix is completed.  One great thing about Logic compared to Reaper is that Reaper couldn’t handle the full 6144x6144 stereoscopic files and I had to create very low-res proxies to use for keyframing, but Logic has no problem with the high-res files.  Note that the FB360 encoder requires an .mp4 file, so I use ff-works/ffmpeg to transcode an .mp4/h.265 version for the encoding process.

On my way into Logic I first brought the dialog track into the  iZotope RX6 DAW for some quick breath and syllibance reduction.  Probably could’ve done that in Logic also but I love working in the RX6 DAW and am used to it.

Creating the spatial audio mix: As you know if you’ve been reading my blog, FCPX doesn’t support spatial audio yet, either directly or via plugins.  Although my previous spatial audio experiments that you can read about here on the blog used a combination of Reaper and Audioease’s 360 Pan Suite, I was not thrilled with the Reaper UI, did not want to purchase Protools, and the Pan Suite wasn’t capable of functioning within Logic.  I also wanted a set of spatial tools that I could use within Premiere for my various experiments with that NLE. David Lawrence turned me on to the AmbiPan/AmbiHead plugins from Noisemakers in France because those tools work both in Premiere and Logic. Noisemakers has an affordable price for indie filmmakers and so I decided to continue my experiments with spatial audio with their tools.  

Apple Logic UI showing the dialog track with azimuth keyframes.

AmbiPan and AmbiHead dialogs, showing Width and Distance parameters. In the movie view on the right you can see the Azimuth marker, with the red dots displaying the Width.

OK... the Logic GUI is infinitely superior to Reaper... it transforms the experience of creating a spatial multi-track mix from something daunting to a really fun process.  The Logic UI is straightforward to learn and once you get the hang of it, incredibly fast to do complex keyframing. The process is basically thus:

  1. Start with a 360 spatial template.  You can find one here.  It sets up the system with a surround first-order ambisonic mix so you don’t have to deal with learning how to do that.  This template assumes that your audio assets have all been recorded with a sample rate of 48khz. If you’ve recorded at 44.1 you’ll have to change that in Project/Settings/Audio.

  2. Go to File/Movie and import your movie file.

  3. Turn on the Show Animation icon in the menu bar.

  4. Add AmbiHead to the audio effects slot of the Master track.  You’ll use this to simulate head-turning while creating your mix.  Important note: Remember to turn OFF AmbiHead for the master track before bouncing to your final spatial audio track.  If you don’t do this your track will be totally screwed up.

  5. Import an audio track, name it and assign Ambi Pan to the Audio Effects slot of that track.  If your track was shot indoors and was close-mic’d so that there’s no indoor reverb ambience, you can assign the new AmbiVerb plugin to a second audio effect on that track.  I did this as an effect for the gate closing at the end of the video to provide a bit of surreal warmth to the very last shot, worked great.

  6. Make sure your track is in Read mode, and I set up the Left Click tool as the selection tool and the Cmd-left-click tool as the Pencil tool.  This makes setting a keyframe with the pencil tool explicit (by Cmd-clicking) and keyframe selection implicit (just clicking and dragging).

  7. You’ll see an AmbiPan dialog appear once it’s installed on a track, and by clicking on the gear icon in that dialog you can bring up the Ambi Scene overlay that you'll place directly over the Top portion of your stereo video window.  (You can see that the overlay window has a handy grid in the screenshot above.)  

  8. Select the AmbiPan parameter you want to keyframe by clicking on the Automation Parameter popup menu in the track, then selecting the specific parameter you want to keyframe in the AmbiPan submenu.  You can see in the screenshot that I have the Azimuth parameter in the Dialog track selected so I can animate the position of the dialog. Each AmbiPan track can have differently colored and labeled position pucks, which show you the center of the signal (plus sign) and width of the signal (colored dots).

  9. At this point it’s just a fun process of moving the playhead to where your talent is speaking from and setting keyframes with cmd-click.  You can also set Gain parameters to change their volume, and Reverb parameters if your talent is moving around indoors. One important note is that the width parameter is very important to creating a natural feeling of spatiality.  I’m typically using a width of about 30%, which creates a feeling that the audio is slightly wider that a pure point source. For outdoor scenes this feels right to me, and for indoor scenes I might pump it up to 40-50% for a dry signal, or leave it at 30% and use AmbiVerb to increase the width for a wetter signal.  

  10. For narration tracks, I set the width to 100% to head-lock the audio track, making it aurally obvious to the viewer that they are hearing narration instead of a point source.

  11. Once all of my keyframing is done on the mono and stereo tracks, I bring in my spatial audio “room tone” tracks, set them to 100% width in AmbiPan and mix that spatial audio in to provide deeper aural immersion ambience.  

  12. When happy with your mix, turn off AmbiHead and use File/Bounce/Project to write the PCM/Wave/24-bit/48k/Surround mix to the folder where your master movie file is.  

The only steps left are:

  1. Bring up the FB360 encoder to mux the audio and video.  Select Youtube output format, B-format spatial audio format, load the audio file, then select your Video Layout (in this case Top/Bottom stereo) and load the .mp4 (hevc) video file.  Write that to disk.

  2. Bring up the Google 360 Spatial Media Metadata Injector, load the muxed .mp4 file and inject it with the necessary metadata.

  3. Upload to Youtube!  Now wait a few hours for YT to correctly encode the spatial audio.  It takes a long time.

In summary: There is so much potential for giving people the means to “visit” with a person in a place that they’d normally never meet or see… I’m particularly interested in providing experiences to people who are bedridden in hospitals or at home and don’t physically have the opportunity to get outside, but there are so many more potentials with 3D 360 video. I’ve never been so excited about a medium before (and I’ve worked in a lot of high tech mediums!).  The Z CAM V1 is what has really charged me up so much... the quality of the stereoscopic imagery is just so beautiful and once I get the production version that supports 60fps shooting (my prototype is 30fps only), my last concern (motion cadence) will be laid to rest.  Very soon now because I need 60fps for the Carnaval SF 40th anniversary dance project I'm starting tonight at the coronation of this year's Carnaval king and queen @ Mission High School in SF. 

I want to express how grateful I am to all of the brilliant creative people who have made these immersive tools possible.  From the Z CAM V1 team (Kinson Loo, Jason Zhang, Bang Fu and their coworkers) to the Logic and FCPX teams at Apple, Charles Verron at Noisemakers, the folks at BorisFX who make MochaVR, and the list goes on and on.  Tools like these don’t just happen by themselves and this stuff is really hard to make.  (Inventing the future is always difficult, right?)  Bravo to the tool makers!!!!

Plus thanks to Professor Michael Dawson at the University of California for providing the funding for this series of projects.  Each one of these four projects (you can watch the Hydrogen Lab video here) is actually a combination of a traditional 16:9 “framed” video plus an additional 3D/360 immersive experience and the entire set will be finished by mid-August 2018.  I can’t say that I’ve ever seen anything like this before, in which a traditional video is accompanied by a 360 video experience, and it’s a powerful combination being made to introduce incoming university students to these subjects in a unique way.

And special thanks to my wife Sondra.  These UC projects are being shot all over central California and she’s been a huge support system for me. Without her I wouldn’t be able to get all this done as a “one man band filmmaker” and I’m forever grateful.

Waiting patiently for me to finish... she was very popular with the cows.

Waiting patiently for me to finish... she was very popular with the cows.

gblog, BlogGary YostComment