How I edit and master my field recordings

April 13, 2021 George Vlad

The studio side of field recording

I love talking about field recording, as you can gather from reading my blog or following me on social media. I’m deeply passionate about all aspects of the discipline and I also love to inspire others to pursue it. Something I haven’t talked much about is the editing and mastering side, probably because it isn’t as glamorous as teetering on the edge of a volcano or being chased by an orangutan in the rainforest. It may also be because I do my best to escape the studio whenever I can, and this would mean more time spent indoors. At any rate, with this blog post I’m trying to fix that.

Before I start, I have to preface this with an important point. If I mention gear or software, it’s because that’s what works for me. Same goes for workflows. It’s always very tempting to buy more kit or to chase the next magic bullet, but that isn’t what I’m trying to achieve. Instead, I suggest you take all this as data points and make your own judgments. Try to use what you already own and only if that doesn’t work, upgrade. I’m always happy to answer questions so feel free to comment below.

In the field

Editing field recordings on expedition

My editing process starts in the field. Even before I press record, I give a lot of thought to what the final result will sound like and how it will be experienced. If I’m recording soundscapes, I want them to be as flexible as possible so I will record long takes: never less than 10 minutes, sometimes as long as 2 or 3 days. My reasoning here is simple – it’s better to have more options and it’s much easier to cut out parts of the recording than to somehow lengthen it.

Recording long takes gives me a lot of flexibility when I’m back in the studio. From the same recording session I can derive sound effects libraries, albums for listening, soundscapes for upload to Youtube, material for music composition, focused sound effects, environmental data etc. I make sure to slate each session at the start and at the end (if there’s any battery left when I collect the rigs) with the equipment I’m using, time of day, location, interesting information about the surroundings, etc.

I record soundscapes in stereo, quad or double Mid-Side. Of all these, DMS is the most flexible as it can be decoded to mono, stereo or surround. It doesn’t sound as immersive as a recording made with spaced omnis or omnis as “tree ears” (a technique that I’m very fond of along with Thomas Rex Beverly and Andy Martin), but the very compact rig and the fact that it can be decoded into so many formats makes up for it. AB or tree ears with omni mics is lovely and immersive to listen to, but is limited to stereo (unless you do quad tree ears). Sometimes it’s easier to tape two lav mics like the Lom mikroUsis to branches on a tree and to plug them into a minuscule recorder such as the Sony A10. A DMS rig would be much more conspicuous in these situations so the benefits of surround sound will be outweighed by the risk of having the equipment stolen or destroyed.

I prefer not to fiddle with the stereo image once I’ve made the recording. I spend a good amount of time deciding on where to record, and stereo or surround imaging is a big part of these decisions. I also think about distance to a certain subject or sound source at this stage, trying to have a nice balance between foreground, background, possible distant elements, reflections etc. This isn’t a straightforward task and requires a lot of listening, but it gets easier with practice and can make a big difference in the final recording.

I usually record 24 bit 96 kHz. I will record at 192 kHz if I use mics that reach high into the frequency spectrum, if I’m recording sources that have a lot of high frequency content and/or if I need to pitch shift the recordings drastically in post. In some situations I also record in 32 bit. This allows for greater dynamic range than 24 bit, although it’s not a magical solution that will prevent just any recording from clipping. Outside these edge cases though, 24/96 is perfectly suited for my recording purposes and offers me enough flexibility while maintaining manageable file size and power consumption.

In the field I’m very deliberate about data management. This isn’t necessarily relevant to editing and mastering but it makes my studio work much easier if everything is neatly organised and labelled. I try to copy data over from recording media to my laptop and at least two SSDs at least once a day, if it’s possible. Sometimes I’m on a long hike in places like the rainforest so this won’t be easy, but I bring tens of SD cards (or whatever recording media my devices take). This way I can simply replace them after every session instead of using them continuously and risking losing more than one session if they fail.

Field recording data management

As soon as I get back to basecamp, I copy everything on to proper drives with redundancy in mind, even if sometimes this takes many hours and I’d much rather be relaxing or exploring the landscape. I pair this with charging batteries and testing equipment so I can be more efficient with my time. I also like to have very quick listens to my recordings when I do the data management part, to identify faults, places or things I’d like to record some more and to have a rough idea about what I’m capturing.

In the studio

I’ve been working with digital audio in one shape or another since 2003, and I started field recording many years later. This insight into the studio side of audio work has helped me avoid some mistakes and has informed my work in the field.

Regardless of purpose, I try to listen to most if not all the recordings I make. That’s not an easy task given the hundreds of hours of soundscapes that I capture on an expedition with multiple drop rigs. It’s not impossible either, especially as I love listening to natural sounds and I can do it while reading, doing my taxes, writing or performing other tasks. I listen to DMS recordings in Reaper as I can decode them to stereo or surround in real time, while still having access to a waveform and spectrogram view (the latter is a bit limited in Reaper though). I listen to stereo recordings in Izotope RX which offers a powerful spectrogram view.

See this SoundCloud audio in the original post

Before I listen to everything thoroughly though, I will have a few cursory listens and looks at the waveform/spectrogram. This way I can identify interesting events, useful parts or anything that I might want to go back to. When I come across something I want to explore more in detail, I drop a marker and name it something simple but easy to understand. For example, I would tag the start of the dawn chorus, a particular species, weather events, a particularly pleasant soundscape and so on.

Once I’m back in the studio and I’ve copied everything over to internal drives, had a few listens and know more or less what I’ve recorded, it’s time to focus on individual purposes. Sometimes these will overlap, so I try to keep track of everything in case I have to go back and export more parts.

Speaking of exporting, I never do destructive editing and I always keep a copy of the original recordings (plus the backup versions, of course). I make separate copies when working in waveform editors (Adobe Audition for example) and I export separate versions when I work in DAWs like Reaper. This way I keep the recordings as flexible as possible without making them better for a given purpose but possibly worse for others.

Sound effects libraries

One of the more important purposes for my field recordings has always been sound effects libraries. While this type of content isn’t used for research or scientific purposes, I like to keep it as transparent and unedited as possible. In places like the Congo basin rainforest, I was able to record for 48 hours without capturing any man-made sound. Those recordings are easy to edit and master compared to the soundscapes I got in the Amazon or in parts of Senegal. In these other locations, the amount of man-made noise was surprisingly high. Acoustic ecology and anthropophony are different topics worth discussing separately though.

Cutting parts of a long recording seems straightforward enough, but it might not be obvious where to start and end. I try to not break the natural flow of a soundscape by having cuts midway through a song phrase or mammal call, for example.

My libraries are mostly made up of tracks that are 5 to 10 minutes in length. Within these constraints there is a lot of leeway to include or exclude material. I try to cut or export audio with homogeneity in mind. If a bird sings prominently for 4 minutes, I will try to include that entire part instead of beginning or ending the file midway through the song. Similarly, if there’s a sudden change in the soundscape (e.g. a storm starts, wind stops completely or a very loud insect starts singing nearby), I will cut parts either before or after this event. There’s not much use in having two vastly different and disparate parts of a soundscape in the same file for library use.

Setting up a drop rig in Borneo

Regardless how good I’ve become at setting gain levels (and I don’t think I’m that good), the loudness of a habitat changes over time. This inevitably results in parts of a long recording session that are too quiet. I avoid clipping by setting levels conservatively, using high quality, low noise equipment and sometimes using limiters or recording at 32 bit.

The first thing I look at when preparing a recording for use in a library is levels. I play it by ear and try to replicate the loudness I heard on location, so most of the time I don’t add a lot of dB of amplification. Sometimes I have to take the levels down a bit, which is generally better than raising them as it doesn’t raise the noise levels.

I never use lowpass filter functions on mics or recorders just so I can keep my recordings as flexible as possible. Some microphones pick up a lot of low end, like the Sennheiser MKH8020s for example. A lot of that low energy isn’t relevant or useful in an ambience sfx library and this is where EQ/filtering can be very useful.

I generally remove everything below 20 Hz and gently roll off content below 80 Hz or so. This varies depending on what I’m recording though. If there’s relevant content between 20 and 80 Hz (think thunder, a volcano, low frequency mammal calls, some bird calls etc) I will be more careful and won’t reduce it as much. The aim of EQ and/or filtering in this case is not to get rid of content, but rather to tame some frequencies so they don’t hog the dynamic range unnecessarily.

Sennheiser Double Mid Side in Gabon

MKH series mics made by Sennheiser exhibit some high frequency noise, roughly above 30 kHz. That can be annoying when the soundscape is very subtle, as this noise is brought up when amplifying it. One way to remove some of this unwanted noise is by filtering it out. This will obviously filter out other relevant content in that range (bats, overtones of certain animal calls etc) so I only do it sparingly.

I also use EQs to filter out parts of some calls that can be excessively loud. Again, I feel like I have some freedom here since this content isn’t used for scientific purposes, and as long as I don’t completely change the sound of these places, I can make the recordings feel a bit more balanced and less grating. The usual culprits are insects, especially in tropical forests. Their calls can be deafening and need to be tamed before I include the recordings in a library.

Compression and limiting are a bit more controversial. I try not to mess with the dynamic range of a soundscape unless something is very much out of balance. Even then, I first try to find a different part of the recording where things are more even, and only as a last resort I will use gentle compression. I’m careful not to create so-called pumping effects and I try to avoid harsh, clipping-like noise in the louder parts of the recording.

Spectral editing can be useful in certain circumstances. This is not something I like to do though and only employ it when there’s no other option. For example, parts of the Borneo rainforest are surrounded by oil palm plantations. It’s an environmental tragedy worth talking about in a separate blog post, but its effects on the soundscape are pretty dire too. Maybe I’ll write about it sometime.

At any rate, going back to the soundscape in Borneo, I was surprised to see how much noise permeated from the plantations into the rainforest. Combustion engines seemed to carry for the longest distances, even if only between 20 and 80 Hz. There was a similar story in a mangrove forest on the Atlantic coast of Senegal. People would go fishing with motorboats before dawn and that ended up ruining a lot of recordings for me. There was no way to avoid capturing distant man-made sounds there.

In these cases, the only way to use the recordings without including the man-made sounds is to replace part of the soundscape with bits that aren’t marred by anthropophony. This isn’t 100% ideal, but it can work if there isn’t other relevant content that overlaps with the man-made sound I’m trying to replace. My approach to this is to splice in bits recorded in the same location with the same microphone so that I only have to replace the “dirty” part of the recording. Izotope RX has an especially useful tool that allows me to copy slices of the spectrum and then to paste them where needed.

Just to be clear, I only replace low frequencies (below 100Hz) this way, and only in situations where there’s no relevant content there (like a lion’s bellow, deadfall thuds, elephant rumbles etc). Trying to replace long parts higher up in the frequency spectrum will most of the time result in an unnatural and odd sounding recording so I just avoid it.

Noise reduction is a controversial one. I would use it on studio recordings of spot effects, but never on soundscapes/ambience recordings. Noise is inherent to the natural world and is never constant. Some amount of noise is perfectly fine in a recording of a space or environment. Using noise reduction would suck all the life out of it and often result in weird artefacts that might not be apparent immediately.

Sony PCM A10 - one of my favourite recorders for drop rig use

Equipment noise on the other hand is a different story. Recorders like the zoom H4n for example will exhibit plenty of it, and the best way to reduce it is to get rid of the recorder itself and to buy better gear. Some microphones can be noisy too, especially cheaper ones. Humidity, heat, cold etc can also cause noise or unwanted artefacts in recordings. Noise reduction can sometimes help but generally does more harm than good.

Removing/replacing individual bits manually is something I’ve warmed up to over the years. Realism is pretty important in the context of a sound effects library, but sometimes a bird will briefly fly too close, a mosquito will land on the mics, or a drop of humidity will hit the blimp etc. This isn’t useful or relevant to sound editors or designers, and I don’t think it’s worth cutting the part out or discarding the entire recording. Moreover, cutting a few seconds midway through a soundscape will break the natural rhythm and even if isn’t immediately obvious, it might still sound odd at a longer listen.

Spectral editing allows me to splice in a clean bit of soundscape from a different part of the recording. If I’m careful I can make the transition to and back from this pasted element 100% seamless. I’m ok with doing this a few times per 5- to-10-minute recording, but not more. Even if the transitions are seamless, something will sound off when parts of the soundscape are spliced in too often.

I never mix recordings in the context of sound effects libraries. Even if the recordings are captured in the same space, cause-and-effect events will be thrown off. The natural rhythm will be canceled, or more precisely there will be two natural rhythms, possibly overlapping and causing confusion for the listener. There’s no situation I can imagine in which this would be a good idea. I prefer to leave it to the sound editor or designer to handle mixing while I focus on delivering clean, natural, realistic takes of what environments and habitats sound like.

All my libraries come with detailed metadata included. I prefer human-readable filenames and descriptions to random lists of keywords. I’ve recently started to use the UCS metadata system but I’m not 100% convinced it helps, especially when it comes to filenames. It’s extremely easy for these to balloon to hundreds of characters.

Going back to my metadata process, I don’t think it’s too complicated. I try to think like a user and I include relevant data about the recording. I keep filenames short and only include things like general location (Amazon rainforest), detailed location (oxbow lake), time of day (afternoon), main theme or subject (dusk chorus) plus the occasional adjective (lush, calm, piercing) to differentiate the track from similar ones, all these elements separated by hyphens.

In the description field I will mention keywords like rainforest or dusk chorus while adding more detail and nuance. Instead of separating the various parts with hyphens, I use human-readable phrases. A description for the file above would sound like: “Calm dusk chorus at oxbow lake in the Amazon rainforest. Constant insect calls, occasional birdsong and water dripping from vegetation. Relaxed atmosphere.” The remaining fields are straightforward enough to not require explanation.

Field recording in the Amazon rainforest

There are a few options with regards to metadata software. I’ve used Basehead for several years and then switched to Soundminer when a recurring bug in BH started to cause annoyance and corrupted files. I’m happy with SM and I can recommend it, although obviously BH, Soundly or even Reaper might work for others.

I should mention here that I don’t use metadata for organising my own recordings. My data management in the studio isn’t too complex. I organise my recordings by year, expedition, device and day. I do manual backups on separate physical drives. I used to have cloud backups but my broadband speed has not kept up with filesize so that’s out of the question at the moment. There is definite room for improvement in this workflow and I’m currently looking into it.

An interesting question came from the Field recording facebook group. Zacha Rosen asked how much of my work for sfx libraries is done because users expect it versus how much of it is my own initiative. I’ve experimented with this over the years and my conclusion is there’s no pleasing everyone. When I did things that I thought the market expected, the process wasn’t as rewarding. It was difficult to set myself apart from the ever-increasing number of sound recordists while doing things exactly like they did. At the moment I’m at the opposite end of the spectrum where I do things because I enjoy the process and I charge as much as I feel is right for my work.

As content creators, we can convince the market and audience to go in a certain direction. I’m fortunate enough to be in a position where I don’t rely on my income from field recording so I can take risks and be bold where others might prefer a more cautious approach. I don’t think about things in terms of return on investment either. This gives me a lot of freedom and allows me to record the subjects I want, to release them as I want and to carve my own niche even if I might be losing sales or spending more than I earn in the short term.

I won’t release individual sound effects for sale because that would feel cheap. I’d rather spend a few more months working on my recordings than release fast to make some extra money. Most importantly, my prices are not low because my content is rare and the work I do is complex. I know I could get a lot of traction by sharing my recordings for free or next to nothing, but that would just be a silly race to the bottom. I believe in my work too much to devalue it like that.

To conclude the sfx library part, I should mention overall sound. In my view, an ambience sound effects library should reflect reality on the ground. This is why I record at least 24 if not 48 or more hours in the same spot. I want to be able to portray that place at any possible time of day, time of the year, weather or mix thereof. As a consequence I won’t try to make the sound glossy and ideal, and I’ll focus on realism instead. This will be different for other purposes which I will go into shortly. Thanks to Stijn for the question.

Slow listening

There’s no clear definition for slow listening, but I like to think about it as any listening activity that takes longer than an hour and is done deliberately. I started to share long soundscapes through my Youtube channel a few years ago but I only did it rarely at first. Last year I started seeing some traction and decided to upload a long soundscape video every week. While there’s plenty of “meditation”, “relaxation”, “ASMR” or otherwise calming soundscapes on there already, most of it is looped, canned content recorded poorly and mediocre at best.

Most aspects I mentioned in the context of sfx libraries apply, but there are some key differences. As mentioned already, these soundscapes range from one hour to more than two hours. The focus is the listening experience, so I try to remove elements that can cause confusion. While I try to preserve the natural rhythm, I occasionally cut out parts that can break the immersion (like animals crashing in the vegetation, unless that’s what I want the soundscape to revolve around).

I’m slightly more generous with compression and limiting since the recordings will be experienced through Youtube. This doesn’t mean the dynamic range is going to suffer much though. I use filters and EQs as well, and occasionally I will perform spectral editing if it makes the listening experience smoother. I won’t do any mixing though, as that would confuse the listener and can be regarded as cheating.

Albums on Bandcamp

This is a similar approach to slow listening, with the difference in duration. My approach when I’m working on an album for Bandcamp is to cut my recordings into 5- to 10-minute long tracks and group them in themed albums. Everything else stays the same and the purpose is listening as one would do with an album of music. Sound is music and music is sound, after all, including field recordings.

Listening to the desert. Photo by Yigong Zhang

Sound design

A big part of my day-to-day work is sound design for video games. I often use my own recordings as material, but going deeper into my process would be a complex task similar to a sound design workshop. In short, I only edit my recordings for sound design purposes as I go, without preparing them in advance.

To give you an example, let’s say I have to design a sonic background for a video game. I decide on a calm bed like soft wind, which I can then go and find in my original recordings folder. Once that’s set, I will start adding point sources like birdsong, animal calls or tree creaks. I will go and select these individually from my original recordings, probably using spectral editing. This workflow serves me well as long as I organise and label my recordings collection carefully.

Conservation and research

Tracking anacondas in the Amazon

I sometimes work with research initiatives and conservation NGOs. There’s a lot I can do to help in these situations, if only I had more time. For now I mostly share media (sound recordings, photos, footage) and advice about sound recording. It’s crucial that the recordings I share in these situations not be edited in any way though. The main reason for this is that important data could be hidden in the rhythm of a soundscape, in frequencies that might seem unimportant or even in an apparent lack of wildlife calls. Any sort of editing or processing has the potential to cause confusion or erroneous results. It’s also particularly important to offer extra information like date and time of recording, exact location, any interesting notes – basically things you would slate on your recording anyway.

Music composition

See this SoundCloud audio in the original post

I also compose music using sound recordings. This is quite the opposite of the research approach, where anything works. My main focus while composing music is to convey feelings, so it doesn’t matter if I pair up animal species from very distant parts of the world or if I process them beyond recognition. Similar to the sound design approach, I only go in and select material from my recordings as needed.

———————————

I hope you’ve found this blog post interesting and useful. It’s taken me several days to write but I think it was worth it given how often I receive these questions. Check out a few options to support me if you want to say thanks:

- become a patron: https://www.patreon.com/georgevlad
- buy my sound effects libraries: https://mindful-audio.com/sound-effects-libraries
- buy me a coffee: https://ko-fi.com/georgevlad
- buy my soundscape albums: https://wildaesthesia.bandcamp.com

Listening to the savanna in Kenya. Photo by William Nkumum