New aac advanced audio coding. HE-AAC format, its versions and their differences

I recently received the following letter:

Hello site, MP3 is the most popular audio format, but there are so many others such as AAC, FLAC, OGG and WMA that I'm not really sure which one I should use. What is the difference between them and which one should I use to store my music?

The question is quite popular, I will try to answer it simply but clearly.

We've already talked about the difference between lossless and lossy, but in short, there are two types of audio quality:

  • lossless: FLAC, ALAC, WAV;
  • lossy: MP3, AAC, OGG, WMA.

The lossless format preserves full audio quality, in most cases CD-level, while the lossy format compresses files to save space (of course, the audio quality is degraded).

Uncompressed data storage formats: FLAC, ALAC, WAV and others

  • WAV and AIFF: Both WAV and AIFF store audio uncompressed, meaning they are exact copies of the original audio. The two formats are essentially the same quality; They just store data a little differently. AIFF is made by Apple, so you may see it more often in Apple products, while WAV is pretty much universal. However, since they are uncompressed, they take up a lot of unnecessary space. If you don't edit audio, you don't need to store audio in these formats.
  • FLAC: Free Lossless Audio Codec (FLAC) is the most popular lossless audio storage format, making it a good choice. Unlike WAV and AIFF, it compresses the data slightly, so it takes up less space. However, it is considered a format that stores lossless audio, the quality of the music remains the same as the original source, so it is more efficient to use than WAV and AIFF. It is free and open source.
  • Apple Lossless: Also known as ALAC, Apple Lossless is similar to FLAC. This is a lightly compressed format, however, the music will be preserved without loss of quality. Its compression is not as efficient as FLAC, so your files may be a little larger, but it is fully supported by iTunes and iOS (while FLAC is not). So, if you use iTunes and iOS as your main software to listen to music, you will have to use this format.
  • A.P.E.: APE - has the most aggressive compression algorithm for lossless music storage, that is, you will get maximum space savings. Its sound quality is the same as FLAC, ALAC, but there are often compatibility issues. In addition, playing this format puts a much higher load on the processor to decode it, since the data is highly compressed. In general, I would not recommend using this format unless you are limited on available memory and have software compatibility issues.

Compressed audio storage formats: MP3, AAC, OGG and others


If you just want to listen to music here and now, chances are you'll be using a lossy format. They save a ton of memory, leaving you with more room for songs on your portable player, and if high enough, they will be indistinguishable from the original source. Here are the formats you are likely to encounter:

  • MP3: MPEG Audio Layer III, or MP3, is the most common lossy audio storage format. So much so that it has become synonymous with downloadable music. MP3 is not the most efficient format of all, but it is certainly the most well supported, which makes it best choice for storing compressed audio.
  • A.A.C.: Advanced Audio Coding, also known as AAC, is similar to MP3, although it is slightly more efficient. This means you can have files that take up less space but have the same sound quality as MP3. The format's best evangelist today is Apple's iTunes, which made AAC so popular that it has become almost as widely known as MP3. I've only had one device in a very long time that couldn't play AAC, and that was a few years ago, so you can safely use this format to store your music.
  • Ogg Vorbis: The Vorbis format, known as Ogg Vorbis due to its use of an Ogg container, is a free alternative to MP3 and AAC. Its main feature is that it is not limited by patents, but you as the end user are not affected at all. In fact, despite its openness and similar quality, it is much less popular than MP3 and AAC, which means that fewer programs support it. Thus, we do not recommend using it to avoid software compatibility issues.
  • WMA: Windows Media Audio is Microsoft's own proprietary format, similar to MP3 or AAC. It doesn't offer any advantages over other formats, and it's also not very well supported outside of the Windows platform. We do not recommend that you rip CDs to this format unless you know for sure that all music will be played on the Windows platform, or on players compatible with this format.

So what should you use?

Now that you understand the difference between each format, which should you use to rip or download music? In general, we recommend using MP3 or AAC. They are compatible with almost every player, and both are indistinguishable from the original, if . Unless you have special needs that dictate otherwise, MP3 and AAC are your best bet.

However, there is something to be said for storing your music in a lossless format like FLAC. While you probably won't notice higher quality, lossless is great for storing music if you plan to convert it to other formats later, since converting a lossy format to another lossy format (such as AAC to MP3) will result in When files appear, they appear to be of noticeably lower quality. Therefore, for archival purposes we recommend FLAC. However, you can use any lossless format as you can convert between lossless formats without changing the quality of the file.

In this article, I propose to consider an effective method for compressing audio files. This is the second part in a series on optimizing content for mobile phones; the first, let me remind you, was dedicated to .

Audio files in, as a rule, take up the most, sometimes the size of each composition on average reaches 3-5 Megabytes. Such storage volumes in the memory of a mobile phone are wasteful.

The most popular format is still mp3, but in terms of encoding “efficiency” it is far from ideal. One alternative is A.A.C., compared to mp3, it is capable of producing higher quality with a similar file size.

In practice, this allows you to compress audio files to an average size of 1.5-2 Mb, which sound slightly different from the original. This article provides a guide on how to convert audio files to AAC using foobar2000.

Advanced Audio Coding (AAC)

This is a wideband audio encoding algorithm that provides support from 1 to 48 channels at sampling rates from 8 to 96 kHz. AAC operates at bitrates ranging from 8 kbps for mono voice to a whopping 160 kbps per channel for high-quality encoding using multiple encoding/decoding cycles.

The format was developed jointly by several companies: AT&T Bell Laboratories, Fraunhofer IIS, Dolby Laboratories, Sony Corporation and Nokia. The AAC format is actively promoted by patent holders. First of all thanks to mobile devices, which have hardware support for this format. You can remember the positioning of phones Sony Ericsson Walkman series, as models created for people who attach great importance to the sound quality of the device. This format is also used in the iTunes online store and in many other media-related areas.

Key Benefits of AAC

  • Up to 48 audio channels;
  • B O greater encoding efficiency at both constant and variable bitrates;
  • Sampling rates 8 Hz to 96 kHz (MP3: 8 Hz to 48 kHz);
    More flexible Joint stereo mode.

AAC encoding

To do this we will use the program


Foobar2000 has a minimalistic, extensible interface and includes many features to support metadata and high quality playback audio. There are both official components and
third-party components with a wide range of additional functions.

Key features of foobar 2000

  • Supported audio formats: MP3, MP4, AAC, Vorbis, FLAC, WAV, Audio CD, etc.;
  • Full Unicode support;
  • Volume equalization (ReplayGain);
  • Easily customizable interface design;
  • Advanced capabilities for working with tags;
  • Support for ripping Audio CD, as well as transcoding all supported audio formats using the component converter;
  • Full ReplayGain support;
  • Open architecture allowing third party developers expand the functionality of the player.
operating system
Windows XP - SP2 or higher, Vista, Seven.

To work, you need to download the latest stable version of foobar 2000 from the office. site. You can also download additional components and plugins there. In order for foobar2000 to encode audio files into the AAC format, you need to download this free codec and place it in the folder where the program files are located.

You can download the codec from. developer sites. There are two popular alternatives - AAC codec from Nero, or QuickTime AAC from Apple.

There have long been heated discussions on professional forums about which AAC codec is better, often agreeing that the psychoacoustic algorithms in the Nero audio are better implemented. For this article, the codec chosen is from Nero (neroAacEnc.exe), after mastering the encoding technology, you will be able to try QT AAC (qaac.exe).

Launch foobar2000, open the file that needs to be converted (File - Open...). Select the line and select convert from the drop-down list.


We are interested in the Output format item.


The following window will open


Convert Setup Menu


Go to the AAC (Nero) item and click Edit to launch the semi-automatic settings mode.


In this menu you can set parameters for the AAC encoder (Encoder) - encoding mode (Mode) and bitrate (Quality). Most efficient mode with variable bitrate - VBR, which foobar2000 recommends to us. Quality allows us to determine the quality of the output file - the higher the bitrate value, measured in kilobits/s, the higher the quality of the final audio file and its size.
Here you need to find a compromise between quality and size. This can only be determined experimentally. From my own experience, I can say that
for a mobile phone, for many music files, q in the range from 23 to 30 is quite enough. It all depends on the complexity of the musical composition.

Exit the settings - click OK, then Back and finally Convert. A window will appear warning you that you are encoding into a lossy format.


Since in the future this aac file is planned to be played on mobile phone, then a certain reduction in quality is an inevitable process. We agree to start coding.


After a few minutes, if everything was done correctly, a file with the m4a - aac extension will be created in the container. This file should play on your phone without any problems, but if your model refuses to play the file, you can try simply changing the extension from .m4a on .aac.

There are also additional commands, so-called switches, that allow you to produce more fine tuning codec.

Let's look at the most important ones when encoding in VBR mode

-ignorelength- ignore file duration, preferably use.

-q- sets the sound quality, 0 - minimum quality, 1 - maximum. You can determine the appropriate bitrate value using the already considered AAC profile.

The remaining commands can be copied from the example below.

In order to be able to enter keys, you need to create a new profile in foobar2000. To do this, in the Convert Setup menu, click Add New and set your values.


The keys must go in a certain sequence.

Example of a valid line:-ignorelength -q 0.52 -if - -of %d

There are variations of the format such as HE-AAC and HE-AACv2 - these formats mean that the AAC codec uses special algorithms for ultra-low bitrates. The fact is that the AAC codec itself selects the optimal encoding mode, so there is no need to use the -lc, -he and -hev2 switches.

You can view the obtained characteristics of the audio file in the program

Today, the AAC format has still not achieved mass distribution on audio media, but in a number of parameters it surpasses all types of audio compression existing today, which means it is worthy of our attention.

What it is?

Let's start with a definition: AAC is a proprietary (proprietary) audio file compression option. At the same time, it has less quality loss during encoding compared to MP3 under conditions of the same bitrate. In addition, the AAC format is a wideband audio encoding algorithm that uses two main encoding principles to significantly reduce the amount of data required to transmit quality digital audio. This solution is recognized as one of the highest quality, implemented using lossy compression technology. The format supports most modern equipment, even portable ones. It should be noted that ringtones in AAC format can be purchased from iTunes Store, and this store presents music compressed exclusively using said decision. It should also be said that the AAC format was originally created as a successor to MP3, which could provide improved encoding quality. The solution was released back in 1997 as a new, 7th, part of the MPEG-2 family.

Principle of operation

When encoding in this format the following processes are performed: unperceived components are removed from the signal, the encoded audio signal is cleared of redundancy. After this, the data is processed according to the MDCT method according to its complexity. At the next stage, codes are added to correct various internal errors. Finally, the signal is transmitted or stored.

All the details

Interestingly, the AAC format has a sampling frequency in the range of 8-96 kHz, as well as a number of channels in the range of 1-48. MP3 uses a hybrid set of filters. In turn, AAC resorts to the Modified Discrete Cosine Transform with an increased window size, which reaches 2048 points.

Thus, AAC is much more suitable for encoding audio that has a stream of complex pulses as well as square waves, compared to MP3. The format has the ability to dynamically switch MDCT block lengths within the range of 2048-256 points. In the event that a short-term or single change occurs, a small “window” of 256 points is applied in order to achieve better resolution. This defaults to a 2048-point large window to maximize encoding efficiency. AAC has a number of advantages over conventional MP3. Among them it should be noted: implementation large number audio channels (up to 48), significant encoding efficiency in conditions of constant and variable bitrates, as well as sampling rates ranging from 8 Hz to 96 kHz (for MP3 this figure ranges from 8 Hz to 48 kHz) and a more flexible special mode called Joint stereo. As for the AAC+ solution, this is a codec that is focused on working with low bitrates. It is a combination of SBR and AAC LC, due to which good sound is achieved already in the range of 32-48 kbps.

2009-09-30T20:52

2009-09-30T20:52

Audiophile's Software

The first ideas about using psychoacoustic masking to compress audio data date back to 1979. However, the corresponding audio encoders began to become widespread only in the mid-90s, when computing power personal computers It became enough to play compressed audio in real time and the MPEG-1 Audio Layer 3 standard, better known as MP3, appeared. Compressed audio formats have become indispensable for transmitting audio over the Internet, providing “virtually transparent” stereo sound quality (that is, the encoded signal is indistinguishable from the original to most listeners) at bit rates above 128 kbps. The basic principles of the MP3 format can be found in the articles by K. Glasman (2...8/2005)

The development of data compression methods and psychoacoustics gradually led to the fact that the MP3 standard became “cramped” for the implementation of new ideas in audio encoding. As a result, by 1997, the Fraunhofer Institute (Fraunhofer IIS), which created MP3 in the early 90s, as well as Dolby, AT&T, Sony and Nokia, developed new method audio compression - Advanced Audio Coding (AAC), included in the MPEG-2 and MPEG-4 standards. The main differences from the MP3 standard are:

  • support for a wider range of formats (up to 48 channels) and audio sampling frequencies (from 8 kHz to 96 kHz);
  • more efficient and simpler filter bank: the hybrid MP3 filter bank has been replaced by the conventional MDCT (modified discrete cosine transform);
  • wider limits for varying the frequency-time resolution in the filter bank - eight times (in MP3 - three times) - led to improved coding of transients (transient processes) and stationary sections of the audio signal;
  • better encoding of frequencies above 16 kHz;
  • a more flexible stereo coding mode, allowing you to switch to M/S (“joint stereo”) mode independently in different frequency bands;
  • additional features of the standard that increase compression efficiency: time-domain noise generation technology (TNS), long-term prediction of MDCT coefficients, parametric stereo coding mode, perceptual noise substitution, high-speed restoration technology frequencies (SBR).

Thanks to these features, the AAC standard is able to achieve more flexible and efficient, and therefore higher quality audio encoding. As a result of the widespread use of the MP3 format, the AAC standard has not yet gained popularity comparable to MP3. Nevertheless, AAC is the main format in the popular online store iTunes Store, iPod players, iTunes, iPhone phone, PlayStation 3, Nintendo Wii and DAB+/DRM digital broadcasting.

Let's take a closer look at the main features of AAC.

Filter bank

Like other psychoacoustic audio encoders, AAC works according to the following scheme. The input signal is passed through a bank of filters - a transformation that transfers the signal from the time domain to the time-frequency domain (similar to constructing a spectrogram). In parallel, the psychoacoustic model analyzes the signal and determines the psychoacoustic masking thresholds. Next, the spectral coefficients of the signal at the output of the filter bank are quantized so that the noise spectrum, if possible (if the bitrate allows), is below the masking thresholds and is not audible. The quantized coefficients are losslessly compressed into an AAC output file. Thus, the filter bank itself does not compress the signal, it only converts it into a form more suitable for compression.

A feature of each filter bank is its frequency resolution, that is, the number of frequency bands into which it divides the signal spectrum. Most filter banks used for audio compression have several hundred bands. This means that, due to the uncertainty relationship, such filter banks have a time resolution of the order of several tens of milliseconds. When the spectral coefficients of a signal are quantized, the introduced quantization error when decoding the signal is distributed in time over the entire length of the filter bank window. In some cases, this results in an undesirable effect called pre-echo. It manifests itself when a quantization error from a transient (a sharp burst of energy in the signal) propagates in time to the time segment preceding the transient and becomes audible (Fig. 1). To reduce this effect, filter banks with variable time-frequency resolution are used. For example, MP3 uses filter bank time resolution switching between 26 and 9 ms. For stationary signals, 26 ms windows are used to give good frequency resolution, and for transients, 9 ms windows are used to reduce the pre-echo effect (see Fig. 1).

The AAC algorithm also uses MDCT window size switching. At the same time, the difference in the size of the windows is eightfold: 6 and 48 ms (256 and 2048 samples). Thanks to this, the algorithm is able to adapt to a wider range of signals and achieve a better degree of compression.

TNS technology - formation of amplitude noise envelope

One of the problems of modern psychoacoustic audio encoders is working with transients (transient processes in an audio signal). To achieve transparent encoding, the quantization noise must fall within a time-dependent masking threshold. However, in practice, this requirement is difficult to satisfy near transient processes, because The quantization noise generated during encoding is propagated in time during decoding over the entire length of the MDCT window. This can result in quantization noise significantly exceeding the temporal masking thresholds.

TNS (temporal noise shaping) technology in the AAC standard allows you to control the propagation of time quantization noise within each MDCT window. TNS technology is based on the similarity (time-frequency duality) of the amplitude envelope of the signal and the envelope of its spectrum, as well as the use of linear prediction (LPC) in frequency when quantizing the spectrum.

It is well known that for signals with a spectrum that is very different from white (for example, tones), the use of linear prediction (LPC) in the time domain can effectively “whiten” the spectrum and encode such signals by decomposing them into prediction coefficients and a relatively small amplitude prediction error (residual). During decoding, the linear prediction filter generates an error spectrum according to the spectrum of the original signal.

An AAC encoder uses linear prediction in the opposite way: to predict spectral samples in the frequency domain. The difference between the original and predicted MDCT coefficients is quantized according to masking thresholds (in traditional encoders, the original MDCT coefficients are quantized). Linear prediction coefficients are also written to the output file. When decoding a signal, a linear prediction filter applied to a difference signal in the frequency domain (including quantization error) produces an amplitude envelope of the original signal (and quantization error) in the time domain. Thus, the amplitude envelope of the quantization errors becomes close to the amplitude envelope of the original signal (Fig. 2).

TNS technology reduces the effect of pre-echo and the noticeability of quantization errors on some harmonic signals with a pulsed nature of sound production (speech, some wind and bowed instruments). In Fig. 2 compares the quantization errors introduced into the vocal signal by the AAC and MP3 algorithms with the same bitrates. Along with a general decrease in the quantization error (due to the greater efficiency of AAC), the formation of the amplitude envelope of the time quantization error is observed according to the envelope of the original signal.

In the AAC standard, TNS technology can be applied to individual frequency bands of the spectrum independently or disabled completely.

SBR technology - high frequency restoration

Reliable transmission of a wide frequency range is an important requirement for high-quality encoding. However, the transmission of each subsequent octave of the audio range increases the bitrate requirements for a traditional audio encoder by one and a half to two times. To reduce the bitrate and still maintain high frequencies in the encoded material, the technology of artificial synthesis of high frequencies SBR (spectral band replication) was created.

The technology is based on the fact that our hearing analyzes high frequencies with less accuracy than mid and low frequencies. To create the effect of the presence of high frequencies, it is not necessary to mathematically accurately reconstruct the waveform, but rather only to restore some essential psychoacoustic parameters of the signal at high frequencies. These essential parameters include the time-frequency distribution (envelope) of the signal energy and the degree of its tonality/noise.

The idea of ​​the algorithm is this. During encoding, high frequencies in the original audio signal are analyzed and their parameters are extracted: first of all, the amplitude envelope in several (usually eight) frequency bands. Next, high frequencies are removed from the recording and only the remaining low and mid frequencies are encoded. At the same time, a relatively small stream of information about the parameters of the lost high frequencies is also added to the output file.

During playback, the low and mid frequency signal is decoded first. Next (if present in the player), the SBR decoder starts working. The first step is to synthesize a high-frequency signal by transposing (or rather, frequency shifting) the existing mid-frequencies. Since the degree of tonality/noise of the spectrum at medium and high frequencies is approximately equal, this step results in a high-frequency signal with a plausible spectrum structure. In the second step, the SBR decoder uses the additional stored high frequency information to give it the desired amplitude envelope in each frequency band. The result is a signal in which the high frequencies are completely synthesized from the mid frequencies, but at the same time retain the sound of the original high frequencies.

SBR technology can be applied to many existing audio encoding methods. For example, SBR in combination with MP3 is called MP3 PRO, and SBR in combination with AAC is called HE-AAC (high efficiency AAC). Basically, SBR is used when encoding with relatively low bitrates: 64 kbit/s and below. The technology allows you to significantly expand the frequency range of the audio signal with a minimal increase in bitrate (several kbit/s).

Parametric stereo technology

Transmission of a stereo signal usually requires the encoder to have almost 2 times the bitrate of transmission of a mono signal. In this case, stereo channels can be encoded both independently and after M/S conversion. In the latter case, the S-channel often consumes less bitrate than the M-channel. This encoding mode is also called joint stereo. In the AAC standard, this mode can be turned on and off by the encoder independently for each frequency band.

For more efficient encoding of stereo signals at very low bitrates (16...32 kbit/s), parametric stereo encoding technology was developed. It consists in the fact that the stereo signal is reduced to mono before encoding, but a small stream (2...3 kbit/s) is added to the output file, containing information about the stereo panorama of the original stereo file. This stream contains (in compressed form) a kind of “panorama map” for the time-frequency plane.

In the decoding stage, frequency-dependent panning is applied to the resulting mono signal. This can be done simultaneously with decoding by applying appropriate amplitude multipliers to the initially equal MDCT coefficients of the left and right channels.

Parametric stereo technology gives a good impression of the original stereo sound at the cost of only a slight increase in bitrate compared to mono encoding. However, it does not allow you to achieve completely transparent sound, since it is unable to take into account all the nuances of the stereo panorama, for example, phase shifts between stereo channels.

Parametric stereo technology was included in the HE-AAC v2 standard.

PNS technology - noise generation

To further increase the coding efficiency of noise signals, the AAC standard provides PNS (perceptual noise substitution) technology for noise synthesis. It is known that our ear is more sensitive to the amplitude spectrum of a signal than to the phase spectrum. Therefore, instead of encoding the MDCT coefficients of the original signal in noise regions, you can only transmit the noise parameters: its power depending on frequency and time.

This is how PNS technology works. During encoding, regions of the spectrum that represent noise are identified, and the corresponding groups of MDCT coefficients are excluded from the encoding process. The frequency band is marked as noise and the total noise energy for it is stored.

When decoding, pseudo-random MDCT coefficients with the required total power are substituted into frequency bands marked as noise. As a result, in the indicated frequency ranges noise is synthesized that is close in sound to the original noise.

Long term prediction technology - time prediction

Psychoacoustic coding of tone signals requires a higher local signal-to-noise ratio than coding of noise signals (e.g., 20 dB and 6 dB, respectively). And this, in turn, requires an increased bitrate. However, the MDCT coefficients of the tones are predictable over time. This circumstance makes it possible to exploit their time dependence to reduce the bitrate.

The AAC standard provides a Long term prediction mode, in which MDCT coefficients are additionally encoded in time using linear prediction. The term “long term” means that the prediction is made not from adjacent samples, but from samples separated by the most probable tone period at a given frequency.

Quantization and compression of MDCT coefficients

Similar to the MP3 standard, AAC uses nonlinear quantization of MDCT coefficients and compression using the Huffman method. MDCT coefficients are quantized after raising to the 0.75 power, allowing the quantization error to be increased for strong signals and reduced for weak signals within each frequency band. In this way, additional implicit formation of the noise spectrum is carried out.

After quantization, the MDCT coefficients are compressed using a set of fixed Huffman tables. In the AAC standard there are more of these tables than in MP3, and there are wider possibilities for grouping coefficients. This results in an additional increase in compression.

Sound quality

When assessing the sound quality of audio encoders, subjective tests are usually used. Listeners are presented with fragments of recordings compressed by different encoders, and they rate the sound purity of each fragment on a scale from 1 to 5. The best codec is considered to be the one that is able to achieve higher sound quality compared to its competitors at a given bitrate.

A fairly authoritative Internet source that provides the results of such tests is the site http://www.rjamorim.com/test/ It presents tests of various codecs at a variety of bitrates. The results presented are generally in good agreement with other sources. Here are some results for MP3 and AAC encoders to help compare their quality.

The best MP3 encoder is the free Lame. However, at most bitrates it is inferior in quality to newer compression standards. At high bitrates (above 128 kbps) this lag is small, and the leader is the Ogg Vorbis encoder.

At a bitrate of 64 kbps, the advantage of AAC is already noticeable. In the HE-AAC variant, the algorithm earns a score of 3.68. This roughly corresponds to Lame with a bitrate of 96 kbps and means that AAC is about 1.5 times superior to MP3. Lame's score at 128 kbps is 4.29.

At a bitrate of 32 kbit/s, the AAC encoder from Nero has a significant improvement in quality compared to MP3: scores of 3.23 and 1.72, respectively. However, AAC is only slightly ahead of the MP3PRO format, which received a score of 3.08. This indicates that SBR technology does significantly improve quality at low bitrates.

conclusions

Thanks to the new technologies used in the AAC standard, this format has a noticeable advantage over MPEG-1 Layer 3 (MP3), allowing it to achieve best quality sound at the same bitrates. A particularly strong gain is observed in the area of ​​low bitrates: 96 kbit/s and below. This confirms the promise of the AAC format for digital broadcasting.

The popularity of AAC for distributing music on the Internet today remains low compared to the MP3 format. Users continue to prefer the better portability of MP3 over the stronger compression of AAC. A significant part of the music archives on sites that distribute music are already initially in MP3 format, and providers do not have access to uncompressed recordings. This means that there is little point in transcoding such recordings into AAC format - the quality is often already lost. However, new pocket players and some online stores already support the AAC format, often with verification of the legality of the content (which also discourages users who prefer not to limit themselves in copying music).

Although very promising, the AAC format is not the only high-quality audio compression format. At high bitrates (above 128 kbps), AAC is often inferior in quality to Ogg Vorbis and Musepack encoders. At the lowest bitrates (less than 32 kbit/s), AAC may be inferior to parametric audio encoders, including specialized encoders for speech compression. However, in the medium-low bitrate range AAC on this moment retains the palm.

Alexey Lukin
Magazine "Sound Engineer" 2008 #1

Both formats use the same container, but with ALAC no information is lost.

AAC (Advanced Audio Coding) was originally created as a successor to MP3 with improved encoding quality. The AAC format, officially known as ISO/IEC 13818-7, was released in 1997 as the seventh member of the MPEG-2 family. There is also an AAC format known as MPEG-4 Part 3.

How does AAC work?

  1. Signal components that are not perceived by humans are removed.
  2. Redundancy in the encoded audio signal is removed.
  3. The signal is then processed using the MDCT method according to its complexity.
  4. Internal error correction codes are added.
  5. The signal is stored or transmitted.
  • .m4a - Standard extension;
  • .m4b - AAC file that supports bookmarks; used for audiobooks and podcasts;
  • .m4p - protected AAC file; used to protect a file from being copied when legally downloading copyrighted music from online stores such as the iTunes Store;
  • .m4r is a ringtone file used in the Apple iPhone.

see also

Write a review about the article "Advanced Audio Coding"

Notes