A free Radio World ebook explores “Trends in Codecs 2024.”
Hartmut Foerster is audio codec specialist and product manager for WorldCast Systems. He has 25 years of experience in this area, from the early days of ISDN to current trends in virtualization. This interview is excerpted from the ebook.
Radio World: What do you consider the most important current or recent trend in codecs?
Hartmut Foerster: The increasing integration of IP-based technologies. A vital aspect of this trend is the merging — or rather, porting — of the benefits of synchronous lines with those of IP-based connections. This means we are transferring the reliability, low latency and high quality we know from traditional synchronous transmissions to the flexible, scalable and cost-efficient world of IP networks.
Combining the strengths of both worlds leads to a significant improvement in the efficiency and adaptability of broadcast systems, which is crucial for the future of broadcasting and media production. These efforts are ongoing, as seen from the more recently launched standard series, ST2022 and ST2110, and proprietary approaches, such as APT SureStream, SRT or RIST, and others.
Another trend is the discovery of transmitting composite/MPX, replacing baseband audio links, fueled by developments such as µMPX and more recent APTmpX. The advantages of sending entirely processed and composed signals on a low bitrate will prevail in the medium term.
Another emerging trend, building on the previous one, relates to the actual design of codecs. Moving away from the traditional approach of using separate hardware for each function, the shift towards codec virtualization is gaining ground. This means running codecs purely as software instances on universal server platforms, offering increased flexibility and scalability. When combined with studio LAN formats such as AES67, to name one, this approach is bringing the “All IP” vision for production environments and broadcasting closer to reality.
RW: How does the growing use of the cloud influence codecs and how they are deployed?
Foerster: Cloud use in everyday broadcasting can bring considerable advantages. However, regular use of cloud services places considerable demands on connections, especially if they are not within a managed network under the organization’s control. As described initially, protection mechanisms (SureStream, SRT, RIST) must be used in the gateway codecs (hardware or software). The aspect of signal security is also a growing topic. Encryption, stream authentication and VPN are some of the keywords associated with this.
If these aspects are considered, cloud usage can fully exploit the following advantages:
- Decentralized distribution: Cloud technologies enable codec deployment, eliminating the need for physical hardware at each transmission location. Codecs can be managed centrally in the cloud and distributed to different locations, enabling greater agility and efficiency in broadcast production.
- Flexibility in deployment planning: The cloud allows radio codecs to adapt quickly and flexibly to different transmission requirements and formats. This is particularly useful for broadcasters working with different standards and formats.
- Improved accessibility and operability: The cloud facilitates remote access and control of codecs, which is beneficial for broadcasters with multiple locations and for conducting live broadcasts from different locations.
- Reducing cost and complexity: The cloud reduces the cost and complexity of maintaining physical codec hardware. Broadcasters can save costs by moving to cloud-based services that require less maintenance and are easier to manage.
- Faster implementation of new technologies: The cloud allows broadcasters to move quickly to new codec technologies and standards without investing in new hardware. This allows them to adapt to market changes more quickly.
RW: How will virtualization and software-integrated air chains change how engineers deploy codecs?
Foerster: Despite all the simplification in handling and daily operations, engineers must consider special requirements when selecting virtual codecs.
It starts with the type of installation, which means a pure software installation on possibly unfamiliar platforms or even having to be executed in a cloud. The question is about the installation effort and the underlying architecture, such as a virtual machine on the server or a Docker runtime with Kubernetes orchestration. These are all typical IT topics the broadcast engineer must first get used to.
Further questions will arise about handling in the event of an error. In particular, knowledge about the behavior in an exceptional situation must be present.
This leads me to another important point, namely the presentation of the installation in a monitor system. In my experience, real-time visualization of the status of the virtual air chain, including all affected components, is essential.
The question of the monitoring system, therefore, takes center stage for daily operation. Various providers of software appliances in the broadcast industry have understood this and combined the actual codec instances with a convenient monitoring system.
At this point, and as an example, I would like to mention the system from WorldCast Systems, which has focused its SNMP monitoring system Kybio on these virtual applications. It has been designed as a central management and monitoring console for APT virtual codecs and all physical components in the air chain or other applications. Comprehensive overview synoptics facilitate the immediate detection of status changes or automatic redundancy actions, to name but a few.
Another critical point is the manufacturer’s continuous software maintenance. Broadcast engineers are well advised to stick to manufacturers with many years of experience with codecs and accept the support contracts they offer. Unlike hardware codecs, software codecs need to be adapted to changing generic IT infrastructures. These support contracts make sense and should not just be seen as a revenue stream for the manufacturer.
RW: How do today’s codecs avoid problems with dropped packets?
Foerster: The problem of lost packets is a constant companion in broadcasting over IP networks.
The best prevention is to avoid such situations with carefully managed networks — unfortunately, these are relatively rare. Compensating for packet loss and keeping the effects to a minimum starts with choosing a suitable audio format. Algorithms organized in frames (e.g., all MPEG formats) require a large IP packet size. The larger the packet, the greater the loss if it is lost. Lost frames are usually audible. All algorithms that are not framed, such as linear PCM or EaptX, are more suitable. Both have proven their worth. A packet loss of 2 ms or 4 ms audio is not perceptible, even less so if an error concealment is used (an interpolation between two level states).
Redundancy solutions: One prevention is forward error correction. FEC adds a certain number of configurable redundant packets to the data stream to replace lost packets to a certain extent. This works well as long as just a few single packets get lost and latency does not play a role in the transmission, as FEC adds high latency. In satellite connections, FEC is a good and proven protection.
Other formats, such as SRT and RIST, rely on ARQ, or automatic resend/repeat request. This mechanism is standard in the TCP protocol and is used in a modified form for the UDP protocol in the formats mentioned. ARQ requires a point-to-point connection of the codecs, which can lead to complications in practice. The ARQ method is also not latency-neutral; time windows (latencies) are required to wait for resent packets. SRT and RIST, with their characteristics, are accepted and used protocols.
The most pragmatic solution for avoiding packet loss is redundant streaming, as introduced by APT’s SureStream. This means identical content is sent multiple times over one or more network connections. This should not be confused with bonding, as each stream contains 100% of the payload.
This method has been widely copied since the introduction of APT SureStream and is a latency-neutral secure method to avoid packet loss. It is data-agnostic and can be used in any network. ST2022-7, for example, implements this with a similar method.
RW: What is considered reasonably low latency at this point?
Foerster: This must be assessed from the perspective of the application.
In the case of communication in the sense of a dialog via IP codecs, the ITU recommendation states that the latency should not exceed 150 ms if possible.
If a radio presenter uses off-air monitoring, a latency of up to 50 ms is still acceptable. Higher values make off-air monitoring difficult or impossible.
In signal distribution, i.e., in the distribution networks to transmitter sites, the absolute latency is not of great relevance as long as it remains below approximately 1000 ms, though opinions differ somewhat here. More important than the objective latency in the distribution network is the temporal alignment of the same program when it is broadcast over different frequencies and the RF zones of the transmitters overlap. Here, the difference should not exceed 20 ms to avoid syllable repetition or jumps in the modulation.
How do I achieve reliable transmission with low latency in IP networks? Codecs with low-latency compression formats and, at the same time, have protection against packet loss that does not increase latency are best equipped. In addition to bandwidth-hungry linear PCM and the old ITU formats such as G.711 or G.722 — both of which are still occasionally considered for voice applications only — low-latency formats today include EaptX (2 ms), Opus (20 ms) and perhaps AAC-ELD (~50 ms). All other formats cause too much coding delay.
A latency-free protection method against packet loss must be used if managed networks are unavailable. Methods such as forward error correction or formats based on ARQ are rather unsuitable for low-latency applications. Redundant streaming, as offered in various codecs, is well-suited. These formats are known as APT SureStream or ST2022-7, among other copies. SureStream can help to reduce the buffer size to a minimum because it requires no headroom for burst-like packet losses or packet comparisons. This redundancy method, therefore, also has a latency-reducing effect.
RW: What tools are available for sending audio to multiple locations at once?
Foerster: There are various starting points from which to answer this question.
Essentially, the most suitable approach depends on the geographical conditions. If an area is poorly developed, i.e., the sites are not connected to a suitable WAN or the internet, access via the air is the most suitable option. These are usually satellite connections or access via the mobile internet (4G/5G). However, the latter still have particular challenges.
However, if a suitable WAN connection, be it the internet or dedicated connections, is supplied via a cloud data center, it is very innovative. The data center has sufficient bandwidth to send hundreds of unicast streams. The codec instances in the cloud are software appliances, preferably in a high-availability architecture. This makes it comparatively easy to meet all requirements.
These would be redundant streaming, redundant codecs offering high availability, and high scalability. The security aspects are also easy to fulfill. There are no restrictions concerning the data format. Baseband audio or linear MPX can be distributed in the same way as compressed formats like µMPX and APTmpX. Again, when connecting a broadcast center to the cloud, precautions must be taken with the gateway codec.