Published on November 27, 2007
The Twilight of Videotape: New Ways to Migrate Video: The Twilight of Videotape: New Ways to Migrate Video Uncompressed, Lossy, and Lossless Compression in Digital Media Formats Jim Lindner Managing Member Media Matters, LLC Video as Data: Video as Data This session focuses not on the how of storage, but the what of storage. Results of Study Dance Heritage Coalition project “Digital Video Preservation Reformatting Project” What is the Impact of Compression on Video: What is the Impact of Compression on Video Technical issues relating to Compression have the potential to: Cause changes (artifacts) to the piece that may or may NOT be within the artistic intent of the piece Have the potential to create new “classes” of masters and copies at different resolutions and quality levels There is No Free Lunch: There is No Free Lunch Compression saves valuable resources but there are almost always tradeoffs Loss of Information / Quantitative How critical was the loss? Loss of Quality Did anyone Notice? Time / Processing Power to Encode/Decode Software Encoders are slower then Hardware Encoders Generally “real time” is fast enough but may not be as efficient Case Study for Video Preservation File Format: Case Study for Video Preservation File Format Dance Heritage Coalition Preservation File Format Dance Heritage Coalition:Towards a Preservation File Format: Partially Funded by a Grant from the Mellon Foundation Discussion of Electronic Media Preservation – Specifically Video and Audio Dance Heritage Coalition: Towards a Preservation File Format Test Objectives: Test Objectives Quality Usability Preservability Test Objective: Quality: Test Objective: Quality The quality of the picture and sound, including resolution, croma bandwidth, luminance, sync, and a lack of phase shifts. A copy will pass the quality test if the measurement of these elements shows little or no diminishment or degradation when compared to the measurements of the original. Test Objective: Usability: Test Objective: Usability The usability of the end product or resulting preservation master copy or working copies made from that master must support the following performance measures: a. It must be possible to edit the copy. b. The copy must retain any information that allows users to run processes on the footage, such as search engines. c. The copy must allow output that can produce an HDTV copy. d. The copy must permit tape-to-film transfer, and must allow freeze framing. (Freeze frame capability is important for the dance community, where users must be able to view single frames clearly to study choreographic details.) Test Objective: Preservability: Test Objective: Preservability Preservability of the end product (i.e., end product must be migratable and must avoid technical protection, such as encryption). The format must be open source, public, well documented, and should carry no fee or very low fees. Method: Method Selected 22 Clips for Analysis Represent Diverse Motion and footage with that we believe may be problematic Prepare for Analysis Transferred uncompressed to .AVI using strict control over levels Compress each clip using 6 different Codecs Analyize using 14 Different Metrics Exhaustive Analysis: Exhaustive Analysis Several Weeks of Computer Time 22 Clips x 6 Compression Types x 14 Different Metrics = 1848 Different Clips Analyzed Clips: Clips Clips: Clips Jacob’s Pillow 1992 Gala Ted Shawn Thtr. Presentation Hi-8 Clips: Clips NYPL CATHY WEIS PROJECTS NOVA PRODUCTIONS FROM SKOPJE, MACEDONIA NOT SO FAST, KID! Excerpt from Show MeThe Kitchen, New York City, January 11, 2001 DVCam Clips: Clips NYPL Choreography and Text by NEIL GREENBERG Performed by ELLEN BARNABY, CHRISTOPHER BATENHORST,NEIL GREENBERG, JUSTINE LYNCH, JO MCKENDRY NOT-ABOUT-AIDS-DANCE Excerpt Performed by Dance by Neil Greenberg The Kitchen, New York, December 15, 1994 ¾" Umatic Compression Types: Compression Types .mov files : .mov files sorenson video 3, 640 x 480, millions, 29.97 fps interlaced bottom field first Key frame every 300 frames aspect ratio 4:3 bitrate limit 1200 kbps spatial quality 50 image smoothing on. .mp4 files : .mp4 files MPEG-4 Video, 640 x480, millions of colors 29.97 fps Interlaced Bottom Field First Key Frame Every 300 Frames aspect ratio 4:3 bitrate 1229kbps .rm files : .rm files Real Media 9 640x480 millions of colors bitrate 1067kbps constant bit rate 29.97 fps 4:3 aspect ratio progressive (no option for interlaced) key frame every 300 frames .wmv files : .wmv files Windows Media Video 9 Professional bitrate ~1340 Variable bit rate 29.97 fps 4:3 aspect ratio Interlaced Bottom Field first key frame interval 300 frames mpeg-2: mpeg-2 20 Megabit 640x480 29.97 fps Interlaced Bottom Field first constant bit rate 4:3 aspect GOP Pattern IPBBIPBB Long GOP Sequence Headers for each GOP High Motion Search Range Jpeg2000: Jpeg2000 Motion JPEG2000 Kakadu variable bit rate (lossless) 29.97 fps interlaced bottom field first 4:3 aspect ratio 5/3 Reversible millions of colors Video Quality Metrics: Video Quality Metrics Genista’s “Media Optimacy” Analysis Tools: Analysis Tools Tools were needed to examine the files on the signal level in order to establish where and when in a file artifacts appear as result of compression. Perceptual quality measurement tools, such as Genista’s Media Optimacy, have enabled content providers to develop associated network delivery mechanisms for the best possible audience experience. Video Quality Metrics: Video Quality Metrics Genista has developed a set of metrics for measuring the quality of digital video and still images. Genista's quality metrics measure the typical artifacts introduced by processing (notably compression) and transport of digital video. Additionally, a metric exists to make a prediction of Mean Opinion Score (MOS), (i.e., reproducing the results of human subjective tests on overall image quality). Video Quality Metrics: Video Quality Metrics Metrics are not merely based on network statistics or network performance parameters such as packet loss. Take into account the image content and frame data of the video resulting from the given coding and transmission conditions. The metrics can be divided into spatial and temporal metrics. Spatial Metrics: Spatial Metrics Spatial metrics, such as blockiness, perform their measurements on a frame-by-frame basis, returning a result for each frame measured. Temporal metrics, such as jerkiness, look at two or more consecutive frames simultaneously to obtain a measurement. MOS prediction takes into account both spatial and temporal aspects. Relative and Absolute Metrics: Relative and Absolute Metrics Video quality measures can be divided into relative (full-reference, FR) metrics and absolute (non-reference, NR) metrics. FR metrics compare a compressed or otherwise processed video directly with the original whereas NR metrics analyze any video without the need for a reference, using only the data contained in the clip under test. Non-reference Metrics: Non-reference Metrics Non-reference metrics target real-time measurement of streaming video. Useful for monitoring quality variations due to network problems, as well as for applications where service level agreements and quality control are required. Characterization of the reference content prior to encoding or processing. Currently non-reference metrics exist to measure jerkiness, blockiness, Blur, and MOS. Metrics Categories: Metrics Categories Fidelity metrics measure the mathematical difference between processed and reference video. Spatiotemporal metrics as defined by the ANSI standard Perceptual metrics includes a prediction of MOS provides an overall perceptual quality in MOS scale. Each of Fidelity Metrics: Fidelity Metrics These metrics are widely used and represent arithmetic measures of the distance between processed and reference video. They are full-reference metrics by definition. Do not take into account human perception Fidelity Metrics: Fidelity Metrics PSNR FR, spatial Peak signal to noise ratio (luminance). SNR FR, spatial Signal to noise ratio (luminance). RMSE FR, spatial Root mean square error (luminance). Color PSNR FR, spatial PSNR from CIE Metric Type Description: Metric Type Description Motion energy difference FR, temporal Added motion energy indicates error blocks, noise. Repeated frames FR, temporal Lost motion energy indicates jerkiness. Edge energy difference FR, spatial Indicates dropped or repeated frames. Horizontal and vertical edges FR, spatial Added edge energy indicates edge noise, blockiness, and noise Spatial frequencies difference Lost edge energy indicates Blur. Spatiotemporal Metrics: Spatiotemporal Metrics Rely on algorithms defined by recommendations from the American National Standards Institute (ANSI). This recommendation represents an attempt by a standards body to define objective measures that serve as a basis for the measurement of video quality. Perceptual Metrics : Perceptual Metrics Perceptual quality metrics measure specific artifacts introduced into the video as perceived by a human viewer. These artifacts are well known, and are easily recognized even by non-experts. Provide an automatic measure of those artifacts that viewers will perceive, in a way that is correlated with human perception. Additionally, a metric exists to make a prediction of Mean Opinion Score (MOS), (i.e. reproducing the results of human subjective tests). Jerkiness: Jerkiness Perceptual measure of frozen pictures or motion that does not look smooth. The primary causes of jerkiness are network congestion and/or packet loss. It can also be introduced by the encoder dropping or repeating entire frames in an effort to achieve the given bit-rate constraints. A reduced frame rate can also create the perception of jerky video. Blockiness: Blockiness Perceptual measure of the block structure that is common to all DCT-based image compression techniques. The DCT is typically performed on 8x8 blocks in the frame, and the coefficients in each block are quantized separately, leading to artificial horizontal and vertical borders between these blocks. Blockiness can also be caused by transmission errors, which often affect entire blocks in the video. Genista has developed both FR and NR blockiness metrics. Blur: Blur Perceptual measure of the loss of fine detail and the smearing of edges in the video. It is due to the attenuation of high frequencies at some stage of the recording or encoding process. It is one of the main artifacts of wavelet-based compression techniques, such as JPEG2000, where transmission errors or packet loss can also induce Blur. Other important sources of Blur are low-pass filtering (e.g,. analog VHS tape recording), out-of-focus cameras, or high motion (leading to motion blur). Noise: Noise Perceptual measure of high-frequency distortions in the form of spurious pixels. It is most noticeable in smooth regions and around edges (edge noise). Can arise from noisy recording equipment (analog tape recordings are usually quite noisy), the compression process, where certain types of image content introduce noise-like artifacts, or from transmission errors (especially uncorrected bit errors). Ringing: Ringing Perceptual measure of ripples typically seen around high-contrast edges in otherwise smooth regions (the technical cause for this is referred to as Gibb's phenomenon). Ringing artifacts are very common in wavelet-based compression schemes (e.g, JPEG2000), but also appear to a slightly lesser extent in DCT-based compression techniques (e.g. JPEG, MPEG). Colorfulness: Colorfulness Describes the intensity or saturation of colors as well as the spread and distribution of individual colors in the image. The range and saturation of colors often suffers due to compression. Mean Opinion Score: Mean Opinion Score MOS Prediction. MOS is the Mean Opinion Score obtained from experiments with human subjects. Metrics correlate with human perception of video quality and thus with the output of subjective test results, a metric that represents the perceived quality of video content. What Did We Learn?: What Did We Learn? Almost impossible to predict Codec performance in advance: Almost impossible to predict Codec performance in advance There was little consistancy in Codec Performance from clip to clip or within a clip Sometimes performance is divergent – and other times… Performance is Bad for all: Performance is Bad for all Smooth Sailing Followed by Disaster: Smooth Sailing Followed by Disaster In this example – MPG2 is clearly better then WM but is the large variation in quality distracting and actually worse then a more even performance?: In this example – MPG2 is clearly better then WM but is the large variation in quality distracting and actually worse then a more even performance? There was no clear leader over a wide variety of material: There was no clear leader over a wide variety of material Each Codec has its own problems Bitrate is not a good overall predictor of quality Artifacts can be generated by any Codec at any bitrate – some are perceptually significant others are not Putting business issues and marketing dynamics aside – there was no clear performance leader even over systems that are very similar like WM and MP4 Our Opinion: Our Opinion From An Archival point of view – Lossy Compression is UNACCEPTABLE – any flavor, any rate – because There is no way to reliably predict performance over a wide spectum of material unless you know in advance and can literally do scene to scene compression Even for short sequences – there was no combination that showed outstanding performance We believe that LossLESS compression is a viable and acceptable option for video preservation. There is a STANDARD!: There is a STANDARD! JPEG 2000 has Mathematically Lossless inclusion in Standard Hardware soon available to Encode / Decode Mathematically Lossless video in REAL TIME (3rd Q 2004) Cost Effective Hardware < $10,000 Hard Disk storage continues to decline in Cost and Increase in Density Current cost less then $1 per Gigabyte As of 3/17/04 $.83US As of 5/28/04 $.79 US Mathematically LossLESS offers many advantages: Mathematically LossLESS offers many advantages No Artifacts added due to compression process Frames available as discrete units (not true with MPEG) Additional cost of Lossless compression as compared to Lossy becomes small relative to overall costs Continuing trends of decrease in Hard Drive Cost drive cost advantage further as time goes on Mathematically Lossless Compression: Mathematically Lossless Compression Mathematically Lossless 3:1 Compression 72 Gigabytes becomes 24 Gigabytes with NO loss in quality In 2010 we forecast the RAW storage cost for 1 hour of content on Hard Drive less then $1.50 US 2004 Dollars. Savings of lossy compression become meaningless as compared to overall costs and tradeoff for future use.