Releases: xiph/rav1e
Weekly pre-release
Optimize the buffer size in inverse_transform_add
Fix a clippy warning
fuzz: Increase coverage of encode
fuzz: Explicit constants for fields of EncoderConfig
fuzz: Increase coverage of encode_decode
aarch64 HDB CDEF stride/argument fixes
Rustfmt/Clippy
Add HBD assembly for arm64 CDEF
Enable full aarch64 assembly for 8-bit CDEF
Add first of ARM 64 bit assembly for CDEF
Weekly pre-release
CI: Avoid deploying automatically binaries on forks
CI: Pin js_api Rust version to 1.46
Weekly pre-release
Simplify compute_motion_vectors
Fix width/height limits in configuration errors
Cleanups based on clippy suggestions
Weekly pre-release
Refactor to use iterators
Use Plane::row() and PlaneSlice::row()
Test Plane::iter()
Add PlaneSlice::row()
Add Plane::row()
Do not use Plane::p() to dump the images
CI: Update nasm to 2.15.05
Move render size validation to Config::validate()
Assert that render_size is non-zero in both dimensions
Detect invalid use of enable_timing_info with still_picture
Assert that render_size fits in 32 bits
Weekly pre-release
fuzz: Increase coverage of encode
fuzz: Increase coverage of encode_decode
fuzz: Use new API Config::with_encoder_config()
fuzz: Use Unstructured::int_in_range()
fuzz: Use a single thread for encode_decode
Weekly pre-release
Avoid use of Plane::p in Plane::downsample
Update fuzzing section of contribution guide
CI: Build fuzzers with stable rustc
fuzz: Implement Structure-Aware Fuzzing
v0.4.0-alpha
This is a new big release of rav1e after 7 months making the encoder sensibly faster and better.
Video | PSNR | PSNR HVS | SSIM | CIEDE 2000 | APSNR | MS SSIM | VMAF |
---|---|---|---|---|---|---|---|
Average | -2.38 | -2.02 | -3.06 | -3.04 | -2.51 | -2.68 | -1.84 |
From 0.3
round there have been new 435 commits with around 50,000 additions and 17,000 deletions from 29 contributors.
Improvements
- Enable Open-Partition on frame boundary, gives ~2% rd gains.
- Use av-metrics in CLI to compute PNSR, PSNR-HVS, SSIM, MS-SSIM,
CIEDE2000 (see--metrics
) - Unwaffle Rebase for Loop-filter: Now deblocking is enabled to loopfilter RDO
giving 0.5 to 1.5% gains - Thread CDEF with tiles giving ~1.2% performance using 2x2 Tiles
- New Rate Control API that is less error-prone to use.
- Full Monochrome Support
- Enabling CDEF, Restoration Filter for 4:2:2, decreasing encoding time by ~37%
and making overall improvements substantial between 0.8 to 5% - Added compound prediction mode variants for drl=2 and drl=3
- Enable NEAR_NEAR1MV, NEAR_NEAR2MV Compound mode
- Support arbitrary SAR anamorphic video
- Enforce a frame limit of 1 in STILL_PICTURE_MODE
- Quiet Mode in CLI with -q or --quiet
- Ensure all mv predictors are converted to fullpel
- Update non-broken Motion Estimation Predictors giving ~0.28% gains
- Substantially rework initial motion estimation: 9% improved performance
- Optimise Preditors for multipass motion estimation giving 0.3-0.4% gains
- Optimize Chroma quantizer offsets for subset3 4:4:4 giving 31% for Luma Metrics
and 14% BD-Rate Improvement for CIEDE2000 for 4:4:4 clips - Opaque data can be pinned to frames and retrieved from the matching packet.
- Merge of dav1d 0.6.0 dav1d 0.7.0, 0.7.1 Assembly for both x86 and AArch64
- Naive x86_64 intrinsics for get_satd HBD
- Added NEON assembly for dist::get_sad on aarch64 giving ~66% improved encoding time
- Integration of around 200+ 16BPC AArch64 Functions from dav1d resulting in an
overall speedup of around ~9.5% - Added x86 SIMD for weighted SSE computation giving 5-7% speedup on PSNR
- Derive quantizers using linear models giving ~0.7 to 1.7% gains in metrics for
4:2:0 - Pruned Intra Mode list by SATD reducing encoding time between 5.5% to 12.2%
at default speed level - Optimization of rdo_loop_decision reducing total allocation count by 25% and
1% for encoding time - Removal of Initial Allocation for lookahead_intra_costs
- Avoid temporary allocation for inter pruning resulting in a reduced allocation
significantly - Reduce manual indexing in for_each in TileBlocks giving 1.5% speedup
Bug Fixes
- Fixed the rebuild with fresh assembly output
- Fixed the Chroma Desync for narrow-frames
- Abort pass encoding without a bitrate target in CLI
- Fixed the
-v
cli option - Fixed a crash when using 4 tiles for 1080p 4:2:2 input
- Fixed the 4:2:0 assumption in IEF block context selection
- Fixed the symbol redefinition error for AArch64 builds using Clang
- Fixed for LRF choosing different LRU sizes in Y and UV when not 4:2:0
- Fixed the broken borrow checker for tile_blocks
- Fixed the quantizer index clamping
- Fixed the Cross-compiling from macOS to mingw-W64
- Avoids a buffer underflow condition in CDEF pad_into_tmp16()
- Properly validate minimum rdo_lookahead_frames value
Changes
- Bumped minimum version of NASM to 2.14.0
- Updated Speed Preset Settings
- Full SGR Search is enabled for Speed Levels till 4 instead of 8
- Enabled Fine Directional Intra Preset for all speed levels
- Removed Diamond Motion Estimation
- Reduced TX_Set preset is now enabled from Speed 6 instead of Speed 5
- Disabled TX-Type RDO for inter frames.
- Rename of Native CPU Feature level to Rust: Use RAV1E_CPU_TARGET=rust from rav1e
0.4.0-alpha instead of RAV1E_CPU_TARGET=NATIVE - Removed in-library psnr computation facility
- Moved Frame related data structures to a separate crate (v_frame)
- Extended
dump_lookahead_data
- Now the
frame_subtype
is exported - Use the
RAV1E_DATA_PATH
env to place the output file.
- Now the
- Major Refactoring in CDEF is both towards allowing easier import of dav1d CDEF
assembly, as well as simplifying bitdepth and [re-]buffering requirements in LR. - Remove of leftover libaom code
- Remove unused diamond motion estimation
- Reduced Build Time:
- do not enable LTO by default,
- use as many codegen unit
- allow incremental builds for the release profile
- in-lined various functions
- removed large stack allocation, improved HBD SATD for x86 targets
- split large modules in multiple submodules
Unstable features
- Channel-based API
- A mean to use a pre-allocated threadpool, and share it across multiple encoders.
Weekly pre-release
tiling: Determine tile_cols_log2 from selected column count The column width is adjusted for 4:2:2 content, which in certain cases results in an actual column count that could be represented by uniform tiling.
v0.3.4
Weekly pre-release
p20200908 Enforce a frame limit of 1 in still picture mode