Some muxers — notably MakeMKV, and mkvmerge in certain configurations — write a small primary seekHead at the start of the segment that contains a single entry referencing a secondary seekHead near the end of the file. The secondary seekHead carries the actual entries for info, tracks, tags, chapters, and attachments.
A new Matroska::File::save(WriteStyle style) overload is provided to
control how tags, attachments and chapters are written to the file.
- Compact: Write tags, attachments and chapters as compact as possible.
This is the default mode.
- DoNotShrink: Do not shrink elements; add void padding when content
gets smaller. Allow inserts when content gets larger.
- AvoidInsert: Like DoNotShrink but also avoid inserts for non-last
elements: replace a growing non-last element with a void of the old
size and append the new element at the end of the segment.
For very large files and/or slow (network) filesystems, using this
mode will reduce write time significantly.
Co-authored-by: Copilot <copilot@github.com>
The iXML chunk in BWF/WAV files is specified as UTF-8 (per the EBU
Tech 3285 supplement and the iXML spec). The reader was constructing
the String without an encoding hint, which falls back to Latin-1 and
mangles any non-ASCII bytes (e.g. Unicode in <NOTE>, <PROJECT>, or
<TRACK_LIST> entries written by Sound Devices, Zaxcom, etc.).
Adds 6 public methods on FLAC::File mirroring RIFF::WAV::File's existing
iXML/BEXT API: iXMLData/setiXMLData/hasiXMLData and the BEXT equivalents.
Reads APPLICATION blocks (RFC 9639 § 8.4) carrying either the IANA-
registered "riff" foreign-metadata wrapper or the direct "iXML" / "bext"
application IDs used by some third-party tools (e.g. Sequoia). Writes
the spec-blessed "riff"-wrapped form. Unrecognized application IDs and
"riff"-wrapped chunks other than iXML/bext (e.g. "fmt ", "JUNK") flow
through unmodified, so existing files round-trip without churn.
Test coverage: read direct + riff-wrapped for both iXML and BEXT,
write+reread round-trip, empty-clears-block, and an unknown-application-
block preservation guard.
Fix: Handle orphan ChapterAtom elements not wrapped in EditionEntry
The Matroska specification requires every ChapterAtom to be inside an
EditionEntry. However, some muxers (older FFmpeg versions, some streaming
tools) produce files with ChapterAtom elements directly under Chapters,
without an EditionEntry wrapper.
MKVToolNix and FFmpeg both handle this case gracefully by treating orphan
atoms as belonging to an implicit default edition. Previously, TagLib
silently ignored these chapters, returning an empty ChapterEditionList.
This change:
- Collects orphan ChapterAtom elements encountered directly under Chapters
- Wraps them in an implicit default edition (UID = 0, isDefault = true,
isOrdered = false) so they are exposed through the existing
chapterEditionList() API
- Extracts the atom-parsing logic into a private parseChapterAtom() helper
to avoid code duplication between the two call sites
No existing behavior is changed - files that already conform to the spec
(chapters inside an EditionEntry) parse identically.
An equality operator is added for the chapters. The chapters are
only written to the file if they were really modified, so just
reading the chapters without modifying them will not affect
the save operation.
Six new tests exercise corners of the chapter implementation that the
orphaned-mdat fix did not reach:
testQTChapterListUnicodeTitles / testChapterListUnicodeTitles --
Round-trip Japanese, German (umlaut), and Russian titles through the
QT text-sample serialisation and the Nero length-prefixed UTF-8 path
respectively. These are separate paths in the code and benefit from
separate coverage.
testQTChapterListEmptyTitleStripped --
A multi-chapter list whose first entry is empty at t=0 matches the QT
dummy-marker pattern; read() must drop it. Test documents the rule so
a regression is immediately detectable.
testQTChapterListSingleEmptyTitleNotStripped --
The stripping rule only applies when size > 1. A single empty-title
chapter at t=0 is valid and must be preserved.
testNeroAndQTChaptersAreIndependent --
Both formats can coexist; removing one leaves the other intact.
Validates the lazy saveChaptersIfModified contract in mp4file.cpp.
testNeroChaptersAloneWhenNoQT --
Writing one format must not create atoms for the other.
All 47 MP4 tests pass.
The previous fix for orphaned chapter mdats assumed the chapter text
mdat was dedicated and derived its location from stco[0] - 8. In
audiobooks that co-locate chapter text at the start of the primary
audio mdat (stco[0] == audioMdat.offset + 8), that arithmetic lands
on the audio mdat header, the "mdat" signature check passes, and the
full audio payload gets removed -- shrinking a 484 MB audiobook to
5.4 MB.
Fix: resolve the chapter mdat by finding the top-level mdat whose
data range contains stco[0], then re-parse after the trak/tref
removals and confirm no other track's stco/co64 points into that
mdat before deleting it. Shared mdats are left intact; the dead
chapter text bytes remain as harmless padding.
Add a regression test that writes a chapter track, patches its
stco[0] to point into the primary audio mdat (simulating the
audiobook layout), removes the chapter track, and verifies the
audio mdat is byte-identical afterwards.
Adds testQTChapterListNoOrphanedMdat which performs three add/remove
cycles and asserts that the top-level mdat count is identical before and
after. Without the fix, each cycle leaves an orphaned mdat at EOF, so
three cycles produce originalCount + 3 atoms.
Uses TagLib's own MP4::Atoms parser as the primary check, with
AtomicParsley as an optional cross-validation when installed.
write() appends a new mdat at EOF to hold chapter text samples but the
removal code (both remove() and the replace-existing path in write())
only deleted the chapter trak and tref atoms from inside moov. Each
add/remove cycle left the previous chapter mdat behind, causing orphaned
mdat atoms to accumulate.
Fix: extract a removeQTChapterTrack() helper that performs all three
removals atomically. Before deleting the chapter trak, the helper reads
the first stco chunk offset (which points 8 bytes past the chapter mdat
header) to locate the mdat. After removing the trak and tref (both
inside moov, which precedes the mdat at EOF), it adjusts the mdat offset
by -(chapterLen + trefLen) and removes the atom, leaving no orphaned data.
The updateChunkOffsets() function in mp4qtchapterlist.cpp and
mp4chapterlist.cpp is duplicated code from mp4tag.cpp and needs
the patch from mp4tag.cpp too.
Changes made
mp4chapterlist.h
• Added (MP4::File*) overloads for read, write, remove
• Replaced broken class File; forward declaration with #include "mp4file.h" (fixed a subtle C++ name-resolution linker bug where Atoms(File*) resolved to MP4::File* instead of TagLib::File*)
mp4chapterlist.cpp
• Refactored: path-based overloads are now thin wrappers that delegate to file-based overloads
• File-based overloads construct Atoms locally — no Atoms* in the public API
• Removed chplHeaderSize = 9 constant; replaced the minimum-size guard in parseChplData with a correct 5-byte check (the old constant was version-1 specific and would reject valid version-0 atoms)
mp4qtchapterlist.h
• Added (MP4::File*) overloads for read, write, remove
• Removed Atoms* parameters entirely from the public API
mp4qtchapterlist.cpp
• Same refactor: path-based overloads delegate; file-based overloads construct Atoms locally
• Added empty-chapter guard: write(MP4::File*, {}) delegates to remove(file) instead of writing a 0-sample chapter track
tests/test_mp4.cpp
• Added testChapterListFileAPI and testQTChapterListFileAPI — exercise the full write/read/remove cycle via the file-based API
• Updated test bodies to use the simplified (MP4::File*) API (no MP4::Atoms construction in test code)
QuickTime-style chapter tracks are the native chapter format for
Apple's ecosystem. They use a disabled text track (hdlr type "text")
referenced by a chap track-reference in the audio track's tref box.
This format is recognized by QuickTime, iTunes/Music, Final Cut Pro,
Logic Pro, DaVinci Resolve, VLC, and most other MP4/M4A players. It
is also the format that AVFoundation reads natively via
AVAssetChapterMetadataGroup.
The implementation produces output that matches ffmpeg's chapter track
structure byte-for-byte: per-sample stts entries (required by
AVFoundation), encd atoms for UTF-8 text encoding, edts/elst edit
lists, gmhd with gmin+text media information, and disabled tkhd flags
(track_in_movie only).
Key behaviors:
- write() inserts tref + chapter trak as a single contiguous block,
then appends text samples in an mdat atom at EOF
- Handles non-zero first chapter times by prepending a dummy chapter
at time 0 (stripped on read)
- Overwrite support: removes existing chapter track before writing
- Preserves existing metadata tags and audio data integrity
- Uses timescale=1000 (milliseconds) for chapter track timing
7 new tests covering write/read round-trip, remove, overwrite, tag
preservation, empty file read, timestamp precision, and non-zero
first chapter handling.
Implement read/write/remove of Nero-style chapter markers (chpl atom)
in MP4 files. The chpl atom lives at moov/udta/chpl, storing up to 255
chapter entries with 100-nanosecond timestamps and UTF-8 titles.
Includes CppUnit tests covering round-trip read/write, remove, tag
preservation, and reading from files with no chapters.
Make MP4 AtomDataType descriptions visible in the generated documentation.
Convert the ID3v2 text frame listing into a table.
Convert the shorten `fileType()` documentation into a table.
Fix some typos.
Add link to specification in `EventType` for consistency with other headers.
* Shorten: Reject out-of-range k in getRiceGolombCode
k values outside [0, 31] cause undefined behavior: a left shift by 32
on int32_t (UB in C++) when bitsAvailable reaches 32 after a buffer
refill. Guard against this at the top of getRiceGolombCode and return
false (invalid file) for any k outside the valid range.
* Shorten: Reject out-of-range k in getRiceGolombCode
k values outside [0, 31] cause undefined behavior: a left shift by 32
on int32_t (UB in C++) when bitsAvailable reaches 32 after a buffer
refill. Guard against this at the top of getRiceGolombCode and return
false (invalid file) for any k outside the valid range.
This will fix a DoS with a crafted MP4 file causing too many offsets
to be written when updating the stco or co64 tables in MP4 files.
Credits for the discovery of this bug go to Yuen Ying Ng (Ruth)
(Cyber Security Researcher at PwC Hong Kong).
Concurrent calls to propertyKeyForName() and handlerTypeForName() (e.g.
via batchMap during import) could race on the isEmpty() guard used for
first-call lazy initialization.
Replace isEmpty() guards with std::call_once / std::once_flag so that
each map is initialized exactly once in a thread-safe manner. Using
call_once (rather than eager construction in the base class constructor)
preserves virtual dispatch, allowing ItemFactory subclasses to override
nameHandlerMap() and namePropertyMap() correctly.
Both property maps are initialized together in a single once_flag since
nameForPropertyKey is derived from namePropertyMap.
mpegheader.cpp: ADTS bitrate divided by 1024 (binary kilo) instead of
1000 (decimal kilo), causing ~2.4% underreporting for all AAC streams.
mp4properties.cpp: ESDS averageBitrate double-rounded via both +500 and
+0.5 before int cast, causing standard bitrates (128000, 192000, etc.)
to read 1 kbps too high.
Some encoders write a valid data chunk but with a slightly too-large
declared chunkSize, or place the data chunk beyond the declared RIFF
boundary. The previous behaviour called break, abandoning all remaining
chunks and making the file appear empty to taglib.
Lenient parsers (ffmpeg, QuickTime) handle this case by clamping the
chunk size to the bytes that actually remain in the file. Adopt the
same strategy: when chunkSize would exceed the file length, clamp it
and continue parsing rather than stopping early.
Read, write, and remove Broadcast Audio Extension (BEXT, EBU Tech 3285)
and iXML metadata chunks in WAV files. BEXT is widely used in broadcast
and professional audio for originator, description, time reference, and
loudness metadata. iXML is used by field recorders and DAWs for scene,
take, and track metadata.
MPEG::File::isSupported() scans for frame sync bytes that can appear
in other files, causing them to be misidentified as MP3.
This also includes a test with such a file.
When using for example
examples/tagwriter -C GENRE \
"name=GENRE,targetTypeValue=50,value=Soft Rock;name=GENRE,targetTypeValue=50,value=Classic Rock" \
path/to/file.mka
the GENRE key was included twice and tagreader displayed the two genre
tags twice.
A crafted file can have blockSamples set to 0 and a blockSize so big
that when adding 8 it overflows and offset is 0 so it goes back to the
same position and loops forever
When building for macOS < 10.14, the API for std::optional and
std::variant is restricted
error: 'value' is unavailable: introduced in macOS 10.14
error: 'get<..>' is unavailable: introduced in macOS 10.14
There was also an issue with Android armeabi-v7a where long is not
64 bit and a static assertion failed.
Make AttachedFile immutable. This is consistent with SimpleTag and
Chapter and avoids using attached files which do not have all required
attributes.
Provide methods to insert and remove a single simple tag, so that
they can be modified without setting all of them while still not
exposing internal lists to the API.
Use DATE_RECORDED instead of DATE_RELEASED for year() and the "DATE"
property. This is more consistent with other tag formats, e.g. for ID3v2
"TDRC" is used, which is the recording time.