Validate chunk extension in dechunk filter#22036
Open
Sjord wants to merge 2 commits into
Open
Conversation
added 2 commits
May 13, 2026 11:55
Filters work on bucket brigades consisting of multiple buckets of data. For the dechunk filter this means that php_dechunk is called on every bucket. This test splits the data into two buckets. This tests that continuing parsing on another bucket works, and handling of the end of the bucket on any point.
Chunked transfer encoding has a chunk-size field that can be optionally followed by a chunk-ext field. This was already skipped by php_dechunk, by skipping anything that was not a hex character. This changes php_dechunk to be a little bit more strict in what it considers a chunk-ext, and treat it as an error and stop decoding if the extension does not start with optional whitespace and a semicolon. The dechunk filter is used in some filter chain attacks as an oracle that determines whether a string starts with a hex character. Anything after the hex character would be interpreted as the chunk-ext. This change make that a bit more narrow, also requiring a semicolon, thus reducing the usefulness of the dechunk filter for attackers. This only changes behavior on invalid chunked encoding; the happy flow for valid chunked encoding remains unchanged. The state CHUNK_SIZE_EXT is renamed to CHUNK_MAYBE_EXT, where the size of the chunk is read and maybe followed by a chunk-ext. In CHUNK_VALID_EXT we have detected the semicolon and just skip the rest of the line. php#21983 https://httpwg.org/specs/rfc9112.html#chunked.encoding
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Chunked transfer encoding has a chunk-size field that can be optionally
followed by a chunk-ext field. This was already skipped by php_dechunk,
by skipping anything that was not a hex character. This changes
php_dechunk to be a little bit more strict in what it considers a
chunk-ext, and treat it as an error and stop decoding if the extension
does not start with optional whitespace and a semicolon.
The dechunk filter is used in some filter chain attacks as an oracle
that determines whether a string starts with a hex character. Anything
after the hex character would be interpreted as the chunk-ext. This
change make that a bit more narrow, also requiring a semicolon, thus
reducing the usefulness of the dechunk filter for attackers.
This only changes behavior on invalid chunked encoding; the happy flow
for valid chunked encoding remains unchanged.
The state CHUNK_SIZE_EXT is renamed to CHUNK_MAYBE_EXT, where the size
of the chunk is read and maybe followed by a chunk-ext. In
CHUNK_VALID_EXT we have detected the semicolon and just skip the rest of
the line.
#21983
https://httpwg.org/specs/rfc9112.html#chunked.encoding