Support for RELAX NG used to be enabled together with XML Schema support
(--with-schemas). Now there's a separate option and a new feature macro
LIBXML_RELAXNG_ENABLED.
Remove XML_PARSE_XINCLUDE. This is only honored by the XML Reader
interface which is now fuzzed in reader.c.
Don't validate in XInclude fuzzer. This doesn't increase coverage after
moving the Reader fuzzer.
Implement a custom mutator that takes a list of fixed-size chunks which
are mutated with a given probability. This makes sure that values like
parser options or failure position are mutated regularly even as the
fuzz data grows large. Values can also be adjusted temporarily to make
the fuzzer focus on failure injection, for example.
Thanks to David Kilzer for the idea.
xmlSchemaItemListAdd can reallocate the items array. Update local
variables after adding item in
- xmlSchemaIDCFillNodeTables
- xmlSchemaBubbleIDCNodeTables
Fixes#828.
If the input file size is a multiple of page size, the byte after the
file's content is on a new page and accessing it will lead to SIGBUS.
Remove XML_INPUT_BUF_ZERO_TERMINATED hint for mmapped files.
Regressed with a221cd78.
Fixes#864.
libxml2's HTML parser adds <p> start tags in some situations. This
behavior, which doesn't follow any standard, was added in 2000, see
here: http://veillard.com/XML/messages/0655.html
Text nodes that only contain whitespace don't imply a <p> tag, but the
whitespace check cannot work reliably if we're parsing partial text data
which can happen with both pull and push parser.
The logic in `areBlanks` is hard to follow. The checks involving `CUR`
depend on the position of the input pointer and seem dubious. It's also
possible that the behavior changed inadvertently with a later commit.
As a result, it's hard to come up with good test cases.
We now process leading whitespace before creating implied tags. This is
more in line with HTML5 and should avoid at least some issues with
partial text data.
For example, parsing the string "<head> x" used to result in:
<html>
<head></head>
<body><p> x</p></body>
</html>
And now results in:
<html>
<head> </head>
<body><p>x</p></body>
</html>
Except for the implied <p> tag, this matches HTML5.
The initial clang patch to support __counted_by__ was landed and
reverted several times. There are some clang toolchains (e.g. the
Android toolchain) that report themselves as version 18 but do not
support __counted_by__. While it is debatable if Android should be
shipping a pre-release clang, using __has_attribute should be a bit
simpler overall.
Note that this doesn't migrate everything else to use __has_attribute:
while clang has always supported __has_attribute, gcc didn't support
it until a bit later.
While looking over the code in the fallback method for `vstateVPush` in
valid.c when `LIBXML_REGEXP_ENABLED` is not defined, I noticed that
there is an ungated `return(-1)` after attempting to allocate memory.
I believe this should be inside a check, for if the malloc fails.