Support for RELAX NG used to be enabled together with XML Schema support
(--with-schemas). Now there's a separate option and a new feature macro
LIBXML_RELAXNG_ENABLED.
Remove XML_PARSE_XINCLUDE. This is only honored by the XML Reader
interface which is now fuzzed in reader.c.
Don't validate in XInclude fuzzer. This doesn't increase coverage after
moving the Reader fuzzer.
Implement a custom mutator that takes a list of fixed-size chunks which
are mutated with a given probability. This makes sure that values like
parser options or failure position are mutated regularly even as the
fuzz data grows large. Values can also be adjusted temporarily to make
the fuzzer focus on failure injection, for example.
Thanks to David Kilzer for the idea.
xmlSchemaItemListAdd can reallocate the items array. Update local
variables after adding item in
- xmlSchemaIDCFillNodeTables
- xmlSchemaBubbleIDCNodeTables
Fixes#828.
If the input file size is a multiple of page size, the byte after the
file's content is on a new page and accessing it will lead to SIGBUS.
Remove XML_INPUT_BUF_ZERO_TERMINATED hint for mmapped files.
Regressed with a221cd78.
Fixes#864.
libxml2's HTML parser adds <p> start tags in some situations. This
behavior, which doesn't follow any standard, was added in 2000, see
here: http://veillard.com/XML/messages/0655.html
Text nodes that only contain whitespace don't imply a <p> tag, but the
whitespace check cannot work reliably if we're parsing partial text data
which can happen with both pull and push parser.
The logic in `areBlanks` is hard to follow. The checks involving `CUR`
depend on the position of the input pointer and seem dubious. It's also
possible that the behavior changed inadvertently with a later commit.
As a result, it's hard to come up with good test cases.
We now process leading whitespace before creating implied tags. This is
more in line with HTML5 and should avoid at least some issues with
partial text data.
For example, parsing the string "<head> x" used to result in:
<html>
<head></head>
<body><p> x</p></body>
</html>
And now results in:
<html>
<head> </head>
<body><p>x</p></body>
</html>
Except for the implied <p> tag, this matches HTML5.