xmlCleanupParser is dangerous and shouldn't be called in most cases.
Being part of the examples led many people to use it incorrectly.
xmlMemoryDump is an obsolete way to test for memory leaks.
Removing version information caused problems when relinking with shared
libraries depending on libxml2. It also broke the ABI on Android.
Revert libxml2.syms to the 2.10.0 version.
Fixes#526.
Revert another change from commit 98840d40.
Decode the whole buffer when reading from memory and switching to the
initial encoding. Add some comments about potential improvements.
* parser.c:
(xmlParseCharData):
- Check if the parser has stopped before advancing
`ctxt->input->cur`. This only occurs if a custom SAX error
handler calls xmlStopParser() on fatal errors.
Fixes#518.
Revert some changes from commit 98840d40.
WebKit/Chromium can actually switch from ISO-8859-1 to UTF-16 in the
middle of parsing. This is a bad idea, but we have to keep supporting
this use case.
When hashing empty strings which aren't null-terminated,
xmlDictComputeFastKey could produce inconsistent results. This could
lead to various logic or memory errors, including double frees.
For consistency the seed is also taken into account, but this shouldn't
have an impact on security.
Found by OSS-Fuzz.
Fixes#510.
Covered by: test/VC/ElementValid5
This only affects XML Reader API with LIBXML_REGEXP_ENABLED and
LIBXML_VALID_ENABLED turned on.
* result/VC/ElementValid5.rdr:
- Update result to add missing error message.
* python/tests/reader2.py:
* result/VC/ElementValid6.rdr:
* result/VC/ElementValid7.rdr:
* result/valid/781333.xml.err.rdr:
- Update result to fix grammar issue.
* valid.c:
(xmlValidatePopElement):
- Check return value of xmlRegExecPushString() to handle -1, and
assign 'ret = 0;' to return 0 from xmlValidatePopElement().
This change affects xmlTextReaderValidatePop() from
xmlreader.c.
- Fix grammar of error message by changing 'child' to
'children'.
In commit 21ca8829, we started to ignore namespaces in HTML element
names but we still called xmlSplitQName, effectively stripping the
namespace prefix. This would cause elements like <o:p> being parsed
as <p>. Now we leave the name untouched.
Fixes#508.
To detect EBCDIC code pages, we used to switch the encoding twice and
had to be very careful not to decode data after the XML declaration
before the second switch. This relied on a hard-coded expected size of
the XML declaration and was complicated and unreliable.
Now we convert the first 200 bytes to EBCDIC-US and parse the encoding
declaration manually.
Don't try to grow the input buffer in xmlParserShrink. This makes sure
that no memory allocations are made and the function always succeeds.
Remove unnecessary invocations of SHRINK. Invoke SHRINK at the end of
DTD parsing loops.
Shrink before growing.
The input buffer is now grown reliably when calling CUR_CHAR
(xmlCurrentChar) or NEXT (xmlNextChar). This allows to remove many
other invocations of GROW.