When using xmlreader, XPointer expressions in XIncludes simply cannot
work. Expressions can reference nodes which weren't parsed yet or which
were already deleted.
After fixing nested XIncludes, we reference includes which were parsed
previously. When streaming, these nodes could have been deleted, leading
to use-after-free errors.
Disallow XPointer expressions and truncate the include table in
streaming mode.
Remove explicit integer casts as final operation
- in assignments
- when passing arguments
- when returning values
Remove casts
- to the same type
- from certain range-bound values
The main motivation is that these explicit casts don't change the result
of operations and only render UBSan's implicit-conversion checks
useless. Removing these casts allows UBSan to detect cases where
truncation or sign-changes occur unexpectedly.
Document some explicit casts as truncating and add a few missing ones.
Private functions were previously declared
- in header files in the root directory
- in public headers guarded with IN_LIBXML
- in libxml.h
- redundantly in source files that used them.
Consolidate all private header files in include/private.
When creating an xmlTextReaderPtr using xmlReaderForMemory(),
there are two optional API functions that can be used:
- xmlTextReaderClose() may be called prior to calling
xmlFreeTextReader() to free parsing resources and close the
xmlTextReaderPtr without freeing it.
- xmlTextReaderCurrentDoc() may be called to return an
xmlDocPtr that's owned by the caller, and must be free using
xmlFreeDoc() after calling xmlFreeTextReader().
The use-after-free issues occur when calling
xmlTextReaderClose() before xmlFreeTextReader(), with different
issues occurring depending on whether xmlTextReaderCurrentDoc()
is also called.
* xmlreader.c:
(xmlFreeTextReader):
- Move code to xmlTextReaderClose(), remove duplicate code, and
call xmlTextReaderClose() if it hasn't been called yet.
(xmlTextReaderClose):
- Move call to xmlFreeNode(reader->faketext) from
xmlFreeTextReader() to fix a use-after-free bug when calling
xmlTextReaderClose() before xmlFreeTextReader(), but not when
using xmlTextReaderCurrentDoc(). The bug was introduced in
2002 by commit beb70bd39. In 2009 commit f4653dcd8 fixed the
use-after-free that occurred every time xmlFreeTextReader()
was called, but not the case where xmlTextReaderClose() was
called first.
- Move post-parsing validation code from xmlFreeTextReader() to
fix a second use-after-free when calling xmlTextReaderClose()
before xmlFreeTextReader(). This regressed in v2.9.10 with
commit 57a3af56f.
In most places, we really need the double-it scheme to avoid quadratic
behavior. The hybrid scheme still can cause many reallocations and the
bounded scheme doesn't seem to provide meaningful protection in
xmlreader.c.
This code has been broken and deprecated since version 2.6.0, released
in 2003. Because of a bug in commit 961b535c, DOCBparser.c was never
compiled since 2012. I couldn't find a Debian package using any of its
symbols, so it seems safe to remove this module.
Now that no references to ID and IDREF attributes are stored in
streaming validation mode, there's no need to try and remove them.
Also remove xmlTextReaderFreeIDTable which was identical to
xmlFreeIDTable.
Don't process XIncludes in the result of another inclusion to avoid
infinite recursion resulting in a call stack overflow.
This is something the XInclude engine shouldn't allow but correct
handling of intra-document includes would require major changes.
Found by OSS-Fuzz.
xml:id creates ID attributes even in documents without a DTD, so the
check in xmlTextReaderFreeProp must be changed to avoid use after free.
Found by OSS-Fuzz.
Keeping objects on a free list can hide memory errors. Only allow a
single node on free lists used by the XML reader when fuzzing. This
should hide fewer errors while still exercising the free list logic.
Just like IDs, IDREF attributes must be removed from the document's
refs table when they're freed by a reader. This bug is often hidden
because xmlAttr structs are reused and strings are stored in a
dictionary unless XML_PARSE_NODICT is specified.
Found by OSS-Fuzz.
The getEntity handler was already invoked by xmlParseReference, so it's
useless to call it again. After the recent change, xmlSAX2GetEntity
won't load any kind of entities anyway.
The parser context stores some options both in the "options" bits and
extra members like "validate" or "replaceEntities". Which of these
are actually read is inconsistent, so make sure to also update the
bit field.
Recent commit 1fbcf40 caused a use-after-free read because it didn't
account for the fact that xmlTextReaderFreeDoc frees entities before
freeing entity references via xmlTextReaderFreeNodeList.
Found by OSS-Fuzz.
Fix a regression caused by commit 39fbfb4f. If xmlTextReaderReadOuterXml
is called on a pristine xmlReader, the current node is NULL and must not
be dereferenced. Move the call to xmlTextReaderExpand to the start of
the function to make sure that we have a valid node.
Fixes#43.
In error cases, there might still be elements in the vstate table.
Since vstateVPop in valid.c is private, we have to pop the elements
with xmlValidatePopElement. This inspects nodes of the document, so
the reader doc must be freed after the clearing the vstate table.
Found by OSS-Fuzz.
Otherwise the encoding of the document is ignored and non-ASCII
characters are serialized as numeric references even if the encoding
is specified as UTF-8.
This fixes the traversal of parent nodes using xmlTextReaderNext()
when the reader is based on a preparsed document (created using
xmlReaderWalker(doc)).
Without this fix the parser will abort even though there are parent
nodes it should traverse to, if it is not currently on an element or
attribute node. This is incorrect, since it can be for example on a
text node when it needs to enter backtracking.
Make sure that all parameters and return values of hash callback
functions exactly match the callback function type. This is required
to pass clang's Control Flow Integrity checks and to allow compilation
to asm.js with Emscripten.
Fixes bug 784861.
One of the operation on the reader could resolve entities
leading to the classic expansion issue. Make sure the
buffer used for xmlreader operation is bounded.
Introduce a new allocation type for the buffers for this effect.