34 Commits

Author SHA1 Message Date
Nick Wellnhofer
d944a41515 parser: Fix in-parameter-entity and in-external-dtd checks
Use in ctxt->input->entity instead of ctxt->inputNr to determine whether
we are inside a parameter entity.

Stop using ctxt->external to check whether we're in an external DTD.
This is signaled by ctxt->inSubset == 2.
2023-12-29 01:19:56 +01:00
Nick Wellnhofer
b76d81dab3 parser: Fix regression when push parsing parameter entities
Short-lived regression from 834b8123.

Also shrink parameter entity buffers when push parsing.
2023-10-06 13:11:19 +02:00
Nick Wellnhofer
0ba22c0513 parser: Support encoded external PEs in entity values
Corner case which was never supported.
2023-10-06 12:28:59 +02:00
Nick Wellnhofer
f480f7509c Update NewsML DTD in test suite
Switch to version 1.2 which has a clearer license.

Fixes #291.
2022-02-03 14:43:17 +01:00
Nick Wellnhofer
d85245f934 Fix regression with PEs in external DTD
Fix a regression introduced with commit a28f7d87. In some cases,
parameter entity references in external DTDs wouldn't be expanded.

Fixes #306.
2022-01-16 21:56:10 +01:00
Nick Wellnhofer
01411e7c5e Check for invalid redeclarations of predefined entities
Implement section "4.6 Predefined Entities" of the XML 1.0 spec and
check whether redeclarations of predefined entities match the original
definitions.

Note that some test cases declared

    <!ENTITY lt "<">

But the XML spec clearly states that this is illegal:

> If the entities lt or amp are declared, they MUST be declared as
> internal entities whose replacement text is a character reference to
> the respective character (less-than sign or ampersand) being escaped;
> the double escaping is REQUIRED for these entities so that references
> to them produce a well-formed result.

Also fixes #217 but the connection is only tangential. The integer
overflow discovered by fuzzing was more related to the fact that various
parts of the parser disagreed on whether to prefer predefined entities
over their redeclarations. The whole situation is a mess and even
depends on legacy parser options. But now that redeclarations are
validated, it shouldn't make a difference.

As noted in the added comment, this is also one of the cases where
overly defensive checks can hide interesting logic bugs from fuzzers.
2021-02-08 21:51:26 +01:00
Jared Yanovich
2a350ee9b4 Large batch of typo fixes
Closes #109.
2019-09-30 18:04:38 +02:00
Nick Wellnhofer
c51e38cb3a Make xmlParseConditionalSections non-recursive
Avoid call stack overflow in deeply nested conditional sections.

Found by OSS-Fuzz.
2019-09-30 15:47:30 +02:00
Nick Wellnhofer
6705f4d28e Remove executable bit from non-executable files 2019-09-16 15:48:59 +02:00
Nick Wellnhofer
932cc9896a Fix buffer size checks in xmlSnprintfElementContent
xmlSnprintfElementContent failed to correctly check the available
buffer space in two locations.

Fixes bug 781333 (CVE-2017-9047) and bug 781701 (CVE-2017-9048).

Thanks to Marcel Böhme and Thuan Pham for the report.
2017-06-05 19:38:19 +02:00
Nick Wellnhofer
e26630548e Fix handling of parameter-entity references
There were two bugs where parameter-entity references could lead to an
unexpected change of the input buffer in xmlParseNameComplex and
xmlDictLookup being called with an invalid pointer.

Percent sign in DTD Names
=========================

The NEXTL macro used to call xmlParserHandlePEReference. When parsing
"complex" names inside the DTD, this could result in entity expansion
which created a new input buffer. The fix is to simply remove the call
to xmlParserHandlePEReference from the NEXTL macro. This is safe because
no users of the macro require expansion of parameter entities.

- xmlParseNameComplex
- xmlParseNCNameComplex
- xmlParseNmtoken

The percent sign is not allowed in names, which are grammatical tokens.

- xmlParseEntityValue

Parameter-entity references in entity values are expanded but this
happens in a separate step in this function.

- xmlParseSystemLiteral

Parameter-entity references are ignored in the system literal.

- xmlParseAttValueComplex
- xmlParseCharDataComplex
- xmlParseCommentComplex
- xmlParsePI
- xmlParseCDSect

Parameter-entity references are ignored outside the DTD.

- xmlLoadEntityContent

This function is only called from xmlStringLenDecodeEntities and
entities are replaced in a separate step immediately after the function
call.

This bug could also be triggered with an internal subset and double
entity expansion.

This fixes bug 766956 initially reported by Wei Lei and independently by
Chromium's ClusterFuzz, Hanno Böck, and Marco Grassi. Thanks to everyone
involved.

xmlParseNameComplex with XML_PARSE_OLD10
========================================

When parsing Names inside an expanded parameter entity with the
XML_PARSE_OLD10 option, xmlParseNameComplex would call xmlGROW via the
GROW macro if the input buffer was exhausted. At the end of the
parameter entity's replacement text, this function would then call
xmlPopInput which invalidated the input buffer.

There should be no need to invoke GROW in this situation because the
buffer is grown periodically every XML_PARSER_CHUNK_SIZE characters and,
at least for UTF-8, in xmlCurrentChar. This also matches the code path
executed when XML_PARSE_OLD10 is not set.

This fixes bugs 781205 (CVE-2017-9049) and 781361 (CVE-2017-9050).
Thanks to Marcel Böhme and Thuan Pham for the report.

Additional hardening
====================

A separate check was added in xmlParseNameComplex to validate the
buffer size.
2017-06-05 18:38:33 +02:00
Daniel Veillard
ef709ce2f7 Fix the spurious ID already defined error
For https://bugzilla.gnome.org/show_bug.cgi?id=737840
the fix for 724903 introduced a regression on external entities carrying
IDs, revert that patch in part and add a specific test to avoid readding it
2015-09-10 19:46:46 +08:00
Daniel Veillard
483272f3f0 Added a regression tests from bug 694228 data
Provided by Mark Rowe <mrowe@apple.com>
2013-03-27 13:37:14 +08:00
Daniel Veillard
a721612e54 446613 small validation bug mixed content with NS
* valid.c: fix a bug when valdating mixed content lists and some
  name use namespaces prefixes.
* result/valid/notes.xml* test/valid/dtds/notes.dtd * test/valid/notes.xml:
  add the test case to the regression suite
2009-08-21 18:22:58 +02:00
Daniel Veillard
8bf64aef50 fix a problem reported by Ashwin for system parameter entities referenced
* parser.c: fix a problem reported by Ashwin for system parameter
  entities referenced from entities in external subset, add a
  specific loading routine.
* test/valid/dtds/external.ent test/valid/dtds/external2.ent
  test/valid/t11.xml result/valid/t11.xml*: added the test to
  the regression suite
Daniel

svn path=/trunk/; revision=3713
2008-03-24 20:45:21 +00:00
Daniel Veillard
57c9db0725 poblem with encoding detection for UTF-16 reported by Ashwin and found by
* encoding.c: poblem with encoding detection for UTF-16 reported by
  Ashwin and found by Bill
* test/valid/dtds/utf16b.ent test/valid/dtds/utf16l.ent
  test/valid/UTF16Entity.xml result/valid/UTF16Entity.xml*: added
  the example to the regression tests
Daniel

svn path=/trunk/; revision=3700
2008-03-06 14:37:10 +00:00
Daniel Veillard
9668826368 fixed bug #170489 reported by Jirka Kosek added the test to the regression
* parser.c: fixed bug #170489 reported by Jirka Kosek
* test/valid/objednavka.xml test/valid/dtds/objednavka.dtd
  result/valid/objednavka*: added the test to the regression suite.
Daniel
2005-08-23 18:14:12 +00:00
Daniel Veillard
f6b71bd176 making DSO support an option code and documentation cleanups regenerated
* configure.in: making DSO support an option
* xmlmodule.c xmlreader.c include/libxml/xmlmodule.h: code
  and documentation cleanups
* elfgcchack.h testapi.c doc/*: regenerated the docs and
  checks for new module
* test/valid/REC-xml-19980210.xml: fix a small change introduced
  previously
Daniel
2005-01-04 17:50:14 +00:00
Daniel Veillard
91b955c1af fixed ID deallocation problem based on patch from Steve Shepard fixes bug
* valid.c: fixed ID deallocation problem based on patch from
  Steve Shepard fixes bug #160893
* xmlmemory.c: improving comment.
* testapi.c: new test for xmlDictExists() is generated.
Daniel
2004-12-10 10:26:42 +00:00
Daniel Veillard
fb5476f216 Fixed back modified tests erased on last commit, daniel 2004-11-07 12:20:06 +00:00
Daniel Veillard
c2c894fd7e fixed typos and avoid Catalogs verbosity Daniel
* gentest.py testapi.c: fixed typos and avoid Catalogs verbosity
Daniel
2004-11-07 12:17:35 +00:00
William M. Brack
4119d1c61d implemented bugfix from Massimo Morara for DTD dumping problem. added
* valid.c: implemented bugfix from Massimo Morara for DTD
  dumping problem.
* test/valid/t10.xml, result/valid/t10.*: added regression
  for above
* configure.in: small change for my profile settings
2004-06-24 02:24:44 +00:00
Daniel Veillard
e70c877c83 swapped the attribute defaulting and attribute checking parts of parsing a
* parser.c: swapped the attribute defaulting and attribute checking
  parts of parsing a new element start, fixes bug #127772
* result/valid/127772.* test/valid/127772.xml
  test/valid/dtds/127772.dtd: added the example in the regression tests
Daniel
2003-11-25 07:21:18 +00:00
Daniel Veillard
7b68df974b fixed bug #118712 about mixed content, and namespaced element names. added
* valid.c: fixed bug #118712 about mixed content, and namespaced
  element names.
* test/valid/mixed_ns.xml result/valid/mixed_ns*: added a check
  in the regression tests
Daniel
2003-08-03 22:58:54 +00:00
Daniel Veillard
f431eb8144 applied the patch provided by Brent Hendricks fixing #105992 and
* SAX.c test/valid/ns* test/result/ns*: applied the patch
  provided by Brent Hendricks fixing #105992 and integrated the
  examples in the testsuite.
Daniel
2003-04-22 08:37:26 +00:00
Daniel Veillard
ef8dd7be29 fixing bug #108976 get the ID/REFs to reference the ID in the document
* parser.c: fixing bug #108976 get the ID/REFs to reference
  the ID in the document content and not in the entity copy
* SAX.c include/libxml/parser.h: more checking of the ID/REF
  stuff, better solution for #107208
* xmlregexp.c: removed a direct printf, dohhh
* xmlreader.c: fixed a bug on streaming validation of empty
  elements in entities
* result/VC/ElementValid8 test/VCM/v20.xml result/valid/xhtml1.xhtml:
  cleanup of the validation tests
* test/valid/id* test/valid/dtds/destfoo.ent result/valid/id*:
  added more ID/IDREF tests to the suite
Daniel
2003-03-23 12:02:56 +00:00
Daniel Veillard
90d68fbb35 fixed bug #92518 validation error were not covering namespace
* SAX.c valid.c include/libxml/valid.h: fixed bug #92518 validation
  error were not covering namespace declarations.
* result/valid/dia.xml test/valid/dia.xml: the test wasn't valid,
  it was missing the attribute declaration for the namespace
* result/VC/NS3: the fix now report breakages in that test
Daniel
2002-09-26 16:10:21 +00:00
Daniel Veillard
f5582f156c applied a couple of patches from Peter Jacobi to start to get rid of
* parser.c: applied a couple of patches from Peter Jacobi to start
  to get rid of ctxt->token, with a possible significant speed
  improvement to be gained once done. Better compliance with PE
  references constructs in DTDs too.
* test/valid/t[0-9]* result/valid/t[0-9]*: added a set of tests
  from Peter too
Daniel
2002-06-11 10:08:16 +00:00
Daniel Veillard
be480fbbe3 trying to fix namespaces + validation problems for good, closing #63619 in
* valid.c include/libxml/tree.h: trying to fix namespaces +
  validation problems for good, closing #63619 in the process
* result/valid/dia.xml test/valid/dia.xml: the Dia test was
  wrong in this respect, fixed it.
Daniel
2001-11-08 23:36:42 +00:00
Daniel Veillard
5151c06f30 fixed an erroneous validation bug when PE refs occurs in external parsed
* parser.c: fixed an erroneous validation bug when PE refs
  occurs in external parsed entities referenced from the
  internals subset
* test/valid/index.xml test/valid/dtds/nitf-2-5.dtd
  test/valid/dtds/NewsMLv1.0.dtd result/valid/index.xml*:
  added the associated testcase, it's a nice one.
* HTMLparser.c: generate the DTD node as HTML still ...
* HTMLtree.c: fixed errors in Set/GetMetaEncoding
Daniel
2001-10-23 13:10:19 +00:00
Daniel Veillard
8534905f62 - valid.c: removed a state explosion exhibited by RSS
- test/valid/rss.xml result/valid/rss.xml*: added the testcase
  from bug #51872
Daniel
2001-04-20 13:48:21 +00:00
Daniel Veillard
3dd82e7c2a - TODO: updated - xmlmemory.[ch] : added xmlMemSetup() and xmlMemGet() to
- TODO: updated
- xmlmemory.[ch] : added xmlMemSetup() and xmlMemGet() to override
  libxml default allocation function with another set (like gmalloc/gfree).
- Makefile.am, uri.c, uri.h: added a set of functions to do exact (litteraly
  copied from the RFC 2396 productions) parsing and handling of URI.
Daniel
2000-03-20 11:48:04 +00:00
Daniel Veillard
686d6b6ab1 - added xmlRemoveProp
- changed the way Windows socket stuff get included
- removed an indetermination xmLDecl/PI(xml...)
- xmlNewNs wasn't checking for double definition
- fixed a problem with dist-hook duplicates
- fixed the loading of external entities APIs, now xmlLoadExternalEntity()
  is used everywhere
- now the xhtml spec validates with the xhtml DTD.
- error.c: fixed crashes in case of no input stream
- added the xhtml spec and dtds to the validation tests and results
Daniel
2000-01-03 11:08:02 +00:00
Daniel Veillard
b05deb7f5f Huge commit: 1.5.0, XML validation, Xpath, bugfixes, examples .... Daniel 1999-08-10 19:04:08 +00:00