7333 Commits

Author SHA1 Message Date
Nick Wellnhofer
93506d41cb parser: Make catalog PIs opt-in
This is an obscure feature that shouldn't be enabled by default.
2025-01-29 00:50:47 +01:00
Nick Wellnhofer
1082d813e8 parser: Prepare to make decompression opt-in
Add a new parser option XML_PARSE_UNZIP that enables decompression.
xmlReadFile, xmlCtxtReadFile and xmlCreateURLParserCtxt always set
this option currently, but downstream users should start to set the
option if they really need it.
2025-01-29 00:49:57 +01:00
Nick Wellnhofer
a78843be5e xmllint: Support compressed input from stdin
Another regression related to reading from stdin.

Making a "-" filename read from stdin was deeply baked into the core
IO code but is inherently insecure. I really want to reenable this
dangerous feature as sparingly as possible.

This now enables compressed input when using the "Fd" API functions
which wan't supported before. But XML_PARSE_NO_UNZIP will be
inverted later.

Allow compressed stdin in xmlReadFile to support xmlstarlet and older
versions of xsltproc. So far, these are the only known command-line
tools that rely on "-" meaning stdin.
2025-01-28 23:20:37 +01:00
Nick Wellnhofer
a8d8a70c51 uri: Fix handling of Windows drive letters
Allow drive letters in URI paths. Technically, these should be treated
as URI schemes, but this is not what users expect. This also makes sure
that paths with drive letters are resolved as filesystem paths and
unescaped, for example when used in libxslt's document() function.

Should fix #832.
2025-01-27 14:28:29 +01:00
Nick Wellnhofer
6904d4c225 fuzz: Fix OSS-Fuzz build of lint fuzzer 2025-01-25 13:55:23 +01:00
Benjamin Gilbert
cd7299a8e3 meson: Fix setup with ICU as sibling subproject
Meson wrapdb provides a wrap for ICU, so libxml2 and ICU could both be
built as subprojects of the same Meson parent project.  In this case, with
the icu option enabled, setup was failing with:

    subprojects/libxml2-2.13.5/meson.build:603:22: ERROR: Could not get an internal variable and no default provided for <InternalDependency dep228908115162702543524838879388991448872: True>

This is because we can't get a dependency variable from a subproject that
hasn't been built yet.  Fall back to assuming DEFS is empty, as it is on
my system.
2025-01-24 18:59:12 -08:00
Nick Wellnhofer
6ec616ba26 encoding: Don't allow POSIX indicator suffixes in encoding names
Suffixes like "//IGNORE" change the behavior of iconv.

Also add comment on how we currently rely on GNU libiconv behavior
which technically violates the POSIX spec.
2025-01-24 20:47:52 +01:00
Nick Wellnhofer
9b1028c906 fuzz: Fix comments 2025-01-23 20:37:37 +01:00
Nick Wellnhofer
e95c4b07ae fuzz: Also test xmllint --repeat option 2025-01-23 20:30:40 +01:00
Nick Wellnhofer
dc6270d110 xmllint: Fix UAF with --push --repeat
Short-lived regression. Fixes #841.
2025-01-23 20:30:25 +01:00
Grzegorz Szymaszek
9d7bbf1952 tree: Fix variable name in xmlAddChild documentation 2025-01-23 17:49:54 +00:00
Kjell Ahlstedt
f043bf2522 meson: Fix build with MSVC
Check compiler options with cc.get_supported_arguments().

Fixes #842
2025-01-22 19:25:59 +01:00
Nick Wellnhofer
b524cd7af2 meson: Fix build as subproject
Use add_project_arguments instead of add_global_arguments.

Should fix #840.
2025-01-21 17:35:04 +01:00
Nick Wellnhofer
1c82bca6bd xmllint: Improve error reports from reader 2025-01-17 23:29:30 +01:00
Nick Wellnhofer
16286dea31 xmllint: Fix memory leak in parseAndPrintFile 2025-01-17 23:14:44 +01:00
Nick Wellnhofer
9cfc723cad xmllint: Always reuse parser context
Also move push parsing into parseXml which makes "--sax --push" work.
2025-01-17 21:42:35 +01:00
Nick Wellnhofer
5f1131dda5 xpath: Don't descend into OP_VALUE in debug dump
For some reason, its "ch1" value is invalid.
2025-01-17 20:10:46 +01:00
Nick Wellnhofer
00167cae33 xmllint: Report OOM errors to stderr
For the validators, some work still has to be done, but for core
features, xmllint should now report OOM errors reliably.
2025-01-17 20:06:45 +01:00
Nick Wellnhofer
67b738d9a5 fuzz: Check whether xmllint reports malloc failures correctly
This relies on xmllint's "maxmem" option.
2025-01-17 20:06:45 +01:00
Nick Wellnhofer
bfe6af2eed fuzz: Remove hacks to build lint fuzzer
Don't include source file directly.
2025-01-17 20:06:45 +01:00
Nick Wellnhofer
bf1d8b9cfb xmllint: Report malloc failures from parsing patterns 2025-01-17 20:06:45 +01:00
Nick Wellnhofer
255fd5f3f1 xmllint: Store error stream in global state 2025-01-17 20:06:45 +01:00
Nick Wellnhofer
e42ded421c xmllint: Stop using global variables
The only exception is "maxmem". The custom malloc functions don't
support an extra context.
2025-01-17 20:06:45 +01:00
Nick Wellnhofer
e41941109d schemas: Make ValidateStream take a const SAXHandler 2025-01-17 20:05:57 +01:00
Nick Wellnhofer
d39e5714b0 xmllint: Fix memory leak in parseFile
Short-lived regression.
2025-01-17 20:05:56 +01:00
Nick Wellnhofer
0f4d36e055 xmllint: Fix memory leak in error case 2025-01-17 13:13:54 +01:00
Nick Wellnhofer
fbaacfe223 encoding: Clean up UCS-4 encodings
Use "UCS-*" instead of "ISO-10646-UCS-*". While the XML spec recommends
"ISO-10646-UCS-2" and "ISO-10646-UCS-4", GNU iconv doesn't understand
these names.

Ignore UCS4_2143 and UCS4_3412 which were never supported.
2025-01-16 16:09:14 +01:00
Nick Wellnhofer
be579a266e reader: Fix return value of xmlTextReaderReadString again
Make sure to return NULL for node types except elements or text to match
the old behavior.

Note that CDATA sections are still treated like text nodes and will have
their content returned.

Fixes #838.
2025-01-15 15:24:58 +01:00
Nick Wellnhofer
86401cc3d2 xmllint: Make --shell ignore some other options
When the shell should be launched with the --shell option, don't
post-validate, stream or dump the document. Ignore the --repeat option.
2025-01-07 19:10:19 +01:00
Nick Wellnhofer
c0c69cb868 xmllint: Always reuse parser context
Simplifies "repeat" logic.
2025-01-07 18:55:35 +01:00
Nick Wellnhofer
a5be2cc303 xmllint: Support --xpath --debug
Dump compiled expression if --debug was supplied.
2025-01-06 19:14:28 +01:00
Nick Wellnhofer
f22707f42b xmllint: Use xmlXPathOrderDocElems for XPath queries 2025-01-06 19:14:21 +01:00
Nick Wellnhofer
ca81916023 include: Use intptr_t to cast between pointers and ints 2025-01-03 20:59:10 +01:00
Nick Wellnhofer
41c10c0cec io: Don't cast file descriptors to pointers
This doesn't work if open() returns 0 which is rare but can happen. Wrap
the fd in a context struct.

Fixes #835.
2025-01-03 20:15:52 +01:00
Nick Wellnhofer
71c37a565d malloc-fail: Fix memory leak in xmlValidateElementContent 2024-12-30 11:42:40 +01:00
Nick Wellnhofer
ab62fc27e8 gitlab-ci: Add --with-valid to medium config
Building --with-valid --without-regexps enables some rarely tested code.

There's an additional test failure in runxmlconf without regexps.
2024-12-28 11:56:51 +01:00
Nick Wellnhofer
cd220b93d8 valid: Remove duplicate error messages when streaming 2024-12-28 11:55:24 +01:00
Nick Wellnhofer
bd2a16489f valid: Fix build --without-regexps 2024-12-28 11:55:24 +01:00
Nick Wellnhofer
41aed0890a automake: Only build testdso when testing 2024-12-26 21:25:23 +01:00
Nick Wellnhofer
0cf25b3de2 Regenerate docs and testapi.c 2024-12-26 21:06:09 +01:00
Nick Wellnhofer
2e3a91a766 doc: Fix documentation 2024-12-26 21:05:39 +01:00
Nick Wellnhofer
53c131f667 doc: Make apibuild.py work again 2024-12-26 20:29:58 +01:00
Nick Wellnhofer
260954c566 autotools: Set AC_CONFIG_AUX_DIR
This should make sure that autoreconf doesn't mess with parent
directories.

Should fix #833.
2024-12-26 18:17:45 +01:00
Nick Wellnhofer
b3871dd138 io: Fix memory leaks of encoding handler in error cases
xmlOutputBufferCreate* must always free the encoding handler.
2024-12-21 21:58:25 +01:00
Nick Wellnhofer
afeff9c52b xinclude: Allow build without XPath
This disables XPath queries and makes the tests fail, but might be
useful.
2024-12-21 21:58:25 +01:00
Nick Wellnhofer
c134e8b4dc include: Make INPUT_CHUNK macro private 2024-12-21 20:02:34 +01:00
Nick Wellnhofer
84a6c82ff8 include: Make most IS_* macros private
Macros like IS_DIGIT or IS_LETTER severely pollute the C namespace.
2024-12-21 20:01:30 +01:00
Nick Wellnhofer
0d4a17af49 valid: Fix and check return value of nodeVPush 2024-12-21 19:41:44 +01:00
Nick Wellnhofer
3f0bac4820 malloc-fail: Handle more malloc failures in schema code
These issues can only arise after a memory allocation failed.

- WXS_ADD_*: Add NULL check and raise error
- XML_SCHEMA_*: Make macros safe
- xmlSchemaParseUnion: Fix leak, raise error, commit after success to
  avoid memory corruption
- xmlSchemaVAddNodeQName: Restore nbItems after partial success,
  raise error
- xmlSchemaIDCAcquireTargetList: Raise error
- xmlSchemaXPathProcessHistory: Handle errors
- xmlSchemaIDCFillNodeTables: Fix leak
- xmlSchemaCheckCVCIDCKeyRef: Handle errors
- xmlSchemaVPushText: Reset flag to avoid memory corruption
- xmlSchemaNewValidCtxt: Handle errors
- xmlSchemaVDocWalk: Fix leak
- xmlSchemaInitBasicType: Handle error
- xmlSchemaCleanupTypesInternal: Fix null deref
- xmlSchemaWhiteSpaceReplace: Handle error
- xmlSchemaParseUInt: Handle error
- xmlSchemaValAtomicType: Fix leak, handle error
- xmlSchemaDateNormalize: Fix leak
2024-12-21 19:41:16 +01:00
Nick Wellnhofer
df7cb96c50 build: Set C standard with CMake and meson
This should add `/std:c11` to MSVC builds which makes sure that the
__STDC_VERSION__ macro is set.
2024-12-21 19:37:38 +01:00