Nick Wellnhofer
b349225952
include: Change some return types from int to enum
...
This also affects some new functions from 2.13.
2025-03-14 02:31:01 +01:00
Nick Wellnhofer
fd1b939168
include: Convert some macros to enums
2025-03-14 00:35:40 +01:00
Nick Wellnhofer
84c6524e26
encoding: Support input-only and output-only converters
...
Make it possible to open an encoding handler only for input or output.
This avoids the creation of unnecessary converters.
Should also fix #863 .
2025-03-13 22:15:10 +01:00
Nick Wellnhofer
69b83bb68e
encoding: Detect truncated multi-byte sequences with ICU
...
Unlike iconv or the internal converters, ICU consumes truncated multi-
byte sequences at the end of an input buffer. We currently check for a
non-empty raw input buffer to detect truncated sequences, so this fails
with ICU.
It might be possible to inspect the pivot buffer pointers, but it seems
cleaner to implement a `flush` flag for some encoding and I/O functions.
After flushing, we can check for U_TRUNCATED_CHAR_FOUND with ICU, or
detect remaining input with other converters.
Also fix detection of truncated sequences for HTML, XML content and
DTDs with iconv.
2025-03-13 22:15:10 +01:00
Nick Wellnhofer
03a8f1dd75
doc: Document SAX handlers a little more
2025-03-11 18:53:59 +01:00
Nick Wellnhofer
87c9e000e5
encoding: Rework custom encoding implementation API
2025-03-09 22:37:13 +01:00
Nick Wellnhofer
ba9148d8a5
parser: Undeprecate input->consumed
...
Should be deprecated after fixing #762 .
2025-03-09 20:30:49 +01:00
Nick Wellnhofer
a0dbf030ee
parser: Undeprecate ctxt->loadsubset
...
Should be deprecated after fixing #873 .
2025-03-09 20:24:06 +01:00
Nick Wellnhofer
d96911f100
doc: Documentation fixes
2025-03-08 23:03:26 +01:00
Nick Wellnhofer
5f0b1378d7
parser: Add more parser context accessors
...
Fixes #763 .
2025-03-08 22:36:06 +01:00
Nick Wellnhofer
38f475072a
encoding: Make conversion callbacks more type-safe
2025-03-05 22:25:14 +01:00
Nick Wellnhofer
a846d96468
encoding: Remove compatibility struct members
2025-03-05 16:49:42 +01:00
Nick Wellnhofer
94d8a3e231
parser: Convert xmlParserMaxDepth to macro
2025-03-05 14:56:46 +01:00
Nick Wellnhofer
696572248f
globals: Remove unused globals
...
- xmlBufferAllocScheme
- xmlDefaultBufferSize
- xmlParserDebugEntities
2025-03-05 12:24:38 +01:00
Nick Wellnhofer
92d7b0cd90
xpath: Rename valuePush and valuePop
2025-03-05 12:24:38 +01:00
Nick Wellnhofer
03be993ce5
Use memcpy to avoid pointer cast warnings
2025-03-05 12:24:38 +01:00
Nick Wellnhofer
f502e9b2f6
include: Add more deprecation warnings
2025-03-04 17:38:10 +01:00
Nick Wellnhofer
85bd58ef56
globals: Remove functions related to global state handling
...
- xmlGetGlobalState
- xmlInitializeGlobalState
- xmlGetThreadId
- xmlIsMainThread
2025-03-04 17:38:10 +01:00
Nick Wellnhofer
03a8d5f93d
unicode: Make Unicode functions private
2025-03-04 17:31:11 +01:00
Nick Wellnhofer
3d37ff84c3
globals: Also use global state struct if threads are disabled
2025-03-04 16:54:41 +01:00
Nick Wellnhofer
a15ad9b268
parser: Remove compatibility symbols
2025-03-04 16:54:41 +01:00
Nick Wellnhofer
8e871162a6
parser: Remove oldXMLWDcompatibility
2025-03-04 16:54:41 +01:00
Nick Wellnhofer
cdc5cfed0b
legacy: Remove legacy symbols
2025-03-04 16:54:05 +01:00
Nick Wellnhofer
3250a01dc2
error: Convert initGenericErrorDefaultFunc to macro
2025-03-04 16:53:59 +01:00
Nick Wellnhofer
c42b32277d
parser: Convert inputPush and inputPop to macros
2025-03-04 16:53:28 +01:00
Nick Wellnhofer
361f7bff92
parser: Make nodePush, nodePop, namePush, namePop private
2025-03-04 16:47:14 +01:00
Nick Wellnhofer
0b27097a92
encoding: Rename unprefixed public functions
2025-03-04 16:46:53 +01:00
Nick Wellnhofer
e50d314a27
build: Add separate configuration option for RELAX NG
...
Support for RELAX NG used to be enabled together with XML Schema support
(--with-schemas). Now there's a separate option and a new feature macro
LIBXML_RELAXNG_ENABLED.
2025-03-01 15:18:20 +01:00
Nick Wellnhofer
7ae8e8ac7d
schemas: Make xmlSchemaDump depend on DEBUG_ENABLED
2025-02-22 21:06:34 +01:00
Nick Wellnhofer
6fc260760a
regexp: Hide debugging code behind DEBUG_REGEXP
...
xmlRegexpPrint is now a deprecated no-op.
2025-02-22 20:55:06 +01:00
Nick Wellnhofer
9c16a153d8
Revert "include: Make most IS_* macros private"
...
This reverts commit 84a6c82ff83d04963d6e1c5cd18ded68ea02d99f.
2025-02-13 20:20:17 +01:00
Nick Wellnhofer
93506d41cb
parser: Make catalog PIs opt-in
...
This is an obscure feature that shouldn't be enabled by default.
2025-01-29 00:50:47 +01:00
Nick Wellnhofer
1082d813e8
parser: Prepare to make decompression opt-in
...
Add a new parser option XML_PARSE_UNZIP that enables decompression.
xmlReadFile, xmlCtxtReadFile and xmlCreateURLParserCtxt always set
this option currently, but downstream users should start to set the
option if they really need it.
2025-01-29 00:49:57 +01:00
Nick Wellnhofer
a78843be5e
xmllint: Support compressed input from stdin
...
Another regression related to reading from stdin.
Making a "-" filename read from stdin was deeply baked into the core
IO code but is inherently insecure. I really want to reenable this
dangerous feature as sparingly as possible.
This now enables compressed input when using the "Fd" API functions
which wan't supported before. But XML_PARSE_NO_UNZIP will be
inverted later.
Allow compressed stdin in xmlReadFile to support xmlstarlet and older
versions of xsltproc. So far, these are the only known command-line
tools that rely on "-" meaning stdin.
2025-01-28 23:20:37 +01:00
Nick Wellnhofer
bfe6af2eed
fuzz: Remove hacks to build lint fuzzer
...
Don't include source file directly.
2025-01-17 20:06:45 +01:00
Nick Wellnhofer
e41941109d
schemas: Make ValidateStream take a const SAXHandler
2025-01-17 20:05:57 +01:00
Nick Wellnhofer
c134e8b4dc
include: Make INPUT_CHUNK macro private
2024-12-21 20:02:34 +01:00
Nick Wellnhofer
84a6c82ff8
include: Make most IS_* macros private
...
Macros like IS_DIGIT or IS_LETTER severely pollute the C namespace.
2024-12-21 20:01:30 +01:00
Nick Wellnhofer
2e18e5dc6d
memory: Grow dynamic arrays by 50%
...
Growing by a factor lower than the golden ratio increases the chances of
reusing memory freed from earlier allocations. Set growth rate to 1.5
which also reduces internal fragmentation.
2024-12-21 19:37:38 +01:00
Nick Wellnhofer
5320a4aa38
memory: Implement xmlGrowCapacity to safely grow arrays
...
xmlGrowCapacity makes sure that dynamic arrays don't grow beyond an
explicit maximum size. size_t considerations are also taken into account.
A macro XML_MAX_ITEMS is provided as default maximum with value
1 billion.
When fuzzing, the initial size is set to 1 to cause more reallocations.
This can require adjustments if callers really need larger arrays.
2024-12-21 19:37:37 +01:00
Nick Wellnhofer
0dd910e82b
save: Fix handling of catastrophic errors
...
Don't overwrite catastrophic errors xmlSaveErr.
Overwrite non-catastrophic errors in xmlOutputBufferClose.
2024-12-19 02:30:36 +01:00
Nick Wellnhofer
57087e5fc7
parser: Don't overwrite catastrophic errors
...
Stop reporting errors after a catastrophic error.
Also make sure that ctxt->errNo matches ctxt->lastError.code.
2024-11-26 00:47:48 +01:00
Nick Wellnhofer
0dc26910c1
parser: Deprecate more internal functions
2024-11-21 22:31:20 +01:00
Nick Wellnhofer
a227a71ac9
regexp: Deprecate internal functions
2024-11-20 17:03:11 +01:00
Nick Wellnhofer
0f4f89005d
parser: Rename inputPush to xmlCtxtPushInput
2024-11-19 00:25:23 +01:00
Nick Wellnhofer
e2ad249c23
parser: Deprecate more internal symbols
...
- xmlParseExternalSubset
- xmlPushInput
- xmlPopInput
- xmlCopyCharMultiByte
- xmlCreateEntityParserCtxt
- xmlStringComment
2024-11-19 00:25:23 +01:00
Nick Wellnhofer
4d1f35b0a9
valid: Deprecate more internal functions
2024-11-19 00:03:37 +01:00
Nick Wellnhofer
5a51f08517
valid: Implement xmlCtxtValidateDocument
...
This allows to use the error handler or resource loader of a parser
context.
2024-11-19 00:03:37 +01:00
Nick Wellnhofer
7f8c436c75
parser: Implement xmlCtxtParseDtd and xmlCtxtValidateDtd
...
This allows to use the context's error handler, options and other
settings.
Fixes #808 .
2024-11-15 16:30:52 +01:00
Nick Wellnhofer
0bc4608c50
html: Use hash table to check for duplicate attributes
2024-10-06 20:04:00 +02:00