Nick Wellnhofer
bd9eed4694
parser: Make unsupported encodings an error in declarations
...
This was changed in 45157261, but in encoding declarations, unsupported
encodings should raise a fatal error.
Fixes #794 .
2024-09-02 19:29:39 +02:00
Nick Wellnhofer
1d009fe35d
parser: Report at least one fatal error
2024-08-05 15:14:21 +02:00
Nick Wellnhofer
bfed6e6ae8
parser: Fix error handling after reaching limit
...
Mark document as non-wellformed and stop parser even if error limit was
reached.
Regressed in abd74186.
2024-08-05 14:58:37 +02:00
Nick Wellnhofer
6a3c0b0d93
parser: Increase XML_MAX_DICTIONARY_LIMIT
...
This limit is somewhat arbitrary and can be reached when fuzzing
documents up to 1 MB.
Increase limit to 100 MB and disable limit if XML_PARSE_HUGE is set.
2024-07-22 12:53:00 +02:00
Nick Wellnhofer
a6f54f055b
io: Fine-tune initial IO buffer size
2024-07-16 17:42:10 +02:00
Nick Wellnhofer
34c9108f15
encoding: Add sizeOut argument to xmlCharEncInput
...
When push parsing, we want to convert as much of the input as possible.
When pull parsing memory buffers, we want to convert data chunk by chunk
to save memory.
2024-07-16 17:42:10 +02:00
Nick Wellnhofer
92f30711de
parser: Optimize buffer shrinking
...
Remove checks now that we can shrink memory buffers efficiently.
Shrink more aggressively.
2024-07-16 17:42:10 +02:00
Nick Wellnhofer
a221cd7849
buf: Rework xmlBuf code
...
Always use what the old implementation called the "IO" allocation
scheme, allowing to move the content pointer past the initial
allocation. This is inexpensive and allows efficient shrinking.
Optimize xmlBufGrow, reusing shrunken memory as much as possible.
Simplify xmlBufAdd.
Make xmlBufBackToBuffer return an error on overflow.
Make "size" exclude the terminating NULL byte.
Always provide an initial size.
Reintroduce static buffers.
Remove xmlBufResize and several other functions.
2024-07-16 17:42:10 +02:00
Nick Wellnhofer
728869809e
error: Add helper functions to print errors and abort
2024-07-15 16:33:38 +02:00
Nick Wellnhofer
aa6aec19b0
parser: Fix xmlInputSetEncodingHandler again
...
Short-lived regression.
2024-07-11 12:42:13 +02:00
Nick Wellnhofer
8af55c8d20
parser: Rename new input API functions
...
These weren't made public yet.
2024-07-11 01:33:29 +02:00
Nick Wellnhofer
d74ca59491
parser: Rename internal xmlNewInput functions
2024-07-11 01:31:50 +02:00
Nick Wellnhofer
4f329dc524
parser: Implement xmlCtxtParseContent
...
This implements xmlCtxtParseContent, a better alternative to
xmlParseInNodeContext or xmlParseBalancedChunkMemory. It accepts a
parser context and a parser input, making it a lot more versatile.
xmlParseInNodeContext is now implemented in terms of
xmlCtxtParseContent. This makes sure that xmlParseInNodeContext never
modifies the target document, improving thread safety.
xmlParseInNodeContext is also more lenient now with regard to undeclared
entities.
Fixes #727 .
2024-07-11 01:26:32 +02:00
Nick Wellnhofer
4fec0889e0
parser: Fix memory leak in xmlInputSetEncodingHandler
...
Short-lived regression.
2024-07-10 22:32:33 +02:00
Nick Wellnhofer
5935471732
parser: Fix malloc failure handling in xmlInputSetEncodingHandler
...
Don't set encoder if allocating buffer failed. This could lead to
xmlByteConsumed processing invalid UTF-8.
2024-07-09 14:11:28 +02:00
Nick Wellnhofer
ea31ac5bba
fuzz: Fix spaceMax
2024-07-07 04:19:09 +02:00
Nick Wellnhofer
29e3ab92f0
fuzz: Make reallocs more likely
2024-07-06 15:48:43 +02:00
Nick Wellnhofer
38195cf596
parser: Don't produce names with invalid UTF-8 in recovery mode
2024-07-06 15:33:06 +02:00
Nick Wellnhofer
ec0881099b
parser: Upgrade XML_IO_NETWORK_ATTEMPT to error
...
Fixes XML::LibXML test suite.
2024-07-04 15:47:20 +02:00
Nick Wellnhofer
fdfeecfe5e
parser: Reenable ctxt->directory
...
Unused internally, but used in downstream code.
Should fix #753 .
2024-07-02 22:06:53 +02:00
Nick Wellnhofer
606f410891
parser: Allow to disable catalogs with parser options
...
Implement XML_PARSE_NO_SYS_CATALOG and XML_PARSE_NO_CATALOG_PI.
Fixes #735 .
2024-07-02 22:06:53 +02:00
Nick Wellnhofer
197e09d5c5
parser: Fix xmlLoadResource
...
Short-lived regression.
2024-07-02 20:03:23 +02:00
Nick Wellnhofer
ede5d99af3
parser: Fix typo
2024-07-02 16:38:15 +02:00
Nick Wellnhofer
30ef77554b
parser: Don't use deprecated xmlCopyChar
2024-07-02 13:34:11 +02:00
Nick Wellnhofer
751ba00e00
parser: Don't use deprecated xmlSwitchInputEncoding
2024-07-02 13:34:04 +02:00
Nick Wellnhofer
9a4770ef84
doc: Improve documentation
2024-07-02 13:34:04 +02:00
Nick Wellnhofer
0b0dd98983
parser: Fix EBCDIC detection
2024-07-01 18:05:40 +02:00
Nick Wellnhofer
221df37529
parser: Support custom charset conversion implementations
...
Implement xmlCtxtSetCharEncConvImpl. I agree that the name is terrible.
2024-07-01 18:05:40 +02:00
Nick Wellnhofer
e72eda101e
parser: Add NULL check in xmlNewIOInputStream
2024-06-29 01:22:02 +02:00
Nick Wellnhofer
bc793390d5
parser: Update documentation
2024-06-27 16:23:14 +02:00
Nick Wellnhofer
193f4653a5
parser: Implement xmlCtxtGetStatus
...
This allows access to ctxt->wellFormed, ctxt->nsWellFormed and
ctxt->valid. It also detects several fatal non-parser errors which
really should be another error level.
2024-06-27 15:17:40 +02:00
Nick Wellnhofer
cc0cc2d3b7
parser: Add more parser context accessors
2024-06-27 14:45:33 +02:00
Nick Wellnhofer
eca972e682
parser: Add getters for XML declaration to parser context
...
Access to struct members will be deprecated.
2024-06-27 14:44:49 +02:00
Nick Wellnhofer
3ff8a2c4b8
parser: Deprecate xmlIsLetter
2024-06-27 14:43:10 +02:00
Nick Wellnhofer
fa50be923b
parser: Move implementation of xmlCtxtGetLastError
2024-06-27 14:37:53 +02:00
Rosen Penev
217e9b7af2
clang-tidy: don't return in void functions
...
Found with readability-redundant-control-flow
Signed-off-by: Rosen Penev <rosenp@gmail.com>
2024-06-20 20:37:34 +00:00
Nick Wellnhofer
c5e9a5b2c9
parser: Use catalogs with resource loader
2024-06-17 15:49:25 +02:00
Nick Wellnhofer
6deebe036a
parser: Make xmlInputCreateUrl handle HTTP input
2024-06-17 15:47:43 +02:00
Nick Wellnhofer
d2fd9d37b0
parser: Fix swapped arguments
2024-06-17 15:47:43 +02:00
Nick Wellnhofer
2608baaf92
parser: Make failure to load main document a warning
...
Revert the change that made failures to load the main document an error.
This fixes the --path option of xmllint and xsltproc.
Should fix #733 .
2024-06-14 20:06:07 +02:00
Nick Wellnhofer
dba1ed85a3
ftp: Remove FTP support
...
Remove the built-in FTP client. If you configure --with-legacy, old
symbols are retained for ABI compatibility.
2024-06-12 18:19:55 +02:00
Nick Wellnhofer
5238404325
parser: Pass resource type to resource loader
2024-06-12 16:36:12 +02:00
Nick Wellnhofer
ab5e6debd1
parser: Introduce XML_INPUT_NETWORK input flag
...
This allows to disable network access when creating parser inputs with
xmlInputCreateUrl.
2024-06-12 16:36:12 +02:00
Nick Wellnhofer
89fcae4dfd
parser: Don't report malloc failures when creating context
...
We don't want messages to stderr before an error handler could be set on
a parser context.
2024-06-12 16:36:12 +02:00
Nick Wellnhofer
64ad272525
parser: Introduce per-context resource loader
2024-06-12 16:22:52 +02:00
Nick Wellnhofer
b9d2f3c911
parser: Introduce new input API
...
- xmlInputCreateUrl
- xmlInputCreateMemory
- xmlInputCreateString
- xmlInputCreateFd
- xmlInputCreateIO
- xmlInputSetEncoding
These functions don't take a parser context and work on xmlParserInputs,
replacing functions working on xmlParserInputBuffers.
xmlInputCreateUrl and xmlInputSetEncoding offer fine-grained error
handling.
Several XML_INPUT_* flags offer additional control.
2024-06-12 16:22:52 +02:00
Nick Wellnhofer
410931e385
parser: Only set input ID for PE refs
...
Other input streams don't require IDs.
2024-06-12 16:22:52 +02:00
Nick Wellnhofer
a3b2baeb67
parser: Simplify xmlNewInputFromFile
2024-06-12 16:22:52 +02:00
Nick Wellnhofer
0b58838764
parser: Rework XML_PARSE_NONET handling
2024-06-12 16:22:52 +02:00
Nick Wellnhofer
ff3b091910
parser: Implement XML_PARSE_NO_UNZIP option
2024-06-12 16:14:15 +02:00