Nick Wellnhofer
0f4f89005d
parser: Rename inputPush to xmlCtxtPushInput
2024-11-19 00:25:23 +01:00
Nick Wellnhofer
e2ad249c23
parser: Deprecate more internal symbols
...
- xmlParseExternalSubset
- xmlPushInput
- xmlPopInput
- xmlCopyCharMultiByte
- xmlCreateEntityParserCtxt
- xmlStringComment
2024-11-19 00:25:23 +01:00
Nick Wellnhofer
2fcdc5f7e7
globals: More comments on future directions
2024-11-19 00:08:39 +01:00
Nick Wellnhofer
4d1f35b0a9
valid: Deprecate more internal functions
2024-11-19 00:03:37 +01:00
Nick Wellnhofer
de0c779116
fuzz: Switch to xmlCtxtValidateDocument
...
This allows to check malloc failure reports during post-validation.
2024-11-19 00:03:37 +01:00
Nick Wellnhofer
5a51f08517
valid: Implement xmlCtxtValidateDocument
...
This allows to use the error handler or resource loader of a parser
context.
2024-11-19 00:03:37 +01:00
Nick Wellnhofer
1e1731a43d
valid: Add NULL check in xmlCtxtValidateDtd
2024-11-17 13:20:06 +01:00
Nick Wellnhofer
631778f679
parser: Check for malloc failure in xmlCtxtParseDtd
2024-11-17 12:11:41 +01:00
Nick Wellnhofer
7f8c436c75
parser: Implement xmlCtxtParseDtd and xmlCtxtValidateDtd
...
This allows to use the context's error handler, options and other
settings.
Fixes #808 .
2024-11-15 16:30:52 +01:00
Nick Wellnhofer
764b8086d1
tests: Fix sanitizer version check on old Apple clang
...
See #669 .
2024-11-13 20:22:32 +01:00
Nick Wellnhofer
b57e022d75
build: Check for icu-uc instead of icu-i18n
...
This should be the ICU component we actually need.
2024-11-13 19:10:45 +01:00
Ruslan Garipov
aaecdc92e2
parser: Assign value without if-statement
...
This avoids an if-statement, because effectively it does nothing. And,
for example, binary artifact generated by GCC with -O2 optimization
settings does not contain that if-statement -- the code just uses the
hprefix->name field explicitly.
No functional changes intended.
Signed-off-by: Ruslan Garipov <ruslanngaripov@gmail.com>
2024-11-12 16:42:36 +05:00
Nick Wellnhofer
1e4d8c55f0
xmlIO: Fix reading from non-regular files like pipes
...
Commit 7e14c05d removed unnecessary copying of uncompressed input
through zlib or xzlib. This broke input from non-regular files like
pipes which can't be reopened. Try to detect such files by checking
whether they're seekable and always pipe them through zlib or xzlib.
Also remove seemingly unnecessary calls to gzread and gzrewind to
support unseekable files.
Fixes https://gitlab.gnome.org/GNOME/libxslt/-/issues/124 .
2024-11-06 16:49:53 +01:00
Nick Wellnhofer
459146140a
xpath: Fix parsing of non-ASCII names
...
Fix a long-standing issue where QNames starting with a non-ASCII
character would be rejected. This became more visible after "streaming"
XPath evaluation was disabled since the latter handled non-ASCII names
correctly.
Fixes #818 .
2024-11-05 12:30:44 +01:00
Nick Wellnhofer
9201173c5a
xmlreader: Fix return value of xmlTextReaderReadString
...
Return NULL if the node has no children or the children were already
deleted to match the 2.12 behavior.
Fixes #817 .
2024-11-05 11:41:28 +01:00
Nick Wellnhofer
869e3fd421
parser: Fix loading of parameter entities in external DTDs
...
Regressed with commit 12f0bb94.
Fixes #816 .
2024-11-01 16:53:18 +01:00
Nick Wellnhofer
36117723d4
Update README
2024-10-31 17:52:42 +01:00
Nick Wellnhofer
467f444544
SAX2: Add NULL check for ctxt->myDoc
2024-10-30 14:13:38 +01:00
Nick Wellnhofer
efb57ddba3
parser: Fix downstream code that swaps DTDs
...
Downstream code like the nginx xslt module can change the document's DTD
pointers in a SAX callback. If an entity from a separate DTD is parsed
lazily, its content must not reference the current document.
Regressed with commit d025cfbb.
Fixes #815 .
2024-10-30 14:13:38 +01:00
Nick Wellnhofer
0ec5687e06
parser: Rework xmlCtxtGrowAttrs
...
Remove unneeded argument.
Check for integer overflow. We probably hit the buffer size limit in
xmlParserGrow before, but better be safe.
2024-10-28 21:06:52 +01:00
Nick Wellnhofer
ffb058f484
parser: Fix detection of duplicate attributes
...
We really need a second scan if more than one namespace clash was
detected.
2024-10-28 20:26:55 +01:00
Nick Wellnhofer
89b9f45711
entities: Allow control chars when serializing HTML
2024-10-25 18:02:58 +02:00
Nick Wellnhofer
b52a3044aa
parser: Use counted_by attribute if supported
...
We only have a single struct with a flexible array member.
2024-10-24 18:18:47 +02:00
Nick Wellnhofer
944e5fe8df
nanohttp: Fix another stdout file descriptor
2024-10-23 16:46:03 +02:00
Nick Wellnhofer
607ada90b8
nanohttp: Fix stdout file descriptor
...
Fixes #813 .
2024-10-23 14:19:01 +02:00
Nick Wellnhofer
b7c0f9d2dd
string: Fix va_copy fallback
...
Fix va_copy fallback reworked in 5cffba83.
Should fix #812 .
2024-10-19 14:53:25 +02:00
Nick Wellnhofer
a870088f94
xpath: Hide internal sort functions
2024-10-19 14:53:25 +02:00
Yegor Yefremov
513949293d
python/tests: fix typos
...
Typos were found with codespell.
2024-10-15 11:11:38 +02:00
Nick Wellnhofer
f9a6469a47
Update NEWS
2024-10-14 16:15:11 +02:00
Satadru Pramanik
c7b2786676
Avoid Python 'licence' distribution option is deprecated; use 'license' error
2024-10-12 11:55:50 +00:00
Nick Wellnhofer
bf3619c328
fuzz: Don't unlink DTD when replacing nodes
...
OP_XML_REPLACE_NODE needs the same check as OP_XML_UNLINK_NODE.
2024-10-10 12:14:47 +02:00
Nick Wellnhofer
a4c16a140c
xmllint: Improve --memory and --testIO options
...
Support --memory and --testIO in SAX mode.
Keep memory-mapped file across repetitions.
Options `--sax --memory --noout --repeat` can now be used to benchmark
the core parser without building a DOM tree or repeatedly reading files
from disk.
2024-10-06 20:04:00 +02:00
Nick Wellnhofer
3ac214f01e
xmllint: Support --html --sax
2024-10-06 20:04:00 +02:00
Nick Wellnhofer
225ed70737
html: Accelerate htmlParseCharData
2024-10-06 20:04:00 +02:00
Nick Wellnhofer
74dfc49b5f
parser: Clarify logic in xmlParseStartTag2
2024-10-06 20:04:00 +02:00
Nick Wellnhofer
207999793f
html: Handle numeric character references directly
2024-10-06 20:04:00 +02:00
Nick Wellnhofer
0bc4608c50
html: Use hash table to check for duplicate attributes
2024-10-06 20:04:00 +02:00
Nick Wellnhofer
24a6149fc4
html: Make sure that character data mode is reset
2024-10-06 20:04:00 +02:00
Nick Wellnhofer
c32397d51f
html: Improve character class macros
2024-10-06 20:04:00 +02:00
Nick Wellnhofer
e840655414
html: Rewrite parsing of most data
2024-10-06 20:04:00 +02:00
Nick Wellnhofer
f77ec16db0
html: Optimize htmlParseCharData
2024-10-06 20:04:00 +02:00
Nick Wellnhofer
440bd64c69
html: Optimize htmlParseHTMLName
2024-10-06 20:04:00 +02:00
Nick Wellnhofer
c34d0ae9cc
html: Deprecate htmlIsBooleanAttr
2024-10-06 20:04:00 +02:00
Nick Wellnhofer
6040785ac4
html: Deprecate AutoClose API
2024-10-06 20:04:00 +02:00
Nick Wellnhofer
188cad68a4
html: Remove obsolete content model
2024-10-06 20:04:00 +02:00
Nick Wellnhofer
0144f662d7
html: Remove obsolete code
2024-10-06 20:04:00 +02:00
Nick Wellnhofer
0ce7bfe559
html: Try to avoid passing XML options to HTML parser
2024-10-06 20:04:00 +02:00
Nick Wellnhofer
76cc63942a
test: Fix XML_PARSE_HTML constant
2024-10-06 20:04:00 +02:00
Nick Wellnhofer
575be6c1f1
html: Fix line numbers with CRs
2024-10-06 20:04:00 +02:00
Nick Wellnhofer
be874d7831
html: Ignore unexpected DOCTYPE declarations
2024-10-06 20:04:00 +02:00