libxml2

c/libxml2

mirror of https://gitlab.gnome.org/GNOME/libxml2 synced 2025-03-28 21:33:13 +00:00

Author	SHA1	Message	Date
Nick Wellnhofer	320f5084cd	parser: Improve handling of encoding and IO errors Make sure that xmlCharEncInput, xmlParserInputBufferPush and xmlParserInputBufferGrow set the correct error code in the xmlParserInputBuffer. Handle errors when calling these functions.	2023-04-30 21:31:54 +02:00
Nick Wellnhofer	1061537efd	malloc-fail: Fix buffer overread with HTML doctype declarations Found by OSS-Fuzz, see #344.	2023-03-26 22:42:13 +02:00
Nick Wellnhofer	7fbd454d9f	parser: Grow input buffer earlier when reading characters Make more bytes available after invoking CUR_CHAR or NEXT.	2023-03-21 21:35:53 +01:00
Nick Wellnhofer	04d1bedd8c	parser: Rework shrinking of input buffers Don't try to grow the input buffer in xmlParserShrink. This makes sure that no memory allocations are made and the function always succeeds. Remove unnecessary invocations of SHRINK. Invoke SHRINK at the end of DTD parsing loops. Shrink before growing.	2023-03-21 13:19:18 +01:00
Nick Wellnhofer	44ecefc8cc	malloc-fail: Fix buffer overread after htmlParseScript Found by OSS-Fuzz, see #344.	2023-03-20 15:53:42 +01:00
Nick Wellnhofer	067986fa67	parser: Fix regressions from previous commits - Fix memory leak in xmlParseNmtoken. - Fix buffer overread after htmlParseCharDataInternal.	2023-03-18 16:51:40 +01:00
Nick Wellnhofer	9ef2a9abf3	html: Rely on CUR_CHAR to grow the input buffer - Remove useless invocations of GROW. - Add some error checks. - Fix invocations of SHRINK.	2023-03-17 14:14:04 +01:00
Nick Wellnhofer	62f199ed7d	malloc-fail: Add error check in htmlParseHTMLAttribute This function must return NULL is an error occurs. Found by OSS-Fuzz, see #344.	2023-03-17 12:40:46 +01:00
Nick Wellnhofer	8090e58564	malloc-fail: Fix buffer overread in htmlParseScript Found by OSS-Fuzz, see #344.	2023-03-17 12:27:07 +01:00
Nick Wellnhofer	ca2bfecea9	malloc-fail: Fix buffer overread when reading from input Found by OSS-Fuzz, see #344.	2023-03-15 17:34:32 +01:00
Nick Wellnhofer	4b3452d171	html: Fix quadratic behavior in htmlParseTryOrFinish Fix check for end of script content. Found by OSS-Fuzz.	2023-03-15 17:02:46 +01:00
Nick Wellnhofer	14c62e0dd3	html: Use NEXTL in htmlParseHTMLAttribute This is more efficient than NEXT.	2023-03-15 17:02:46 +01:00
Nick Wellnhofer	2099441f32	parser: Stop calling xmlParserInputShrink Introduce xmlParserShrink which takes a parser context to simplify error handling.	2023-03-13 17:51:13 +01:00
Nick Wellnhofer	cabde70f8b	parser: Simplify calculation of available buffer space	2023-03-12 19:07:23 +01:00
Nick Wellnhofer	b75976e029	parser: Use size_t when subtracting input buffer pointers Avoid integer overflows.	2023-03-12 19:06:19 +01:00
Nick Wellnhofer	9a6ca81612	parser: Check for integer overflow when updating checkIndex Unfortunately, checkIndex is a long, not a size_t. Check for integer overflow before updating the value.	2023-03-12 19:03:11 +01:00
Nick Wellnhofer	bd63d730b8	html: Impose some length limits Impose length limits on names, attribute values, PIs and comments, similar to the XML parser.	2023-03-12 17:40:55 +01:00
Nick Wellnhofer	3eb6bf0386	parser: Stop calling xmlParserInputGrow Introduce xmlParserGrow which takes a parser context to simplify error handling.	2023-03-12 17:05:51 +01:00
Nick Wellnhofer	53d1cc98cf	malloc-fail: Fix error code in htmlParseChunk Found with libFuzzer, see #344.	2023-02-17 17:18:51 +01:00
Nick Wellnhofer	15b0ed0815	malloc-fail: Fix infinite loop in htmlParseDocTypeDecl Found with libFuzzer, see #344.	2023-02-17 17:18:47 +01:00
Nick Wellnhofer	041789d9ec	malloc-fail: Fix null deref in htmlnamePush Found with libFuzzer, see #344.	2023-02-17 17:18:43 +01:00
Nick Wellnhofer	0ec9c91064	malloc-fail: Fix infinite loop in htmlParseStartTag Found with libFuzzer, see #344.	2023-02-17 17:18:38 +01:00
Nick Wellnhofer	04c2955197	malloc-fail: Fix infinite loop in htmlParseContentInternal Found with libFuzzer, see #344.	2023-02-17 17:18:34 +01:00
Nick Wellnhofer	f3e62035d8	malloc-fail: Fix memory leak in htmlCreatePushParserCtxt Found with libFuzzer, see #344.	2023-02-17 17:18:29 +01:00
Nick Wellnhofer	fc256953d2	malloc-fail: Fix memory leak in htmlCreateMemoryParserCtxt Found with libFuzzer, see #344.	2023-02-17 17:18:25 +01:00
Nick Wellnhofer	643b4e90eb	malloc-fail: Fix infinite loop in htmlParseStartTag Found with libFuzzer, see #344.	2023-02-17 17:16:52 +01:00
Nick Wellnhofer	59b3366178	error: Limit number of parser errors Reporting errors is expensive and some abusive test cases can generate an error for each invalid input byte. This causes the parser to spend most of the time with error handling. Limit the number of errors and warnings to 100.	2022-12-27 14:41:19 +01:00
Alex Richardson	4b959ee168	Remove hacky heuristic from b2dc5675e94aa6b5557ba63f7d66b0f08dd17e4d Checking whether the context is close to the parent context by hardcoding 250 is not portable (I noticed tests were failing on Morello since the value is 288 there due to pointers being 128 bits). Instead we should ensure that the XML_VCTXT_USE_PCTXT flag is not set in cases where the user data is not actually a parser context (or ideally add a separate field but that would be an ABI break. From what I can see in the source, the XML_VCTXT_USE_PCTXT is only set if the userData field points to a valid context, and if this is not the case the flag should be cleared when changing userData rather than relying on the offset between the two. Looking at the history, I think d7cb33cf44aa688f24215c9cd398c1a26f0d25ff fixed most of the need for this workaround, but it looks like there are a few more locations that need updating; This commit changes two more places to set/clear/copy the XML_VCTXT_USE_PCTXT flag, so this heuristic should not be needed anymore. I've also drop two = NULL assignment in xmllint since this is not needed after a call to memset(). There was also an uninitialized vctxt.flags (and other fields) in `xmlShellValidate()`, which I've fixed by adding a memset() call.	2022-12-01 15:31:25 +00:00
Alex Richardson	c715ded086	Avoid creating an out-of-bounds pointer by rewriting a check Creating more than one-past-the-end pointers is undefined behaviour in C and while this code is unlikely to be miscompiled, I discovered that an out-of-bounds pointer is being created using UBSan on a CHERI-enabled system.	2022-12-01 15:30:12 +00:00
Nick Wellnhofer	c7a9b85cbb	html: Improve parsing of nested lists Allow ul/ol as immediate children of ul/ol. This is more in line with the HTML5 spec. Fixes #447.	2022-11-30 17:11:33 +01:00
Nick Wellnhofer	e414f82585	html: Fix htmlInitAutoClose documentation	2022-11-27 02:11:07 +01:00
Nick Wellnhofer	c93679381c	html: Fix check for end of comment in push parser Make sure to reset checkIndex. Handle case where "--" or "--!" is at the end of the buffer. Fix "avail" check in htmlParseOrTryFinish.	2022-11-20 21:27:59 +01:00
Nick Wellnhofer	68a6518c45	parser: Rewrite push parser boundary checks Remove inaccurate xmlParseCheckTransition check. Remove non-incremental xmlParseGetLasts check. Add functions that check for several boundary constructs more accurately, keeping track of progress in ctxt->checkIndex. Fixes #439.	2022-11-20 21:27:08 +01:00
Nick Wellnhofer	6843fc726f	Remove or annotate char casts	2022-09-01 04:31:30 +02:00
Nick Wellnhofer	2cac626976	Don't use sizeof(xmlChar) or sizeof(char)	2022-09-01 03:35:19 +02:00
Nick Wellnhofer	ad338ca737	Remove explicit integer casts Remove explicit integer casts as final operation - in assignments - when passing arguments - when returning values Remove casts - to the same type - from certain range-bound values The main motivation is that these explicit casts don't change the result of operations and only render UBSan's implicit-conversion checks useless. Removing these casts allows UBSan to detect cases where truncation or sign-changes occur unexpectedly. Document some explicit casts as truncating and add a few missing ones.	2022-09-01 02:33:57 +02:00
Nick Wellnhofer	65dc8a63ac	Make xmlNewSAXParserCtx take a const sax handler Also improve documentation.	2022-09-01 00:17:45 +02:00
Nick Wellnhofer	0f568c0b73	Consolidate private header files Private functions were previously declared - in header files in the root directory - in public headers guarded with IN_LIBXML - in libxml.h - redundantly in source files that used them. Consolidate all private header files in include/private.	2022-08-26 02:11:56 +02:00
Nick Wellnhofer	58fc89e8a9	Deprecate internal parser functions	2022-08-25 21:04:57 +02:00
Nick Wellnhofer	a308c0cdf7	Deprecate old HTML SAX API	2022-08-25 21:04:57 +02:00
Nick Wellnhofer	9a82b94a94	Introduce xmlNewSAXParserCtxt and htmlNewSAXParserCtxt Add API functions to create a parser context with a custom SAX handler without having to mess with ctxt->sax manually.	2022-08-24 14:07:55 +02:00
Nick Wellnhofer	0a04db19fc	Don't mess with parser options in htmlParseDocument Don't set ctxt->html. This member should already be initialized. Set ctxt->linenumbers in htmlCtxtUseOptions like the XML parser does.	2022-08-24 14:06:00 +02:00
Nick Wellnhofer	d45263a262	Remove useless call to htmlDefaultSAXHandlerInit This function is already called from xmlInitParser.	2022-08-24 14:04:35 +02:00
Nick Wellnhofer	4b184240be	Remove htmlDefaultSAXHandler from non-SAX1 build This matches long-standing behavior of the XML counterpart.	2022-08-22 14:24:25 +02:00
Nick Wellnhofer	80bd34c3c6	Don't initialize SAX handler in htmlReadMemory The SAX handler is already initialized when creating the parser context.	2022-08-22 14:06:37 +02:00
Nick Wellnhofer	37cedc0b15	Fix htmlReadMemory mixing up XML and HTML functions Also see fe6890e2.	2022-08-22 14:04:07 +02:00
Nick Wellnhofer	920753c4aa	Don't use default SAX handler to report unrelated errors	2022-08-22 13:48:59 +02:00
Nick Wellnhofer	38f04779f7	Fix HTML parser with threads and --without-legacy If the legacy functions are disabled, the default "V1" HTML SAX handler isn't initialized in threads other than the main thread. htmlInitParserCtxt would later use the empty V1 SAX handler, resulting in NULL documents. Change htmlInitParserCtxt to initialize the HTML SAX handler by calling xmlSAX2InitHtmlDefaultSAXHandler. This removes the ability to change the default handler but is more in line with the XML parser which initializes the SAX handler by calling xmlSAXVersion, ignoring the V1 default handler. Fixes #399.	2022-08-22 13:48:59 +02:00
Nick Wellnhofer	5b2d07a726	Use xmlStrlen in *CtxtReadDoc xmlStrlen handles buffers larger than INT_MAX more gracefully.	2022-08-20 17:00:50 +02:00
Nick Wellnhofer	4ad71c2d72	Fix xmlCtxtReadDoc with encoding xmlCtxtReadDoc used to create an input stream involving xmlNewStringInputStream. This would create a stream without an input buffer, causing problems with encodings (see #34). After commit aab584dc3, an error was returned even with UTF-8 encodings which happened to work before. Make xmlCtxtReadDoc call xmlCtxtReadMemory which doesn't suffer from these issues. Also fix htmlCtxtReadDoc. Fixes #397.	2022-08-20 16:34:08 +02:00

1 2 3 4 5 ...

444 Commits