Commit 7e14c05d removed unnecessary copying of uncompressed input
through zlib or xzlib. This broke input from non-regular files like
pipes which can't be reopened. Try to detect such files by checking
whether they're seekable and always pipe them through zlib or xzlib.
Also remove seemingly unnecessary calls to gzread and gzrewind to
support unseekable files.
Fixes https://gitlab.gnome.org/GNOME/libxslt/-/issues/124.
When push parsing, we want to convert as much of the input as possible.
When pull parsing memory buffers, we want to convert data chunk by chunk
to save memory.
Always use what the old implementation called the "IO" allocation
scheme, allowing to move the content pointer past the initial
allocation. This is inexpensive and allows efficient shrinking.
Optimize xmlBufGrow, reusing shrunken memory as much as possible.
Simplify xmlBufAdd.
Make xmlBufBackToBuffer return an error on overflow.
Make "size" exclude the terminating NULL byte.
Always provide an initial size.
Reintroduce static buffers.
Remove xmlBufResize and several other functions.
Stop using xmlOutputBufferWriteEscape except when using deprecated
xmlSaveSetEscape. Rewrite xmlOutputBufferWriteEscape to use an extra
buffer and call xmlOutputBufferWrite.
Introduce xmlSerializeText to serialize both text and attribute content.
Don't read encoding from document when serializing and remove all hacks
that temporarily changed the document's encoding.
Unless we are on Windows, the following POSIX headers are required.
They're part of the earliest POSIX specs and it doesn't make sense to
check for them.
- fcntl.h
- unistd.h
- sys/stat.h
- sys/time.h
On Windows, io.h, fcntl.h and sys/stat.h are always available.
- xmlInputCreateUrl
- xmlInputCreateMemory
- xmlInputCreateString
- xmlInputCreateFd
- xmlInputCreateIO
- xmlInputSetEncoding
These functions don't take a parser context and work on xmlParserInputs,
replacing functions working on xmlParserInputBuffers.
xmlInputCreateUrl and xmlInputSetEncoding offer fine-grained error
handling.
Several XML_INPUT_* flags offer additional control.
Provide a new set of functions to create xmlParserInputs. These can be
used for the document entity or from external entity loaders.
- Don't require xmlParserInputBuffer.
- All functions take a base URI.
- All functions take an encoding as string.
- xmlNewInputURL also takes a public ID.
- xmlNewInputMemory takes a size_t.
- Optimization hints for memory buffers.
Improve documentation.
Only call xmlInitParser before allocating a new parser context.
Call xmlCtxtUseOptions as early as possible.
Many strings are passed to the library that could be either URIs or
filesystem paths. We now assume that strings are a URI if they contain
the substring "://". This means that they have a scheme and an
authority. Otherwise, URI resolution wouldn't make much sense.
Fix xmlBuildURI to work with filesystem paths. If the base URI doesn't
contain "://" it is treated as filename. The resolved URI is unescaped,
appended and the result is normalized. Rewrite xmlNormalizePath to
handle Windows quirks.
All special handling for Windows paths is removed in xmlCanonicPath.
If the path looks like an URI, only escape characters allowed in Legacy
Extended IRIs.
Make xmlPathToURI only call xmlCanonicPath. Theh additional round-trip
through URI parser and serializer seems useless.
Add a helper function xmlConvertUriToPath in xmlIO.c which checks for
file URIs and unescapes them.
Always process strings with xmlCanonicPath in xmlLoadExternalEntity.
This should be harmless now.
Should help with #334, #387, #611.
To implement this feature on such a low level is a disaster waiting to
happen. Remove these checks from the IO code and move them to xmllint.
Note that the serialization API will still treat "-" as stdout.