20 Commits

Author SHA1 Message Date
cd1406e158 Update docos.
Fix static library build commands - remove archive before building it fresh.
2024-01-24 11:03:58 -05:00
59deee020a Fix some Clang warnings. 2024-01-24 10:58:11 -05:00
476013706e Prep for 1.2.0 release, bump copyright. 2024-01-24 10:53:53 -05:00
a43a9d9e32 Fix whitespace. 2023-12-18 10:04:24 -05:00
abc69b3361 Save work. 2023-12-18 10:04:00 -05:00
83bfb135c6 Add some more debug printfs, relocate extra newline detection after stream
token.
2023-12-15 12:57:31 -05:00
2dfb560f8b Add more debug logging. 2023-12-14 17:05:10 -05:00
7330cc35ba Defer object/value decryption to after the object is loaded (Issue #42) 2023-12-14 16:02:26 -05:00
5d760e7315 Update some debug printfs. 2023-12-13 12:48:31 -05:00
2a85baaf81 Increase the maximum number of object streams in a file (Issue #58) - most files
only contain 1 or 2...

Change the implementation of add/find object to use a custom binary insertion
sort algorithm rather than doing a qsort after every addition.  This results in
a significant improvement in open speed - from 2371 seconds (about 39.5 minutes)
to 3.1 seconds for one large test file (an ESRI standard).
2023-12-13 12:26:25 -05:00
2b92044504 Support per-object file IDs (Issue #42) 2023-12-12 21:48:58 -05:00
f4aa951165 Fix _pdfioFileSeek with whence==SEEK_CUR
Fix seek offset after trailer.

Look at the last 1k of the file to find the startxref marker.
2023-12-12 12:24:49 -05:00
038fd8686b Fix trailer dictionary handling (Issue #58)
Fix generation number handling for object 0 (Issue #59)
2023-12-11 19:56:00 -05:00
7084105dc4 Merge pull request #57 from eli-schwartz/pdfio-pc-redundancy
pdfio.pc: use -lm as specified in configure
2023-12-10 19:35:37 -05:00
9f06f22281 pdfio.pc: use -lm as specified in configure
It is already configured in, in the correct place. Currently, it is
listed twice in Libs.private, if --enable-shared is used. And it is
redundant if the build is static instead, since it is recorded in Libs.
2023-12-10 16:32:52 -05:00
cb6b493df6 Update configure script. 2023-12-10 15:38:35 -05:00
2753a82eb9 Merge pull request #56 from eli-schwartz/misspelled
fix misspelled variable: PKCONFIG
2023-12-10 15:38:12 -05:00
ddb8ddff9c fix misspelled variable: PKCONFIG
This prevented using pkg-config for zlib lookup.
2023-12-10 01:39:23 -05:00
c992b2ba89 Update the token reading code to protect against obvious format abuses.
Update the xref loading code to protect against looping xref tables.
2023-12-07 17:50:52 -05:00
ed723a46dc Make sure buffer is terminated on error. 2023-12-06 11:21:33 -05:00
25 changed files with 612 additions and 208 deletions

View File

@ -2,16 +2,28 @@ Changes in PDFio
================
v1.2.0 (Month DD, YYYY)
-----------------------
v1.2.0 (January 24, 2024)
-------------------------
- Now use autoconf to configure the PDFio sources (Issue #54)
- Added `pdfioFileCreateNumberObj` and `pdfioFileCreateStringObj` functions
(Issue #14)
- Added `pdfioContentTextMeasure` function (Issue #17)
- Added `pdfioContentTextNewLineShow` and `pdfioContentTextNewLineShowf`
functions (Issue #24)
- Renamed `pdfioContentTextNextLine` to `pdfioContentTextNewLine`.
- Now use autoconf to configure the PDFio sources (Issue #54)
- Updated the maximum number of object streams in a single file from 4096 to
8192 (Issue #58)
- Updated the token reading code to protect against some obvious abuses of the
PDF format.
- Updated the xref reading code to protect against loops.
- Updated the object handling code to use a binary insertion algorithm -
provides a significant (~800x) improvement in open times.
- Fixed handling of encrypted PDFs with per-object file IDs (Issue #42)
- Fixed handling of of trailer dictionaries that started immediately after the
"trailer" keyword (Issue #58)
- Fixed handling of invalid, but common, PDF files with a generation number of
65536 in the xref table (Issue #59)
v1.1.4 (December 3, 2023)

View File

@ -172,6 +172,7 @@ valgrind: testpdfio
# pdfio library
libpdfio.a: $(LIBOBJS)
echo Archiving $@...
$(RM) $@
$(AR) $(ARFLAGS) $@ $(LIBOBJS)
$(RANLIB) $@
@ -226,7 +227,7 @@ ttf.o: ttf.h
# Make documentation using Codedoc <https://www.msweet.org/codedoc>
DOCFLAGS = \
--author "Michael R Sweet" \
--copyright "Copyright (c) 2021-2023 by Michael R Sweet" \
--copyright "Copyright (c) 2021-2024 by Michael R Sweet" \
--docversion $(PDFIO_VERSION)
.PHONY: doc

2
NOTICE
View File

@ -1,6 +1,6 @@
PDFio - PDF Read/Write Library
Copyright © 2021-2023 by Michael R Sweet.
Copyright © 2021-2024 by Michael R Sweet.
(Optional) Exceptions to the Apache 2.0 License:
================================================

View File

@ -89,7 +89,7 @@ generates a static library that will be installed under "/usr/local" with:
Legal Stuff
-----------
PDFio is Copyright © 2021-2023 by Michael R Sweet.
PDFio is Copyright © 2021-2024 by Michael R Sweet.
This software is licensed under the Apache License Version 2.0 with an
(optional) exception to allow linking against GPL2/LGPL2 software. See the

View File

@ -54,7 +54,7 @@ starting at 0. A feature release has a "PATCH" value of 0, for example:
1.1.0
2.0.0
Beta releases and release candidates are *not* prodution releases and use
Beta releases and release candidates are *not* production releases and use
semantic version numbers of the form:
MAJOR.MINORbNUMBER

2
configure vendored
View File

@ -4024,7 +4024,7 @@ then :
printf "%s\n" "#define STDC_HEADERS 1" >>confdefs.h
fi
if $PKCONFIG --exists zlib
if $PKGCONFIG --exists zlib
then :
{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: result: yes" >&5

View File

@ -1,7 +1,7 @@
dnl
dnl Configuration script for PDFio
dnl
dnl Copyright © 2023 by Michael R Sweet
dnl Copyright © 2023-2024 by Michael R Sweet
dnl
dnl Licensed under Apache License v2.0. See the file "LICENSE" for more
dnl information.
@ -103,10 +103,10 @@ AC_SUBST([PKGCONFIG_REQUIRES])
dnl ZLIB
AC_MSG_CHECKING([for zlib via pkg-config])
AS_IF([$PKCONFIG --exists zlib], [
AS_IF([$PKGCONFIG --exists zlib], [
AC_MSG_RESULT([yes])
LIBS="$($PKGCONFIG --libs zlib) $LIBS"
CPPFLAGS="$($PKGCONFIG --cflags zlib) $CPPFLAGS"
LIBS="$($PKGCONFIG --libs zlib) $LIBS"
],[
AC_MSG_RESULT([no])
AC_CHECK_HEADER([zlib.h])
@ -116,6 +116,7 @@ AS_IF([$PKCONFIG --exists zlib], [
AC_MSG_ERROR([Sorry, this software requires zlib 1.1 or higher.])
])
PKGCONFIG_REQUIRES=""
PKGCONFIG_LIBS_PRIVATE="-lz $PKGCONFIG_LIBS_PRIVATE"
])

View File

@ -1,4 +1,4 @@
.TH pdfio 3 "pdf read/write library" "2023-12-05" "pdf read/write library"
.TH pdfio 3 "pdf read/write library" "2024-01-24" "pdf read/write library"
.SH NAME
pdfio \- pdf read/write library
.SH Introduction
@ -34,7 +34,7 @@ PDFio is
.I not
concerned with rendering or viewing a PDF file, although a PDF RIP or viewer could be written using it.
.PP
PDFio is Copyright \[co] 2021\-2023 by Michael R Sweet and is licensed under the Apache License Version 2.0 with an (optional) exception to allow linking against GPL2/LGPL2 software. See the files "LICENSE" and "NOTICE" for more information.
PDFio is Copyright \[co] 2021\-2024 by Michael R Sweet and is licensed under the Apache License Version 2.0 with an (optional) exception to allow linking against GPL2/LGPL2 software. See the files "LICENSE" and "NOTICE" for more information.
.SS Requirements
.PP
PDFio requires the following to build the software:
@ -3087,4 +3087,4 @@ typedef enum pdfio_valtype_e pdfio_valtype_t;
Michael R Sweet
.SH COPYRIGHT
.PP
Copyright (c) 2021-2023 by Michael R Sweet
Copyright (c) 2021-2024 by Michael R Sweet

View File

@ -6,7 +6,7 @@
<meta name="generator" content="codedoc v3.7">
<meta name="author" content="Michael R Sweet">
<meta name="language" content="en-US">
<meta name="copyright" content="Copyright © 2021-2023 by Michael R Sweet">
<meta name="copyright" content="Copyright © 2021-2024 by Michael R Sweet">
<meta name="version" content="1.2.0">
<style type="text/css"><!--
body {
@ -247,7 +247,7 @@ span.string {
<p><img class="title" src="pdfio-512.png"></p>
<h1 class="title">PDFio Programming Manual v1.2.0</h1>
<p>Michael R Sweet</p>
<p>Copyright © 2021-2023 by Michael R Sweet</p>
<p>Copyright © 2021-2024 by Michael R Sweet</p>
</div>
<div class="contents">
<h2 class="title">Contents</h2>
@ -501,7 +501,7 @@ span.string {
</li>
</ul>
<p>PDFio is <em>not</em> concerned with rendering or viewing a PDF file, although a PDF RIP or viewer could be written using it.</p>
<p>PDFio is Copyright © 2021-2023 by Michael R Sweet and is licensed under the Apache License Version 2.0 with an (optional) exception to allow linking against GPL2/LGPL2 software. See the files &quot;LICENSE&quot; and &quot;NOTICE&quot; for more information.</p>
<p>PDFio is Copyright © 2021-2024 by Michael R Sweet and is licensed under the Apache License Version 2.0 with an (optional) exception to allow linking against GPL2/LGPL2 software. See the files &quot;LICENSE&quot; and &quot;NOTICE&quot; for more information.</p>
<h3 class="title" id="requirements">Requirements</h3>
<p>PDFio requires the following to build the software:</p>
<ul>

View File

@ -15,7 +15,7 @@ goals of pdfio are:
PDFio is *not* concerned with rendering or viewing a PDF file, although a PDF
RIP or viewer could be written using it.
PDFio is Copyright © 2021-2023 by Michael R Sweet and is licensed under the
PDFio is Copyright © 2021-2024 by Michael R Sweet and is licensed under the
Apache License Version 2.0 with an (optional) exception to allow linking against
GPL2/LGPL2 software. See the files "LICENSE" and "NOTICE" for more information.

View File

@ -327,6 +327,30 @@ pdfioArrayCreate(pdfio_file_t *pdf) // I - PDF file
}
//
// '_pdfioArrayDecrypt()' - Decrypt values in an array.
//
bool // O - `true` on success, `false` on error
_pdfioArrayDecrypt(pdfio_file_t *pdf, // I - PDF file
pdfio_obj_t *obj, // I - Object
pdfio_array_t *a, // I - Array
size_t depth) // I - Depth
{
size_t i; // Looping var
_pdfio_value_t *v; // Current value
for (i = a->num_values, v = a->values; i > 0; i --, v ++)
{
if (!_pdfioValueDecrypt(pdf, obj, v, depth))
return (false);
}
return (true);
}
//
// '_pdfioArrayDebug()' - Print the contents of an array.
//

View File

@ -141,7 +141,7 @@ _pdfioFileGets(pdfio_file_t *pdf, // I - PDF file
*bufend = buffer + bufsize - 1; // Pointer to end of buffer
PDFIO_DEBUG("_pdfioFileGets(pdf=%p, buffer=%p, bufsize=%lu) bufpos=%ld, buffer=%p, bufptr=%p, bufend=%p\n", pdf, buffer, (unsigned long)bufsize, (long)pdf->bufpos, pdf->buffer, pdf->bufptr, pdf->bufend);
PDFIO_DEBUG("_pdfioFileGets(pdf=%p, buffer=%p, bufsize=%lu) bufpos=%ld, buffer=%p, bufptr=%p, bufend=%p, offset=%lu\n", pdf, buffer, (unsigned long)bufsize, (long)pdf->bufpos, pdf->buffer, pdf->bufptr, pdf->bufend, (unsigned long)(pdf->bufpos + (pdf->bufptr - pdf->buffer)));
while (!eol)
{
@ -356,12 +356,12 @@ _pdfioFileSeek(pdfio_file_t *pdf, // I - PDF file
off_t offset, // I - Offset
int whence) // I - Offset base
{
PDFIO_DEBUG("_pdfioFileSeek(pdf=%p, offset=%ld, whence=%d)\n", pdf, (long)offset, whence);
PDFIO_DEBUG("_pdfioFileSeek(pdf=%p, offset=%ld, whence=%d) pdf->bufpos=%lu\n", pdf, (long)offset, whence, (unsigned long)(pdf ? pdf->bufpos : 0));
// Adjust offset for relative seeks...
if (whence == SEEK_CUR)
{
offset += pdf->bufpos;
offset += pdf->bufpos + (pdf->bufptr - pdf->buffer);
whence = SEEK_SET;
}
@ -404,7 +404,7 @@ _pdfioFileSeek(pdfio_file_t *pdf, // I - PDF file
return (-1);
}
PDFIO_DEBUG("_pdfioFileSeek: Reset bufpos=%ld.\n", (long)pdf->bufpos);
PDFIO_DEBUG("_pdfioFileSeek: Reset bufpos=%ld, offset=%lu.\n", (long)pdf->bufpos, (unsigned long)offset);
PDFIO_DEBUG("_pdfioFileSeek: buffer=%p, bufptr=%p, bufend=%p\n", pdf->buffer, pdf->bufptr, pdf->bufend);
pdf->bufpos = offset;

View File

@ -1131,20 +1131,20 @@ pdfioContentTextMeasure(
if (ch < 128)
{
// ASCII
*tempptr++ = ch;
*tempptr++ = (char)ch;
}
else if (ch < 2048)
{
// 2-byte UTF-8
*tempptr++ = 0xc0 | ((ch >> 6) & 0x1f);
*tempptr++ = 0x80 | (ch & 0x3f);
*tempptr++ = (char)(0xc0 | ((ch >> 6) & 0x1f));
*tempptr++ = (char)(0x80 | (ch & 0x3f));
}
else
{
// 3-byte UTF-8
*tempptr++ = 0xe0 | ((ch >> 12) & 0x0f);
*tempptr++ = 0x80 | ((ch >> 6) & 0x3f);
*tempptr++ = 0x80 | (ch & 0x3f);
*tempptr++ = (char)(0xe0 | ((ch >> 12) & 0x0f));
*tempptr++ = (char)(0x80 | ((ch >> 6) & 0x3f));
*tempptr++ = (char)(0x80 | (ch & 0x3f));
}
}
@ -1152,7 +1152,7 @@ pdfioContentTextMeasure(
s = temp;
}
ttfGetExtents(ttf, size, s, &extents);
ttfGetExtents(ttf, (float)size, s, &extents);
return (extents.right - extents.left);
}
@ -1642,7 +1642,7 @@ pdfioFileCreateFontObjFromFile(
*bufptr++ = (unsigned char)(cmap[i] >> 8);
*bufptr++ = (unsigned char)(cmap[i] & 255);
glyphs[cmap[i]] = i;
glyphs[cmap[i]] = (unsigned short)i;
if (cmap[i] < min_glyph)
min_glyph = cmap[i];
if (cmap[i] > max_glyph)
@ -1727,9 +1727,9 @@ pdfioFileCreateFontObjFromFile(
if ((w_array = pdfioArrayCreate(pdf)) == NULL)
goto done;
for (start = 0, w0 = ttfGetWidth(font, 0), i = 1; i < 65536; start = i, w0 = w1, i ++)
for (start = 0, w0 = ttfGetWidth(font, 0), w1 = 0, i = 1; i < 65536; start = i, w0 = w1, i ++)
{
while (i < 65536 && (w1 = ttfGetWidth(font, i)) == w0)
while (i < 65536 && (w1 = ttfGetWidth(font, (int)i)) == w0)
i ++;
if ((i - start) > 1)
@ -1750,7 +1750,7 @@ pdfioFileCreateFontObjFromFile(
pdfioArrayAppendNumber(temp_array, w0);
for (w0 = w1, i ++; i < 65536; w0 = w1, i ++)
{
if ((w1 = ttfGetWidth(font, i)) == w0 && i < 65535)
if ((w1 = ttfGetWidth(font, (int)i)) == w0 && i < 65535)
break;
pdfioArrayAppendNumber(temp_array, w0);

View File

@ -399,16 +399,23 @@ _pdfioCryptoMakeRandom(uint8_t *buffer, // I - Buffer
//
_pdfio_crypto_cb_t // O - Decryption callback or `NULL` for none
_pdfioCryptoMakeReader(
_pdfioCryptoMakeReader(
pdfio_file_t *pdf, // I - PDF file
pdfio_obj_t *obj, // I - PDF object
_pdfio_crypto_ctx_t *ctx, // I - Pointer to crypto context
uint8_t *iv, // I - Buffer for initialization vector
size_t *ivlen) // IO - Size of initialization vector
{
uint8_t data[21]; /* Key data */
_pdfio_md5_t md5; /* MD5 state */
uint8_t digest[16]; /* MD5 digest value */
uint8_t data[21]; // Key data
_pdfio_md5_t md5; // MD5 state
uint8_t digest[16]; // MD5 digest value
#if PDFIO_OBJ_CRYPT
pdfio_array_t *id_array; // Object ID array
unsigned char *id_value; // Object ID value
size_t id_len; // Length of object ID
uint8_t temp_key[16]; // File key for object
#endif // PDFIO_OBJ_CRYPT
uint8_t *file_key; // Computed file key to use
PDFIO_DEBUG("_pdfioCryptoMakeReader(pdf=%p, obj=%p(%d), ctx=%p, iv=%p, ivlen=%p(%d))\n", pdf, obj, (int)obj->number, ctx, iv, ivlen, (int)*ivlen);
@ -420,6 +427,59 @@ _pdfio_crypto_cb_t // O - Decryption callback or `NULL` for none
return (NULL);
}
#if PDFIO_OBJ_CRYPT
if ((id_array = pdfioDictGetArray(pdfioObjGetDict(obj), "ID")) != NULL)
{
// Object has its own ID that will get used for encryption...
_pdfio_md5_t md5; // MD5 context
uint8_t file_digest[16]; // MD5 digest of file ID and pad
uint8_t user_pad[32], // Padded user password
own_user_key[32], // Calculated user key
pdf_user_key[32]; // Decrypted user key
PDFIO_DEBUG("_pdfioCryptoMakeReader: Per-object file ID.\n");
if ((id_value = pdfioArrayGetBinary(id_array, 0, &id_len)) == NULL)
{
*ivlen = 0;
return (NULL);
}
_pdfioCryptoMD5Init(&md5);
_pdfioCryptoMD5Append(&md5, pdf_passpad, 32);
_pdfioCryptoMD5Append(&md5, id_value, id_len);
_pdfioCryptoMD5Finish(&md5, file_digest);
make_owner_key(pdf->encryption, pdf->password, pdf->owner_key, user_pad);
make_file_key(pdf->encryption, pdf->permissions, id_value, id_len, user_pad, pdf->owner_key, temp_key);
make_user_key(id_value, id_len, own_user_key);
if (memcmp(own_user_key, pdf->user_key, sizeof(own_user_key)))
{
PDFIO_DEBUG("_pdfioCryptoMakeReader: Not user password, trying owner password.\n");
make_file_key(pdf->encryption, pdf->permissions, id_value, id_len, pdf->password, pdf->owner_key, temp_key);
make_user_key(id_value, id_len, own_user_key);
memcpy(pdf_user_key, pdf->user_key, sizeof(pdf_user_key));
decrypt_user_key(pdf->encryption, temp_key, pdf_user_key);
if (memcmp(pdf->password, pdf_user_key, 32) && memcmp(own_user_key, pdf_user_key, 16))
{
*ivlen = 0;
return (NULL);
}
}
file_key = temp_key;
}
else
#endif // PDFIO_OBJ_CRYPT
{
// Use the default file key...
file_key = pdf->file_key;
}
switch (pdf->encryption)
{
default :
@ -428,7 +488,7 @@ _pdfio_crypto_cb_t // O - Decryption callback or `NULL` for none
case PDFIO_ENCRYPTION_RC4_40 :
// Copy the key data for the MD5 hash.
memcpy(data, pdf->file_key, sizeof(pdf->file_key));
memcpy(data, file_key, 16);
data[16] = (uint8_t)obj->number;
data[17] = (uint8_t)(obj->number >> 8);
data[18] = (uint8_t)(obj->number >> 16);
@ -455,7 +515,7 @@ _pdfio_crypto_cb_t // O - Decryption callback or `NULL` for none
case PDFIO_ENCRYPTION_RC4_128 :
// Copy the key data for the MD5 hash.
memcpy(data, pdf->file_key, sizeof(pdf->file_key));
memcpy(data, file_key, 16);
data[16] = (uint8_t)obj->number;
data[17] = (uint8_t)(obj->number >> 8);
data[18] = (uint8_t)(obj->number >> 16);
@ -491,7 +551,7 @@ _pdfio_crypto_cb_t // O - Decryption callback or `NULL` for none
//
_pdfio_crypto_cb_t // O - Encryption callback or `NULL` for none
_pdfioCryptoMakeWriter(
_pdfioCryptoMakeWriter(
pdfio_file_t *pdf, // I - PDF file
pdfio_obj_t *obj, // I - PDF object
_pdfio_crypto_ctx_t *ctx, // I - Pointer to crypto context
@ -596,6 +656,8 @@ _pdfioCryptoUnlock(
revision = (int)pdfioDictGetNumber(encrypt_dict, "R");
length = (int)pdfioDictGetNumber(encrypt_dict, "Length");
PDFIO_DEBUG("_pdfioCryptoUnlock: handler=%p(%s), version=%d, revision=%d, length=%d\n", (void *)handler, handler ? handler : "(null)", version, revision, length);
if (!handler || strcmp(handler, "Standard"))
{
_pdfioFileError(pdf, "Unsupported security handler '%s'.", handler ? handler : "(null)");
@ -644,6 +706,8 @@ _pdfioCryptoUnlock(
}
else
{
PDFIO_DEBUG("_pdfioCryptoUnlock: CFM=\"%s\"\n", cfm);
if (length < 40 || length > 128)
length = 128; // Default to 128 bits
@ -769,17 +833,16 @@ _pdfioCryptoUnlock(
{
// Matches!
memcpy(pdf->file_key, file_key, sizeof(pdf->file_key));
memcpy(pdf->password, pad, sizeof(pdf->password));
return (true);
}
/*
* Not the owner password, try the user password...
*/
// Not the owner password, try the user password...
make_file_key(pdf->encryption, pdf->permissions, file_id, file_idlen, pad, pdf->owner_key, file_key);
PDFIO_DEBUG("_pdfioCryptoUnlock: Fuse=%02X%02X%02X%02X...%02X%02X%02X%02X\n", file_key[0], file_key[1], file_key[2], file_key[3], file_key[12], file_key[13], file_key[14], file_key[15]);
make_user_key(file_id, file_idlen, user_key);
make_user_key(file_id, file_idlen, own_user_key);
memcpy(pdf_user_key, pdf->user_key, sizeof(pdf_user_key));
decrypt_user_key(pdf->encryption, file_key, pdf_user_key);
@ -787,10 +850,12 @@ _pdfioCryptoUnlock(
PDFIO_DEBUG("_pdfioCryptoUnlock: Uuse=%02X%02X%02X%02X...%02X%02X%02X%02X\n", user_key[0], user_key[1], user_key[2], user_key[3], user_key[28], user_key[29], user_key[30], user_key[31]);
PDFIO_DEBUG("_pdfioCryptoUnlock: Updf=%02X%02X%02X%02X...%02X%02X%02X%02X\n", pdf_user_key[0], pdf_user_key[1], pdf_user_key[2], pdf_user_key[3], pdf_user_key[28], pdf_user_key[29], pdf_user_key[30], pdf_user_key[31]);
if (!memcmp(pad, pdf_user_key, 32) || !memcmp(user_key, pdf_user_key, 16))
if (!memcmp(pad, pdf_user_key, 32) || !memcmp(own_user_key, pdf_user_key, 16))
{
// Matches!
memcpy(pdf->file_key, file_key, sizeof(pdf->file_key));
memcpy(pdf->password, pad, sizeof(pdf->password));
return (true);
}
}

View File

@ -158,6 +158,30 @@ pdfioDictCreate(pdfio_file_t *pdf) // I - PDF file
}
//
// '_pdfioDictDecrypt()' - Decrypt the values in a dictionary.
//
bool // O - `true` on success, `false` on error
_pdfioDictDecrypt(pdfio_file_t *pdf, // I - PDF file
pdfio_obj_t *obj, // I - Object
pdfio_dict_t *dict, // I - Dictionary
size_t depth) // I - Depth
{
size_t i; // Looping var
_pdfio_pair_t *pair; // Current pair
for (i = dict->num_pairs, pair = dict->pairs; i > 0; i --, pair ++)
{
if (strcmp(pair->key, "ID") && !_pdfioValueDecrypt(pdf, obj, &pair->value, depth + 1))
return (false);
}
return (true);
}
//
// '_pdfioDictDebug()' - Dump a dictionary to stderr.
//
@ -518,7 +542,7 @@ _pdfioDictRead(pdfio_file_t *pdf, // I - PDF file
_pdfio_value_t value; // Dictionary value
PDFIO_DEBUG("_pdfioDictRead(pdf=%p)\n", pdf);
PDFIO_DEBUG("_pdfioDictRead(pdf=%p, obj=%p, tb=%p, depth=%lu)\n", pdf, obj, tb, (unsigned long)depth);
// Create a dictionary and start reading...
if ((dict = pdfioDictCreate(pdf)) == NULL)
@ -530,6 +554,7 @@ _pdfioDictRead(pdfio_file_t *pdf, // I - PDF file
if (!strcmp(key, ">>"))
{
// End of dictionary...
PDFIO_DEBUG("_pdfioDictRead: Returning dictionary value...\n");
return (dict);
}
else if (key[0] != '/')
@ -548,14 +573,14 @@ _pdfioDictRead(pdfio_file_t *pdf, // I - PDF file
if (!_pdfioValueRead(pdf, obj, tb, &value, depth))
{
_pdfioFileError(pdf, "Missing value for dictionary key.");
_pdfioFileError(pdf, "Missing value for dictionary key '%s'.", key + 1);
break;
}
if (!_pdfioDictSetValue(dict, pdfioStringCreate(pdf, key + 1), &value))
break;
// PDFIO_DEBUG("_pdfioDictRead: Set %s.\n", key);
PDFIO_DEBUG("_pdfioDictRead: Set %s.\n", key);
}
// Dictionary is invalid - pdfioFileClose will free the memory, return NULL

View File

@ -19,7 +19,6 @@
static pdfio_obj_t *add_obj(pdfio_file_t *pdf, size_t number, unsigned short generation, off_t offset);
static int compare_objmaps(_pdfio_objmap_t *a, _pdfio_objmap_t *b);
static int compare_objs(pdfio_obj_t **a, pdfio_obj_t **b);
static const char *get_info_string(pdfio_file_t *pdf, const char *key);
static bool load_obj_stream(pdfio_obj_t *obj);
static bool load_pages(pdfio_file_t *pdf, pdfio_obj_t *obj, size_t depth);
@ -916,21 +915,57 @@ pdfioFileFindObj(
pdfio_file_t *pdf, // I - PDF file
size_t number) // I - Object number (1 to N)
{
pdfio_obj_t key, // Search key
*keyptr, // Pointer to key
**match; // Pointer to match
size_t left, // Left object
right, // Right object
current; // Current object
if (pdf->num_objs > 0)
PDFIO_DEBUG("pdfioFileFindObj(pdf=%p, number=%lu) alloc_objs=%lu, num_objs=%lu, objs=%p\n", (void *)pdf, (unsigned long)number, (unsigned long)(pdf ? pdf->alloc_objs : 0), (unsigned long)(pdf ? pdf->num_objs : 0), (void *)(pdf ? pdf->objs : NULL));
// Range check input...
if (!pdf || pdf->num_objs == 0 || number < 1)
return (NULL);
// Do a binary search for the object...
if ((current = number - 1) >= pdf->num_objs)
current = pdf->num_objs / 2;
PDFIO_DEBUG("pdfioFileFindObj: objs[current=%lu]=%p\n", (unsigned long)current, (void *)pdf->objs[current]);
if (number == pdf->objs[current]->number)
{
key.number = number;
keyptr = &key;
match = (pdfio_obj_t **)bsearch(&keyptr, pdf->objs, pdf->num_objs, sizeof(pdfio_obj_t *), (int (*)(const void *, const void *))compare_objs);
return (match ? *match : NULL);
// Fast match...
return (pdf->objs[current]);
}
else if (number < pdf->objs[current]->number)
{
left = 0;
right = current;
}
else
{
left = current;
right = pdf->num_objs - 1;
}
return (NULL);
while ((right - left) > 1)
{
current = (left + right) / 2;
if (number == pdf->objs[current]->number)
return (pdf->objs[current]);
else if (number < pdf->objs[current]->number)
right = current;
else
left = current;
}
if (number == pdf->objs[left]->number)
return (pdf->objs[left]);
else if (number == pdf->objs[right]->number)
return (pdf->objs[right]);
else
return (NULL);
}
@ -1154,8 +1189,10 @@ pdfioFileOpen(
void *error_data) // I - Error callback data, if any
{
pdfio_file_t *pdf; // PDF file
char line[1024], // Line from file
*ptr; // Pointer into line
char line[1025], // Line from file
*ptr, // Pointer into line
*end; // End of line
ssize_t bytes; // Bytes read
off_t xref_offset; // Offset to xref table
@ -1210,21 +1247,29 @@ pdfioFileOpen(
// Copy the version number...
pdf->version = strdup(line + 5);
// Grab the last 32 characters of the file to find the start of the xref table...
if (_pdfioFileSeek(pdf, -32, SEEK_END) < 0)
// Grab the last 1k of the file to find the start of the xref table...
if (_pdfioFileSeek(pdf, -1024, SEEK_END) < 0)
{
_pdfioFileError(pdf, "Unable to read startxref data.");
goto error;
}
if (_pdfioFileRead(pdf, line, 32) < 32)
if ((bytes = _pdfioFileRead(pdf, line, sizeof(line) - 1)) < 1)
{
_pdfioFileError(pdf, "Unable to read startxref data.");
goto error;
}
line[32] = '\0';
if ((ptr = strstr(line, "startxref")) == NULL)
line[bytes] = '\0';
end = line + bytes - 9;
for (ptr = line; ptr < end; ptr ++)
{
if (!memcmp(ptr, "startxref", 9))
break;
}
if (ptr >= end)
{
_pdfioFileError(pdf, "Unable to find start of xref table.");
goto error;
@ -1375,6 +1420,9 @@ add_obj(pdfio_file_t *pdf, // I - PDF file
off_t offset) // I - Offset in file
{
pdfio_obj_t *obj; // Object
size_t left, // Left object
right, // Right object
current; // Current object (center)
// Allocate memory for the object...
@ -1400,18 +1448,63 @@ add_obj(pdfio_file_t *pdf, // I - PDF file
pdf->alloc_objs += 32;
}
pdf->objs[pdf->num_objs ++] = obj;
obj->pdf = pdf;
obj->number = number;
obj->generation = generation;
obj->offset = offset;
PDFIO_DEBUG("add_obj: obj=%p, ->pdf=%p, ->number=%lu\n", obj, pdf, (unsigned long)obj->number);
PDFIO_DEBUG("add_obj: obj=%p, ->pdf=%p, ->number=%lu, ->offset=%lu\n", obj, pdf, (unsigned long)obj->number, (unsigned long)offset);
// Re-sort object array as needed...
if (pdf->num_objs > 1 && pdf->objs[pdf->num_objs - 2]->number > number)
qsort(pdf->objs, pdf->num_objs, sizeof(pdfio_obj_t *), (int (*)(const void *, const void *))compare_objs);
// Insert object into array as needed...
if (pdf->num_objs == 0 || obj->number > pdf->objs[pdf->num_objs - 1]->number)
{
// Append object...
PDFIO_DEBUG("add_obj: Appending at %lu\n", (unsigned long)pdf->num_objs);
pdf->objs[pdf->num_objs] = obj;
pdf->last_obj = pdf->num_objs;
}
else
{
// Insert object...
if (obj->number < pdf->objs[pdf->last_obj]->number)
{
left = 0;
right = pdf->last_obj;
}
else
{
left = pdf->last_obj;
right = pdf->num_objs - 1;
}
while ((right - left) > 1)
{
current = (left + right) / 2;
if (obj->number < pdf->objs[current]->number)
right = current;
else
left = current;
}
if (obj->number < pdf->objs[left]->number)
current = left;
else if (obj->number < pdf->objs[right]->number)
current = right;
else
current = right;
PDFIO_DEBUG("add_obj: Inserting at %lu\n", (unsigned long)current);
if (current < pdf->num_objs)
memmove(pdf->objs + current + 1, pdf->objs + current, (pdf->num_objs - current) * sizeof(pdfio_obj_t *));
pdf->objs[current] = obj;
pdf->last_obj = current;
}
pdf->num_objs ++;
return (obj);
}
@ -1438,23 +1531,6 @@ compare_objmaps(_pdfio_objmap_t *a, // I - First object map
}
//
// 'compare_objs()' - Compare the object numbers of two objects.
//
static int // O - Result of comparison
compare_objs(pdfio_obj_t **a, // I - First object
pdfio_obj_t **b) // I - Second object
{
if ((*a)->number < (*b)->number)
return (-1);
else if ((*a)->number == (*b)->number)
return (0);
else
return (1);
}
//
// 'get_info_string()' - Get a string value from the Info dictionary.
//
@ -1727,7 +1803,7 @@ load_xref(
pdfio_stream_t *st; // Stream
unsigned char buffer[32]; // Read buffer
size_t num_sobjs = 0, // Number of object streams
sobjs[4096]; // Object streams to load
sobjs[8192]; // Object streams to load
pdfio_obj_t *current; // Current object
if ((number = strtoimax(line, &ptr, 10)) < 1)
@ -1736,7 +1812,7 @@ load_xref(
return (false);
}
if ((generation = (int)strtol(ptr, &ptr, 10)) < 0 || generation > 65535)
if ((generation = (int)strtol(ptr, &ptr, 10)) < 0 || (generation > 65535 && number != 0))
{
_pdfioFileError(pdf, "Bad xref table header '%s'.", line);
return (false);
@ -1976,12 +2052,32 @@ load_xref(
else if (!strncmp(line, "xref", 4) && (!line[4] || isspace(line[4] & 255)))
{
// Read the xref tables
off_t trailer_offset = _pdfioFileTell(pdf);
// Offset of current line
PDFIO_DEBUG("load_xref: Reading xref table starting at offset %lu\n", (unsigned long)trailer_offset);
while (_pdfioFileGets(pdf, line, sizeof(line)))
{
PDFIO_DEBUG("load_xref: '%s' at offset %lu\n", line, (unsigned long)trailer_offset);
if (!strncmp(line, "trailer", 7) && (!line[7] || isspace(line[7] & 255)))
{
if (line[7])
{
// Probably the start of the trailer dictionary, rewind the file so
// we can read it...
_pdfioFileSeek(pdf, trailer_offset + 7, SEEK_SET);
}
break;
else if (!line[0])
continue;
}
else
{
trailer_offset = _pdfioFileTell(pdf);
if (!line[0])
continue;
}
if (sscanf(line, "%jd%jd", &number, &num_objects) != 2)
{
@ -2012,7 +2108,7 @@ load_xref(
return (false);
}
if ((generation = (int)strtol(ptr, &ptr, 10)) < 0 || generation > 65535)
if ((generation = (int)strtol(ptr, &ptr, 10)) < 0 || (generation > 65535 && offset != 0))
{
_pdfioFileError(pdf, "Malformed xref table entry '%s'.", line);
return (false);
@ -2041,6 +2137,8 @@ load_xref(
if (!add_obj(pdf, (size_t)number, (unsigned short)generation, offset))
return (false);
}
trailer_offset = _pdfioFileTell(pdf);
}
if (strncmp(line, "trailer", 7))
@ -2091,8 +2189,19 @@ load_xref(
PDFIO_DEBUG_VALUE(&trailer);
PDFIO_DEBUG("\n");
if ((xref_offset = (off_t)pdfioDictGetNumber(trailer.value.dict, "Prev")) <= 0)
off_t new_offset = (off_t)pdfioDictGetNumber(trailer.value.dict, "Prev");
if (new_offset <= 0)
{
done = true;
}
else if (new_offset == xref_offset)
{
_pdfioFileError(pdf, "Recursive xref table.");
return (false);
}
xref_offset = new_offset;
}
// Once we have all of the xref tables loaded, get the important objects and

View File

@ -453,9 +453,7 @@ _pdfioObjLoad(pdfio_obj_t *obj) // I - Object
return (false);
}
PDFIO_DEBUG("_pdfioObjLoad: tb.bufptr=%p, tb.bufend=%p, tb.bufptr[0]=0x%02x, tb.bufptr[0]=0x%02x\n", tb.bufptr, tb.bufend, tb.bufptr[0], tb.bufptr[1]);
if (tb.bufptr && tb.bufptr < tb.bufend && (tb.bufptr[0] == 0x0d || tb.bufptr[0] == 0x0a))
tb.bufptr ++; // Skip trailing CR or LF after token
PDFIO_DEBUG("_pdfioObjLoad: tb.bufptr=%p, tb.bufend=%p, tb.bufptr[0]=0x%02x, tb.bufptr[1]=0x%02x\n", tb.bufptr, tb.bufend, tb.bufptr[0], tb.bufptr[1]);
_pdfioTokenFlush(&tb);
@ -466,6 +464,18 @@ _pdfioObjLoad(pdfio_obj_t *obj) // I - Object
PDFIO_DEBUG("_pdfioObjLoad: stream_offset=%lu.\n", (unsigned long)obj->stream_offset);
}
// Decrypt as needed...
if (obj->pdf->encryption)
{
PDFIO_DEBUG("_pdfioObjLoad: Decrypting value...\n");
if (!_pdfioValueDecrypt(obj->pdf, obj, &obj->value, 0))
{
PDFIO_DEBUG("_pdfioObjLoad: Failed to decrypt.\n");
return (false);
}
}
PDFIO_DEBUG("_pdfioObjLoad: ");
PDFIO_DEBUG_VALUE(&obj->value);
PDFIO_DEBUG("\n");

View File

@ -237,7 +237,8 @@ struct _pdfio_file_s // PDF file structure
pdfio_permission_t permissions; // Access permissions (encrypted PDF files)
uint8_t file_key[16], // File encryption key
owner_key[32], // Owner encryption key
user_key[32]; // User encryption key
user_key[32], // User encryption key
password[32]; // Padded password
size_t file_keylen, // Length of file encryption key
owner_keylen, // Length of owner encryption key
user_keylen; // Length of user encryption key
@ -265,7 +266,8 @@ struct _pdfio_file_s // PDF file structure
alloc_dicts; // Allocated dictionaries
pdfio_dict_t **dicts; // Dictionaries
size_t num_objs, // Number of objects
alloc_objs; // Allocated objects
alloc_objs, // Allocated objects
last_obj; // Last object added
pdfio_obj_t **objs, // Objects
*current_obj; // Current object being written/read
size_t num_objmaps, // Number of object maps
@ -320,6 +322,7 @@ struct _pdfio_stream_s // Stream
// Functions...
//
extern bool _pdfioArrayDecrypt(pdfio_file_t *pdf, pdfio_obj_t *obj, pdfio_array_t *a, size_t depth) _PDFIO_INTERNAL;
extern void _pdfioArrayDebug(pdfio_array_t *a, FILE *fp) _PDFIO_INTERNAL;
extern void _pdfioArrayDelete(pdfio_array_t *a) _PDFIO_INTERNAL;
extern _pdfio_value_t *_pdfioArrayGetValue(pdfio_array_t *a, size_t n) _PDFIO_INTERNAL;
@ -344,6 +347,7 @@ extern void _pdfioCryptoSHA256Finish(_pdfio_sha256_t *ctx, uint8_t *Message_Dig
extern bool _pdfioCryptoUnlock(pdfio_file_t *pdf, pdfio_password_cb_t password_cb, void *password_data) _PDFIO_INTERNAL;
extern void _pdfioDictClear(pdfio_dict_t *dict, const char *key) _PDFIO_INTERNAL;
extern bool _pdfioDictDecrypt(pdfio_file_t *pdf, pdfio_obj_t *obj, pdfio_dict_t *dict, size_t depth) _PDFIO_INTERNAL;
extern void _pdfioDictDebug(pdfio_dict_t *dict, FILE *fp) _PDFIO_INTERNAL;
extern void _pdfioDictDelete(pdfio_dict_t *dict) _PDFIO_INTERNAL;
extern _pdfio_value_t *_pdfioDictGetValue(pdfio_dict_t *dict, const char *key) _PDFIO_INTERNAL;
@ -387,6 +391,7 @@ extern void _pdfioTokenPush(_pdfio_token_t *tb, const char *token) _PDFIO_INTER
extern bool _pdfioTokenRead(_pdfio_token_t *tb, char *buffer, size_t bufsize);
extern _pdfio_value_t *_pdfioValueCopy(pdfio_file_t *pdfdst, _pdfio_value_t *vdst, pdfio_file_t *pdfsrc, _pdfio_value_t *vsrc) _PDFIO_INTERNAL;
extern bool _pdfioValueDecrypt(pdfio_file_t *pdf, pdfio_obj_t *obj, _pdfio_value_t *v, size_t depth) _PDFIO_INTERNAL;
extern void _pdfioValueDebug(_pdfio_value_t *v, FILE *fp) _PDFIO_INTERNAL;
extern void _pdfioValueDelete(_pdfio_value_t *v) _PDFIO_INTERNAL;
extern _pdfio_value_t *_pdfioValueRead(pdfio_file_t *pdf, pdfio_obj_t *obj, _pdfio_token_t *ts, _pdfio_value_t *v, size_t depth) _PDFIO_INTERNAL;

View File

@ -1066,7 +1066,7 @@ stream_read(pdfio_stream_t *st, // I - Stream
if ((status = inflate(&(st->flate), Z_NO_FLUSH)) < Z_OK)
{
_pdfioFileError(st->pdf, "Unable to decompress stream data: %s", zstrerror(status));
_pdfioFileError(st->pdf, "Unable to decompress stream data for object %ld: %s", (long)st->obj->number, zstrerror(status));
return (-1);
}
else if (avail_in == st->flate.avail_in && avail_out == st->flate.avail_out)
@ -1127,7 +1127,7 @@ stream_read(pdfio_stream_t *st, // I - Stream
if ((status = inflate(&(st->flate), Z_NO_FLUSH)) < Z_OK)
{
_pdfioFileError(st->pdf, "Unable to decompress stream data: %s", zstrerror(status));
_pdfioFileError(st->pdf, "Unable to decompress stream data for object %ld: %s", (long)st->obj->number, zstrerror(status));
return (-1);
}
else if (status == Z_STREAM_END || (avail_in == st->flate.avail_in && avail_out == st->flate.avail_out))
@ -1197,7 +1197,7 @@ stream_read(pdfio_stream_t *st, // I - Stream
if ((status = inflate(&(st->flate), Z_NO_FLUSH)) < Z_OK)
{
_pdfioFileError(st->pdf, "Unable to decompress stream data: %s", zstrerror(status));
_pdfioFileError(st->pdf, "Unable to decompress stream data for object %ld: %s", (long)st->obj->number, zstrerror(status));
return (-1);
}
else if (status == Z_STREAM_END || (avail_in == st->flate.avail_in && avail_out == st->flate.avail_out))

View File

@ -208,9 +208,10 @@ _pdfioTokenRead(_pdfio_token_t *tb, // I - Token buffer/stack
*bufend, // End of buffer
state = '\0'; // Current state
bool saw_nul = false; // Did we see a nul character?
size_t count = 0; // Number of whitespace/comment bytes
//
// "state" is:
//
// - '\0' for idle
@ -229,21 +230,45 @@ _pdfioTokenRead(_pdfio_token_t *tb, // I - Token buffer/stack
// Skip leading whitespace...
while ((ch = get_char(tb)) != EOF)
{
count ++;
if (ch == '%')
{
// Skip comment
PDFIO_DEBUG("_pdfioTokenRead: Skipping comment...\n");
while ((ch = get_char(tb)) != EOF)
{
count ++;
if (ch == '\n' || ch == '\r')
{
break;
}
else if (count > 2048)
{
_pdfioFileError(tb->pdf, "Comment too long.");
*bufptr = '\0';
return (false);
}
}
}
else if (!isspace(ch))
{
break;
}
else if (count > 2048)
{
_pdfioFileError(tb->pdf, "Too much whitespace.");
*bufptr = '\0';
return (false);
}
}
if (ch == EOF)
{
*bufptr = '\0';
return (false);
}
// Check for delimiters...
if (strchr(PDFIO_DELIM_CHARS, ch) != NULL)
@ -263,6 +288,8 @@ _pdfioTokenRead(_pdfio_token_t *tb, // I - Token buffer/stack
*bufptr++ = (char)ch;
}
PDFIO_DEBUG("_pdfioTokenRead: state='%c'\n", state);
switch (state)
{
case '(' : // Literal string
@ -354,6 +381,7 @@ _pdfioTokenRead(_pdfio_token_t *tb, // I - Token buffer/stack
{
// Out of space
_pdfioFileError(tb->pdf, "Token too large.");
*bufptr = '\0';
return (false);
}
}
@ -361,6 +389,7 @@ _pdfioTokenRead(_pdfio_token_t *tb, // I - Token buffer/stack
if (ch != ')')
{
_pdfioFileError(tb->pdf, "Unterminated string literal.");
*bufptr = '\0';
return (false);
}
@ -380,6 +409,7 @@ _pdfioTokenRead(_pdfio_token_t *tb, // I - Token buffer/stack
{
// Out of space...
_pdfioFileError(tb->pdf, "Token too large.");
*bufptr = '\0';
return (false);
}
@ -413,9 +443,17 @@ _pdfioTokenRead(_pdfio_token_t *tb, // I - Token buffer/stack
{
// Out of space...
_pdfioFileError(tb->pdf, "Token too large.");
*bufptr = '\0';
return (false);
}
}
if (ch == '\r')
{
// Look for a trailing LF
if ((ch = get_char(tb)) != EOF && ch != '\n')
tb->bufptr --;
}
break;
case 'N' : // number
@ -424,6 +462,7 @@ _pdfioTokenRead(_pdfio_token_t *tb, // I - Token buffer/stack
if (!isdigit(ch) && ch != '.')
{
// End of number...
PDFIO_DEBUG("_pdfioTokenRead: End of number with ch=0x%02x\n", ch);
tb->bufptr --;
break;
}
@ -436,6 +475,7 @@ _pdfioTokenRead(_pdfio_token_t *tb, // I - Token buffer/stack
{
// Out of space...
_pdfioFileError(tb->pdf, "Token too large.");
*bufptr = '\0';
return (false);
}
}
@ -462,12 +502,17 @@ _pdfioTokenRead(_pdfio_token_t *tb, // I - Token buffer/stack
if (!isxdigit(tch & 255))
{
_pdfioFileError(tb->pdf, "Bad # escape in name.");
*bufptr = '\0';
return (false);
}
else if (isdigit(tch))
{
ch = ((ch & 255) << 4) | (tch - '0');
}
else
{
ch = ((ch & 255) << 4) | (tolower(tch) - 'a' + 10);
}
}
}
@ -479,9 +524,17 @@ _pdfioTokenRead(_pdfio_token_t *tb, // I - Token buffer/stack
{
// Out of space
_pdfioFileError(tb->pdf, "Token too large.");
*bufptr = '\0';
return (false);
}
}
if (bufptr == (buffer + 1))
{
_pdfioFileError(tb->pdf, "Empty name.");
*bufptr = '\0';
return (false);
}
break;
case '<' : // Potential hex string
@ -501,9 +554,12 @@ _pdfioTokenRead(_pdfio_token_t *tb, // I - Token buffer/stack
else if (!isspace(ch & 255) && !isxdigit(ch & 255))
{
_pdfioFileError(tb->pdf, "Syntax error: '<%c'", ch);
*bufptr = '\0';
return (false);
}
count = 0;
do
{
if (isxdigit(ch))
@ -512,25 +568,39 @@ _pdfioTokenRead(_pdfio_token_t *tb, // I - Token buffer/stack
{
// Hex digit
*bufptr++ = (char)ch;
count = 0;
}
else
{
// Too large
_pdfioFileError(tb->pdf, "Token too large.");
*bufptr = '\0';
return (false);
}
}
else if (!isspace(ch))
{
_pdfioFileError(tb->pdf, "Invalid hex string character '%c'.", ch);
*bufptr = '\0';
return (false);
}
else
{
count ++;
if (count > 2048)
{
_pdfioFileError(tb->pdf, "Too much whitespace.");
*bufptr = '\0';
return (false);
}
}
}
while ((ch = get_char(tb)) != EOF && ch != '>');
if (ch == EOF)
{
_pdfioFileError(tb->pdf, "Unterminated hex string.");
*bufptr = '\0';
return (false);
}
break;
@ -543,6 +613,7 @@ _pdfioTokenRead(_pdfio_token_t *tb, // I - Token buffer/stack
else
{
_pdfioFileError(tb->pdf, "Syntax error: '>%c'.", ch);
*bufptr = '\0';
return (false);
}
break;
@ -550,7 +621,7 @@ _pdfioTokenRead(_pdfio_token_t *tb, // I - Token buffer/stack
*bufptr = '\0';
// PDFIO_DEBUG("_pdfioTokenRead: Read '%s'.\n", buffer);
PDFIO_DEBUG("_pdfioTokenRead: Read '%s'.\n", buffer);
return (bufptr > buffer);
}
@ -587,7 +658,6 @@ get_char(_pdfio_token_t *tb) // I - Token buffer
tb->bufptr = tb->buffer;
tb->bufend = tb->buffer + bytes;
#if 0
#ifdef DEBUG
unsigned char *ptr; // Pointer into buffer
@ -601,7 +671,6 @@ get_char(_pdfio_token_t *tb) // I - Token buffer
}
PDFIO_DEBUG("'\n");
#endif // DEBUG
#endif // 0
}
// Return the next character...

View File

@ -10,6 +10,13 @@
#include "pdfio-private.h"
//
// Local functions...
//
static time_t get_date_time(const char *s);
//
// '_pdfioValueCopy()' - Copy a value to a PDF file.
//
@ -105,6 +112,101 @@ _pdfioValueCopy(pdfio_file_t *pdfdst, // I - Destination PDF file
}
//
// '_pdfioValueDecrypt()' - Decrypt a value.
//
bool // O - `true` on success, `false` on error
_pdfioValueDecrypt(pdfio_file_t *pdf, // I - PDF file
pdfio_obj_t *obj, // I - Object
_pdfio_value_t *v, // I - Value
size_t depth)// I - Depth
{
_pdfio_crypto_ctx_t ctx; // Decryption context
_pdfio_crypto_cb_t cb; // Decryption callback
size_t ivlen; // Number of initialization vector bytes
uint8_t temp[32768]; // Temporary buffer for decryption
size_t templen; // Number of actual data bytes
time_t timeval; // Date/time value
if (depth > PDFIO_MAX_DEPTH)
{
_pdfioFileError(pdf, "Value too deep.");
return (false);
}
switch (v->type)
{
default :
// Do nothing
break;
case PDFIO_VALTYPE_ARRAY :
return (_pdfioArrayDecrypt(pdf, obj, v->value.array, depth + 1));
break;
case PDFIO_VALTYPE_DICT :
return (_pdfioDictDecrypt(pdf, obj, v->value.dict, depth + 1));
break;
case PDFIO_VALTYPE_BINARY :
// Decrypt the binary string...
if (v->value.binary.datalen > (sizeof(temp) - 32))
{
_pdfioFileError(pdf, "Unable to read encrypted binary string - too long.");
return (false);
}
ivlen = v->value.binary.datalen;
if ((cb = _pdfioCryptoMakeReader(pdf, obj, &ctx, v->value.binary.data, &ivlen)) == NULL)
return (false);
templen = (cb)(&ctx, temp, v->value.binary.data + ivlen, v->value.binary.datalen - ivlen);
// Copy the decrypted string back to the value and adjust the length...
memcpy(v->value.binary.data, temp, templen);
if (pdf->encryption >= PDFIO_ENCRYPTION_AES_128)
v->value.binary.datalen = templen - temp[templen - 1];
else
v->value.binary.datalen = templen;
break;
case PDFIO_VALTYPE_STRING :
// Decrypt regular string...
templen = strlen(v->value.string);
if (templen > (sizeof(temp) - 33))
{
_pdfioFileError(pdf, "Unable to read encrypted string - too long.");
return (false);
}
ivlen = templen;
if ((cb = _pdfioCryptoMakeReader(pdf, obj, &ctx, (uint8_t *)v->value.string, &ivlen)) == NULL)
return (false);
templen = (cb)(&ctx, temp, (uint8_t *)v->value.string + ivlen, templen - ivlen);
temp[templen] = '\0';
if ((timeval = get_date_time((char *)temp)) != 0)
{
// Change the type to date...
v->type = PDFIO_VALTYPE_DATE;
v->value.date = timeval;
}
else
{
// Copy the decrypted string back to the value...
v->value.string = pdfioStringCreate(pdf, (char *)temp);
}
break;
}
return (true);
}
//
// '_pdfioValueDebug()' - Print the contents of a value.
//
@ -196,6 +298,7 @@ _pdfioValueRead(pdfio_file_t *pdf, // I - PDF file
size_t depth) // I - Depth of value
{
char token[32768]; // Token buffer
time_t timeval; // Date/time value
#ifdef DEBUG
static const char * const valtypes[] =
{
@ -245,73 +348,10 @@ _pdfioValueRead(pdfio_file_t *pdf, // I - PDF file
if ((v->value.dict = _pdfioDictRead(pdf, obj, tb, depth + 1)) == NULL)
return (NULL);
}
else if (!strncmp(token, "(D:", 3))
else if (!strncmp(token, "(D:", 3) && (timeval = get_date_time(token + 1)) != 0)
{
// Possible date value of the form:
//
// (D:YYYYMMDDhhmmssZ)
// (D:YYYYMMDDhhmmss+HH'mm)
// (D:YYYYMMDDhhmmss-HH'mm)
//
int i; // Looping var
struct tm dateval; // Date value
int offset; // Date offset
for (i = 3; i < 17; i ++)
{
if (!isdigit(token[i] & 255))
break;
}
if (i >= 17)
{
if (token[i] == 'Z')
{
i ++;
}
else if (token[i] == '-' || token[i] == '+')
{
if (isdigit(token[i + 1] & 255) && isdigit(token[i + 2] & 255) && token[i + 3] == '\'' && isdigit(token[i + 4] & 255) && isdigit(token[i + 5] & 255))
{
i += 6;
if (token[i] == '\'')
i ++;
}
}
}
if (token[i])
{
// Just a string...
v->type = PDFIO_VALTYPE_STRING;
v->value.string = pdfioStringCreate(pdf, token + 1);
}
else
{
// Date value...
memset(&dateval, 0, sizeof(dateval));
dateval.tm_year = (token[3] - '0') * 1000 + (token[4] - '0') * 100 + (token[5] - '0') * 10 + token[6] - '0' - 1900;
dateval.tm_mon = (token[7] - '0') * 10 + token[8] - '0' - 1;
dateval.tm_mday = (token[9] - '0') * 10 + token[10] - '0';
dateval.tm_hour = (token[11] - '0') * 10 + token[12] - '0';
dateval.tm_min = (token[13] - '0') * 10 + token[14] - '0';
dateval.tm_sec = (token[15] - '0') * 10 + token[16] - '0';
if (token[17] == 'Z')
{
offset = 0;
}
else
{
offset = (token[18] - '0') * 600 + (token[19] - '0') * 60 + (token[20] - '0') * 10 + token[21] - '0';
if (token[17] == '-')
offset = -offset;
}
v->type = PDFIO_VALTYPE_DATE;
v->value.date = mktime(&dateval) + offset;
}
v->type = PDFIO_VALTYPE_DATE;
v->value.date = timeval;
}
else if (token[0] == '(')
{
@ -363,36 +403,6 @@ _pdfioValueRead(pdfio_file_t *pdf, // I - PDF file
*dataptr++ = (unsigned char)d;
}
if (obj && pdf->encryption)
{
// Decrypt the string...
_pdfio_crypto_ctx_t ctx; // Decryption context
_pdfio_crypto_cb_t cb; // Decryption callback
size_t ivlen; // Number of initialization vector bytes
uint8_t temp[32768]; // Temporary buffer for decryption
size_t templen; // Number of actual data bytes
if (v->value.binary.datalen > (sizeof(temp) - 32))
{
_pdfioFileError(pdf, "Unable to read encrypted binary string - too long.");
return (false);
}
ivlen = v->value.binary.datalen;
if ((cb = _pdfioCryptoMakeReader(pdf, obj, &ctx, v->value.binary.data, &ivlen)) == NULL)
return (false);
templen = (cb)(&ctx, temp, v->value.binary.data + ivlen, v->value.binary.datalen - ivlen);
// Copy the decrypted string back to the value and adjust the length...
memcpy(v->value.binary.data, temp, templen);
if (pdf->encryption >= PDFIO_ENCRYPTION_AES_128)
v->value.binary.datalen = templen - temp[templen - 1];
else
v->value.binary.datalen = templen;
}
}
else if (strchr("0123456789-+.", token[0]) != NULL)
{
@ -728,3 +738,76 @@ _pdfioValueWrite(pdfio_file_t *pdf, // I - PDF file
return (false);
}
//
// 'get_date_time()' - Convert PDF date/time value to time_t.
//
static time_t // O - Time in seconds
get_date_time(const char *s) // I - PDF date/time value
{
int i; // Looping var
struct tm dateval; // Date value
int offset; // Date offset
// Possible date value of the form:
//
// (D:YYYYMMDDhhmmssZ)
// (D:YYYYMMDDhhmmss+HH'mm)
// (D:YYYYMMDDhhmmss-HH'mm)
//
for (i = 2; i < 16; i ++)
{
if (!isdigit(s[i] & 255) || !s[i])
break;
}
if (i >= 16)
{
if (s[i] == 'Z')
{
i ++;
}
else if (s[i] == '-' || s[i] == '+')
{
if (isdigit(s[i + 1] & 255) && isdigit(s[i + 2] & 255) && s[i + 3] == '\'' && isdigit(s[i + 4] & 255) && isdigit(s[i + 5] & 255))
{
i += 6;
if (s[i] == '\'')
i ++;
}
}
}
if (s[i])
{
// Just a string...
return (0);
}
// Date value...
memset(&dateval, 0, sizeof(dateval));
dateval.tm_year = (s[2] - '0') * 1000 + (s[3] - '0') * 100 + (s[4] - '0') * 10 + s[5] - '0' - 1900;
dateval.tm_mon = (s[6] - '0') * 10 + s[7] - '0' - 1;
dateval.tm_mday = (s[8] - '0') * 10 + s[9] - '0';
dateval.tm_hour = (s[10] - '0') * 10 + s[11] - '0';
dateval.tm_min = (s[12] - '0') * 10 + s[13] - '0';
dateval.tm_sec = (s[14] - '0') * 10 + s[15] - '0';
if (s[16] == 'Z')
{
offset = 0;
}
else
{
offset = (s[17] - '0') * 600 + (s[18] - '0') * 60 + (s[19] - '0') * 10 + s[20] - '0';
if (s[16] == '-')
offset = -offset;
}
return (mktime(&dateval) + offset);
}

View File

@ -9,5 +9,5 @@ Version: @PDFIO_VERSION@
URL: https://www.msweet.org/pdfio
Requires: @PKGCONFIG_REQUIRES@
Libs: @PKGCONFIG_LIBS@
Libs.private: @PKGCONFIG_LIBS_PRIVATE@ -lm
Libs.private: @PKGCONFIG_LIBS_PRIVATE@
Cflags: @PKGCONFIG_CFLAGS@

View File

@ -13,7 +13,7 @@
<requireLicenseAcceptance>false</requireLicenseAcceptance>
<description>PDFio Library for VS2019+</description>
<summary>PDFio is a simple C library for reading and writing PDF files. PDFio is licensed under the Apache License Version 2.0 with an (optional) exception to allow linking against GNU GPL2-only software.</summary>
<copyright>Copyright © 2019-2023 by Michael R Sweet</copyright>
<copyright>Copyright © 2019-2024 by Michael R Sweet</copyright>
<tags>pdf file native</tags>
<dependencies>
<dependency id="pdfio_native.redist" version="1.2.0" />

View File

@ -13,7 +13,7 @@
<requireLicenseAcceptance>false</requireLicenseAcceptance>
<description>PDFio Library for VS2019+</description>
<summary>PDFio is a simple C library for reading and writing PDF files. This package provides the redistributable content for the PDFio library. PDFio is licensed under the Apache License Version 2.0 with an (optional) exception to allow linking against GNU GPL2-only software.</summary>
<copyright>Copyright © 2019-2023 by Michael R Sweet</copyright>
<copyright>Copyright © 2019-2024 by Michael R Sweet</copyright>
<tags>pdf file native</tags>
<dependencies>
<dependency id="zlib_native.redist" version="1.2.11" />

2
ttf.c
View File

@ -1307,7 +1307,7 @@ read_cmap(ttf_t *font) // I - Font
{
// Use an "obscure indexing trick" (words from the spec, not
// mine) to look up the glyph index...
temp = segment->idRangeOffset / 2 - segCount + (ch - segment->startCode) + (segment - segments);
temp = (int)(segment->idRangeOffset / 2 - segCount + (ch - segment->startCode) + (segment - segments));
TTF_DEBUG("read_cmap: ch=%d, temp=%d\n", ch, temp);
if (temp < 0 || temp >= numGlyphIdArray)