Compare commits
83 Commits
Author | SHA1 | Date | |
---|---|---|---|
4bbb8b0b38 | |||
1657e89ddb | |||
afa6d4c4de | |||
31a086e165 | |||
01cc243bcf | |||
25f5e28e56 | |||
e6588d3960 | |||
8f706b9fe7 | |||
f9c07a0346 | |||
a22957baa1 | |||
d7f3c64f63 | |||
29eea131b9 | |||
2dcef0936e | |||
20dd2a6d28 | |||
4219b8fd77 | |||
064e7fa473 | |||
ea9b7843fc | |||
755efe08da | |||
0391df5bbd | |||
49efd97cab | |||
d7eb1fc540 | |||
7afefda326 | |||
cbea3ecc2a | |||
130cef8702 | |||
0bd9edc845 | |||
fe755eac3d | |||
8cca645835 | |||
b8ea9ea064 | |||
2874022aa4 | |||
3befcf2fd5 | |||
3b2f7e21d9 | |||
7e01069c5a | |||
88839ccb56 | |||
ebd5aab39b | |||
71d33c03ff | |||
cfe91b4ea2 | |||
458f366d78 | |||
4165cd23ba | |||
7e56d26ff8 | |||
712b213ec6 | |||
b7b6655db0 | |||
e9debcd169 | |||
2f925ccd3c | |||
89c2a75376 | |||
1237599dea | |||
6e2e4bbcc6 | |||
d535067c91 | |||
e996898b57 | |||
aa6a20c042 | |||
f09105dd3f | |||
5be5552b2b | |||
492a4f51b2 | |||
44827bac1a | |||
3fad0d6f15 | |||
aeee24b856 | |||
8d72f22efe | |||
77117ac789 | |||
fceb5a807d | |||
4f123c2a01 | |||
c4c8fa6036 | |||
9a5c5ec65d | |||
3f4308b68d | |||
9e930a7c5d | |||
afa010cea2 | |||
c26b200a83 | |||
eff02198ab | |||
5f98c7838c | |||
4f880bc0c1 | |||
d032483ed4 | |||
b2fc82f3a8 | |||
b81d01f319 | |||
1b35321615 | |||
990342f2a5 | |||
7f5fc456bc | |||
7c527cc908 | |||
41d17fc4e3 | |||
4e89137689 | |||
e686669b9d | |||
1e5cc6ffd5 | |||
4f1b373232 | |||
6f4bfe107f | |||
5b5de3aff6 | |||
8b2b013b36 |
2
.github/workflows/build.yml
vendored
@ -17,7 +17,7 @@ jobs:
|
||||
- name: Update Build Environment
|
||||
run: sudo apt-get update --fix-missing -y
|
||||
- name: Install Prerequisites
|
||||
run: sudo apt-get install -y cppcheck zlib1g-dev
|
||||
run: sudo apt-get install -y cppcheck zlib1g-dev libpng-dev
|
||||
- name: Configure PDFio
|
||||
run: ./configure --enable-debug --enable-sanitizer --enable-maintainer
|
||||
- name: Build PDFio
|
||||
|
2
.github/workflows/codeql.yml
vendored
@ -32,7 +32,7 @@ jobs:
|
||||
run: sudo apt-get update --fix-missing -y
|
||||
|
||||
- name: Install Prerequisites
|
||||
run: sudo apt-get install -y zlib1g-dev
|
||||
run: sudo apt-get install -y zlib1g-dev libpng-dev
|
||||
|
||||
- name: Initialize CodeQL
|
||||
uses: github/codeql-action/init@v2
|
||||
|
2
.github/workflows/coverity.yml
vendored
@ -12,7 +12,7 @@ jobs:
|
||||
- name: Update Build Environment
|
||||
run: sudo apt-get update --fix-missing -y
|
||||
- name: Install Prerequisites
|
||||
run: sudo apt-get install -y zlib1g-dev
|
||||
run: sudo apt-get install -y zlib1g-dev libpng-dev
|
||||
- name: Download Coverity Build Tool
|
||||
run: |
|
||||
wget -q https://scan.coverity.com/download/linux64 --post-data token="$TOKEN&project=$GITHUB_REPOSITORY" -O cov-analysis-linux64.tar.gz
|
||||
|
1
.gitignore
vendored
@ -16,6 +16,7 @@
|
||||
/examples/md2pdf
|
||||
/examples/pdf2text
|
||||
/examples/pdfioinfo
|
||||
/examples/pdfiomerge
|
||||
/Makefile
|
||||
/packages
|
||||
/pdfio.pc
|
||||
|
54
CHANGES.md
@ -1,6 +1,60 @@
|
||||
Changes in PDFio
|
||||
================
|
||||
|
||||
|
||||
v1.5.3 - 2025-05-03
|
||||
-------------------
|
||||
|
||||
- Fixed decryption of PDF files "protected" by 40-bit RC4 (Issue #42)
|
||||
- Fixed decryption of UTF-16 strings (Issue #42)
|
||||
- Fixed decryption of PDF files with large permission values.
|
||||
- Fixed support for EncryptMetadata key in the encryption dictionary.
|
||||
- Fixed `pdfioObjCopy` and `pdfioPageCopy` to properly identify the source PDF
|
||||
file being used (Issue #125)
|
||||
|
||||
|
||||
v1.5.2 - 2025-04-12
|
||||
-------------------
|
||||
|
||||
- Updated maximum allowed PDF string size to 64k (Issue #117)
|
||||
- Updated dictionary reading code to discard duplicate key/value pairs with a
|
||||
warning message (Issue #118)
|
||||
- Fixed form detection in `pdfioinfo` example code (Issue #114)
|
||||
- Fixed parsing of certain date/time values (Issue #115)
|
||||
- Fixed support for empty name values (Issue #116)
|
||||
- Fixed range checking in `pdfioImageGetBytesPerLine` (Issue #121)
|
||||
|
||||
|
||||
v1.5.1 - 2025-03-28
|
||||
-------------------
|
||||
|
||||
- Fixed output of special characters in name values (Issue #106)
|
||||
- Fixed output of special characters in string values (Issue #107)
|
||||
- Fixed output of large integers in dictionaries (Issue #108)
|
||||
- Fixed handling of 0-length streams (Issue #111)
|
||||
- Fixed detection of UTF-16 Big-Endian strings (Issue #112)
|
||||
|
||||
|
||||
v1.5.0 - 2025-03-06
|
||||
-------------------
|
||||
|
||||
- Added support for embedded color profiles in JPEG images (Issue #7)
|
||||
- Added `pdfioFileCreateICCObjFromData` API.
|
||||
- Added support for writing cross-reference streams for PDF 1.5 and newer files
|
||||
(Issue #10)
|
||||
- Added `pdfioFileGetModDate()` API (Issue #88)
|
||||
- Added support for using libpng to embed PNG images in PDF output (Issue #90)
|
||||
- Added support for writing the PCLm subset of PDF (Issue #99)
|
||||
- Now support opening damaged PDF files (Issue #45)
|
||||
- Updated documentation (Issue #95)
|
||||
- Updated the pdf2txt example to support font encodings.
|
||||
- Fixed potential heap/integer overflow issues in the TrueType cmap code.
|
||||
- Fixed an output issue for extremely small `double` values with the
|
||||
`pdfioContent` APIs.
|
||||
- Fixed a missing Widths array issue for embedded TrueType fonts.
|
||||
- Fixed some Unicode font embedding issues.
|
||||
|
||||
|
||||
v1.4.1 - 2025-01-24
|
||||
-------------------
|
||||
|
||||
|
@ -15,7 +15,7 @@
|
||||
.SILENT:
|
||||
|
||||
|
||||
# Version number...
|
||||
# Version numbers...
|
||||
PDFIO_VERSION = @PDFIO_VERSION@
|
||||
PDFIO_VERSION_MAJOR = @PDFIO_VERSION_MAJOR@
|
||||
PDFIO_VERSION_MINOR = @PDFIO_VERSION_MINOR@
|
||||
|
256
configure
vendored
@ -1,6 +1,6 @@
|
||||
#! /bin/sh
|
||||
# Guess values for system-dependent variables and create Makefiles.
|
||||
# Generated by GNU Autoconf 2.71 for pdfio 1.4.1.
|
||||
# Generated by GNU Autoconf 2.71 for pdfio 1.5.3.
|
||||
#
|
||||
# Report bugs to <https://github.com/michaelrsweet/pdfio/issues>.
|
||||
#
|
||||
@ -610,8 +610,8 @@ MAKEFLAGS=
|
||||
# Identity of this package.
|
||||
PACKAGE_NAME='pdfio'
|
||||
PACKAGE_TARNAME='pdfio'
|
||||
PACKAGE_VERSION='1.4.1'
|
||||
PACKAGE_STRING='pdfio 1.4.1'
|
||||
PACKAGE_VERSION='1.5.3'
|
||||
PACKAGE_STRING='pdfio 1.5.3'
|
||||
PACKAGE_BUGREPORT='https://github.com/michaelrsweet/pdfio/issues'
|
||||
PACKAGE_URL='https://www.msweet.org/pdfio'
|
||||
|
||||
@ -653,6 +653,7 @@ WARNINGS
|
||||
CSFLAGS
|
||||
LIBPDFIO_STATIC
|
||||
LIBPDFIO
|
||||
PKGCONFIG_LIBPNG
|
||||
PKGCONFIG_REQUIRES
|
||||
PKGCONFIG_LIBS_PRIVATE
|
||||
PKGCONFIG_LIBS
|
||||
@ -729,6 +730,7 @@ SHELL'
|
||||
ac_subst_files=''
|
||||
ac_user_opts='
|
||||
enable_option_checking
|
||||
enable_libpng
|
||||
enable_static
|
||||
enable_shared
|
||||
enable_debug
|
||||
@ -1293,7 +1295,7 @@ if test "$ac_init_help" = "long"; then
|
||||
# Omit some internal or obsolete options to make the list less imposing.
|
||||
# This message is too long to be a string in the A/UX 3.1 sh.
|
||||
cat <<_ACEOF
|
||||
\`configure' configures pdfio 1.4.1 to adapt to many kinds of systems.
|
||||
\`configure' configures pdfio 1.5.3 to adapt to many kinds of systems.
|
||||
|
||||
Usage: $0 [OPTION]... [VAR=VALUE]...
|
||||
|
||||
@ -1359,7 +1361,7 @@ fi
|
||||
|
||||
if test -n "$ac_init_help"; then
|
||||
case $ac_init_help in
|
||||
short | recursive ) echo "Configuration of pdfio 1.4.1:";;
|
||||
short | recursive ) echo "Configuration of pdfio 1.5.3:";;
|
||||
esac
|
||||
cat <<\_ACEOF
|
||||
|
||||
@ -1367,6 +1369,8 @@ Optional Features:
|
||||
--disable-option-checking ignore unrecognized --enable/--with options
|
||||
--disable-FEATURE do not include FEATURE (same as --enable-FEATURE=no)
|
||||
--enable-FEATURE[=ARG] include FEATURE [ARG=yes]
|
||||
--enable-libpng use libpng for pdfioFileCreateImageObjFromFile,
|
||||
default=auto
|
||||
--disable-static do not install static library
|
||||
--enable-shared install shared library
|
||||
--enable-debug turn on debugging, default=no
|
||||
@ -1456,7 +1460,7 @@ fi
|
||||
test -n "$ac_init_help" && exit $ac_status
|
||||
if $ac_init_version; then
|
||||
cat <<\_ACEOF
|
||||
pdfio configure 1.4.1
|
||||
pdfio configure 1.5.3
|
||||
generated by GNU Autoconf 2.71
|
||||
|
||||
Copyright (C) 2021 Free Software Foundation, Inc.
|
||||
@ -1509,39 +1513,6 @@ fi
|
||||
|
||||
} # ac_fn_c_try_compile
|
||||
|
||||
# ac_fn_c_check_header_compile LINENO HEADER VAR INCLUDES
|
||||
# -------------------------------------------------------
|
||||
# Tests whether HEADER exists and can be compiled using the include files in
|
||||
# INCLUDES, setting the cache variable VAR accordingly.
|
||||
ac_fn_c_check_header_compile ()
|
||||
{
|
||||
as_lineno=${as_lineno-"$1"} as_lineno_stack=as_lineno_stack=$as_lineno_stack
|
||||
{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: checking for $2" >&5
|
||||
printf %s "checking for $2... " >&6; }
|
||||
if eval test \${$3+y}
|
||||
then :
|
||||
printf %s "(cached) " >&6
|
||||
else $as_nop
|
||||
cat confdefs.h - <<_ACEOF >conftest.$ac_ext
|
||||
/* end confdefs.h. */
|
||||
$4
|
||||
#include <$2>
|
||||
_ACEOF
|
||||
if ac_fn_c_try_compile "$LINENO"
|
||||
then :
|
||||
eval "$3=yes"
|
||||
else $as_nop
|
||||
eval "$3=no"
|
||||
fi
|
||||
rm -f core conftest.err conftest.$ac_objext conftest.beam conftest.$ac_ext
|
||||
fi
|
||||
eval ac_res=\$$3
|
||||
{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: result: $ac_res" >&5
|
||||
printf "%s\n" "$ac_res" >&6; }
|
||||
eval $as_lineno_stack; ${as_lineno_stack:+:} unset as_lineno
|
||||
|
||||
} # ac_fn_c_check_header_compile
|
||||
|
||||
# ac_fn_c_try_link LINENO
|
||||
# -----------------------
|
||||
# Try to link conftest.$ac_ext, and return whether this succeeded.
|
||||
@ -1588,6 +1559,101 @@ fi
|
||||
as_fn_set_status $ac_retval
|
||||
|
||||
} # ac_fn_c_try_link
|
||||
|
||||
# ac_fn_c_check_func LINENO FUNC VAR
|
||||
# ----------------------------------
|
||||
# Tests whether FUNC exists, setting the cache variable VAR accordingly
|
||||
ac_fn_c_check_func ()
|
||||
{
|
||||
as_lineno=${as_lineno-"$1"} as_lineno_stack=as_lineno_stack=$as_lineno_stack
|
||||
{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: checking for $2" >&5
|
||||
printf %s "checking for $2... " >&6; }
|
||||
if eval test \${$3+y}
|
||||
then :
|
||||
printf %s "(cached) " >&6
|
||||
else $as_nop
|
||||
cat confdefs.h - <<_ACEOF >conftest.$ac_ext
|
||||
/* end confdefs.h. */
|
||||
/* Define $2 to an innocuous variant, in case <limits.h> declares $2.
|
||||
For example, HP-UX 11i <limits.h> declares gettimeofday. */
|
||||
#define $2 innocuous_$2
|
||||
|
||||
/* System header to define __stub macros and hopefully few prototypes,
|
||||
which can conflict with char $2 (); below. */
|
||||
|
||||
#include <limits.h>
|
||||
#undef $2
|
||||
|
||||
/* Override any GCC internal prototype to avoid an error.
|
||||
Use char because int might match the return type of a GCC
|
||||
builtin and then its argument prototype would still apply. */
|
||||
#ifdef __cplusplus
|
||||
extern "C"
|
||||
#endif
|
||||
char $2 ();
|
||||
/* The GNU C library defines this for functions which it implements
|
||||
to always fail with ENOSYS. Some functions are actually named
|
||||
something starting with __ and the normal name is an alias. */
|
||||
#if defined __stub_$2 || defined __stub___$2
|
||||
choke me
|
||||
#endif
|
||||
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
return $2 ();
|
||||
;
|
||||
return 0;
|
||||
}
|
||||
_ACEOF
|
||||
if ac_fn_c_try_link "$LINENO"
|
||||
then :
|
||||
eval "$3=yes"
|
||||
else $as_nop
|
||||
eval "$3=no"
|
||||
fi
|
||||
rm -f core conftest.err conftest.$ac_objext conftest.beam \
|
||||
conftest$ac_exeext conftest.$ac_ext
|
||||
fi
|
||||
eval ac_res=\$$3
|
||||
{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: result: $ac_res" >&5
|
||||
printf "%s\n" "$ac_res" >&6; }
|
||||
eval $as_lineno_stack; ${as_lineno_stack:+:} unset as_lineno
|
||||
|
||||
} # ac_fn_c_check_func
|
||||
|
||||
# ac_fn_c_check_header_compile LINENO HEADER VAR INCLUDES
|
||||
# -------------------------------------------------------
|
||||
# Tests whether HEADER exists and can be compiled using the include files in
|
||||
# INCLUDES, setting the cache variable VAR accordingly.
|
||||
ac_fn_c_check_header_compile ()
|
||||
{
|
||||
as_lineno=${as_lineno-"$1"} as_lineno_stack=as_lineno_stack=$as_lineno_stack
|
||||
{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: checking for $2" >&5
|
||||
printf %s "checking for $2... " >&6; }
|
||||
if eval test \${$3+y}
|
||||
then :
|
||||
printf %s "(cached) " >&6
|
||||
else $as_nop
|
||||
cat confdefs.h - <<_ACEOF >conftest.$ac_ext
|
||||
/* end confdefs.h. */
|
||||
$4
|
||||
#include <$2>
|
||||
_ACEOF
|
||||
if ac_fn_c_try_compile "$LINENO"
|
||||
then :
|
||||
eval "$3=yes"
|
||||
else $as_nop
|
||||
eval "$3=no"
|
||||
fi
|
||||
rm -f core conftest.err conftest.$ac_objext conftest.beam conftest.$ac_ext
|
||||
fi
|
||||
eval ac_res=\$$3
|
||||
{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: result: $ac_res" >&5
|
||||
printf "%s\n" "$ac_res" >&6; }
|
||||
eval $as_lineno_stack; ${as_lineno_stack:+:} unset as_lineno
|
||||
|
||||
} # ac_fn_c_check_header_compile
|
||||
ac_configure_args_raw=
|
||||
for ac_arg
|
||||
do
|
||||
@ -1612,7 +1678,7 @@ cat >config.log <<_ACEOF
|
||||
This file contains any messages produced by compilers while
|
||||
running configure, to aid debugging if configure makes a mistake.
|
||||
|
||||
It was created by pdfio $as_me 1.4.1, which was
|
||||
It was created by pdfio $as_me 1.5.3, which was
|
||||
generated by GNU Autoconf 2.71. Invocation command line was
|
||||
|
||||
$ $0$ac_configure_args_raw
|
||||
@ -2368,9 +2434,9 @@ ac_compiler_gnu=$ac_cv_c_compiler_gnu
|
||||
|
||||
|
||||
|
||||
PDFIO_VERSION="1.4.1"
|
||||
PDFIO_VERSION_MAJOR="`echo 1.4.1 | awk -F. '{print $1}'`"
|
||||
PDFIO_VERSION_MINOR="`echo 1.4.1 | awk -F. '{printf("%d\n",$2);}'`"
|
||||
PDFIO_VERSION="1.5.3"
|
||||
PDFIO_VERSION_MAJOR="`echo 1.5.3 | awk -F. '{print $1}'`"
|
||||
PDFIO_VERSION_MINOR="`echo 1.5.3 | awk -F. '{printf("%d\n",$2);}'`"
|
||||
|
||||
|
||||
|
||||
@ -3873,6 +3939,56 @@ INSTALL="$(pwd)/install-sh"
|
||||
printf "%s\n" "using $INSTALL" >&6; }
|
||||
|
||||
|
||||
|
||||
ac_fn_c_check_func "$LINENO" "timegm" "ac_cv_func_timegm"
|
||||
if test "x$ac_cv_func_timegm" = xyes
|
||||
then :
|
||||
|
||||
|
||||
printf "%s\n" "#define HAVE_TIMEGM 1" >>confdefs.h
|
||||
|
||||
CPPFLAGS="-DHAVE_TIMEGM=1 $CPPFLAGS"
|
||||
|
||||
fi
|
||||
|
||||
|
||||
{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: checking for tm_gmtoff member in tm structure" >&5
|
||||
printf %s "checking for tm_gmtoff member in tm structure... " >&6; }
|
||||
cat confdefs.h - <<_ACEOF >conftest.$ac_ext
|
||||
/* end confdefs.h. */
|
||||
|
||||
#include <time.h>
|
||||
int
|
||||
main (void)
|
||||
{
|
||||
|
||||
struct tm t;
|
||||
int o = t.tm_gmtoff;
|
||||
|
||||
;
|
||||
return 0;
|
||||
}
|
||||
|
||||
_ACEOF
|
||||
if ac_fn_c_try_compile "$LINENO"
|
||||
then :
|
||||
|
||||
{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: result: yes" >&5
|
||||
printf "%s\n" "yes" >&6; }
|
||||
|
||||
printf "%s\n" "#define HAVE_TM_GMTOFF 1" >>confdefs.h
|
||||
|
||||
CPPFLAGS="-DHAVE_TM_GMTOFF=1 $CPPFLAGS"
|
||||
|
||||
else $as_nop
|
||||
|
||||
{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: result: no" >&5
|
||||
printf "%s\n" "no" >&6; }
|
||||
|
||||
fi
|
||||
rm -f core conftest.err conftest.$ac_objext conftest.beam conftest.$ac_ext
|
||||
|
||||
|
||||
if test -n "$ac_tool_prefix"; then
|
||||
# Extract the first word of "${ac_tool_prefix}pkg-config", so it can be a program name with args.
|
||||
set dummy ${ac_tool_prefix}pkg-config; ac_word=$2
|
||||
@ -3994,7 +4110,6 @@ PKGCONFIG_REQUIRES="zlib"
|
||||
|
||||
{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: checking for zlib via pkg-config" >&5
|
||||
printf %s "checking for zlib via pkg-config... " >&6; }
|
||||
|
||||
ac_header= ac_cache=
|
||||
for ac_item in $ac_header_c_list
|
||||
do
|
||||
@ -4099,6 +4214,55 @@ fi
|
||||
fi
|
||||
|
||||
|
||||
# Check whether --enable-libpng was given.
|
||||
if test ${enable_libpng+y}
|
||||
then :
|
||||
enableval=$enable_libpng;
|
||||
fi
|
||||
|
||||
|
||||
PKGCONFIG_LIBPNG=""
|
||||
|
||||
|
||||
if test "x$PKGCONFIG" != x -a x$enable_libpng != xno
|
||||
then :
|
||||
|
||||
{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: checking for libpng-1.6.x" >&5
|
||||
printf %s "checking for libpng-1.6.x... " >&6; }
|
||||
if $PKGCONFIG --exists libpng16
|
||||
then :
|
||||
|
||||
{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: result: yes" >&5
|
||||
printf "%s\n" "yes" >&6; };
|
||||
|
||||
printf "%s\n" "#define HAVE_LIBPNG 1" >>confdefs.h
|
||||
|
||||
CPPFLAGS="$($PKGCONFIG --cflags libpng16) -DHAVE_LIBPNG=1 $CPPFLAGS"
|
||||
LIBS="$($PKGCONFIG --libs libpng16) -lz $LIBS"
|
||||
PKGCONFIG_LIBS_PRIVATE="$($PKGCONFIG --libs libpng16) $PKGCONFIG_LIBS_PRIVATE"
|
||||
PKGCONFIG_REQUIRES="libpng >= 1.6,$PKGCONFIG_REQUIRES"
|
||||
|
||||
else $as_nop
|
||||
|
||||
{ printf "%s\n" "$as_me:${as_lineno-$LINENO}: result: no" >&5
|
||||
printf "%s\n" "no" >&6; };
|
||||
if test x$enable_libpng = xyes
|
||||
then :
|
||||
|
||||
as_fn_error $? "libpng-dev 1.6 or later required for --enable-libpng." "$LINENO" 5
|
||||
|
||||
fi
|
||||
|
||||
fi
|
||||
|
||||
elif test x$enable_libpng = xyes
|
||||
then :
|
||||
|
||||
as_fn_error $? "libpng-dev 1.6 or later required for --enable-libpng." "$LINENO" 5
|
||||
|
||||
fi
|
||||
|
||||
|
||||
# Check whether --enable-static was given.
|
||||
if test ${enable_static+y}
|
||||
then :
|
||||
@ -4935,7 +5099,7 @@ cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1
|
||||
# report actual input values of CONFIG_FILES etc. instead of their
|
||||
# values after options handling.
|
||||
ac_log="
|
||||
This file was extended by pdfio $as_me 1.4.1, which was
|
||||
This file was extended by pdfio $as_me 1.5.3, which was
|
||||
generated by GNU Autoconf 2.71. Invocation command line was
|
||||
|
||||
CONFIG_FILES = $CONFIG_FILES
|
||||
@ -4991,7 +5155,7 @@ ac_cs_config_escaped=`printf "%s\n" "$ac_cs_config" | sed "s/^ //; s/'/'\\\\\\\\
|
||||
cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1
|
||||
ac_cs_config='$ac_cs_config_escaped'
|
||||
ac_cs_version="\\
|
||||
pdfio config.status 1.4.1
|
||||
pdfio config.status 1.5.3
|
||||
configured by $0, generated by GNU Autoconf 2.71,
|
||||
with options \\"\$ac_cs_config\\"
|
||||
|
||||
|
51
configure.ac
@ -1,7 +1,7 @@
|
||||
dnl
|
||||
dnl Configuration script for PDFio
|
||||
dnl
|
||||
dnl Copyright © 2023-2024 by Michael R Sweet
|
||||
dnl Copyright © 2023-2025 by Michael R Sweet
|
||||
dnl
|
||||
dnl Licensed under Apache License v2.0. See the file "LICENSE" for more
|
||||
dnl information.
|
||||
@ -21,7 +21,7 @@ AC_PREREQ([2.70])
|
||||
|
||||
|
||||
dnl Package name and version...
|
||||
AC_INIT([pdfio], [1.4.1], [https://github.com/michaelrsweet/pdfio/issues], [pdfio], [https://www.msweet.org/pdfio])
|
||||
AC_INIT([pdfio], [1.5.3], [https://github.com/michaelrsweet/pdfio/issues], [pdfio], [https://www.msweet.org/pdfio])
|
||||
|
||||
PDFIO_VERSION="AC_PACKAGE_VERSION"
|
||||
PDFIO_VERSION_MAJOR="`echo AC_PACKAGE_VERSION | awk -F. '{print $1}'`"
|
||||
@ -88,6 +88,27 @@ AC_SUBST([INSTALL])
|
||||
AC_MSG_RESULT([using $INSTALL])
|
||||
|
||||
|
||||
dnl Check for date/time functionality...
|
||||
AC_CHECK_FUNC([timegm], [
|
||||
AC_DEFINE([HAVE_TIMEGM], [1], [Do we have the timegm function?])
|
||||
CPPFLAGS="-DHAVE_TIMEGM=1 $CPPFLAGS"
|
||||
])
|
||||
|
||||
AC_MSG_CHECKING([for tm_gmtoff member in tm structure])
|
||||
AC_COMPILE_IFELSE([
|
||||
AC_LANG_PROGRAM([[#include <time.h>]], [[
|
||||
struct tm t;
|
||||
int o = t.tm_gmtoff;
|
||||
]])
|
||||
], [
|
||||
AC_MSG_RESULT([yes])
|
||||
AC_DEFINE([HAVE_TM_GMTOFF], [1], [Have tm_gmtoff member in struct tm?])
|
||||
CPPFLAGS="-DHAVE_TM_GMTOFF=1 $CPPFLAGS"
|
||||
], [
|
||||
AC_MSG_RESULT([no])
|
||||
])
|
||||
|
||||
|
||||
dnl Check for pkg-config, which is used for some other tests later on...
|
||||
AC_PATH_TOOL([PKGCONFIG], [pkg-config])
|
||||
|
||||
@ -121,6 +142,32 @@ AS_IF([$PKGCONFIG --exists zlib], [
|
||||
])
|
||||
|
||||
|
||||
dnl libpng...
|
||||
AC_ARG_ENABLE([libpng], AS_HELP_STRING([--enable-libpng], [use libpng for pdfioFileCreateImageObjFromFile, default=auto]))
|
||||
|
||||
PKGCONFIG_LIBPNG=""
|
||||
AC_SUBST([PKGCONFIG_LIBPNG])
|
||||
|
||||
AS_IF([test "x$PKGCONFIG" != x -a x$enable_libpng != xno], [
|
||||
AC_MSG_CHECKING([for libpng-1.6.x])
|
||||
AS_IF([$PKGCONFIG --exists libpng16], [
|
||||
AC_MSG_RESULT([yes]);
|
||||
AC_DEFINE([HAVE_LIBPNG], 1, [Have PNG library?])
|
||||
CPPFLAGS="$($PKGCONFIG --cflags libpng16) -DHAVE_LIBPNG=1 $CPPFLAGS"
|
||||
LIBS="$($PKGCONFIG --libs libpng16) -lz $LIBS"
|
||||
PKGCONFIG_LIBS_PRIVATE="$($PKGCONFIG --libs libpng16) $PKGCONFIG_LIBS_PRIVATE"
|
||||
PKGCONFIG_REQUIRES="libpng >= 1.6,$PKGCONFIG_REQUIRES"
|
||||
], [
|
||||
AC_MSG_RESULT([no]);
|
||||
AS_IF([test x$enable_libpng = xyes], [
|
||||
AC_MSG_ERROR([libpng-dev 1.6 or later required for --enable-libpng.])
|
||||
])
|
||||
])
|
||||
], [test x$enable_libpng = xyes], [
|
||||
AC_MSG_ERROR([libpng-dev 1.6 or later required for --enable-libpng.])
|
||||
])
|
||||
|
||||
|
||||
dnl Library target...
|
||||
AC_ARG_ENABLE([static], AS_HELP_STRING([--disable-static], [do not install static library]))
|
||||
AC_ARG_ENABLE([shared], AS_HELP_STRING([--enable-shared], [install shared library]))
|
||||
|
Before Width: | Height: | Size: 20 KiB After Width: | Height: | Size: 20 KiB |
457
doc/pdfio.3
@ -1,4 +1,4 @@
|
||||
.TH pdfio 3 "pdf read/write library" "2025-01-24" "pdf read/write library"
|
||||
.TH pdfio 3 "pdf read/write library" "2025-05-03" "pdf read/write library"
|
||||
.SH NAME
|
||||
pdfio \- pdf read/write library
|
||||
.SH Introduction
|
||||
@ -34,7 +34,7 @@ PDFio is
|
||||
.I not
|
||||
concerned with rendering or viewing a PDF file, although a PDF RIP or viewer could be written using it.
|
||||
.PP
|
||||
PDFio is Copyright \[co] 2021\-2024 by Michael R Sweet and is licensed under the Apache License Version 2.0 with an (optional) exception to allow linking against GPL2/LGPL2 software. See the files "LICENSE" and "NOTICE" for more information.
|
||||
PDFio is Copyright \[co] 2021\-2025 by Michael R Sweet and is licensed under the Apache License Version 2.0 with an (optional) exception to allow linking against GPL2/LGPL2 software. See the files "LICENSE" and "NOTICE" for more information.
|
||||
.SS Requirements
|
||||
.PP
|
||||
PDFio requires the following to build the software:
|
||||
@ -52,9 +52,11 @@ A POSIX\-compliant sh program
|
||||
|
||||
.IP \(bu 5
|
||||
.PP
|
||||
ZLIB (https://www.zlib.net) 1.0 or higher
|
||||
ZLIB (https://www.zlib.net/) 1.0 or higher
|
||||
|
||||
|
||||
.PP
|
||||
PDFio will also use libpng 1.6 or higher (https://www.libpng.org/) to provide enhanced PNG image support.
|
||||
.PP
|
||||
IDE files for Xcode (macOS/iOS) and Visual Studio (Windows) are also provided.
|
||||
.SS Installing PDFio
|
||||
@ -323,7 +325,7 @@ where the five arguments to the function are the filename ("myinputfile.pdf"), a
|
||||
}
|
||||
.fi
|
||||
.PP
|
||||
The error callback is called for both errors and warnings and accepts the pdfio_file_t pointer, a message string, and the callback pointer value, for example:
|
||||
The error callback is called for both errors and warnings and accepts the pdfio_file_t pointer, a message string, and the callback pointer value. It returns true to continue processing the file or false to stop, for example:
|
||||
.nf
|
||||
|
||||
bool
|
||||
@ -333,12 +335,15 @@ The error callback is called for both errors and warnings and accepts the pdfio_
|
||||
|
||||
fprintf(stderr, "%s: %s\\n", pdfioFileGetName(pdf), message);
|
||||
|
||||
// Return false to treat warnings as errors
|
||||
return (false);
|
||||
// Return true for warning messages (continue) and false for errors (stop)
|
||||
return (!strncmp(message, "WARNING:", 8));
|
||||
}
|
||||
.fi
|
||||
.PP
|
||||
The default error callback (NULL) does the equivalent of the above.
|
||||
.PP
|
||||
Note: Many errors are unrecoverable, so PDFio ignores the return value from the error callback and always stops processing the PDF file. Warning messages start with the prefix "WARNING:" while errors have no prefix.
|
||||
|
||||
.PP
|
||||
Each PDF file contains one or more pages. The pdfioFileGetNumPages function returns the number of pages in the file while the pdfioFileGetPage function gets the specified page in the PDF file:
|
||||
.nf
|
||||
@ -1045,11 +1050,26 @@ The pdfioinfo.c example program opens a PDF file and prints the title, author, c
|
||||
{
|
||||
const char *filename; // PDF filename
|
||||
pdfio_file_t *pdf; // PDF file
|
||||
const char *author; // Author name
|
||||
time_t creation_date; // Creation date
|
||||
struct tm *creation_tm; // Creation date/time information
|
||||
char creation_text[256]; // Creation date/time as a string
|
||||
const char *title; // Title
|
||||
pdfio_dict_t *catalog; // Catalog dictionary
|
||||
const char *author, // Author name
|
||||
*creator, // Creator name
|
||||
*producer, // Producer name
|
||||
*title; // Title
|
||||
time_t creation_date, // Creation date
|
||||
modification_date; // Modification date
|
||||
struct tm *creation_tm, // Creation date/time information
|
||||
*modification_tm; // Modification date/time information
|
||||
char creation_text[256], // Creation date/time as a string
|
||||
modification_text[256], // Modification date/time human fmt string
|
||||
range_text[255]; // Page range text
|
||||
size_t num_pages; // PDF number of pages
|
||||
bool has_acroform; // Does the file have an AcroForm?
|
||||
pdfio_obj_t *page; // Object
|
||||
pdfio_dict_t *page_dict; // Object dictionary
|
||||
size_t cur, // Current page index
|
||||
prev; // Previous page index
|
||||
pdfio_rect_t cur_box, // Current MediaBox
|
||||
prev_box; // Previous MediaBox
|
||||
|
||||
|
||||
// Get the filename from the command\-line...
|
||||
@ -1062,14 +1082,20 @@ The pdfioinfo.c example program opens a PDF file and prints the title, author, c
|
||||
filename = argv[1];
|
||||
|
||||
// Open the PDF file with the default callbacks...
|
||||
pdf = pdfioFileOpen(filename, /*password_cb*/NULL, /*password_cbdata*/NULL,
|
||||
/*error_cb*/NULL, /*error_cbdata*/NULL);
|
||||
pdf = pdfioFileOpen(filename, /*password_cb*/NULL,
|
||||
/*password_cbdata*/NULL, /*error_cb*/NULL,
|
||||
/*error_cbdata*/NULL);
|
||||
if (pdf == NULL)
|
||||
return (1);
|
||||
|
||||
// Get the title and author...
|
||||
author = pdfioFileGetAuthor(pdf);
|
||||
title = pdfioFileGetTitle(pdf);
|
||||
// Get the title, author, etc...
|
||||
catalog = pdfioFileGetCatalog(pdf);
|
||||
author = pdfioFileGetAuthor(pdf);
|
||||
creator = pdfioFileGetCreator(pdf);
|
||||
has_acroform = pdfioDictGetType(catalog, "AcroForm") != PDFIO_VALTYPE_NONE;
|
||||
num_pages = pdfioFileGetNumPages(pdf);
|
||||
producer = pdfioFileGetProducer(pdf);
|
||||
title = pdfioFileGetTitle(pdf);
|
||||
|
||||
// Get the creation date and convert to a string...
|
||||
if ((creation_date = pdfioFileGetCreationDate(pdf)) > 0)
|
||||
@ -1082,12 +1108,76 @@ The pdfioinfo.c example program opens a PDF file and prints the title, author, c
|
||||
snprintf(creation_text, sizeof(creation_text), "\-\- not set \-\-");
|
||||
}
|
||||
|
||||
// Get the modification date and convert to a string...
|
||||
if ((modification_date = pdfioFileGetModificationDate(pdf)) > 0)
|
||||
{
|
||||
modification_tm = localtime(&modification_date);
|
||||
strftime(modification_text, sizeof(modification_text), "%c", modification_tm);
|
||||
}
|
||||
else
|
||||
{
|
||||
snprintf(modification_text, sizeof(modification_text), "\-\- not set \-\-");
|
||||
}
|
||||
|
||||
// Print file information to stdout...
|
||||
printf("%s:\\n", filename);
|
||||
printf(" Title: %s\\n", title ? title : "\-\- not set \-\-");
|
||||
printf(" Author: %s\\n", author ? author : "\-\- not set \-\-");
|
||||
printf(" Created On: %s\\n", creation_text);
|
||||
printf(" Number Pages: %u\\n", (unsigned)pdfioFileGetNumPages(pdf));
|
||||
printf(" Title: %s\\n", title ? title : "\-\- not set \-\-");
|
||||
printf(" Author: %s\\n", author ? author : "\-\- not set \-\-");
|
||||
printf(" Creator: %s\\n", creator ? creator : "\-\- not set \-\-");
|
||||
printf(" Producer: %s\\n", producer ? producer : "\-\- not set \-\-");
|
||||
printf(" Created On: %s\\n", creation_text);
|
||||
printf(" Modified On: %s\\n", modification_text);
|
||||
printf(" Version: %s\\n", pdfioFileGetVersion(pdf));
|
||||
printf(" AcroForm: %s\\n", has_acroform ? "Yes" : "No");
|
||||
printf(" Number of Pages: %u\\n", (unsigned)num_pages);
|
||||
|
||||
// Report the MediaBox for all of the pages
|
||||
prev_box.x1 = prev_box.x2 = prev_box.y1 = prev_box.y2 = 0.0;
|
||||
|
||||
for (cur = 0, prev = 0; cur < num_pages; cur ++)
|
||||
{
|
||||
// Find the MediaBox for this page in the page tree...
|
||||
for (page = pdfioFileGetPage(pdf, cur);
|
||||
page != NULL;
|
||||
page = pdfioDictGetObj(page_dict, "Parent"))
|
||||
{
|
||||
cur_box.x1 = cur_box.x2 = cur_box.y1 = cur_box.y2 = 0.0;
|
||||
page_dict = pdfioObjGetDict(page);
|
||||
|
||||
if (pdfioDictGetRect(page_dict, "MediaBox", &cur_box))
|
||||
break;
|
||||
}
|
||||
|
||||
// If this MediaBox is different from the previous one, show the range of
|
||||
// pages that have that size...
|
||||
if (cur == 0 ||
|
||||
fabs(cur_box.x1 \- prev_box.x1) > 0.01 ||
|
||||
fabs(cur_box.y1 \- prev_box.y1) > 0.01 ||
|
||||
fabs(cur_box.x2 \- prev_box.x2) > 0.01 ||
|
||||
fabs(cur_box.y2 \- prev_box.y2) > 0.01)
|
||||
{
|
||||
if (cur > prev)
|
||||
{
|
||||
snprintf(range_text, sizeof(range_text), "Pages %u\-%u",
|
||||
(unsigned)(prev + 1), (unsigned)cur);
|
||||
printf("%16s: [%g %g %g %g]\\n", range_text,
|
||||
prev_box.x1, prev_box.y1, prev_box.x2, prev_box.y2);
|
||||
}
|
||||
|
||||
// Start a new series of pages with the new size...
|
||||
prev = cur;
|
||||
prev_box = cur_box;
|
||||
}
|
||||
}
|
||||
|
||||
// Show the last range as needed...
|
||||
if (cur > prev)
|
||||
{
|
||||
snprintf(range_text, sizeof(range_text), "Pages %u\-%u",
|
||||
(unsigned)(prev + 1), (unsigned)cur);
|
||||
printf("%16s: [%g %g %g %g]\\n", range_text,
|
||||
prev_box.x1, prev_box.y1, prev_box.x2, prev_box.y2);
|
||||
}
|
||||
|
||||
// Close the PDF file...
|
||||
pdfioFileClose(pdf);
|
||||
@ -1097,28 +1187,83 @@ The pdfioinfo.c example program opens a PDF file and prints the title, author, c
|
||||
.fi
|
||||
.SS Extract Text from PDF File
|
||||
.PP
|
||||
The pdf2text.c example code extracts non\-Unicode text from a PDF file by scanning each page for strings and text drawing commands. Since it doesn't look at the font encoding or support Unicode text, it is really only useful to extract plain ASCII text from a PDF file. And since it writes text in the order it appears in the page stream, it may not come out in the same order as appears on the page.
|
||||
The pdf2text.c example code extracts text from a PDF file and writes it to the standard output. Unlike some other PDF tools, it outputs the text in the order it is seen in each page stream so the output might appear "jumbled" if the PDF producer doesn't output text in reading order. The code is able to handle different font encodings and produces UTF\-8 output.
|
||||
.PP
|
||||
The pdfioStreamGetToken function is used to read individual tokens from the page streams. Tokens starting with the open parenthesis are text strings, while PDF operators are left as\-is. We use some simple logic to make sure that we include spaces between text strings and add newlines for the text operators that start a new line in a text block:
|
||||
The pdfioStreamGetToken function is used to read individual tokens from the page streams:
|
||||
.nf
|
||||
|
||||
pdfio_stream_t *st; // Page stream
|
||||
char buffer[1024], // Token buffer
|
||||
*bufptr, // Pointer into buffer
|
||||
name[256]; // Current (font) name
|
||||
bool first = true; // First string on line?
|
||||
char buffer[1024]; // Token buffer
|
||||
int encoding[256]; // Font encoding to Unicode
|
||||
bool in_array = false; // Are we in an array?
|
||||
|
||||
// Read PDF tokens from the page stream...
|
||||
while (pdfioStreamGetToken(st, buffer, sizeof(buffer)))
|
||||
{
|
||||
if (buffer[0] == '(')
|
||||
.fi
|
||||
.PP
|
||||
Justified text can be found inside arrays ("[ ... ]"), so we look for the array delimiter tokens and any (spacing) numbers inside an array. Experimentation has shown that numbers greater than 100 can be treated as whitespace:
|
||||
.nf
|
||||
|
||||
if (!strcmp(buffer, "["))
|
||||
{
|
||||
// Start of an array for justified text...
|
||||
in_array = true;
|
||||
}
|
||||
else if (!strcmp(buffer, "]"))
|
||||
{
|
||||
// End of an array for justified text...
|
||||
in_array = false;
|
||||
}
|
||||
else if (!first && in_array && (isdigit(buffer[0]) || buffer[0] == '\-') && fabs(atof(buffer)) > 100)
|
||||
{
|
||||
// Whitespace in a justified text block...
|
||||
putchar(' ');
|
||||
}
|
||||
.fi
|
||||
.PP
|
||||
Tokens starting with \'(' or \'<' are text fragments. 8\-bit text starting with \'(' needs to be mapped to Unicode using the current font encoding while hex strings starting with \'<' are UTF\-16 (Unicode) that need to be converted to UTF\-8:
|
||||
.nf
|
||||
|
||||
else if (buffer[0] == '(')
|
||||
{
|
||||
// Text string using an 8\-bit encoding
|
||||
if (first)
|
||||
first = false;
|
||||
else if (buffer[1] != ' ')
|
||||
putchar(' ');
|
||||
first = false;
|
||||
|
||||
fputs(buffer + 1, stdout);
|
||||
for (bufptr = buffer + 1; *bufptr; bufptr ++)
|
||||
put_utf8(encoding[*bufptr & 255]);
|
||||
}
|
||||
else if (buffer[0] == '<')
|
||||
{
|
||||
// Unicode text string
|
||||
first = false;
|
||||
|
||||
puts_utf16(buffer + 1);
|
||||
}
|
||||
.fi
|
||||
.PP
|
||||
Simple (8\-bit) fonts include an encoding table that maps the 8\-bit characters to one of 1051 Unicode glyph names. Since each font can use a different encoding, we look for font names starting with \'/' and the "Tf" (set text font) operator token and load that font's encoding using the load_encoding function:
|
||||
.nf
|
||||
|
||||
else if (buffer[0] == '/')
|
||||
{
|
||||
// Save name...
|
||||
strncpy(name, buffer + 1, sizeof(name) \- 1);
|
||||
name[sizeof(name) \- 1] = '\\0';
|
||||
}
|
||||
else if (!strcmp(buffer, "Tf") && name[0])
|
||||
{
|
||||
// Set font...
|
||||
load_encoding(obj, name, encoding);
|
||||
}
|
||||
.fi
|
||||
.PP
|
||||
Finally, some text operators start a new line in a text block, so when we see their tokens we output a newline:
|
||||
.nf
|
||||
|
||||
else if (!strcmp(buffer, "Td") || !strcmp(buffer, "TD") || !strcmp(buffer, "T*") ||
|
||||
!strcmp(buffer, "\\'") || !strcmp(buffer, "\\""))
|
||||
{
|
||||
@ -1127,9 +1272,150 @@ The pdfioStreamGetToken function is used to read individual tokens from the page
|
||||
first = true;
|
||||
}
|
||||
}
|
||||
.fi
|
||||
.PP
|
||||
The load_encoding Function
|
||||
.PP
|
||||
The load_encoding function looks up the named font in the page's "Resources" dictionary. Every PDF simple font contains an "Encoding" dictionary with a base encoding ("WinANSI", "MacRoman", or "MacExpert") and a differences array that lists character indexes and glyph names for an 8\-bit font.
|
||||
.PP
|
||||
We start by initializing the encoding array to the default WinANSI encoding and looking up the font object for the named font:
|
||||
.nf
|
||||
|
||||
static void
|
||||
load_encoding(
|
||||
pdfio_obj_t *page_obj, // I \- Page object
|
||||
const char *name, // I \- Font name
|
||||
int encoding[256]) // O \- Encoding table
|
||||
{
|
||||
size_t i, j; // Looping vars
|
||||
pdfio_dict_t *page_dict, // Page dictionary
|
||||
*resources_dict, // Resources dictionary
|
||||
*font_dict; // Font dictionary
|
||||
pdfio_obj_t *font_obj, // Font object
|
||||
*encoding_obj; // Encoding object
|
||||
static int win_ansi[32] = // WinANSI characters from 128 to 159
|
||||
{
|
||||
...
|
||||
};
|
||||
static int mac_roman[128] = // MacRoman characters from 128 to 255
|
||||
{
|
||||
...
|
||||
};
|
||||
|
||||
if (!first)
|
||||
putchar('\\n');
|
||||
|
||||
// Initialize the encoding to be the "standard" WinAnsi...
|
||||
for (i = 0; i < 128; i ++)
|
||||
encoding[i] = i;
|
||||
for (i = 160; i < 256; i ++)
|
||||
encoding[i] = i;
|
||||
memcpy(encoding + 128, win_ansi, sizeof(win_ansi));
|
||||
|
||||
// Find the named font...
|
||||
if ((page_dict = pdfioObjGetDict(page_obj)) == NULL)
|
||||
return;
|
||||
|
||||
if ((resources_dict = pdfioDictGetDict(page_dict, "Resources")) == NULL)
|
||||
return;
|
||||
|
||||
if ((font_dict = pdfioDictGetDict(resources_dict, "Font")) == NULL)
|
||||
{
|
||||
// Font resources not a dictionary, see if it is an object...
|
||||
if ((font_obj = pdfioDictGetObj(resources_dict, "Font")) != NULL)
|
||||
font_dict = pdfioObjGetDict(font_obj);
|
||||
|
||||
if (!font_dict)
|
||||
return;
|
||||
}
|
||||
|
||||
if ((font_obj = pdfioDictGetObj(font_dict, name)) == NULL)
|
||||
return;
|
||||
.fi
|
||||
.PP
|
||||
Once we have found the font we see if it has an "Encoding" dictionary:
|
||||
.nf
|
||||
|
||||
pdfio_dict_t *encoding_dict; // Encoding dictionary
|
||||
|
||||
if ((encoding_obj = pdfioDictGetObj(pdfioObjGetDict(font_obj), "Encoding")) == NULL)
|
||||
return;
|
||||
|
||||
if ((encoding_dict = pdfioObjGetDict(encoding_obj)) == NULL)
|
||||
return;
|
||||
.fi
|
||||
.PP
|
||||
Once we have the encoding dictionary we can get the "BaseEncoding" and "Differences" values:
|
||||
.nf
|
||||
|
||||
const char *base_encoding; // BaseEncoding name
|
||||
pdfio_array_t *differences; // Differences array
|
||||
|
||||
// OK, have the encoding object, build the encoding using it...
|
||||
base_encoding = pdfioDictGetName(encoding_dict, "BaseEncoding");
|
||||
differences = pdfioDictGetArray(encoding_dict, "Differences");
|
||||
.fi
|
||||
.PP
|
||||
If the base encoding is "MacRomainEncoding", we need to reset the upper 128 characters in the encoding array match it:
|
||||
.nf
|
||||
|
||||
if (base_encoding && !strcmp(base_encoding, "MacRomanEncoding"))
|
||||
{
|
||||
// Map upper 128
|
||||
memcpy(encoding + 128, mac_roman, sizeof(mac_roman));
|
||||
}
|
||||
.fi
|
||||
.PP
|
||||
Then we loop through the differences array, keeping track of the current index within the encoding array. A number indicates a new index while a name is the Unicode glyph for the current index:
|
||||
.nf
|
||||
|
||||
typedef struct name_map_s
|
||||
{
|
||||
const char *name; // Character name
|
||||
int unicode; // Unicode value
|
||||
} name_map_t;
|
||||
|
||||
static name_map_t unicode_map[1051]; // List of glyph names
|
||||
|
||||
if (differences)
|
||||
{
|
||||
// Apply differences
|
||||
size_t count = pdfioArrayGetSize(differences);
|
||||
// Number of differences
|
||||
const char *name; // Character name
|
||||
size_t idx = 0; // Index in encoding array
|
||||
|
||||
for (i = 0; i < count; i ++)
|
||||
{
|
||||
switch (pdfioArrayGetType(differences, i))
|
||||
{
|
||||
case PDFIO_VALTYPE_NUMBER :
|
||||
// Get the index of the next character...
|
||||
idx = (size_t)pdfioArrayGetNumber(differences, i);
|
||||
break;
|
||||
|
||||
case PDFIO_VALTYPE_NAME :
|
||||
// Lookup name and apply to encoding...
|
||||
if (idx < 0 || idx > 255)
|
||||
break;
|
||||
|
||||
name = pdfioArrayGetName(differences, i);
|
||||
for (j = 0; j < (sizeof(unicode_map) / sizeof(unicode_map[0])); j ++)
|
||||
{
|
||||
if (!strcmp(name, unicode_map[j].name))
|
||||
{
|
||||
encoding[idx] = unicode_map[j].unicode;
|
||||
break;
|
||||
}
|
||||
}
|
||||
idx ++;
|
||||
break;
|
||||
|
||||
default :
|
||||
// Do nothing for other values
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
.fi
|
||||
.SS Create a PDF File With Text and an Image
|
||||
.PP
|
||||
@ -2194,7 +2480,7 @@ PDFIO_ENCRYPTION_RC4_128
|
||||
.TP 5
|
||||
PDFIO_ENCRYPTION_RC4_40
|
||||
.br
|
||||
40-bit RC4 encryption (PDF 1.3)
|
||||
40-bit RC4 encryption (PDF 1.3, reading only)
|
||||
.SS pdfio_filter_e
|
||||
Compression/decompression filters for streams
|
||||
.TP 5
|
||||
@ -2664,6 +2950,8 @@ bool pdfioArrayRemove (
|
||||
size_t n
|
||||
);
|
||||
.fi
|
||||
.PP
|
||||
|
||||
.SS pdfioContentClip
|
||||
Clip output to the current path.
|
||||
.PP
|
||||
@ -2800,6 +3088,8 @@ bool pdfioContentPathEnd (
|
||||
pdfio_stream_t *st
|
||||
);
|
||||
.fi
|
||||
.PP
|
||||
|
||||
.SS pdfioContentPathLineTo
|
||||
Add a straight line to the current path.
|
||||
.PP
|
||||
@ -3140,6 +3430,8 @@ double pdfioContentTextMeasure (
|
||||
This function measures the given text string "s" and returns its width based
|
||||
on "size". The text string must always use the UTF-8 (Unicode) encoding but
|
||||
any control characters (such as newlines) are ignored.
|
||||
|
||||
|
||||
.SS pdfioContentTextMoveLine
|
||||
Move to the next line and offset.
|
||||
.PP
|
||||
@ -3168,6 +3460,8 @@ bool pdfioContentTextNewLine (
|
||||
pdfio_stream_t *st
|
||||
);
|
||||
.fi
|
||||
.PP
|
||||
|
||||
.SS pdfioContentTextNewLineShow
|
||||
Move to the next line and show text.
|
||||
.PP
|
||||
@ -3185,6 +3479,8 @@ This function moves to the next line and then shows some text with optional
|
||||
word and character spacing in a PDF content stream. The "unicode" argument
|
||||
specifies that the current font maps to full Unicode. The "s" argument
|
||||
specifies a UTF-8 encoded string.
|
||||
|
||||
|
||||
.SS pdfioContentTextNewLineShowf
|
||||
Show formatted text.
|
||||
.PP
|
||||
@ -3203,6 +3499,8 @@ This function moves to the next line and shows some formatted text with
|
||||
optional word and character spacing in a PDF content stream. The "unicode"
|
||||
argument specifies that the current font maps to full Unicode. The "format"
|
||||
argument specifies a UTF-8 encoded \fBprintf\fR-style format string.
|
||||
|
||||
|
||||
.SS pdfioContentTextShow
|
||||
Show text.
|
||||
.PP
|
||||
@ -3253,6 +3551,8 @@ bool pdfioDictClear (
|
||||
const char *key
|
||||
);
|
||||
.fi
|
||||
.PP
|
||||
|
||||
.SS pdfioDictCopy
|
||||
Copy a dictionary to a PDF file.
|
||||
.PP
|
||||
@ -3325,6 +3625,8 @@ const char * pdfioDictGetKey (
|
||||
size_t n
|
||||
);
|
||||
.fi
|
||||
.PP
|
||||
|
||||
.SS pdfioDictGetName
|
||||
Get a key name value from a dictionary.
|
||||
.PP
|
||||
@ -3342,6 +3644,8 @@ size_t pdfioDictGetNumPairs (
|
||||
pdfio_dict_t *dict
|
||||
);
|
||||
.fi
|
||||
.PP
|
||||
|
||||
.SS pdfioDictGetNumber
|
||||
Get a key number value from a dictionary.
|
||||
.PP
|
||||
@ -3414,6 +3718,8 @@ function "cb":
|
||||
|
||||
The iteration continues as long as the callback returns \fBtrue\fR or all keys
|
||||
have been iterated.
|
||||
|
||||
|
||||
.SS pdfioDictSetArray
|
||||
Set a key array in a dictionary.
|
||||
.PP
|
||||
@ -3561,15 +3867,17 @@ This function creates a new PDF file. The "filename" argument specifies the
|
||||
name of the PDF file to create.
|
||||
.PP
|
||||
The "version" argument specifies the PDF version number for the file or
|
||||
\fBNULL\fR for the default ("2.0").
|
||||
\fBNULL\fR for the default ("2.0"). The value "PCLm-1.0" can be specified to
|
||||
produce the PCLm subset of PDF.
|
||||
.PP
|
||||
The "media_box" and "crop_box" arguments specify the default MediaBox and
|
||||
CropBox for pages in the PDF file - if \fBNULL\fR then a default "Universal" size
|
||||
of 8.27x11in (the intersection of US Letter and ISO A4) is used.
|
||||
.PP
|
||||
The "error_cb" and "error_cbdata" arguments specify an error handler callback
|
||||
and its data pointer - if \fBNULL\fR the default error handler is used that
|
||||
writes error messages to \fBstderr\fR.
|
||||
and its data pointer - if \fBNULL\fR then the default error handler is used that
|
||||
writes error messages to \fBstderr\fR. The error handler callback should return
|
||||
\fBtrue\fR to continue writing the PDF file or \fBfalse\fR to stop.
|
||||
.SS pdfioFileCreateArrayObj
|
||||
Create a new object in a PDF file containing an array.
|
||||
.PP
|
||||
@ -3643,8 +3951,19 @@ This function embeds a TrueType/OpenType font into a PDF file. The
|
||||
characters (potentially full Unicode, but more typically a subset)
|
||||
or to only support the Windows CP1252 (ISO-8859-1 with additional
|
||||
characters such as the Euro symbol) subset of Unicode.
|
||||
.SS pdfioFileCreateICCObjFromData
|
||||
Add ICC profile data to a PDF file.
|
||||
.PP
|
||||
.nf
|
||||
pdfio_obj_t * pdfioFileCreateICCObjFromData (
|
||||
pdfio_file_t *pdf,
|
||||
const unsigned char *data,
|
||||
size_t datalen,
|
||||
size_t num_colors
|
||||
);
|
||||
.fi
|
||||
.SS pdfioFileCreateICCObjFromFile
|
||||
Add an ICC profile object to a PDF file.
|
||||
Add an ICC profile file to a PDF file.
|
||||
.PP
|
||||
.nf
|
||||
pdfio_obj_t * pdfioFileCreateICCObjFromFile (
|
||||
@ -3716,6 +4035,8 @@ pdfio_obj_t * pdfioFileCreateNameObj (
|
||||
.PP
|
||||
This function creates a new object with a name value in a PDF file.
|
||||
You must call \fIpdfioObjClose\fR to write the object to the file.
|
||||
|
||||
|
||||
.SS pdfioFileCreateNumberObj
|
||||
Create a new object in a PDF file containing a number.
|
||||
.PP
|
||||
@ -3728,6 +4049,8 @@ pdfio_obj_t * pdfioFileCreateNumberObj (
|
||||
.PP
|
||||
This function creates a new object with a number value in a PDF file.
|
||||
You must call \fIpdfioObjClose\fR to write the object to the file.
|
||||
|
||||
|
||||
.SS pdfioFileCreateObj
|
||||
Create a new object in a PDF file.
|
||||
.PP
|
||||
@ -3767,15 +4090,18 @@ written:
|
||||
.fi
|
||||
|
||||
The "version" argument specifies the PDF version number for the file or
|
||||
\fBNULL\fR for the default ("2.0").
|
||||
\fBNULL\fR for the default ("2.0"). Unlike \fIpdfioFileCreate\fR and
|
||||
\fIpdfioFileCreateTemporary\fR, it is generally not safe to pass the
|
||||
"PCLm-1.0" version string.
|
||||
.PP
|
||||
The "media_box" and "crop_box" arguments specify the default MediaBox and
|
||||
CropBox for pages in the PDF file - if \fBNULL\fR then a default "Universal" size
|
||||
of 8.27x11in (the intersection of US Letter and ISO A4) is used.
|
||||
.PP
|
||||
The "error_cb" and "error_cbdata" arguments specify an error handler callback
|
||||
and its data pointer - if \fBNULL\fR the default error handler is used that
|
||||
writes error messages to \fBstderr\fR.
|
||||
and its data pointer - if \fBNULL\fR then the default error handler is used that
|
||||
writes error messages to \fBstderr\fR. The error handler callback should return
|
||||
\fBtrue\fR to continue writing the PDF file or \fBfalse\fR to stop.
|
||||
.PP
|
||||
.IP 5
|
||||
\fINote\fR: Files created using this API are slightly larger than those
|
||||
@ -3804,6 +4130,8 @@ pdfio_obj_t * pdfioFileCreateStringObj (
|
||||
.PP
|
||||
This function creates a new object with a string value in a PDF file.
|
||||
You must call \fIpdfioObjClose\fR to write the object to the file.
|
||||
|
||||
|
||||
.SS pdfioFileCreateTemporary
|
||||
|
||||
.PP
|
||||
@ -3880,6 +4208,14 @@ const char * pdfioFileGetKeywords (
|
||||
pdfio_file_t *pdf
|
||||
);
|
||||
.fi
|
||||
.SS pdfioFileGetModificationDate
|
||||
Get the most recent modification date for a PDF file.
|
||||
.PP
|
||||
.nf
|
||||
time_t pdfioFileGetModificationDate (
|
||||
pdfio_file_t *pdf
|
||||
);
|
||||
.fi
|
||||
.SS pdfioFileGetName
|
||||
Get a PDF's filename.
|
||||
.PP
|
||||
@ -3989,8 +4325,18 @@ cancel the open. If \fBNULL\fR is specified for the callback function and the
|
||||
PDF file requires a password, the open will always fail.
|
||||
.PP
|
||||
The "error_cb" and "error_cbdata" arguments specify an error handler callback
|
||||
and its data pointer - if \fBNULL\fR the default error handler is used that
|
||||
writes error messages to \fBstderr\fR.
|
||||
and its data pointer - if \fBNULL\fR then the default error handler is used that
|
||||
writes error messages to \fBstderr\fR. The error handler callback should return
|
||||
\fBtrue\fR to continue reading the PDF file or \fBfalse\fR to stop.
|
||||
.PP
|
||||
.IP 5
|
||||
Note: Error messages starting with "WARNING:" are actually warning
|
||||
.IP 5
|
||||
messages - the callback should normally return \fBtrue\fR to allow PDFio to
|
||||
.IP 5
|
||||
try to resolve the issue. In addition, some errors are unrecoverable and
|
||||
.IP 5
|
||||
ignore the return value of the error callback.
|
||||
.SS pdfioFileSetAuthor
|
||||
Set the author for a PDF file.
|
||||
.PP
|
||||
@ -4027,6 +4373,15 @@ void pdfioFileSetKeywords (
|
||||
const char *value
|
||||
);
|
||||
.fi
|
||||
.SS pdfioFileSetModificationDate
|
||||
Set the modification date for a PDF file.
|
||||
.PP
|
||||
.nf
|
||||
void pdfioFileSetModificationDate (
|
||||
pdfio_file_t *pdf,
|
||||
time_t value
|
||||
);
|
||||
.fi
|
||||
.SS pdfioFileSetPermissions
|
||||
Set the PDF permissions, encryption mode, and passwords.
|
||||
.PP
|
||||
@ -4160,6 +4515,8 @@ const char * pdfioObjGetName (
|
||||
pdfio_obj_t *obj
|
||||
);
|
||||
.fi
|
||||
.PP
|
||||
|
||||
.SS pdfioObjGetNumber
|
||||
Get the object's number.
|
||||
.PP
|
||||
@ -4334,12 +4691,13 @@ bool pdfioStreamGetToken (
|
||||
);
|
||||
.fi
|
||||
.PP
|
||||
This function reads a single PDF token from a stream. Operator tokens,
|
||||
boolean values, and numbers are returned as-is in the provided string buffer.
|
||||
String values start with the opening parenthesis ('(') but have all escaping
|
||||
resolved and the terminating parenthesis removed. Hexadecimal string values
|
||||
start with the opening angle bracket ('<') and have all whitespace and the
|
||||
terminating angle bracket removed.
|
||||
This function reads a single PDF token from a stream, skipping all whitespace
|
||||
and comments. Operator tokens, boolean values, and numbers are returned
|
||||
as-is in the provided string buffer. String values start with the opening
|
||||
parenthesis ('(') but have all escaping resolved and the terminating
|
||||
parenthesis removed. Hexadecimal string values start with the opening angle
|
||||
bracket ('<') and have all whitespace and the terminating angle bracket
|
||||
removed.
|
||||
.SS pdfioStreamPeek
|
||||
Peek at data in a stream.
|
||||
.PP
|
||||
@ -4360,6 +4718,11 @@ bool pdfioStreamPrintf (
|
||||
...
|
||||
);
|
||||
.fi
|
||||
.PP
|
||||
This function writes a formatted string to a stream. In addition to the
|
||||
standard \fBprintf\fR format characters, you can use "%H" to format a HTML/XML
|
||||
string value, "%N" to format a PDF name value ("/Name"), and "%S" to format
|
||||
a PDF string ("(String)") value.
|
||||
.SS pdfioStreamPutChar
|
||||
Write a single character to a stream.
|
||||
.PP
|
||||
|
491
doc/pdfio.html
@ -1,13 +1,13 @@
|
||||
<!DOCTYPE html>
|
||||
<html lang="en-US">
|
||||
<head>
|
||||
<title>PDFio Programming Manual v1.4.1</title>
|
||||
<title>PDFio Programming Manual v1.5.3</title>
|
||||
<meta http-equiv="Content-Type" content="text/html;charset=utf-8">
|
||||
<meta name="generator" content="codedoc v3.8">
|
||||
<meta name="author" content="Michael R Sweet">
|
||||
<meta name="language" content="en-US">
|
||||
<meta name="copyright" content="Copyright © 2021-2025 by Michael R Sweet">
|
||||
<meta name="version" content="1.4.1">
|
||||
<meta name="version" content="1.5.3">
|
||||
<style type="text/css"><!--
|
||||
body {
|
||||
background: white;
|
||||
@ -251,7 +251,7 @@ span.string {
|
||||
<body>
|
||||
<div class="header">
|
||||
<p><img class="title" src="pdfio-512.png"></p>
|
||||
<h1 class="title">PDFio Programming Manual v1.4.1</h1>
|
||||
<h1 class="title">PDFio Programming Manual v1.5.3</h1>
|
||||
<p>Michael R Sweet</p>
|
||||
<p>Copyright © 2021-2025 by Michael R Sweet</p>
|
||||
</div>
|
||||
@ -400,6 +400,7 @@ span.string {
|
||||
<li><a href="#pdfioFileCreateArrayObj">pdfioFileCreateArrayObj</a></li>
|
||||
<li><a href="#pdfioFileCreateFontObjFromBase">pdfioFileCreateFontObjFromBase</a></li>
|
||||
<li><a href="#pdfioFileCreateFontObjFromFile">pdfioFileCreateFontObjFromFile</a></li>
|
||||
<li><a href="#pdfioFileCreateICCObjFromData">pdfioFileCreateICCObjFromData</a></li>
|
||||
<li><a href="#pdfioFileCreateICCObjFromFile">pdfioFileCreateICCObjFromFile</a></li>
|
||||
<li><a href="#pdfioFileCreateImageObjFromData">pdfioFileCreateImageObjFromData</a></li>
|
||||
<li><a href="#pdfioFileCreateImageObjFromFile">pdfioFileCreateImageObjFromFile</a></li>
|
||||
@ -417,6 +418,7 @@ span.string {
|
||||
<li><a href="#pdfioFileGetCreator">pdfioFileGetCreator</a></li>
|
||||
<li><a href="#pdfioFileGetID">pdfioFileGetID</a></li>
|
||||
<li><a href="#pdfioFileGetKeywords">pdfioFileGetKeywords</a></li>
|
||||
<li><a href="#pdfioFileGetModificationDate">pdfioFileGetModificationDate</a></li>
|
||||
<li><a href="#pdfioFileGetName">pdfioFileGetName</a></li>
|
||||
<li><a href="#pdfioFileGetNumObjs">pdfioFileGetNumObjs</a></li>
|
||||
<li><a href="#pdfioFileGetNumPages">pdfioFileGetNumPages</a></li>
|
||||
@ -432,6 +434,7 @@ span.string {
|
||||
<li><a href="#pdfioFileSetCreationDate">pdfioFileSetCreationDate</a></li>
|
||||
<li><a href="#pdfioFileSetCreator">pdfioFileSetCreator</a></li>
|
||||
<li><a href="#pdfioFileSetKeywords">pdfioFileSetKeywords</a></li>
|
||||
<li><a href="#pdfioFileSetModificationDate">pdfioFileSetModificationDate</a></li>
|
||||
<li><a href="#pdfioFileSetPermissions">pdfioFileSetPermissions</a></li>
|
||||
<li><a href="#pdfioFileSetSubject">pdfioFileSetSubject</a></li>
|
||||
<li><a href="#pdfioFileSetTitle">pdfioFileSetTitle</a></li>
|
||||
@ -522,7 +525,7 @@ span.string {
|
||||
</li>
|
||||
</ul>
|
||||
<p>PDFio is <em>not</em> concerned with rendering or viewing a PDF file, although a PDF RIP or viewer could be written using it.</p>
|
||||
<p>PDFio is Copyright © 2021-2024 by Michael R Sweet and is licensed under the Apache License Version 2.0 with an (optional) exception to allow linking against GPL2/LGPL2 software. See the files "LICENSE" and "NOTICE" for more information.</p>
|
||||
<p>PDFio is Copyright © 2021-2025 by Michael R Sweet and is licensed under the Apache License Version 2.0 with an (optional) exception to allow linking against GPL2/LGPL2 software. See the files "LICENSE" and "NOTICE" for more information.</p>
|
||||
<h3 class="title" id="requirements">Requirements</h3>
|
||||
<p>PDFio requires the following to build the software:</p>
|
||||
<ul>
|
||||
@ -532,9 +535,10 @@ span.string {
|
||||
</li>
|
||||
<li><p>A POSIX-compliant <code>sh</code> program</p>
|
||||
</li>
|
||||
<li><p>ZLIB (<a href="https://www.zlib.net">https://www.zlib.net</a>) 1.0 or higher</p>
|
||||
<li><p>ZLIB (<a href="https://www.zlib.net/">https://www.zlib.net/</a>) 1.0 or higher</p>
|
||||
</li>
|
||||
</ul>
|
||||
<p>PDFio will also use libpng 1.6 or higher (<a href="https://www.libpng.org/">https://www.libpng.org/</a>) to provide enhanced PNG image support.</p>
|
||||
<p>IDE files for Xcode (macOS/iOS) and Visual Studio (Windows) are also provided.</p>
|
||||
<h3 class="title" id="installing-pdfio">Installing PDFio</h3>
|
||||
<p>PDFio comes with a configure script that creates a portable makefile that will work on any POSIX-compliant system with ZLIB installed. To make it, run:</p>
|
||||
@ -724,7 +728,7 @@ password_cb(<span class="reserved">void</span> *data, <span class="reserved">con
|
||||
<span class="reserved">return</span> (<span class="string">"Password42"</span>);
|
||||
}
|
||||
</code></pre>
|
||||
<p>The error callback is called for both errors and warnings and accepts the <code>pdfio_file_t</code> pointer, a message string, and the callback pointer value, for example:</p>
|
||||
<p>The error callback is called for both errors and warnings and accepts the <code>pdfio_file_t</code> pointer, a message string, and the callback pointer value. It returns <code>true</code> to continue processing the file or <code>false</code> to stop, for example:</p>
|
||||
<pre><code class="language-c"><span class="reserved">bool</span>
|
||||
error_cb(pdfio_file_t *pdf, <span class="reserved">const</span> <span class="reserved">char</span> *message, <span class="reserved">void</span> *data)
|
||||
{
|
||||
@ -732,11 +736,14 @@ error_cb(pdfio_file_t *pdf, <span class="reserved">const</span> <span class="res
|
||||
|
||||
fprintf(stderr, <span class="string">"%s: %s\n"</span>, pdfioFileGetName(pdf), message);
|
||||
|
||||
<span class="comment">// Return false to treat warnings as errors</span>
|
||||
<span class="reserved">return</span> (<span class="reserved">false</span>);
|
||||
<span class="comment">// Return true for warning messages (continue) and false for errors (stop)</span>
|
||||
<span class="reserved">return</span> (!strncmp(message, <span class="string">"WARNING:"</span>, <span class="number">8</span>));
|
||||
}
|
||||
</code></pre>
|
||||
<p>The default error callback (<code>NULL</code>) does the equivalent of the above.</p>
|
||||
<blockquote>
|
||||
<p>Note: Many errors are unrecoverable, so PDFio ignores the return value from the error callback and always stops processing the PDF file. Warning messages start with the prefix "WARNING:" while errors have no prefix.</p>
|
||||
</blockquote>
|
||||
<p>Each PDF file contains one or more pages. The <a href="#pdfioFileGetNumPages"><code>pdfioFileGetNumPages</code></a> function returns the number of pages in the file while the <a href="#pdfioFileGetPage"><code>pdfioFileGetPage</code></a> function gets the specified page in the PDF file:</p>
|
||||
<pre><code class="language-c">pdfio_file_t *pdf; <span class="comment">// PDF file</span>
|
||||
size_t i; <span class="comment">// Looping var</span>
|
||||
@ -1161,11 +1168,26 @@ main(<span class="reserved">int</span> argc, <span clas
|
||||
{
|
||||
<span class="reserved">const</span> <span class="reserved">char</span> *filename; <span class="comment">// PDF filename</span>
|
||||
pdfio_file_t *pdf; <span class="comment">// PDF file</span>
|
||||
<span class="reserved">const</span> <span class="reserved">char</span> *author; <span class="comment">// Author name</span>
|
||||
time_t creation_date; <span class="comment">// Creation date</span>
|
||||
<span class="reserved">struct</span> tm *creation_tm; <span class="comment">// Creation date/time information</span>
|
||||
<span class="reserved">char</span> creation_text[<span class="number">256</span>]; <span class="comment">// Creation date/time as a string</span>
|
||||
<span class="reserved">const</span> <span class="reserved">char</span> *title; <span class="comment">// Title</span>
|
||||
pdfio_dict_t *catalog; <span class="comment">// Catalog dictionary</span>
|
||||
<span class="reserved">const</span> <span class="reserved">char</span> *author, <span class="comment">// Author name</span>
|
||||
*creator, <span class="comment">// Creator name</span>
|
||||
*producer, <span class="comment">// Producer name</span>
|
||||
*title; <span class="comment">// Title</span>
|
||||
time_t creation_date, <span class="comment">// Creation date</span>
|
||||
modification_date; <span class="comment">// Modification date</span>
|
||||
<span class="reserved">struct</span> tm *creation_tm, <span class="comment">// Creation date/time information</span>
|
||||
*modification_tm; <span class="comment">// Modification date/time information</span>
|
||||
<span class="reserved">char</span> creation_text[<span class="number">256</span>], <span class="comment">// Creation date/time as a string</span>
|
||||
modification_text[<span class="number">256</span>], <span class="comment">// Modification date/time human fmt string</span>
|
||||
range_text[<span class="number">255</span>]; <span class="comment">// Page range text</span>
|
||||
size_t num_pages; <span class="comment">// PDF number of pages</span>
|
||||
<span class="reserved">bool</span> has_acroform; <span class="comment">// Does the file have an AcroForm?</span>
|
||||
pdfio_obj_t *page; <span class="comment">// Object</span>
|
||||
pdfio_dict_t *page_dict; <span class="comment">// Object dictionary</span>
|
||||
size_t cur, <span class="comment">// Current page index</span>
|
||||
prev; <span class="comment">// Previous page index</span>
|
||||
pdfio_rect_t cur_box, <span class="comment">// Current MediaBox</span>
|
||||
prev_box; <span class="comment">// Previous MediaBox</span>
|
||||
|
||||
|
||||
<span class="comment">// Get the filename from the command-line...</span>
|
||||
@ -1178,14 +1200,20 @@ main(<span class="reserved">int</span> argc, <span clas
|
||||
filename = argv[<span class="number">1</span>];
|
||||
|
||||
<span class="comment">// Open the PDF file with the default callbacks...</span>
|
||||
pdf = pdfioFileOpen(filename, <span class="comment">/*password_cb*/</span>NULL, <span class="comment">/*password_cbdata*/</span>NULL,
|
||||
<span class="comment">/*error_cb*/</span>NULL, <span class="comment">/*error_cbdata*/</span>NULL);
|
||||
pdf = pdfioFileOpen(filename, <span class="comment">/*password_cb*/</span>NULL,
|
||||
<span class="comment">/*password_cbdata*/</span>NULL, <span class="comment">/*error_cb*/</span>NULL,
|
||||
<span class="comment">/*error_cbdata*/</span>NULL);
|
||||
<span class="reserved">if</span> (pdf == NULL)
|
||||
<span class="reserved">return</span> (<span class="number">1</span>);
|
||||
|
||||
<span class="comment">// Get the title and author...</span>
|
||||
author = pdfioFileGetAuthor(pdf);
|
||||
title = pdfioFileGetTitle(pdf);
|
||||
<span class="comment">// Get the title, author, etc...</span>
|
||||
catalog = pdfioFileGetCatalog(pdf);
|
||||
author = pdfioFileGetAuthor(pdf);
|
||||
creator = pdfioFileGetCreator(pdf);
|
||||
has_acroform = pdfioDictGetType(catalog, <span class="string">"AcroForm"</span>) != PDFIO_VALTYPE_NONE;
|
||||
num_pages = pdfioFileGetNumPages(pdf);
|
||||
producer = pdfioFileGetProducer(pdf);
|
||||
title = pdfioFileGetTitle(pdf);
|
||||
|
||||
<span class="comment">// Get the creation date and convert to a string...</span>
|
||||
<span class="reserved">if</span> ((creation_date = pdfioFileGetCreationDate(pdf)) > <span class="number">0</span>)
|
||||
@ -1198,12 +1226,76 @@ main(<span class="reserved">int</span> argc, <span clas
|
||||
snprintf(creation_text, <span class="reserved">sizeof</span>(creation_text), <span class="string">"-- not set --"</span>);
|
||||
}
|
||||
|
||||
<span class="comment">// Get the modification date and convert to a string...</span>
|
||||
<span class="reserved">if</span> ((modification_date = pdfioFileGetModificationDate(pdf)) > <span class="number">0</span>)
|
||||
{
|
||||
modification_tm = localtime(&modification_date);
|
||||
strftime(modification_text, <span class="reserved">sizeof</span>(modification_text), <span class="string">"%c"</span>, modification_tm);
|
||||
}
|
||||
<span class="reserved">else</span>
|
||||
{
|
||||
snprintf(modification_text, <span class="reserved">sizeof</span>(modification_text), <span class="string">"-- not set --"</span>);
|
||||
}
|
||||
|
||||
<span class="comment">// Print file information to stdout...</span>
|
||||
printf(<span class="string">"%s:\n"</span>, filename);
|
||||
printf(<span class="string">" Title: %s\n"</span>, title ? title : <span class="string">"-- not set --"</span>);
|
||||
printf(<span class="string">" Author: %s\n"</span>, author ? author : <span class="string">"-- not set --"</span>);
|
||||
printf(<span class="string">" Created On: %s\n"</span>, creation_text);
|
||||
printf(<span class="string">" Number Pages: %u\n"</span>, (<span class="reserved">unsigned</span>)pdfioFileGetNumPages(pdf));
|
||||
printf(<span class="string">" Title: %s\n"</span>, title ? title : <span class="string">"-- not set --"</span>);
|
||||
printf(<span class="string">" Author: %s\n"</span>, author ? author : <span class="string">"-- not set --"</span>);
|
||||
printf(<span class="string">" Creator: %s\n"</span>, creator ? creator : <span class="string">"-- not set --"</span>);
|
||||
printf(<span class="string">" Producer: %s\n"</span>, producer ? producer : <span class="string">"-- not set --"</span>);
|
||||
printf(<span class="string">" Created On: %s\n"</span>, creation_text);
|
||||
printf(<span class="string">" Modified On: %s\n"</span>, modification_text);
|
||||
printf(<span class="string">" Version: %s\n"</span>, pdfioFileGetVersion(pdf));
|
||||
printf(<span class="string">" AcroForm: %s\n"</span>, has_acroform ? <span class="string">"Yes"</span> : <span class="string">"No"</span>);
|
||||
printf(<span class="string">" Number of Pages: %u\n"</span>, (<span class="reserved">unsigned</span>)num_pages);
|
||||
|
||||
<span class="comment">// Report the MediaBox for all of the pages</span>
|
||||
prev_box.x1 = prev_box.x2 = prev_box.y1 = prev_box.y2 = <span class="number">0.0</span>;
|
||||
|
||||
<span class="reserved">for</span> (cur = <span class="number">0</span>, prev = <span class="number">0</span>; cur < num_pages; cur ++)
|
||||
{
|
||||
<span class="comment">// Find the MediaBox for this page in the page tree...</span>
|
||||
<span class="reserved">for</span> (page = pdfioFileGetPage(pdf, cur);
|
||||
page != NULL;
|
||||
page = pdfioDictGetObj(page_dict, <span class="string">"Parent"</span>))
|
||||
{
|
||||
cur_box.x1 = cur_box.x2 = cur_box.y1 = cur_box.y2 = <span class="number">0.0</span>;
|
||||
page_dict = pdfioObjGetDict(page);
|
||||
|
||||
<span class="reserved">if</span> (pdfioDictGetRect(page_dict, <span class="string">"MediaBox"</span>, &cur_box))
|
||||
<span class="reserved">break</span>;
|
||||
}
|
||||
|
||||
<span class="comment">// If this MediaBox is different from the previous one, show the range of</span>
|
||||
<span class="comment">// pages that have that size...</span>
|
||||
<span class="reserved">if</span> (cur == <span class="number">0</span> ||
|
||||
fabs(cur_box.x1 - prev_box.x1) > <span class="number">0.01</span> ||
|
||||
fabs(cur_box.y1 - prev_box.y1) > <span class="number">0.01</span> ||
|
||||
fabs(cur_box.x2 - prev_box.x2) > <span class="number">0.01</span> ||
|
||||
fabs(cur_box.y2 - prev_box.y2) > <span class="number">0.01</span>)
|
||||
{
|
||||
<span class="reserved">if</span> (cur > prev)
|
||||
{
|
||||
snprintf(range_text, <span class="reserved">sizeof</span>(range_text), <span class="string">"Pages %u-%u"</span>,
|
||||
(<span class="reserved">unsigned</span>)(prev + <span class="number">1</span>), (<span class="reserved">unsigned</span>)cur);
|
||||
printf(<span class="string">"%16s: [%g %g %g %g]\n"</span>, range_text,
|
||||
prev_box.x1, prev_box.y1, prev_box.x2, prev_box.y2);
|
||||
}
|
||||
|
||||
<span class="comment">// Start a new series of pages with the new size...</span>
|
||||
prev = cur;
|
||||
prev_box = cur_box;
|
||||
}
|
||||
}
|
||||
|
||||
<span class="comment">// Show the last range as needed...</span>
|
||||
<span class="reserved">if</span> (cur > prev)
|
||||
{
|
||||
snprintf(range_text, <span class="reserved">sizeof</span>(range_text), <span class="string">"Pages %u-%u"</span>,
|
||||
(<span class="reserved">unsigned</span>)(prev + <span class="number">1</span>), (<span class="reserved">unsigned</span>)cur);
|
||||
printf(<span class="string">"%16s: [%g %g %g %g]\n"</span>, range_text,
|
||||
prev_box.x1, prev_box.y1, prev_box.x2, prev_box.y2);
|
||||
}
|
||||
|
||||
<span class="comment">// Close the PDF file...</span>
|
||||
pdfioFileClose(pdf);
|
||||
@ -1212,26 +1304,69 @@ main(<span class="reserved">int</span> argc, <span clas
|
||||
}
|
||||
</code></pre>
|
||||
<h3 class="title" id="extract-text-from-pdf-file">Extract Text from PDF File</h3>
|
||||
<p>The <code>pdf2text.c</code> example code extracts non-Unicode text from a PDF file by scanning each page for strings and text drawing commands. Since it doesn't look at the font encoding or support Unicode text, it is really only useful to extract plain ASCII text from a PDF file. And since it writes text in the order it appears in the page stream, it may not come out in the same order as appears on the page.</p>
|
||||
<p>The <a href="#pdfioStreamGetToken"><code>pdfioStreamGetToken</code></a> function is used to read individual tokens from the page streams. Tokens starting with the open parenthesis are text strings, while PDF operators are left as-is. We use some simple logic to make sure that we include spaces between text strings and add newlines for the text operators that start a new line in a text block:</p>
|
||||
<p>The <code>pdf2text.c</code> example code extracts text from a PDF file and writes it to the standard output. Unlike some other PDF tools, it outputs the text in the order it is seen in each page stream so the output might appear "jumbled" if the PDF producer doesn't output text in reading order. The code is able to handle different font encodings and produces UTF-8 output.</p>
|
||||
<p>The <a href="#pdfioStreamGetToken"><code>pdfioStreamGetToken</code></a> function is used to read individual tokens from the page streams:</p>
|
||||
<pre><code class="language-c">pdfio_stream_t *st; <span class="comment">// Page stream</span>
|
||||
<span class="reserved">char</span> buffer[<span class="number">1024</span>], <span class="comment">// Token buffer</span>
|
||||
*bufptr, <span class="comment">// Pointer into buffer</span>
|
||||
name[<span class="number">256</span>]; <span class="comment">// Current (font) name</span>
|
||||
<span class="reserved">bool</span> first = <span class="reserved">true</span>; <span class="comment">// First string on line?</span>
|
||||
<span class="reserved">char</span> buffer[<span class="number">1024</span>]; <span class="comment">// Token buffer</span>
|
||||
<span class="reserved">int</span> encoding[<span class="number">256</span>]; <span class="comment">// Font encoding to Unicode</span>
|
||||
<span class="reserved">bool</span> in_array = <span class="reserved">false</span>; <span class="comment">// Are we in an array?</span>
|
||||
|
||||
<span class="comment">// Read PDF tokens from the page stream...</span>
|
||||
<span class="reserved">while</span> (pdfioStreamGetToken(st, buffer, <span class="reserved">sizeof</span>(buffer)))
|
||||
{
|
||||
<span class="reserved">if</span> (buffer[<span class="number">0</span>] == <span class="string">'('</span>)
|
||||
</code></pre>
|
||||
<p>Justified text can be found inside arrays ("[ ... ]"), so we look for the array delimiter tokens and any (spacing) numbers inside an array. Experimentation has shown that numbers greater than 100 can be treated as whitespace:</p>
|
||||
<pre><code class="language-c"> <span class="reserved">if</span> (!strcmp(buffer, <span class="string">"["</span>))
|
||||
{
|
||||
<span class="comment">// Start of an array for justified text...</span>
|
||||
in_array = <span class="reserved">true</span>;
|
||||
}
|
||||
<span class="reserved">else</span> <span class="reserved">if</span> (!strcmp(buffer, <span class="string">"]"</span>))
|
||||
{
|
||||
<span class="comment">// End of an array for justified text...</span>
|
||||
in_array = <span class="reserved">false</span>;
|
||||
}
|
||||
<span class="reserved">else</span> <span class="reserved">if</span> (!first && in_array && (isdigit(buffer[<span class="number">0</span>]) || buffer[<span class="number">0</span>] == <span class="string">'-'</span>) && fabs(atof(buffer)) > <span class="number">100</span>)
|
||||
{
|
||||
<span class="comment">// Whitespace in a justified text block...</span>
|
||||
putchar(<span class="string">' '</span>);
|
||||
}
|
||||
</code></pre>
|
||||
<p>Tokens starting with '(' or '<' are text fragments. 8-bit text starting with '(' needs to be mapped to Unicode using the current font encoding while hex strings starting with '<' are UTF-16 (Unicode) that need to be converted to UTF-8:</p>
|
||||
<pre><code class="language-c"> <span class="reserved">else</span> <span class="reserved">if</span> (buffer[<span class="number">0</span>] == <span class="string">'('</span>)
|
||||
{
|
||||
<span class="comment">// Text string using an 8-bit encoding</span>
|
||||
<span class="reserved">if</span> (first)
|
||||
first = <span class="reserved">false</span>;
|
||||
<span class="reserved">else</span> <span class="reserved">if</span> (buffer[<span class="number">1</span>] != <span class="string">' '</span>)
|
||||
putchar(<span class="string">' '</span>);
|
||||
first = <span class="reserved">false</span>;
|
||||
|
||||
fputs(buffer + <span class="number">1</span>, stdout);
|
||||
<span class="reserved">for</span> (bufptr = buffer + <span class="number">1</span>; *bufptr; bufptr ++)
|
||||
put_utf8(encoding[*bufptr & <span class="number">255</span>]);
|
||||
}
|
||||
<span class="reserved">else</span> <span class="reserved">if</span> (!strcmp(buffer, <span class="string">"Td"</span>) || !strcmp(buffer, <span class="string">"TD"</span>) || !strcmp(buffer, <span class="string">"T*"</span>) ||
|
||||
<span class="reserved">else</span> <span class="reserved">if</span> (buffer[<span class="number">0</span>] == <span class="string">'<'</span>)
|
||||
{
|
||||
<span class="comment">// Unicode text string</span>
|
||||
first = <span class="reserved">false</span>;
|
||||
|
||||
puts_utf16(buffer + <span class="number">1</span>);
|
||||
}
|
||||
</code></pre>
|
||||
<p>Simple (8-bit) fonts include an encoding table that maps the 8-bit characters to one of 1051 Unicode glyph names. Since each font can use a different encoding, we look for font names starting with '/' and the "Tf" (set text font) operator token and load that font's encoding using the <a href="#the-loadencoding-function">load_encoding</a> function:</p>
|
||||
<pre><code class="language-c"> <span class="reserved">else</span> <span class="reserved">if</span> (buffer[<span class="number">0</span>] == <span class="string">'/'</span>)
|
||||
{
|
||||
<span class="comment">// Save name...</span>
|
||||
strncpy(name, buffer + <span class="number">1</span>, <span class="reserved">sizeof</span>(name) - <span class="number">1</span>);
|
||||
name[<span class="reserved">sizeof</span>(name) - <span class="number">1</span>] = <span class="string">'\0'</span>;
|
||||
}
|
||||
<span class="reserved">else</span> <span class="reserved">if</span> (!strcmp(buffer, <span class="string">"Tf"</span>) && name[<span class="number">0</span>])
|
||||
{
|
||||
<span class="comment">// Set font...</span>
|
||||
load_encoding(obj, name, encoding);
|
||||
}
|
||||
</code></pre>
|
||||
<p>Finally, some text operators start a new line in a text block, so when we see their tokens we output a newline:</p>
|
||||
<pre><code class="language-c"> <span class="reserved">else</span> <span class="reserved">if</span> (!strcmp(buffer, <span class="string">"Td"</span>) || !strcmp(buffer, <span class="string">"TD"</span>) || !strcmp(buffer, <span class="string">"T*"</span>) ||
|
||||
!strcmp(buffer, <span class="string">"\'"</span>) || !strcmp(buffer, <span class="string">"\""</span>))
|
||||
{
|
||||
<span class="comment">// Text operators that advance to the next line in the block</span>
|
||||
@ -1239,9 +1374,133 @@ main(<span class="reserved">int</span> argc, <span clas
|
||||
first = <span class="reserved">true</span>;
|
||||
}
|
||||
}
|
||||
</code></pre>
|
||||
<h4 id="the-loadencoding-function">The <code>load_encoding</code> Function</h4>
|
||||
<p>The <code>load_encoding</code> function looks up the named font in the page's "Resources" dictionary. Every PDF simple font contains an "Encoding" dictionary with a base encoding ("WinANSI", "MacRoman", or "MacExpert") and a differences array that lists character indexes and glyph names for an 8-bit font.</p>
|
||||
<p>We start by initializing the encoding array to the default WinANSI encoding and looking up the font object for the named font:</p>
|
||||
<pre><code class="language-c"><span class="reserved">static</span> <span class="reserved">void</span>
|
||||
load_encoding(
|
||||
pdfio_obj_t *page_obj, <span class="comment">// I - Page object</span>
|
||||
<span class="reserved">const</span> <span class="reserved">char</span> *name, <span class="comment">// I - Font name</span>
|
||||
<span class="reserved">int</span> encoding[<span class="number">256</span>]) <span class="comment">// O - Encoding table</span>
|
||||
{
|
||||
size_t i, j; <span class="comment">// Looping vars</span>
|
||||
pdfio_dict_t *page_dict, <span class="comment">// Page dictionary</span>
|
||||
*resources_dict, <span class="comment">// Resources dictionary</span>
|
||||
*font_dict; <span class="comment">// Font dictionary</span>
|
||||
pdfio_obj_t *font_obj, <span class="comment">// Font object</span>
|
||||
*encoding_obj; <span class="comment">// Encoding object</span>
|
||||
<span class="reserved">static</span> <span class="reserved">int</span> win_ansi[<span class="number">32</span>] = <span class="comment">// WinANSI characters from 128 to 159</span>
|
||||
{
|
||||
...
|
||||
};
|
||||
<span class="reserved">static</span> <span class="reserved">int</span> mac_roman[<span class="number">128</span>] = <span class="comment">// MacRoman characters from 128 to 255</span>
|
||||
{
|
||||
...
|
||||
};
|
||||
|
||||
<span class="reserved">if</span> (!first)
|
||||
putchar(<span class="string">'\n'</span>);
|
||||
|
||||
<span class="comment">// Initialize the encoding to be the "standard" WinAnsi...</span>
|
||||
<span class="reserved">for</span> (i = <span class="number">0</span>; i < <span class="number">128</span>; i ++)
|
||||
encoding[i] = i;
|
||||
<span class="reserved">for</span> (i = <span class="number">160</span>; i < <span class="number">256</span>; i ++)
|
||||
encoding[i] = i;
|
||||
memcpy(encoding + <span class="number">128</span>, win_ansi, <span class="reserved">sizeof</span>(win_ansi));
|
||||
|
||||
<span class="comment">// Find the named font...</span>
|
||||
<span class="reserved">if</span> ((page_dict = pdfioObjGetDict(page_obj)) == NULL)
|
||||
<span class="reserved">return</span>;
|
||||
|
||||
<span class="reserved">if</span> ((resources_dict = pdfioDictGetDict(page_dict, <span class="string">"Resources"</span>)) == NULL)
|
||||
<span class="reserved">return</span>;
|
||||
|
||||
<span class="reserved">if</span> ((font_dict = pdfioDictGetDict(resources_dict, <span class="string">"Font"</span>)) == NULL)
|
||||
{
|
||||
<span class="comment">// Font resources not a dictionary, see if it is an object...</span>
|
||||
<span class="reserved">if</span> ((font_obj = pdfioDictGetObj(resources_dict, <span class="string">"Font"</span>)) != NULL)
|
||||
font_dict = pdfioObjGetDict(font_obj);
|
||||
|
||||
<span class="reserved">if</span> (!font_dict)
|
||||
<span class="reserved">return</span>;
|
||||
}
|
||||
|
||||
<span class="reserved">if</span> ((font_obj = pdfioDictGetObj(font_dict, name)) == NULL)
|
||||
<span class="reserved">return</span>;
|
||||
</code></pre>
|
||||
<p>Once we have found the font we see if it has an "Encoding" dictionary:</p>
|
||||
<pre><code class="language-c"> pdfio_dict_t *encoding_dict; <span class="comment">// Encoding dictionary</span>
|
||||
|
||||
<span class="reserved">if</span> ((encoding_obj = pdfioDictGetObj(pdfioObjGetDict(font_obj), <span class="string">"Encoding"</span>)) == NULL)
|
||||
<span class="reserved">return</span>;
|
||||
|
||||
<span class="reserved">if</span> ((encoding_dict = pdfioObjGetDict(encoding_obj)) == NULL)
|
||||
<span class="reserved">return</span>;
|
||||
</code></pre>
|
||||
<p>Once we have the encoding dictionary we can get the "BaseEncoding" and "Differences" values:</p>
|
||||
<pre><code class="language-c"> <span class="reserved">const</span> <span class="reserved">char</span> *base_encoding; <span class="comment">// BaseEncoding name</span>
|
||||
pdfio_array_t *differences; <span class="comment">// Differences array</span>
|
||||
|
||||
<span class="comment">// OK, have the encoding object, build the encoding using it...</span>
|
||||
base_encoding = pdfioDictGetName(encoding_dict, <span class="string">"BaseEncoding"</span>);
|
||||
differences = pdfioDictGetArray(encoding_dict, <span class="string">"Differences"</span>);
|
||||
</code></pre>
|
||||
<p>If the base encoding is "MacRomainEncoding", we need to reset the upper 128 characters in the encoding array match it:</p>
|
||||
<pre><code class="language-c"> <span class="reserved">if</span> (base_encoding && !strcmp(base_encoding, <span class="string">"MacRomanEncoding"</span>))
|
||||
{
|
||||
<span class="comment">// Map upper 128</span>
|
||||
memcpy(encoding + <span class="number">128</span>, mac_roman, <span class="reserved">sizeof</span>(mac_roman));
|
||||
}
|
||||
</code></pre>
|
||||
<p>Then we loop through the differences array, keeping track of the current index within the encoding array. A number indicates a new index while a name is the Unicode glyph for the current index:</p>
|
||||
<pre><code class="language-c"> <span class="reserved">typedef</span> <span class="reserved">struct</span> name_map_s
|
||||
{
|
||||
<span class="reserved">const</span> <span class="reserved">char</span> *name; <span class="comment">// Character name</span>
|
||||
<span class="reserved">int</span> unicode; <span class="comment">// Unicode value</span>
|
||||
} name_map_t;
|
||||
|
||||
<span class="reserved">static</span> name_map_t unicode_map[<span class="number">1051</span>]; <span class="comment">// List of glyph names</span>
|
||||
|
||||
<span class="reserved">if</span> (differences)
|
||||
{
|
||||
<span class="comment">// Apply differences</span>
|
||||
size_t count = pdfioArrayGetSize(differences);
|
||||
<span class="comment">// Number of differences</span>
|
||||
<span class="reserved">const</span> <span class="reserved">char</span> *name; <span class="comment">// Character name</span>
|
||||
size_t idx = <span class="number">0</span>; <span class="comment">// Index in encoding array</span>
|
||||
|
||||
<span class="reserved">for</span> (i = <span class="number">0</span>; i < count; i ++)
|
||||
{
|
||||
<span class="reserved">switch</span> (pdfioArrayGetType(differences, i))
|
||||
{
|
||||
<span class="reserved">case</span> PDFIO_VALTYPE_NUMBER :
|
||||
<span class="comment">// Get the index of the next character...</span>
|
||||
idx = (size_t)pdfioArrayGetNumber(differences, i);
|
||||
<span class="reserved">break</span>;
|
||||
|
||||
<span class="reserved">case</span> PDFIO_VALTYPE_NAME :
|
||||
<span class="comment">// Lookup name and apply to encoding...</span>
|
||||
<span class="reserved">if</span> (idx < <span class="number">0</span> || idx > <span class="number">255</span>)
|
||||
<span class="reserved">break</span>;
|
||||
|
||||
name = pdfioArrayGetName(differences, i);
|
||||
<span class="reserved">for</span> (j = <span class="number">0</span>; j < (<span class="reserved">sizeof</span>(unicode_map) / <span class="reserved">sizeof</span>(unicode_map[<span class="number">0</span>])); j ++)
|
||||
{
|
||||
<span class="reserved">if</span> (!strcmp(name, unicode_map[j].name))
|
||||
{
|
||||
encoding[idx] = unicode_map[j].unicode;
|
||||
<span class="reserved">break</span>;
|
||||
}
|
||||
}
|
||||
idx ++;
|
||||
<span class="reserved">break</span>;
|
||||
|
||||
<span class="reserved">default</span> :
|
||||
<span class="comment">// Do nothing for other values</span>
|
||||
<span class="reserved">break</span>;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
</code></pre>
|
||||
<h3 class="title" id="create-a-pdf-file-with-text-and-an-image">Create a PDF File With Text and an Image</h3>
|
||||
<p>The <code>image2pdf.c</code> example code creates a PDF file containing a JPEG or PNG image file and optional caption on a single page. The <code>create_pdf_image_file</code> function creates the PDF file, embeds a base font and the named JPEG or PNG image file, and then creates a page with the image centered on the page with any text centered below:</p>
|
||||
@ -2490,7 +2749,7 @@ size_t pdfioArrayGetSize(<a href="#pdfio_array_t">pdfio_array_t</a> *a);</p>
|
||||
</tbody></table>
|
||||
<h4 class="returnvalue">Return Value</h4>
|
||||
<p class="description">Value type</p>
|
||||
<h3 class="function"><a id="pdfioArrayRemove">pdfioArrayRemove</a></h3>
|
||||
<h3 class="function"><span class="info"> PDFio v1.4 </span><a id="pdfioArrayRemove">pdfioArrayRemove</a></h3>
|
||||
<p class="description">Remove an array entry.</p>
|
||||
<p class="code">
|
||||
<span class="reserved">bool</span> pdfioArrayRemove(<a href="#pdfio_array_t">pdfio_array_t</a> *a, size_t n);</p>
|
||||
@ -2695,7 +2954,7 @@ using the <a href="#pdfioPageDictAddImage"><code>pdfioPageDictAddImage</code></a
|
||||
</tbody></table>
|
||||
<h4 class="returnvalue">Return Value</h4>
|
||||
<p class="description"><code>true</code> on success, <code>false</code> on failure</p>
|
||||
<h3 class="function"><a id="pdfioContentPathEnd">pdfioContentPathEnd</a></h3>
|
||||
<h3 class="function"><span class="info"> PDFio v1.1 </span><a id="pdfioContentPathEnd">pdfioContentPathEnd</a></h3>
|
||||
<p class="description">Clear the current path.</p>
|
||||
<p class="code">
|
||||
<span class="reserved">bool</span> pdfioContentPathEnd(<a href="#pdfio_stream_t">pdfio_stream_t</a> *st);</p>
|
||||
@ -3185,7 +3444,7 @@ are 0, a solid line is drawn.</p>
|
||||
</tbody></table>
|
||||
<h4 class="returnvalue">Return Value</h4>
|
||||
<p class="description"><code>true</code> on success, <code>false</code> on failure</p>
|
||||
<h3 class="function"><a id="pdfioContentTextMeasure">pdfioContentTextMeasure</a></h3>
|
||||
<h3 class="function"><span class="info"> PDFio v1.2 </span><a id="pdfioContentTextMeasure">pdfioContentTextMeasure</a></h3>
|
||||
<p class="description">Measure a text string and return its width.</p>
|
||||
<p class="code">
|
||||
<span class="reserved">double</span> pdfioContentTextMeasure(<a href="#pdfio_obj_t">pdfio_obj_t</a> *font, <span class="reserved">const</span> <span class="reserved">char</span> *s, <span class="reserved">double</span> size);</p>
|
||||
@ -3203,7 +3462,9 @@ are 0, a solid line is drawn.</p>
|
||||
<h4 class="discussion">Discussion</h4>
|
||||
<p class="discussion">This function measures the given text string "s" and returns its width based
|
||||
on "size". The text string must always use the UTF-8 (Unicode) encoding but
|
||||
any control characters (such as newlines) are ignored.</p>
|
||||
any control characters (such as newlines) are ignored.
|
||||
|
||||
</p>
|
||||
<h3 class="function"><a id="pdfioContentTextMoveLine">pdfioContentTextMoveLine</a></h3>
|
||||
<p class="description">Move to the next line and offset.</p>
|
||||
<p class="code">
|
||||
@ -3234,7 +3495,7 @@ any control characters (such as newlines) are ignored.</p>
|
||||
</tbody></table>
|
||||
<h4 class="returnvalue">Return Value</h4>
|
||||
<p class="description"><code>true</code> on success, <code>false</code> on failure</p>
|
||||
<h3 class="function"><a id="pdfioContentTextNewLine">pdfioContentTextNewLine</a></h3>
|
||||
<h3 class="function"><span class="info"> PDFio v1.2 </span><a id="pdfioContentTextNewLine">pdfioContentTextNewLine</a></h3>
|
||||
<p class="description">Move to the next line.</p>
|
||||
<p class="code">
|
||||
<span class="reserved">bool</span> pdfioContentTextNewLine(<a href="#pdfio_stream_t">pdfio_stream_t</a> *st);</p>
|
||||
@ -3245,7 +3506,7 @@ any control characters (such as newlines) are ignored.</p>
|
||||
</tbody></table>
|
||||
<h4 class="returnvalue">Return Value</h4>
|
||||
<p class="description"><code>true</code> on success, <code>false</code> on failure</p>
|
||||
<h3 class="function"><a id="pdfioContentTextNewLineShow">pdfioContentTextNewLineShow</a></h3>
|
||||
<h3 class="function"><span class="info"> PDFio v1.2 </span><a id="pdfioContentTextNewLineShow">pdfioContentTextNewLineShow</a></h3>
|
||||
<p class="description">Move to the next line and show text.</p>
|
||||
<p class="code">
|
||||
<span class="reserved">bool</span> pdfioContentTextNewLineShow(<a href="#pdfio_stream_t">pdfio_stream_t</a> *st, <span class="reserved">double</span> ws, <span class="reserved">double</span> cs, <span class="reserved">bool</span> unicode, <span class="reserved">const</span> <span class="reserved">char</span> *s);</p>
|
||||
@ -3268,8 +3529,10 @@ any control characters (such as newlines) are ignored.</p>
|
||||
<p class="discussion">This function moves to the next line and then shows some text with optional
|
||||
word and character spacing in a PDF content stream. The "unicode" argument
|
||||
specifies that the current font maps to full Unicode. The "s" argument
|
||||
specifies a UTF-8 encoded string.</p>
|
||||
<h3 class="function"><a id="pdfioContentTextNewLineShowf">pdfioContentTextNewLineShowf</a></h3>
|
||||
specifies a UTF-8 encoded string.
|
||||
|
||||
</p>
|
||||
<h3 class="function"><span class="info"> PDFio v1.2 </span><a id="pdfioContentTextNewLineShowf">pdfioContentTextNewLineShowf</a></h3>
|
||||
<p class="description">Show formatted text.</p>
|
||||
<p class="code">
|
||||
<span class="reserved">bool</span> pdfioContentTextNewLineShowf(<a href="#pdfio_stream_t">pdfio_stream_t</a> *st, <span class="reserved">double</span> ws, <span class="reserved">double</span> cs, <span class="reserved">bool</span> unicode, <span class="reserved">const</span> <span class="reserved">char</span> *format, ...);</p>
|
||||
@ -3294,7 +3557,9 @@ specifies a UTF-8 encoded string.</p>
|
||||
<p class="discussion">This function moves to the next line and shows some formatted text with
|
||||
optional word and character spacing in a PDF content stream. The "unicode"
|
||||
argument specifies that the current font maps to full Unicode. The "format"
|
||||
argument specifies a UTF-8 encoded <code>printf</code>-style format string.</p>
|
||||
argument specifies a UTF-8 encoded <code>printf</code>-style format string.
|
||||
|
||||
</p>
|
||||
<h3 class="function"><a id="pdfioContentTextShow">pdfioContentTextShow</a></h3>
|
||||
<p class="description">Show text.</p>
|
||||
<p class="code">
|
||||
@ -3357,7 +3622,7 @@ argument specifies an array of UTF-8 encoded strings.</p>
|
||||
<p class="discussion">This function shows some formatted text in a PDF content stream. The
|
||||
"unicode" argument specifies that the current font maps to full Unicode.
|
||||
The "format" argument specifies a UTF-8 encoded <code>printf</code>-style format string.</p>
|
||||
<h3 class="function"><a id="pdfioDictClear">pdfioDictClear</a></h3>
|
||||
<h3 class="function"><span class="info"> PDFio v1.4 </span><a id="pdfioDictClear">pdfioDictClear</a></h3>
|
||||
<p class="description">Remove a key/value pair from a dictionary.</p>
|
||||
<p class="code">
|
||||
<span class="reserved">bool</span> pdfioDictClear(<a href="#pdfio_dict_t">pdfio_dict_t</a> *dict, <span class="reserved">const</span> <span class="reserved">char</span> *key);</p>
|
||||
@ -3461,7 +3726,7 @@ time_t pdfioDictGetDate(<a href="#pdfio_dict_t">pdfio_dict_t</a> *dict, <span cl
|
||||
</tbody></table>
|
||||
<h4 class="returnvalue">Return Value</h4>
|
||||
<p class="description">Value</p>
|
||||
<h3 class="function"><a id="pdfioDictGetKey">pdfioDictGetKey</a></h3>
|
||||
<h3 class="function"><span class="info"> PDFio v1.4 </span><a id="pdfioDictGetKey">pdfioDictGetKey</a></h3>
|
||||
<p class="description">Get the key for the specified pair.</p>
|
||||
<p class="code">
|
||||
<span class="reserved">const</span> <span class="reserved">char</span> *pdfioDictGetKey(<a href="#pdfio_dict_t">pdfio_dict_t</a> *dict, size_t n);</p>
|
||||
@ -3487,7 +3752,7 @@ time_t pdfioDictGetDate(<a href="#pdfio_dict_t">pdfio_dict_t</a> *dict, <span cl
|
||||
</tbody></table>
|
||||
<h4 class="returnvalue">Return Value</h4>
|
||||
<p class="description">Value</p>
|
||||
<h3 class="function"><a id="pdfioDictGetNumPairs">pdfioDictGetNumPairs</a></h3>
|
||||
<h3 class="function"><span class="info"> PDFio v1.4 </span><a id="pdfioDictGetNumPairs">pdfioDictGetNumPairs</a></h3>
|
||||
<p class="description">Get the number of key/value pairs in a dictionary.</p>
|
||||
<p class="code">
|
||||
size_t pdfioDictGetNumPairs(<a href="#pdfio_dict_t">pdfio_dict_t</a> *dict);</p>
|
||||
@ -3565,7 +3830,7 @@ size_t pdfioDictGetNumPairs(<a href="#pdfio_dict_t">pdfio_dict_t</a> *dict);</p>
|
||||
</tbody></table>
|
||||
<h4 class="returnvalue">Return Value</h4>
|
||||
<p class="description">Value type</p>
|
||||
<h3 class="function"><a id="pdfioDictIterateKeys">pdfioDictIterateKeys</a></h3>
|
||||
<h3 class="function"><span class="info"> PDFio v1.1 </span><a id="pdfioDictIterateKeys">pdfioDictIterateKeys</a></h3>
|
||||
<p class="description">Iterate the keys in a dictionary.</p>
|
||||
<p class="code">
|
||||
<span class="reserved">void</span> pdfioDictIterateKeys(<a href="#pdfio_dict_t">pdfio_dict_t</a> *dict, <a href="#pdfio_dict_cb_t">pdfio_dict_cb_t</a> cb, <span class="reserved">void</span> *cb_data);</p>
|
||||
@ -3592,7 +3857,9 @@ my_dict_cb(pdfio_dict_t *dict, const char *key, void *cb_data)
|
||||
</pre>
|
||||
|
||||
The iteration continues as long as the callback returns <code>true</code> or all keys
|
||||
have been iterated.</p>
|
||||
have been iterated.
|
||||
|
||||
</p>
|
||||
<h3 class="function"><a id="pdfioDictSetArray">pdfioDictSetArray</a></h3>
|
||||
<p class="description">Set a key array in a dictionary.</p>
|
||||
<p class="code">
|
||||
@ -3812,15 +4079,17 @@ have been iterated.</p>
|
||||
name of the PDF file to create.<br>
|
||||
<br>
|
||||
The "version" argument specifies the PDF version number for the file or
|
||||
<code>NULL</code> for the default ("2.0").<br>
|
||||
<code>NULL</code> for the default ("2.0"). The value "PCLm-1.0" can be specified to
|
||||
produce the PCLm subset of PDF.<br>
|
||||
<br>
|
||||
The "media_box" and "crop_box" arguments specify the default MediaBox and
|
||||
CropBox for pages in the PDF file - if <code>NULL</code> then a default "Universal" size
|
||||
of 8.27x11in (the intersection of US Letter and ISO A4) is used.<br>
|
||||
<br>
|
||||
The "error_cb" and "error_cbdata" arguments specify an error handler callback
|
||||
and its data pointer - if <code>NULL</code> the default error handler is used that
|
||||
writes error messages to <code>stderr</code>.</p>
|
||||
and its data pointer - if <code>NULL</code> then the default error handler is used that
|
||||
writes error messages to <code>stderr</code>. The error handler callback should return
|
||||
<code>true</code> to continue writing the PDF file or <code>false</code> to stop.</p>
|
||||
<h3 class="function"><a id="pdfioFileCreateArrayObj">pdfioFileCreateArrayObj</a></h3>
|
||||
<p class="description">Create a new object in a PDF file containing an array.</p>
|
||||
<p class="code">
|
||||
@ -3907,8 +4176,25 @@ Unicode.</p>
|
||||
characters (potentially full Unicode, but more typically a subset)
|
||||
or to only support the Windows CP1252 (ISO-8859-1 with additional
|
||||
characters such as the Euro symbol) subset of Unicode.</p>
|
||||
<h3 class="function"><a id="pdfioFileCreateICCObjFromData">pdfioFileCreateICCObjFromData</a></h3>
|
||||
<p class="description">Add ICC profile data to a PDF file.</p>
|
||||
<p class="code">
|
||||
<a href="#pdfio_obj_t">pdfio_obj_t</a> *pdfioFileCreateICCObjFromData(<a href="#pdfio_file_t">pdfio_file_t</a> *pdf, <span class="reserved">const</span> <span class="reserved">unsigned</span> <span class="reserved">char</span> *data, size_t datalen, size_t num_colors);</p>
|
||||
<h4 class="parameters">Parameters</h4>
|
||||
<table class="list"><tbody>
|
||||
<tr><th>pdf</th>
|
||||
<td class="description">PDF file</td></tr>
|
||||
<tr><th>data</th>
|
||||
<td class="description">ICC profile buffer</td></tr>
|
||||
<tr><th>datalen</th>
|
||||
<td class="description">Length of ICC profile</td></tr>
|
||||
<tr><th>num_colors</th>
|
||||
<td class="description">Number of color components (1, 3, or 4)</td></tr>
|
||||
</tbody></table>
|
||||
<h4 class="returnvalue">Return Value</h4>
|
||||
<p class="description">Object</p>
|
||||
<h3 class="function"><a id="pdfioFileCreateICCObjFromFile">pdfioFileCreateICCObjFromFile</a></h3>
|
||||
<p class="description">Add an ICC profile object to a PDF file.</p>
|
||||
<p class="description">Add an ICC profile file to a PDF file.</p>
|
||||
<p class="code">
|
||||
<a href="#pdfio_obj_t">pdfio_obj_t</a> *pdfioFileCreateICCObjFromFile(<a href="#pdfio_file_t">pdfio_file_t</a> *pdf, <span class="reserved">const</span> <span class="reserved">char</span> *filename, size_t num_colors);</p>
|
||||
<h4 class="parameters">Parameters</h4>
|
||||
@ -3986,7 +4272,7 @@ image on the page.<br>
|
||||
Note: Currently PNG support is limited to grayscale, RGB, or indexed files
|
||||
without interlacing or alpha. Transparency (masking) based on color/index
|
||||
is supported.</blockquote>
|
||||
<h3 class="function"><a id="pdfioFileCreateNameObj">pdfioFileCreateNameObj</a></h3>
|
||||
<h3 class="function"><span class="info"> PDFio v1.4 </span><a id="pdfioFileCreateNameObj">pdfioFileCreateNameObj</a></h3>
|
||||
<p class="description">Create a new object in a PDF file containing a name.</p>
|
||||
<p class="code">
|
||||
<a href="#pdfio_obj_t">pdfio_obj_t</a> *pdfioFileCreateNameObj(<a href="#pdfio_file_t">pdfio_file_t</a> *pdf, <span class="reserved">const</span> <span class="reserved">char</span> *name);</p>
|
||||
@ -4001,8 +4287,10 @@ is supported.</blockquote>
|
||||
<p class="description">New object</p>
|
||||
<h4 class="discussion">Discussion</h4>
|
||||
<p class="discussion">This function creates a new object with a name value in a PDF file.
|
||||
You must call <a href="#pdfioObjClose"><code>pdfioObjClose</code></a> to write the object to the file.</p>
|
||||
<h3 class="function"><a id="pdfioFileCreateNumberObj">pdfioFileCreateNumberObj</a></h3>
|
||||
You must call <a href="#pdfioObjClose"><code>pdfioObjClose</code></a> to write the object to the file.
|
||||
|
||||
</p>
|
||||
<h3 class="function"><span class="info"> PDFio v1.2 </span><a id="pdfioFileCreateNumberObj">pdfioFileCreateNumberObj</a></h3>
|
||||
<p class="description">Create a new object in a PDF file containing a number.</p>
|
||||
<p class="code">
|
||||
<a href="#pdfio_obj_t">pdfio_obj_t</a> *pdfioFileCreateNumberObj(<a href="#pdfio_file_t">pdfio_file_t</a> *pdf, <span class="reserved">double</span> number);</p>
|
||||
@ -4017,7 +4305,9 @@ You must call <a href="#pdfioObjClose"><code>pdfioObjClose</code></a> to write t
|
||||
<p class="description">New object</p>
|
||||
<h4 class="discussion">Discussion</h4>
|
||||
<p class="discussion">This function creates a new object with a number value in a PDF file.
|
||||
You must call <a href="#pdfioObjClose"><code>pdfioObjClose</code></a> to write the object to the file.</p>
|
||||
You must call <a href="#pdfioObjClose"><code>pdfioObjClose</code></a> to write the object to the file.
|
||||
|
||||
</p>
|
||||
<h3 class="function"><a id="pdfioFileCreateObj">pdfioFileCreateObj</a></h3>
|
||||
<p class="description">Create a new object in a PDF file.</p>
|
||||
<p class="code">
|
||||
@ -4069,15 +4359,18 @@ output_cb(void *output_cbdata, const void *buffer, size_t bytes)
|
||||
</pre>
|
||||
|
||||
The "version" argument specifies the PDF version number for the file or
|
||||
<code>NULL</code> for the default ("2.0").<br>
|
||||
<code>NULL</code> for the default ("2.0"). Unlike <a href="#pdfioFileCreate"><code>pdfioFileCreate</code></a> and
|
||||
<a href="#pdfioFileCreateTemporary"><code>pdfioFileCreateTemporary</code></a>, it is generally not safe to pass the
|
||||
"PCLm-1.0" version string.<br>
|
||||
<br>
|
||||
The "media_box" and "crop_box" arguments specify the default MediaBox and
|
||||
CropBox for pages in the PDF file - if <code>NULL</code> then a default "Universal" size
|
||||
of 8.27x11in (the intersection of US Letter and ISO A4) is used.<br>
|
||||
<br>
|
||||
The "error_cb" and "error_cbdata" arguments specify an error handler callback
|
||||
and its data pointer - if <code>NULL</code> the default error handler is used that
|
||||
writes error messages to <code>stderr</code>.<br>
|
||||
and its data pointer - if <code>NULL</code> then the default error handler is used that
|
||||
writes error messages to <code>stderr</code>. The error handler callback should return
|
||||
<code>true</code> to continue writing the PDF file or <code>false</code> to stop.<br>
|
||||
<br>
|
||||
</p><blockquote>
|
||||
<em>Note</em>: Files created using this API are slightly larger than those
|
||||
@ -4096,7 +4389,7 @@ stored as indirect object references.</blockquote>
|
||||
</tbody></table>
|
||||
<h4 class="returnvalue">Return Value</h4>
|
||||
<p class="description">Contents stream</p>
|
||||
<h3 class="function"><a id="pdfioFileCreateStringObj">pdfioFileCreateStringObj</a></h3>
|
||||
<h3 class="function"><span class="info"> PDFio v1.2 </span><a id="pdfioFileCreateStringObj">pdfioFileCreateStringObj</a></h3>
|
||||
<p class="description">Create a new object in a PDF file containing a string.</p>
|
||||
<p class="code">
|
||||
<a href="#pdfio_obj_t">pdfio_obj_t</a> *pdfioFileCreateStringObj(<a href="#pdfio_file_t">pdfio_file_t</a> *pdf, <span class="reserved">const</span> <span class="reserved">char</span> *string);</p>
|
||||
@ -4111,7 +4404,9 @@ stored as indirect object references.</blockquote>
|
||||
<p class="description">New object</p>
|
||||
<h4 class="discussion">Discussion</h4>
|
||||
<p class="discussion">This function creates a new object with a string value in a PDF file.
|
||||
You must call <a href="#pdfioObjClose"><code>pdfioObjClose</code></a> to write the object to the file.</p>
|
||||
You must call <a href="#pdfioObjClose"><code>pdfioObjClose</code></a> to write the object to the file.
|
||||
|
||||
</p>
|
||||
<h3 class="function"><a id="pdfioFileCreateTemporary">pdfioFileCreateTemporary</a></h3>
|
||||
<p class="description"></p>
|
||||
<p class="code">
|
||||
@ -4137,8 +4432,19 @@ You must call <a href="#pdfioObjClose"><code>pdfioObjClose</code></a> to write t
|
||||
<p class="description">Create a temporary PDF file.</p>
|
||||
<p class="discussion">This function creates a PDF file with a unique filename in the current
|
||||
temporary directory. The temporary file is stored in the string "buffer" an
|
||||
will have a ".pdf" extension. Otherwise, this function works the same as
|
||||
the <a href="#pdfioFileCreate"><code>pdfioFileCreate</code></a> function.
|
||||
will have a ".pdf" extension.<br>
|
||||
<br>
|
||||
The "version" argument specifies the PDF version number for the file or
|
||||
<code>NULL</code> for the default ("2.0"). The value "PCLm-1.0" can be specified to
|
||||
produce the PCLm subset of PDF.<br>
|
||||
<br>
|
||||
The "media_box" and "crop_box" arguments specify the default MediaBox and
|
||||
CropBox for pages in the PDF file - if <code>NULL</code> then a default "Universal" size
|
||||
of 8.27x11in (the intersection of US Letter and ISO A4) is used.<br>
|
||||
<br>
|
||||
The "error_cb" and "error_cbdata" arguments specify an error handler callback
|
||||
and its data pointer - if <code>NULL</code> the default error handler is used that
|
||||
writes error messages to <code>stderr</code>.
|
||||
|
||||
</p>
|
||||
<h3 class="function"><a id="pdfioFileFindObj">pdfioFileFindObj</a></h3>
|
||||
@ -4223,6 +4529,17 @@ time_t pdfioFileGetCreationDate(<a href="#pdfio_file_t">pdfio_file_t</a> *pdf);<
|
||||
</tbody></table>
|
||||
<h4 class="returnvalue">Return Value</h4>
|
||||
<p class="description">Keywords string or <code>NULL</code> for none</p>
|
||||
<h3 class="function"><a id="pdfioFileGetModificationDate">pdfioFileGetModificationDate</a></h3>
|
||||
<p class="description">Get the most recent modification date for a PDF file.</p>
|
||||
<p class="code">
|
||||
time_t pdfioFileGetModificationDate(<a href="#pdfio_file_t">pdfio_file_t</a> *pdf);</p>
|
||||
<h4 class="parameters">Parameters</h4>
|
||||
<table class="list"><tbody>
|
||||
<tr><th>pdf</th>
|
||||
<td class="description">PDF file</td></tr>
|
||||
</tbody></table>
|
||||
<h4 class="returnvalue">Return Value</h4>
|
||||
<p class="description">Modification date or <code>0</code> for none</p>
|
||||
<h3 class="function"><a id="pdfioFileGetName">pdfioFileGetName</a></h3>
|
||||
<p class="description">Get a PDF's filename.</p>
|
||||
<p class="code">
|
||||
@ -4372,8 +4689,15 @@ cancel the open. If <code>NULL</code> is specified for the callback function an
|
||||
PDF file requires a password, the open will always fail.<br>
|
||||
<br>
|
||||
The "error_cb" and "error_cbdata" arguments specify an error handler callback
|
||||
and its data pointer - if <code>NULL</code> the default error handler is used that
|
||||
writes error messages to <code>stderr</code>.</p>
|
||||
and its data pointer - if <code>NULL</code> then the default error handler is used that
|
||||
writes error messages to <code>stderr</code>. The error handler callback should return
|
||||
<code>true</code> to continue reading the PDF file or <code>false</code> to stop.<br>
|
||||
<br>
|
||||
</p><blockquote>
|
||||
Note: Error messages starting with "WARNING:" are actually warning
|
||||
messages - the callback should normally return <code>true</code> to allow PDFio to
|
||||
try to resolve the issue. In addition, some errors are unrecoverable and
|
||||
ignore the return value of the error callback.</blockquote>
|
||||
<h3 class="function"><a id="pdfioFileSetAuthor">pdfioFileSetAuthor</a></h3>
|
||||
<p class="description">Set the author for a PDF file.</p>
|
||||
<p class="code">
|
||||
@ -4418,6 +4742,17 @@ writes error messages to <code>stderr</code>.</p>
|
||||
<tr><th>value</th>
|
||||
<td class="description">Value</td></tr>
|
||||
</tbody></table>
|
||||
<h3 class="function"><a id="pdfioFileSetModificationDate">pdfioFileSetModificationDate</a></h3>
|
||||
<p class="description">Set the modification date for a PDF file.</p>
|
||||
<p class="code">
|
||||
<span class="reserved">void</span> pdfioFileSetModificationDate(<a href="#pdfio_file_t">pdfio_file_t</a> *pdf, time_t value);</p>
|
||||
<h4 class="parameters">Parameters</h4>
|
||||
<table class="list"><tbody>
|
||||
<tr><th>pdf</th>
|
||||
<td class="description">PDF file</td></tr>
|
||||
<tr><th>value</th>
|
||||
<td class="description">Value</td></tr>
|
||||
</tbody></table>
|
||||
<h3 class="function"><a id="pdfioFileSetPermissions">pdfioFileSetPermissions</a></h3>
|
||||
<p class="description">Set the PDF permissions, encryption mode, and passwords.</p>
|
||||
<p class="code">
|
||||
@ -4478,7 +4813,7 @@ size_t pdfioImageGetBytesPerLine(<a href="#pdfio_obj_t">pdfio_obj_t</a> *obj);</
|
||||
<td class="description">Image object</td></tr>
|
||||
</tbody></table>
|
||||
<h4 class="returnvalue">Return Value</h4>
|
||||
<p class="description">Number of bytes per line</p>
|
||||
<p class="description">Number of bytes per line or <code>0</code> on error</p>
|
||||
<h3 class="function"><a id="pdfioImageGetHeight">pdfioImageGetHeight</a></h3>
|
||||
<p class="description">Get the height of an image object.</p>
|
||||
<p class="code">
|
||||
@ -4583,7 +4918,7 @@ size_t pdfioObjGetLength(<a href="#pdfio_obj_t">pdfio_obj_t</a> *obj);</p>
|
||||
</tbody></table>
|
||||
<h4 class="returnvalue">Return Value</h4>
|
||||
<p class="description">Length in bytes or <code>0</code> for none</p>
|
||||
<h3 class="function"><a id="pdfioObjGetName">pdfioObjGetName</a></h3>
|
||||
<h3 class="function"><span class="info"> PDFio v1.4 </span><a id="pdfioObjGetName">pdfioObjGetName</a></h3>
|
||||
<p class="description">Get the name value associated with an object.</p>
|
||||
<p class="code">
|
||||
<span class="reserved">const</span> <span class="reserved">char</span> *pdfioObjGetName(<a href="#pdfio_obj_t">pdfio_obj_t</a> *obj);</p>
|
||||
@ -4818,14 +5153,15 @@ size_t pdfioPageGetNumStreams(<a href="#pdfio_obj_t">pdfio_obj_t</a> *page);</p>
|
||||
<td class="description">Size of string buffer</td></tr>
|
||||
</tbody></table>
|
||||
<h4 class="returnvalue">Return Value</h4>
|
||||
<p class="description"><code>true</code> on success, <code>false</code> on EOF</p>
|
||||
<p class="description"><code>true</code> on success, <code>false</code> on end-of-stream or error</p>
|
||||
<h4 class="discussion">Discussion</h4>
|
||||
<p class="discussion">This function reads a single PDF token from a stream. Operator tokens,
|
||||
boolean values, and numbers are returned as-is in the provided string buffer.
|
||||
String values start with the opening parenthesis ('(') but have all escaping
|
||||
resolved and the terminating parenthesis removed. Hexadecimal string values
|
||||
start with the opening angle bracket ('<') and have all whitespace and the
|
||||
terminating angle bracket removed.</p>
|
||||
<p class="discussion">This function reads a single PDF token from a stream, skipping all whitespace
|
||||
and comments. Operator tokens, boolean values, and numbers are returned
|
||||
as-is in the provided string buffer. String values start with the opening
|
||||
parenthesis ('(') but have all escaping resolved and the terminating
|
||||
parenthesis removed. Hexadecimal string values start with the opening angle
|
||||
bracket ('<') and have all whitespace and the terminating angle bracket
|
||||
removed.</p>
|
||||
<h3 class="function"><a id="pdfioStreamPeek">pdfioStreamPeek</a></h3>
|
||||
<p class="description">Peek at data in a stream.</p>
|
||||
<p class="code">
|
||||
@ -4856,6 +5192,11 @@ ssize_t pdfioStreamPeek(<a href="#pdfio_stream_t">pdfio_stream_t</a> *st, <span
|
||||
</tbody></table>
|
||||
<h4 class="returnvalue">Return Value</h4>
|
||||
<p class="description"><code>true</code> on success, <code>false</code> on failure</p>
|
||||
<h4 class="discussion">Discussion</h4>
|
||||
<p class="discussion">This function writes a formatted string to a stream. In addition to the
|
||||
standard <code>printf</code> format characters, you can use "%H" to format a HTML/XML
|
||||
string value, "%N" to format a PDF name value ("/Name"), and "%S" to format
|
||||
a PDF string ("(String)") value.</p>
|
||||
<h3 class="function"><a id="pdfioStreamPutChar">pdfioStreamPutChar</a></h3>
|
||||
<p class="description">Write a single character to a stream.</p>
|
||||
<p class="code">
|
||||
@ -5089,7 +5430,7 @@ typedef enum <a href="#pdfio_valtype_e">pdfio_valtype_e</a> pdfio_valtype_t;
|
||||
<tr><th>PDFIO_ENCRYPTION_AES_128 </th><td class="description">128-bit AES encryption (PDF 1.6)</td></tr>
|
||||
<tr><th>PDFIO_ENCRYPTION_NONE </th><td class="description">No encryption</td></tr>
|
||||
<tr><th>PDFIO_ENCRYPTION_RC4_128 </th><td class="description">128-bit RC4 encryption (PDF 1.4)</td></tr>
|
||||
<tr><th>PDFIO_ENCRYPTION_RC4_40 </th><td class="description">40-bit RC4 encryption (PDF 1.3)</td></tr>
|
||||
<tr><th>PDFIO_ENCRYPTION_RC4_40 </th><td class="description">40-bit RC4 encryption (PDF 1.3, reading only)</td></tr>
|
||||
</tbody></table>
|
||||
<h3 class="enumeration"><a id="pdfio_filter_e">pdfio_filter_e</a></h3>
|
||||
<p class="description">Compression/decompression filters for streams</p>
|
||||
|
382
doc/pdfio.md
@ -15,7 +15,7 @@ goals of PDFio are:
|
||||
PDFio is *not* concerned with rendering or viewing a PDF file, although a PDF
|
||||
RIP or viewer could be written using it.
|
||||
|
||||
PDFio is Copyright © 2021-2024 by Michael R Sweet and is licensed under the
|
||||
PDFio is Copyright © 2021-2025 by Michael R Sweet and is licensed under the
|
||||
Apache License Version 2.0 with an (optional) exception to allow linking against
|
||||
GPL2/LGPL2 software. See the files "LICENSE" and "NOTICE" for more information.
|
||||
|
||||
@ -28,7 +28,10 @@ PDFio requires the following to build the software:
|
||||
- A C99 compiler such as Clang, GCC, or MS Visual C
|
||||
- A POSIX-compliant `make` program
|
||||
- A POSIX-compliant `sh` program
|
||||
- ZLIB (<https://www.zlib.net>) 1.0 or higher
|
||||
- ZLIB (<https://www.zlib.net/>) 1.0 or higher
|
||||
|
||||
PDFio will also use libpng 1.6 or higher (<https://www.libpng.org/>) to provide
|
||||
enhanced PNG image support.
|
||||
|
||||
IDE files for Xcode (macOS/iOS) and Visual Studio (Windows) are also provided.
|
||||
|
||||
@ -340,8 +343,8 @@ password_cb(void *data, const char *filename)
|
||||
```
|
||||
|
||||
The error callback is called for both errors and warnings and accepts the
|
||||
`pdfio_file_t` pointer, a message string, and the callback pointer value, for
|
||||
example:
|
||||
`pdfio_file_t` pointer, a message string, and the callback pointer value. It
|
||||
returns `true` to continue processing the file or `false` to stop, for example:
|
||||
|
||||
```c
|
||||
bool
|
||||
@ -351,13 +354,17 @@ error_cb(pdfio_file_t *pdf, const char *message, void *data)
|
||||
|
||||
fprintf(stderr, "%s: %s\n", pdfioFileGetName(pdf), message);
|
||||
|
||||
// Return false to treat warnings as errors
|
||||
return (false);
|
||||
// Return true for warning messages (continue) and false for errors (stop)
|
||||
return (!strncmp(message, "WARNING:", 8));
|
||||
}
|
||||
```
|
||||
|
||||
The default error callback (`NULL`) does the equivalent of the above.
|
||||
|
||||
> Note: Many errors are unrecoverable, so PDFio ignores the return value from
|
||||
> the error callback and always stops processing the PDF file. Warning messages
|
||||
> start with the prefix "WARNING:" while errors have no prefix.
|
||||
|
||||
Each PDF file contains one or more pages. The [`pdfioFileGetNumPages`](@@)
|
||||
function returns the number of pages in the file while the
|
||||
[`pdfioFileGetPage`](@@) function gets the specified page in the PDF file:
|
||||
@ -886,11 +893,26 @@ main(int argc, // I - Number of command-line arguments
|
||||
{
|
||||
const char *filename; // PDF filename
|
||||
pdfio_file_t *pdf; // PDF file
|
||||
const char *author; // Author name
|
||||
time_t creation_date; // Creation date
|
||||
struct tm *creation_tm; // Creation date/time information
|
||||
char creation_text[256]; // Creation date/time as a string
|
||||
const char *title; // Title
|
||||
pdfio_dict_t *catalog; // Catalog dictionary
|
||||
const char *author, // Author name
|
||||
*creator, // Creator name
|
||||
*producer, // Producer name
|
||||
*title; // Title
|
||||
time_t creation_date, // Creation date
|
||||
modification_date; // Modification date
|
||||
struct tm *creation_tm, // Creation date/time information
|
||||
*modification_tm; // Modification date/time information
|
||||
char creation_text[256], // Creation date/time as a string
|
||||
modification_text[256], // Modification date/time human fmt string
|
||||
range_text[255]; // Page range text
|
||||
size_t num_pages; // PDF number of pages
|
||||
bool has_acroform; // Does the file have an AcroForm?
|
||||
pdfio_obj_t *page; // Object
|
||||
pdfio_dict_t *page_dict; // Object dictionary
|
||||
size_t cur, // Current page index
|
||||
prev; // Previous page index
|
||||
pdfio_rect_t cur_box, // Current MediaBox
|
||||
prev_box; // Previous MediaBox
|
||||
|
||||
|
||||
// Get the filename from the command-line...
|
||||
@ -903,14 +925,20 @@ main(int argc, // I - Number of command-line arguments
|
||||
filename = argv[1];
|
||||
|
||||
// Open the PDF file with the default callbacks...
|
||||
pdf = pdfioFileOpen(filename, /*password_cb*/NULL, /*password_cbdata*/NULL,
|
||||
/*error_cb*/NULL, /*error_cbdata*/NULL);
|
||||
pdf = pdfioFileOpen(filename, /*password_cb*/NULL,
|
||||
/*password_cbdata*/NULL, /*error_cb*/NULL,
|
||||
/*error_cbdata*/NULL);
|
||||
if (pdf == NULL)
|
||||
return (1);
|
||||
|
||||
// Get the title and author...
|
||||
author = pdfioFileGetAuthor(pdf);
|
||||
title = pdfioFileGetTitle(pdf);
|
||||
// Get the title, author, etc...
|
||||
catalog = pdfioFileGetCatalog(pdf);
|
||||
author = pdfioFileGetAuthor(pdf);
|
||||
creator = pdfioFileGetCreator(pdf);
|
||||
has_acroform = pdfioDictGetType(catalog, "AcroForm") != PDFIO_VALTYPE_NONE;
|
||||
num_pages = pdfioFileGetNumPages(pdf);
|
||||
producer = pdfioFileGetProducer(pdf);
|
||||
title = pdfioFileGetTitle(pdf);
|
||||
|
||||
// Get the creation date and convert to a string...
|
||||
if ((creation_date = pdfioFileGetCreationDate(pdf)) > 0)
|
||||
@ -923,12 +951,76 @@ main(int argc, // I - Number of command-line arguments
|
||||
snprintf(creation_text, sizeof(creation_text), "-- not set --");
|
||||
}
|
||||
|
||||
// Get the modification date and convert to a string...
|
||||
if ((modification_date = pdfioFileGetModificationDate(pdf)) > 0)
|
||||
{
|
||||
modification_tm = localtime(&modification_date);
|
||||
strftime(modification_text, sizeof(modification_text), "%c", modification_tm);
|
||||
}
|
||||
else
|
||||
{
|
||||
snprintf(modification_text, sizeof(modification_text), "-- not set --");
|
||||
}
|
||||
|
||||
// Print file information to stdout...
|
||||
printf("%s:\n", filename);
|
||||
printf(" Title: %s\n", title ? title : "-- not set --");
|
||||
printf(" Author: %s\n", author ? author : "-- not set --");
|
||||
printf(" Created On: %s\n", creation_text);
|
||||
printf(" Number Pages: %u\n", (unsigned)pdfioFileGetNumPages(pdf));
|
||||
printf(" Title: %s\n", title ? title : "-- not set --");
|
||||
printf(" Author: %s\n", author ? author : "-- not set --");
|
||||
printf(" Creator: %s\n", creator ? creator : "-- not set --");
|
||||
printf(" Producer: %s\n", producer ? producer : "-- not set --");
|
||||
printf(" Created On: %s\n", creation_text);
|
||||
printf(" Modified On: %s\n", modification_text);
|
||||
printf(" Version: %s\n", pdfioFileGetVersion(pdf));
|
||||
printf(" AcroForm: %s\n", has_acroform ? "Yes" : "No");
|
||||
printf(" Number of Pages: %u\n", (unsigned)num_pages);
|
||||
|
||||
// Report the MediaBox for all of the pages
|
||||
prev_box.x1 = prev_box.x2 = prev_box.y1 = prev_box.y2 = 0.0;
|
||||
|
||||
for (cur = 0, prev = 0; cur < num_pages; cur ++)
|
||||
{
|
||||
// Find the MediaBox for this page in the page tree...
|
||||
for (page = pdfioFileGetPage(pdf, cur);
|
||||
page != NULL;
|
||||
page = pdfioDictGetObj(page_dict, "Parent"))
|
||||
{
|
||||
cur_box.x1 = cur_box.x2 = cur_box.y1 = cur_box.y2 = 0.0;
|
||||
page_dict = pdfioObjGetDict(page);
|
||||
|
||||
if (pdfioDictGetRect(page_dict, "MediaBox", &cur_box))
|
||||
break;
|
||||
}
|
||||
|
||||
// If this MediaBox is different from the previous one, show the range of
|
||||
// pages that have that size...
|
||||
if (cur == 0 ||
|
||||
fabs(cur_box.x1 - prev_box.x1) > 0.01 ||
|
||||
fabs(cur_box.y1 - prev_box.y1) > 0.01 ||
|
||||
fabs(cur_box.x2 - prev_box.x2) > 0.01 ||
|
||||
fabs(cur_box.y2 - prev_box.y2) > 0.01)
|
||||
{
|
||||
if (cur > prev)
|
||||
{
|
||||
snprintf(range_text, sizeof(range_text), "Pages %u-%u",
|
||||
(unsigned)(prev + 1), (unsigned)cur);
|
||||
printf("%16s: [%g %g %g %g]\n", range_text,
|
||||
prev_box.x1, prev_box.y1, prev_box.x2, prev_box.y2);
|
||||
}
|
||||
|
||||
// Start a new series of pages with the new size...
|
||||
prev = cur;
|
||||
prev_box = cur_box;
|
||||
}
|
||||
}
|
||||
|
||||
// Show the last range as needed...
|
||||
if (cur > prev)
|
||||
{
|
||||
snprintf(range_text, sizeof(range_text), "Pages %u-%u",
|
||||
(unsigned)(prev + 1), (unsigned)cur);
|
||||
printf("%16s: [%g %g %g %g]\n", range_text,
|
||||
prev_box.x1, prev_box.y1, prev_box.x2, prev_box.y2);
|
||||
}
|
||||
|
||||
// Close the PDF file...
|
||||
pdfioFileClose(pdf);
|
||||
@ -941,37 +1033,98 @@ main(int argc, // I - Number of command-line arguments
|
||||
Extract Text from PDF File
|
||||
--------------------------
|
||||
|
||||
The `pdf2text.c` example code extracts non-Unicode text from a PDF file by
|
||||
scanning each page for strings and text drawing commands. Since it doesn't
|
||||
look at the font encoding or support Unicode text, it is really only useful to
|
||||
extract plain ASCII text from a PDF file. And since it writes text in the order
|
||||
it appears in the page stream, it may not come out in the same order as appears
|
||||
on the page.
|
||||
The `pdf2text.c` example code extracts text from a PDF file and writes it to the
|
||||
standard output. Unlike some other PDF tools, it outputs the text in the order
|
||||
it is seen in each page stream so the output might appear "jumbled" if the PDF
|
||||
producer doesn't output text in reading order. The code is able to handle
|
||||
different font encodings and produces UTF-8 output.
|
||||
|
||||
The [`pdfioStreamGetToken`](@@) function is used to read individual tokens from
|
||||
the page streams. Tokens starting with the open parenthesis are text strings,
|
||||
while PDF operators are left as-is. We use some simple logic to make sure that
|
||||
we include spaces between text strings and add newlines for the text operators
|
||||
that start a new line in a text block:
|
||||
the page streams:
|
||||
|
||||
```c
|
||||
pdfio_stream_t *st; // Page stream
|
||||
char buffer[1024], // Token buffer
|
||||
*bufptr, // Pointer into buffer
|
||||
name[256]; // Current (font) name
|
||||
bool first = true; // First string on line?
|
||||
char buffer[1024]; // Token buffer
|
||||
int encoding[256]; // Font encoding to Unicode
|
||||
bool in_array = false; // Are we in an array?
|
||||
|
||||
// Read PDF tokens from the page stream...
|
||||
while (pdfioStreamGetToken(st, buffer, sizeof(buffer)))
|
||||
{
|
||||
if (buffer[0] == '(')
|
||||
```
|
||||
|
||||
Justified text can be found inside arrays ("[ ... ]"), so we look for the array
|
||||
delimiter tokens and any (spacing) numbers inside an array. Experimentation has
|
||||
shown that numbers greater than 100 can be treated as whitespace:
|
||||
|
||||
```c
|
||||
if (!strcmp(buffer, "["))
|
||||
{
|
||||
// Start of an array for justified text...
|
||||
in_array = true;
|
||||
}
|
||||
else if (!strcmp(buffer, "]"))
|
||||
{
|
||||
// End of an array for justified text...
|
||||
in_array = false;
|
||||
}
|
||||
else if (!first && in_array && (isdigit(buffer[0]) || buffer[0] == '-') && fabs(atof(buffer)) > 100)
|
||||
{
|
||||
// Whitespace in a justified text block...
|
||||
putchar(' ');
|
||||
}
|
||||
```
|
||||
|
||||
Tokens starting with '(' or '<' are text fragments. 8-bit text starting with
|
||||
'(' needs to be mapped to Unicode using the current font encoding while hex
|
||||
strings starting with '<' are UTF-16 (Unicode) that need to be converted to
|
||||
UTF-8:
|
||||
|
||||
```c
|
||||
else if (buffer[0] == '(')
|
||||
{
|
||||
// Text string using an 8-bit encoding
|
||||
if (first)
|
||||
first = false;
|
||||
else if (buffer[1] != ' ')
|
||||
putchar(' ');
|
||||
first = false;
|
||||
|
||||
fputs(buffer + 1, stdout);
|
||||
for (bufptr = buffer + 1; *bufptr; bufptr ++)
|
||||
put_utf8(encoding[*bufptr & 255]);
|
||||
}
|
||||
else if (buffer[0] == '<')
|
||||
{
|
||||
// Unicode text string
|
||||
first = false;
|
||||
|
||||
puts_utf16(buffer + 1);
|
||||
}
|
||||
```
|
||||
|
||||
Simple (8-bit) fonts include an encoding table that maps the 8-bit characters to
|
||||
one of 1051 Unicode glyph names. Since each font can use a different encoding,
|
||||
we look for font names starting with '/' and the "Tf" (set text font) operator
|
||||
token and load that font's encoding using the
|
||||
[load_encoding](#the-loadencoding-function) function:
|
||||
|
||||
```c
|
||||
else if (buffer[0] == '/')
|
||||
{
|
||||
// Save name...
|
||||
strncpy(name, buffer + 1, sizeof(name) - 1);
|
||||
name[sizeof(name) - 1] = '\0';
|
||||
}
|
||||
else if (!strcmp(buffer, "Tf") && name[0])
|
||||
{
|
||||
// Set font...
|
||||
load_encoding(obj, name, encoding);
|
||||
}
|
||||
```
|
||||
|
||||
Finally, some text operators start a new line in a text block, so when we see
|
||||
their tokens we output a newline:
|
||||
|
||||
```c
|
||||
else if (!strcmp(buffer, "Td") || !strcmp(buffer, "TD") || !strcmp(buffer, "T*") ||
|
||||
!strcmp(buffer, "\'") || !strcmp(buffer, "\""))
|
||||
{
|
||||
@ -980,9 +1133,160 @@ while (pdfioStreamGetToken(st, buffer, sizeof(buffer)))
|
||||
first = true;
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
if (!first)
|
||||
putchar('\n');
|
||||
|
||||
### The `load_encoding` Function
|
||||
|
||||
The `load_encoding` function looks up the named font in the page's "Resources"
|
||||
dictionary. Every PDF simple font contains an "Encoding" dictionary with a base
|
||||
encoding ("WinANSI", "MacRoman", or "MacExpert") and a differences array that
|
||||
lists character indexes and glyph names for an 8-bit font.
|
||||
|
||||
We start by initializing the encoding array to the default WinANSI encoding and
|
||||
looking up the font object for the named font:
|
||||
|
||||
```c
|
||||
static void
|
||||
load_encoding(
|
||||
pdfio_obj_t *page_obj, // I - Page object
|
||||
const char *name, // I - Font name
|
||||
int encoding[256]) // O - Encoding table
|
||||
{
|
||||
size_t i, j; // Looping vars
|
||||
pdfio_dict_t *page_dict, // Page dictionary
|
||||
*resources_dict, // Resources dictionary
|
||||
*font_dict; // Font dictionary
|
||||
pdfio_obj_t *font_obj, // Font object
|
||||
*encoding_obj; // Encoding object
|
||||
static int win_ansi[32] = // WinANSI characters from 128 to 159
|
||||
{
|
||||
...
|
||||
};
|
||||
static int mac_roman[128] = // MacRoman characters from 128 to 255
|
||||
{
|
||||
...
|
||||
};
|
||||
|
||||
|
||||
// Initialize the encoding to be the "standard" WinAnsi...
|
||||
for (i = 0; i < 128; i ++)
|
||||
encoding[i] = i;
|
||||
for (i = 160; i < 256; i ++)
|
||||
encoding[i] = i;
|
||||
memcpy(encoding + 128, win_ansi, sizeof(win_ansi));
|
||||
|
||||
// Find the named font...
|
||||
if ((page_dict = pdfioObjGetDict(page_obj)) == NULL)
|
||||
return;
|
||||
|
||||
if ((resources_dict = pdfioDictGetDict(page_dict, "Resources")) == NULL)
|
||||
return;
|
||||
|
||||
if ((font_dict = pdfioDictGetDict(resources_dict, "Font")) == NULL)
|
||||
{
|
||||
// Font resources not a dictionary, see if it is an object...
|
||||
if ((font_obj = pdfioDictGetObj(resources_dict, "Font")) != NULL)
|
||||
font_dict = pdfioObjGetDict(font_obj);
|
||||
|
||||
if (!font_dict)
|
||||
return;
|
||||
}
|
||||
|
||||
if ((font_obj = pdfioDictGetObj(font_dict, name)) == NULL)
|
||||
return;
|
||||
```
|
||||
|
||||
Once we have found the font we see if it has an "Encoding" dictionary:
|
||||
|
||||
```c
|
||||
pdfio_dict_t *encoding_dict; // Encoding dictionary
|
||||
|
||||
if ((encoding_obj = pdfioDictGetObj(pdfioObjGetDict(font_obj), "Encoding")) == NULL)
|
||||
return;
|
||||
|
||||
if ((encoding_dict = pdfioObjGetDict(encoding_obj)) == NULL)
|
||||
return;
|
||||
```
|
||||
|
||||
Once we have the encoding dictionary we can get the "BaseEncoding" and
|
||||
"Differences" values:
|
||||
|
||||
```c
|
||||
const char *base_encoding; // BaseEncoding name
|
||||
pdfio_array_t *differences; // Differences array
|
||||
|
||||
// OK, have the encoding object, build the encoding using it...
|
||||
base_encoding = pdfioDictGetName(encoding_dict, "BaseEncoding");
|
||||
differences = pdfioDictGetArray(encoding_dict, "Differences");
|
||||
```
|
||||
|
||||
If the base encoding is "MacRomainEncoding", we need to reset the upper 128
|
||||
characters in the encoding array match it:
|
||||
|
||||
```c
|
||||
if (base_encoding && !strcmp(base_encoding, "MacRomanEncoding"))
|
||||
{
|
||||
// Map upper 128
|
||||
memcpy(encoding + 128, mac_roman, sizeof(mac_roman));
|
||||
}
|
||||
|
||||
```
|
||||
|
||||
Then we loop through the differences array, keeping track of the current index
|
||||
within the encoding array. A number indicates a new index while a name is the
|
||||
Unicode glyph for the current index:
|
||||
|
||||
```c
|
||||
typedef struct name_map_s
|
||||
{
|
||||
const char *name; // Character name
|
||||
int unicode; // Unicode value
|
||||
} name_map_t;
|
||||
|
||||
static name_map_t unicode_map[1051]; // List of glyph names
|
||||
|
||||
if (differences)
|
||||
{
|
||||
// Apply differences
|
||||
size_t count = pdfioArrayGetSize(differences);
|
||||
// Number of differences
|
||||
const char *name; // Character name
|
||||
size_t idx = 0; // Index in encoding array
|
||||
|
||||
for (i = 0; i < count; i ++)
|
||||
{
|
||||
switch (pdfioArrayGetType(differences, i))
|
||||
{
|
||||
case PDFIO_VALTYPE_NUMBER :
|
||||
// Get the index of the next character...
|
||||
idx = (size_t)pdfioArrayGetNumber(differences, i);
|
||||
break;
|
||||
|
||||
case PDFIO_VALTYPE_NAME :
|
||||
// Lookup name and apply to encoding...
|
||||
if (idx < 0 || idx > 255)
|
||||
break;
|
||||
|
||||
name = pdfioArrayGetName(differences, i);
|
||||
for (j = 0; j < (sizeof(unicode_map) / sizeof(unicode_map[0])); j ++)
|
||||
{
|
||||
if (!strcmp(name, unicode_map[j].name))
|
||||
{
|
||||
encoding[idx] = unicode_map[j].unicode;
|
||||
break;
|
||||
}
|
||||
}
|
||||
idx ++;
|
||||
break;
|
||||
|
||||
default :
|
||||
// Do nothing for other values
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
|
||||
|
@ -14,8 +14,8 @@
|
||||
# Common options
|
||||
CFLAGS = -g $(CPPFLAGS)
|
||||
#CFLAGS = -g -fsanitize=address $(CPPFLAGS)
|
||||
CPPFLAGS = -I.. -I/usr/local/include
|
||||
LIBS = -L.. -L/usr/local/lib -lpdfio -lz -lm
|
||||
CPPFLAGS = -I.. $(shell PKG_CONFIG_PATH="..:$(PKG_CONFIG_PATH)" pkg-config pdfio --cflags)
|
||||
LIBS = -L.. $(shell PKG_CONFIG_PATH="..:$(PKG_CONFIG_PATH)" pkg-config pdfio --libs)
|
||||
|
||||
|
||||
# Targets
|
||||
@ -24,7 +24,8 @@ TARGETS = \
|
||||
image2pdf \
|
||||
md2pdf \
|
||||
pdf2text \
|
||||
pdfioinfo
|
||||
pdfioinfo \
|
||||
pdfiomerge
|
||||
|
||||
|
||||
# Make everything
|
||||
@ -61,5 +62,10 @@ pdfioinfo: pdfioinfo.c
|
||||
$(CC) $(CFLAGS) -o $@ pdfioinfo.c $(LIBS)
|
||||
|
||||
|
||||
# pdfiomerge
|
||||
pdfiomerge: pdfiomerge.c
|
||||
$(CC) $(CFLAGS) -o $@ pdfiomerge.c $(LIBS)
|
||||
|
||||
|
||||
# Common dependencies...
|
||||
$(TARGETS): Makefile ../pdfio.h ../pdfio-content.h
|
||||
|
@ -1,7 +1,7 @@
|
||||
//
|
||||
// Image example for PDFio.
|
||||
//
|
||||
// Copyright © 2023-2024 by Michael R Sweet.
|
||||
// Copyright © 2023-2025 by Michael R Sweet.
|
||||
//
|
||||
// Licensed under Apache License v2.0. See the file "LICENSE" for more
|
||||
// information.
|
||||
@ -22,8 +22,8 @@
|
||||
|
||||
bool // O - True on success, false on failure
|
||||
create_pdf_image_file(
|
||||
const char *pdfname, // I - PDF filename
|
||||
const char *imagename, // I - Image filename
|
||||
const char *pdfname, // I - PDF filename
|
||||
const char *caption) // I - Caption filename
|
||||
{
|
||||
pdfio_file_t *pdf; // PDF file
|
||||
@ -36,6 +36,15 @@ create_pdf_image_file(
|
||||
double tx, ty; // Position on page
|
||||
|
||||
|
||||
// Default the caption...
|
||||
if (!caption)
|
||||
{
|
||||
if ((caption = strrchr(imagename, '/')) != NULL)
|
||||
caption ++;
|
||||
else
|
||||
caption = imagename;
|
||||
}
|
||||
|
||||
// Create the PDF file...
|
||||
pdf = pdfioFileCreate(pdfname, /*version*/NULL, /*media_box*/NULL,
|
||||
/*crop_box*/NULL, /*error_cb*/NULL,
|
||||
|
1346
examples/pdf2text.c
@ -13,6 +13,7 @@
|
||||
|
||||
#include <pdfio.h>
|
||||
#include <time.h>
|
||||
#include <math.h>
|
||||
|
||||
|
||||
//
|
||||
@ -25,11 +26,26 @@ main(int argc, // I - Number of command-line arguments
|
||||
{
|
||||
const char *filename; // PDF filename
|
||||
pdfio_file_t *pdf; // PDF file
|
||||
const char *author; // Author name
|
||||
time_t creation_date; // Creation date
|
||||
struct tm *creation_tm; // Creation date/time information
|
||||
char creation_text[256]; // Creation date/time as a string
|
||||
const char *title; // Title
|
||||
pdfio_dict_t *catalog; // Catalog dictionary
|
||||
const char *author, // Author name
|
||||
*creator, // Creator name
|
||||
*producer, // Producer name
|
||||
*title; // Title
|
||||
time_t creation_date, // Creation date
|
||||
modification_date; // Modification date
|
||||
struct tm *creation_tm, // Creation date/time information
|
||||
*modification_tm; // Modification date/time information
|
||||
char creation_text[256], // Creation date/time as a string
|
||||
modification_text[256], // Modification date/time human fmt string
|
||||
range_text[255]; // Page range text
|
||||
size_t num_pages; // PDF number of pages
|
||||
bool has_acroform; // Does the file have an AcroForm?
|
||||
pdfio_obj_t *page; // Object
|
||||
pdfio_dict_t *page_dict; // Object dictionary
|
||||
size_t cur, // Current page index
|
||||
prev; // Previous page index
|
||||
pdfio_rect_t cur_box, // Current MediaBox
|
||||
prev_box; // Previous MediaBox
|
||||
|
||||
|
||||
// Get the filename from the command-line...
|
||||
@ -48,9 +64,14 @@ main(int argc, // I - Number of command-line arguments
|
||||
if (pdf == NULL)
|
||||
return (1);
|
||||
|
||||
// Get the title and author...
|
||||
author = pdfioFileGetAuthor(pdf);
|
||||
title = pdfioFileGetTitle(pdf);
|
||||
// Get the title, author, etc...
|
||||
catalog = pdfioFileGetCatalog(pdf);
|
||||
author = pdfioFileGetAuthor(pdf);
|
||||
creator = pdfioFileGetCreator(pdf);
|
||||
has_acroform = pdfioDictGetType(catalog, "AcroForm") != PDFIO_VALTYPE_NONE;
|
||||
num_pages = pdfioFileGetNumPages(pdf);
|
||||
producer = pdfioFileGetProducer(pdf);
|
||||
title = pdfioFileGetTitle(pdf);
|
||||
|
||||
// Get the creation date and convert to a string...
|
||||
if ((creation_date = pdfioFileGetCreationDate(pdf)) > 0)
|
||||
@ -63,12 +84,76 @@ main(int argc, // I - Number of command-line arguments
|
||||
snprintf(creation_text, sizeof(creation_text), "-- not set --");
|
||||
}
|
||||
|
||||
// Get the modification date and convert to a string...
|
||||
if ((modification_date = pdfioFileGetModificationDate(pdf)) > 0)
|
||||
{
|
||||
modification_tm = localtime(&modification_date);
|
||||
strftime(modification_text, sizeof(modification_text), "%c", modification_tm);
|
||||
}
|
||||
else
|
||||
{
|
||||
snprintf(modification_text, sizeof(modification_text), "-- not set --");
|
||||
}
|
||||
|
||||
// Print file information to stdout...
|
||||
printf("%s:\n", filename);
|
||||
printf(" Title: %s\n", title ? title : "-- not set --");
|
||||
printf(" Author: %s\n", author ? author : "-- not set --");
|
||||
printf(" Created On: %s\n", creation_text);
|
||||
printf(" Number Pages: %u\n", (unsigned)pdfioFileGetNumPages(pdf));
|
||||
printf(" Title: %s\n", title ? title : "-- not set --");
|
||||
printf(" Author: %s\n", author ? author : "-- not set --");
|
||||
printf(" Creator: %s\n", creator ? creator : "-- not set --");
|
||||
printf(" Producer: %s\n", producer ? producer : "-- not set --");
|
||||
printf(" Created On: %s\n", creation_text);
|
||||
printf(" Modified On: %s\n", modification_text);
|
||||
printf(" Version: %s\n", pdfioFileGetVersion(pdf));
|
||||
printf(" AcroForm: %s\n", has_acroform ? "Yes" : "No");
|
||||
printf(" Number of Pages: %u\n", (unsigned)num_pages);
|
||||
|
||||
// Report the MediaBox for all of the pages
|
||||
prev_box.x1 = prev_box.x2 = prev_box.y1 = prev_box.y2 = 0.0;
|
||||
|
||||
for (cur = 0, prev = 0; cur < num_pages; cur ++)
|
||||
{
|
||||
// Find the MediaBox for this page in the page tree...
|
||||
for (page = pdfioFileGetPage(pdf, cur);
|
||||
page != NULL;
|
||||
page = pdfioDictGetObj(page_dict, "Parent"))
|
||||
{
|
||||
cur_box.x1 = cur_box.x2 = cur_box.y1 = cur_box.y2 = 0.0;
|
||||
page_dict = pdfioObjGetDict(page);
|
||||
|
||||
if (pdfioDictGetRect(page_dict, "MediaBox", &cur_box))
|
||||
break;
|
||||
}
|
||||
|
||||
// If this MediaBox is different from the previous one, show the range of
|
||||
// pages that have that size...
|
||||
if (cur == 0 ||
|
||||
fabs(cur_box.x1 - prev_box.x1) > 0.01 ||
|
||||
fabs(cur_box.y1 - prev_box.y1) > 0.01 ||
|
||||
fabs(cur_box.x2 - prev_box.x2) > 0.01 ||
|
||||
fabs(cur_box.y2 - prev_box.y2) > 0.01)
|
||||
{
|
||||
if (cur > prev)
|
||||
{
|
||||
snprintf(range_text, sizeof(range_text), "Pages %u-%u",
|
||||
(unsigned)(prev + 1), (unsigned)cur);
|
||||
printf("%16s: [%g %g %g %g]\n", range_text,
|
||||
prev_box.x1, prev_box.y1, prev_box.x2, prev_box.y2);
|
||||
}
|
||||
|
||||
// Start a new series of pages with the new size...
|
||||
prev = cur;
|
||||
prev_box = cur_box;
|
||||
}
|
||||
}
|
||||
|
||||
// Show the last range as needed...
|
||||
if (cur > prev)
|
||||
{
|
||||
snprintf(range_text, sizeof(range_text), "Pages %u-%u",
|
||||
(unsigned)(prev + 1), (unsigned)cur);
|
||||
printf("%16s: [%g %g %g %g]\n", range_text,
|
||||
prev_box.x1, prev_box.y1, prev_box.x2, prev_box.y2);
|
||||
}
|
||||
|
||||
// Close the PDF file...
|
||||
pdfioFileClose(pdf);
|
||||
|
146
examples/pdfiomerge.c
Normal file
@ -0,0 +1,146 @@
|
||||
//
|
||||
// PDF merge program for PDFio.
|
||||
//
|
||||
// Copyright © 2025 by Michael R Sweet.
|
||||
//
|
||||
// Licensed under Apache License v2.0. See the file "LICENSE" for more
|
||||
// information.
|
||||
//
|
||||
// Usage:
|
||||
//
|
||||
// ./pdfiomerge [-o OUTPUT.pdf] INPUT.pdf [... INPUT.pdf]
|
||||
// ./pdfiomerge INPUT.pdf [... INPUT.pdf] >OUTPUT.pdf
|
||||
//
|
||||
|
||||
#include <pdfio.h>
|
||||
#include <string.h>
|
||||
|
||||
|
||||
//
|
||||
// Local functions...
|
||||
//
|
||||
|
||||
static ssize_t output_cb(void *output_cbdata, const void *buffer, size_t bytes);
|
||||
static int usage(FILE *out);
|
||||
|
||||
|
||||
//
|
||||
// 'main()' - Main entry.
|
||||
//
|
||||
|
||||
int // O - Exit status
|
||||
main(int argc, // I - Number of command-line arguments
|
||||
char *argv[]) // I - Command-line arguments
|
||||
{
|
||||
int i; // Looping var
|
||||
const char *opt; // Current option
|
||||
pdfio_file_t *inpdf, // Input PDF file
|
||||
*outpdf = NULL; // Output PDF file
|
||||
|
||||
|
||||
// Parse command-line...
|
||||
for (i = 1; i < argc; i ++)
|
||||
{
|
||||
if (!strcmp(argv[i], "--help"))
|
||||
{
|
||||
return (usage(stdout));
|
||||
}
|
||||
else if (!strncmp(argv[i], "--", 2))
|
||||
{
|
||||
fprintf(stderr, "pdfiomerge: Unknown option '%s'.\n", argv[i]);
|
||||
return (usage(stderr));
|
||||
}
|
||||
else if (argv[i][0] == '-')
|
||||
{
|
||||
for (opt = argv[i] + 1; *opt; opt ++)
|
||||
{
|
||||
switch (*opt)
|
||||
{
|
||||
case 'o' : // -o OUTPUT.pdf
|
||||
if (outpdf)
|
||||
{
|
||||
fputs("pdfiomerge: Only one output file can be specified.\n", stderr);
|
||||
return (usage(stderr));
|
||||
}
|
||||
|
||||
i ++;
|
||||
if (i >= argc)
|
||||
{
|
||||
fputs("pdfiomerge: Missing output filename after '-o'.\n", stderr);
|
||||
return (usage(stderr));
|
||||
}
|
||||
|
||||
if ((outpdf = pdfioFileCreate(argv[i], /*version*/NULL, /*media_box*/NULL, /*crop_box*/NULL, /*error_cb*/NULL, /*error_data*/NULL)) == NULL)
|
||||
return (1);
|
||||
break;
|
||||
|
||||
default :
|
||||
fprintf(stderr, "pdfiomerge: Unknown option '-%c'.\n", *opt);
|
||||
return (usage(stderr));
|
||||
}
|
||||
}
|
||||
}
|
||||
else if ((inpdf = pdfioFileOpen(argv[i], /*password_cb*/NULL, /*password_data*/NULL, /*error_cb*/NULL, /*error_data*/NULL)) == NULL)
|
||||
{
|
||||
return (1);
|
||||
}
|
||||
else
|
||||
{
|
||||
// Copy PDF file...
|
||||
size_t p, // Current page
|
||||
nump; // Number of pages
|
||||
|
||||
if (!outpdf)
|
||||
{
|
||||
if ((outpdf = pdfioFileCreateOutput(output_cb, /*output_cbdata*/NULL, /*version*/NULL, /*media_box*/NULL, /*crop_box*/NULL, /*error_cb*/NULL, /*error_data*/NULL)) == NULL)
|
||||
return (1);
|
||||
}
|
||||
|
||||
for (p = 0, nump = pdfioFileGetNumPages(inpdf); p < nump; p ++)
|
||||
{
|
||||
if (!pdfioPageCopy(outpdf, pdfioFileGetPage(inpdf, p)))
|
||||
return (1);
|
||||
}
|
||||
|
||||
pdfioFileClose(inpdf);
|
||||
}
|
||||
}
|
||||
|
||||
if (!outpdf)
|
||||
return (usage(stderr));
|
||||
|
||||
pdfioFileClose(outpdf);
|
||||
|
||||
return (0);
|
||||
}
|
||||
|
||||
|
||||
//
|
||||
// 'output_cb()' - Write PDF data to the standard output...
|
||||
//
|
||||
|
||||
static ssize_t // O - Number of bytes written
|
||||
output_cb(void *output_cbdata, // I - Callback data (not used)
|
||||
const void *buffer, // I - Buffer to write
|
||||
size_t bytes) // I - Number of bytes to write
|
||||
{
|
||||
(void)output_cbdata;
|
||||
|
||||
return ((ssize_t)fwrite(buffer, 1, bytes, stdout));
|
||||
}
|
||||
|
||||
|
||||
//
|
||||
// 'usage()' - Show program usage.
|
||||
//
|
||||
|
||||
static int // O - Exit status
|
||||
usage(FILE *out) // I - stdout or stderr
|
||||
{
|
||||
fputs("Usage: pdfmerge [OPTIONS] INPUT.pdf [... INPUT.pdf] >OUTPUT.pdf\n", out);
|
||||
fputs("Options:\n", out);
|
||||
fputs(" --help Show help.\n", out);
|
||||
fputs(" -o OUTPUT.pdf Send output to filename instead of stdout.\n", out);
|
||||
|
||||
return (out == stdout ? 0 : 1);
|
||||
}
|
36
makesrcdist
@ -21,40 +21,60 @@ if test $# != 1; then
|
||||
exit 1
|
||||
fi
|
||||
|
||||
status=0
|
||||
version=$1
|
||||
version_major=$(echo $1 | awk -F. '{print $1}')
|
||||
version_minor=$(echo $1 | awk -F. '{print $2}')
|
||||
|
||||
# Check that version number has been updated everywhere...
|
||||
if test $(grep AC_INIT configure.ac | awk '{print $2}') != "[$version],"; then
|
||||
echo "Still need to update AC_INIT version in 'configure.ac'."
|
||||
exit 1
|
||||
status=1
|
||||
fi
|
||||
|
||||
if test $(head -4 CHANGES.md | tail -1 | awk '{print $1}') != "v$version"; then
|
||||
if test $(head -5 CHANGES.md | tail -1 | awk '{print $1}') != "v$version"; then
|
||||
echo "Still need to update CHANGES.md version number."
|
||||
exit 1
|
||||
status=1
|
||||
fi
|
||||
if test $(head -4 CHANGES.md | tail -1 | awk '{print $3}') = "YYYY-MM-DD"; then
|
||||
if test $(head -5 CHANGES.md | tail -1 | awk '{print $3}') = "YYYY-MM-DD"; then
|
||||
echo "Still need to update CHANGES.md release date."
|
||||
exit 1
|
||||
status=1
|
||||
fi
|
||||
|
||||
if test $(grep PDFIO_VERSION= configure | awk -F \" '{print $2}') != "$version"; then
|
||||
echo "Still need to run 'autoconf -f'."
|
||||
exit 1
|
||||
status=1
|
||||
fi
|
||||
|
||||
if test $(grep '<version>' pdfio_native.nuspec | sed -E -e '1,$s/^.*<version>([0-9.]+).*$/\1/') != "$version"; then
|
||||
echo "Still need to update version in 'pdfio_native.nuspec'."
|
||||
exit 1
|
||||
status=1
|
||||
fi
|
||||
|
||||
if test $(grep '<version>' pdfio_native.redist.nuspec | sed -E -e '1,$s/^.*<version>([0-9.]+).*$/\1/') != "$version"; then
|
||||
echo "Still need to update version in 'pdfio_native.redist.nuspec'."
|
||||
exit 1
|
||||
status=1
|
||||
fi
|
||||
|
||||
if test $(grep PDFIO_VERSION pdfio.h | awk -F \" '{print $2}') != "$version"; then
|
||||
echo "Still need to update PDFIO_VERSION in 'pdfio.h'."
|
||||
status=1
|
||||
fi
|
||||
if test $(grep PDFIO_VERSION_MAJOR pdfio.h | awk '{print $4}') != "$version_major"; then
|
||||
echo "Still need to update PDFIO_VERSION_MAJOR in 'pdfio.h'."
|
||||
status=1
|
||||
fi
|
||||
if test $(grep PDFIO_VERSION_MINOR pdfio.h | awk '{print $4}') != "$version_minor"; then
|
||||
echo "Still need to update PDFIO_VERSION_MINOR in 'pdfio.h'."
|
||||
status=1
|
||||
fi
|
||||
|
||||
if test $(grep VERSION pdfio1.def | awk '{print $2}') != "$version_major.$version_minor"; then
|
||||
echo "Still need to update VERSION in 'pdfio1.def'."
|
||||
status=1
|
||||
fi
|
||||
|
||||
if test $status = 1; then
|
||||
exit 1
|
||||
fi
|
||||
|
||||
|
@ -1,5 +1,7 @@
|
||||
<?xml version="1.0" encoding="utf-8"?>
|
||||
<packages>
|
||||
<package id="libpng_native" version="1.6.30" targetFramework="native" />
|
||||
<package id="libpng_native.redist" version="1.6.30" targetFramework="native" />
|
||||
<package id="zlib_native" version="1.2.11" targetFramework="native" />
|
||||
<package id="zlib_native.redist" version="1.2.11" targetFramework="native" />
|
||||
</packages>
|
181
pdfio-aes.c
@ -1,7 +1,7 @@
|
||||
//
|
||||
// AES functions for PDFio.
|
||||
//
|
||||
// Copyright © 2021 by Michael R Sweet.
|
||||
// Copyright © 2021-2025 by Michael R Sweet.
|
||||
//
|
||||
// Licensed under Apache License v2.0. See the file "LICENSE" for more
|
||||
// information.
|
||||
@ -76,18 +76,18 @@ static const uint8_t Rcon[11] = // Round constants
|
||||
// Local functions...
|
||||
//
|
||||
|
||||
static void AddRoundKey(size_t round, state_t *state, const uint8_t *RoundKey);
|
||||
static void SubBytes(state_t *state);
|
||||
static void ShiftRows(state_t *state);
|
||||
static void add_round_key(size_t round, state_t *state, const uint8_t *round_key);
|
||||
static void sub_bytes(state_t *state);
|
||||
static void shift_rows(state_t *state);
|
||||
static uint8_t xtime(uint8_t x);
|
||||
static void MixColumns(state_t *state);
|
||||
static uint8_t Multiply(uint8_t x, uint8_t y);
|
||||
static void InvMixColumns(state_t *state);
|
||||
static void InvSubBytes(state_t *state);
|
||||
static void InvShiftRows(state_t *state);
|
||||
static void Cipher(state_t *state, const _pdfio_aes_t *ctx);
|
||||
static void InvCipher(state_t *state, const _pdfio_aes_t *ctx);
|
||||
static void XorWithIv(uint8_t *buf, const uint8_t *Iv);
|
||||
static void mix_columns(state_t *state);
|
||||
static uint8_t multiply(uint8_t x, uint8_t y);
|
||||
static void inv_mix_columns(state_t *state);
|
||||
static void inv_sub_bytes(state_t *state);
|
||||
static void inv_shift_rows(state_t *state);
|
||||
static void cipher(state_t *state, const _pdfio_aes_t *ctx);
|
||||
static void inv_cipher(state_t *state, const _pdfio_aes_t *ctx);
|
||||
static void xor_with_iv(uint8_t *buf, const uint8_t *Iv);
|
||||
|
||||
|
||||
//
|
||||
@ -106,7 +106,6 @@ _pdfioCryptoAESInit(
|
||||
*rkptr, // Current round_key values
|
||||
*rkend, // End of round_key values
|
||||
tempa[4]; // Used for the column/row operations
|
||||
// size_t roundlen = keylen + 24; // Length of round_key
|
||||
size_t nwords = keylen / 4; // Number of 32-bit words in key
|
||||
|
||||
|
||||
@ -188,8 +187,8 @@ _pdfioCryptoAESDecrypt(
|
||||
while (len > 15)
|
||||
{
|
||||
memcpy(next_iv, outbuffer, 16);
|
||||
InvCipher((state_t *)outbuffer, ctx);
|
||||
XorWithIv(outbuffer, ctx->iv);
|
||||
inv_cipher((state_t *)outbuffer, ctx);
|
||||
xor_with_iv(outbuffer, ctx->iv);
|
||||
memcpy(ctx->iv, next_iv, 16);
|
||||
outbuffer += 16;
|
||||
len -= 16;
|
||||
@ -231,8 +230,8 @@ _pdfioCryptoAESEncrypt(
|
||||
|
||||
while (len > 15)
|
||||
{
|
||||
XorWithIv(outbuffer, iv);
|
||||
Cipher((state_t*)outbuffer, ctx);
|
||||
xor_with_iv(outbuffer, iv);
|
||||
cipher((state_t*)outbuffer, ctx);
|
||||
iv = outbuffer;
|
||||
outbuffer += 16;
|
||||
len -= 16;
|
||||
@ -242,10 +241,10 @@ _pdfioCryptoAESEncrypt(
|
||||
if (len > 0)
|
||||
{
|
||||
// Pad the final buffer with (16 - len)...
|
||||
memset(outbuffer + len, 16 - len, 16 - len);
|
||||
memset(outbuffer + len, (int)(16 - len), 16 - len);
|
||||
|
||||
XorWithIv(outbuffer, iv);
|
||||
Cipher((state_t*)outbuffer, ctx);
|
||||
xor_with_iv(outbuffer, iv);
|
||||
cipher((state_t*)outbuffer, ctx);
|
||||
iv = outbuffer;
|
||||
outbytes += 16;
|
||||
}
|
||||
@ -257,24 +256,32 @@ _pdfioCryptoAESEncrypt(
|
||||
}
|
||||
|
||||
|
||||
// This function adds the round key to state.
|
||||
//
|
||||
// 'add_round_key()' - Add the round key to state.
|
||||
//
|
||||
// The round key is added to the state by an XOR function.
|
||||
//
|
||||
|
||||
static void
|
||||
AddRoundKey(size_t round, state_t *state, const uint8_t *RoundKey)
|
||||
add_round_key(size_t round, // I - Which round
|
||||
state_t *state, // I - Current state
|
||||
const uint8_t *round_key) // I - Key
|
||||
{
|
||||
unsigned i; // Looping var
|
||||
uint8_t *sptr = (*state)[0]; // Pointer into state
|
||||
|
||||
|
||||
for (RoundKey += round * 16, i = 16; i > 0; i --, sptr ++, RoundKey ++)
|
||||
*sptr ^= *RoundKey;
|
||||
for (round_key += round * 16, i = 16; i > 0; i --, sptr ++, round_key ++)
|
||||
*sptr ^= *round_key;
|
||||
}
|
||||
|
||||
|
||||
// The SubBytes Function Substitutes the values in the
|
||||
// state matrix with values in an S-box.
|
||||
//
|
||||
// 'sub_bytes()' - Substitute the values in the state matrix with values in an S-box.
|
||||
//
|
||||
|
||||
static void
|
||||
SubBytes(state_t *state)
|
||||
sub_bytes(state_t *state) // I - Current state
|
||||
{
|
||||
unsigned i; // Looping var
|
||||
uint8_t *sptr = (*state)[0]; // Pointer into state
|
||||
@ -284,11 +291,16 @@ SubBytes(state_t *state)
|
||||
*sptr = sbox[*sptr];
|
||||
}
|
||||
|
||||
// The ShiftRows() function shifts the rows in the state to the left.
|
||||
|
||||
//
|
||||
// 'shift_rows()' - Shift the rows in the state to the left.
|
||||
//
|
||||
// Each row is shifted with different offset.
|
||||
// Offset = Row number. So the first row is not shifted.
|
||||
//
|
||||
|
||||
static void
|
||||
ShiftRows(state_t *state)
|
||||
shift_rows(state_t *state) // I - Current state
|
||||
{
|
||||
uint8_t *sptr = (*state)[0]; // Pointer into state
|
||||
uint8_t temp; // Temporary value
|
||||
@ -319,21 +331,29 @@ ShiftRows(state_t *state)
|
||||
}
|
||||
|
||||
|
||||
static uint8_t
|
||||
xtime(uint8_t x)
|
||||
//
|
||||
// 'xtime()' - Compute the AES xtime function.
|
||||
//
|
||||
|
||||
static uint8_t // O - xtime(x)
|
||||
xtime(uint8_t x) // I - Column value
|
||||
{
|
||||
return ((uint8_t)((x << 1) ^ ((x >> 7) * 0x1b)));
|
||||
}
|
||||
|
||||
|
||||
// MixColumns function mixes the columns of the state matrix
|
||||
//
|
||||
// 'mix_columns()' - Mix the columns of the state matrix.
|
||||
//
|
||||
|
||||
static void
|
||||
MixColumns(state_t *state)
|
||||
mix_columns(state_t *state) // I - Current state
|
||||
{
|
||||
unsigned i; // Looping var
|
||||
uint8_t *sptr = (*state)[0]; // Pointer into state
|
||||
uint8_t Tmp, Tm, t; // Temporary values
|
||||
|
||||
|
||||
for (i = 4; i > 0; i --, sptr += 4)
|
||||
{
|
||||
t = sptr[0];
|
||||
@ -357,11 +377,15 @@ MixColumns(state_t *state)
|
||||
}
|
||||
|
||||
|
||||
// Multiply is used to multiply numbers in the field GF(2^8)
|
||||
//
|
||||
// 'multiply()' - Multiply numbers in the field GF(2^8)
|
||||
//
|
||||
// Note: The last call to xtime() is unneeded, but often ends up generating a smaller binary
|
||||
// The compiler seems to be able to vectorize the operation better this way.
|
||||
// See https://github.com/kokke/tiny-AES-c/pull/34
|
||||
static uint8_t Multiply(uint8_t x, uint8_t y)
|
||||
//
|
||||
|
||||
static uint8_t multiply(uint8_t x, uint8_t y)
|
||||
{
|
||||
return (((y & 1) * x) ^
|
||||
((y>>1 & 1) * xtime(x)) ^
|
||||
@ -371,11 +395,15 @@ static uint8_t Multiply(uint8_t x, uint8_t y)
|
||||
}
|
||||
|
||||
|
||||
// MixColumns function mixes the columns of the state matrix.
|
||||
//
|
||||
// 'mix_columns()' - Mix the columns of the state matrix.
|
||||
//
|
||||
// The method used to multiply may be difficult to understand for the inexperienced.
|
||||
// Please use the references to gain more information.
|
||||
//
|
||||
|
||||
static void
|
||||
InvMixColumns(state_t *state)
|
||||
inv_mix_columns(state_t *state) // I - Current state
|
||||
{
|
||||
unsigned i; // Looping var
|
||||
uint8_t *sptr = (*state)[0]; // Pointer into state
|
||||
@ -389,18 +417,20 @@ InvMixColumns(state_t *state)
|
||||
c = sptr[2];
|
||||
d = sptr[3];
|
||||
|
||||
*sptr++ = Multiply(a, 0x0e) ^ Multiply(b, 0x0b) ^ Multiply(c, 0x0d) ^ Multiply(d, 0x09);
|
||||
*sptr++ = Multiply(a, 0x09) ^ Multiply(b, 0x0e) ^ Multiply(c, 0x0b) ^ Multiply(d, 0x0d);
|
||||
*sptr++ = Multiply(a, 0x0d) ^ Multiply(b, 0x09) ^ Multiply(c, 0x0e) ^ Multiply(d, 0x0b);
|
||||
*sptr++ = Multiply(a, 0x0b) ^ Multiply(b, 0x0d) ^ Multiply(c, 0x09) ^ Multiply(d, 0x0e);
|
||||
*sptr++ = multiply(a, 0x0e) ^ multiply(b, 0x0b) ^ multiply(c, 0x0d) ^ multiply(d, 0x09);
|
||||
*sptr++ = multiply(a, 0x09) ^ multiply(b, 0x0e) ^ multiply(c, 0x0b) ^ multiply(d, 0x0d);
|
||||
*sptr++ = multiply(a, 0x0d) ^ multiply(b, 0x09) ^ multiply(c, 0x0e) ^ multiply(d, 0x0b);
|
||||
*sptr++ = multiply(a, 0x0b) ^ multiply(b, 0x0d) ^ multiply(c, 0x09) ^ multiply(d, 0x0e);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
// The SubBytes Function Substitutes the values in the
|
||||
// state matrix with values in an S-box.
|
||||
//
|
||||
// 'sub_bytes()' - Substitute the values in the state matrix with values in an S-box.
|
||||
//
|
||||
|
||||
static void
|
||||
InvSubBytes(state_t *state)
|
||||
inv_sub_bytes(state_t *state) // I - Current state
|
||||
{
|
||||
unsigned i; // Looping var
|
||||
uint8_t *sptr = (*state)[0]; // Pointer into state
|
||||
@ -411,8 +441,12 @@ InvSubBytes(state_t *state)
|
||||
}
|
||||
|
||||
|
||||
//
|
||||
// 'inv_shift_rows()' - Shift the rows in the state to the right.
|
||||
//
|
||||
|
||||
static void
|
||||
InvShiftRows(state_t *state)
|
||||
inv_shift_rows(state_t *state) // I - Current state
|
||||
{
|
||||
uint8_t *sptr = (*state)[0]; // Pointer into state
|
||||
uint8_t temp; // Temporary value
|
||||
@ -443,40 +477,52 @@ InvShiftRows(state_t *state)
|
||||
}
|
||||
|
||||
|
||||
// Cipher is the main function that encrypts the PlainText.
|
||||
//
|
||||
// 'cipher()' - Encrypt the PlainText.
|
||||
//
|
||||
|
||||
static void
|
||||
Cipher(state_t *state, const _pdfio_aes_t *ctx)
|
||||
cipher(state_t *state, // I - Current state
|
||||
const _pdfio_aes_t *ctx) // I - AES context
|
||||
{
|
||||
size_t round = 0;
|
||||
size_t round = 0; // Current round
|
||||
|
||||
|
||||
// Add the First round key to the state before starting the rounds.
|
||||
AddRoundKey(0, state, ctx->round_key);
|
||||
add_round_key(0, state, ctx->round_key);
|
||||
|
||||
// There will be Nr rounds.
|
||||
// The first Nr-1 rounds are identical.
|
||||
// These Nr rounds are executed in the loop below.
|
||||
// Last one without MixColumns()
|
||||
// Last one without mix_columns()
|
||||
for (round = 1; round < ctx->round_size; round ++)
|
||||
{
|
||||
SubBytes(state);
|
||||
ShiftRows(state);
|
||||
MixColumns(state);
|
||||
AddRoundKey(round, state, ctx->round_key);
|
||||
sub_bytes(state);
|
||||
shift_rows(state);
|
||||
mix_columns(state);
|
||||
add_round_key(round, state, ctx->round_key);
|
||||
}
|
||||
|
||||
// Add round key to last round
|
||||
SubBytes(state);
|
||||
ShiftRows(state);
|
||||
AddRoundKey(ctx->round_size, state, ctx->round_key);
|
||||
sub_bytes(state);
|
||||
shift_rows(state);
|
||||
add_round_key(ctx->round_size, state, ctx->round_key);
|
||||
}
|
||||
|
||||
|
||||
//
|
||||
// 'inv_cipher()' - Decrypt the CipherText.
|
||||
//
|
||||
|
||||
static void
|
||||
InvCipher(state_t *state, const _pdfio_aes_t *ctx)
|
||||
inv_cipher(state_t *state, // I - Current state
|
||||
const _pdfio_aes_t *ctx) // I - AES context
|
||||
{
|
||||
size_t round;
|
||||
size_t round; // Current round
|
||||
|
||||
|
||||
// Add the First round key to the state before starting the rounds.
|
||||
AddRoundKey(ctx->round_size, state, ctx->round_key);
|
||||
add_round_key(ctx->round_size, state, ctx->round_key);
|
||||
|
||||
// There will be Nr rounds.
|
||||
// The first Nr-1 rounds are identical.
|
||||
@ -484,20 +530,25 @@ InvCipher(state_t *state, const _pdfio_aes_t *ctx)
|
||||
// Last one without InvMixColumn()
|
||||
for (round = ctx->round_size - 1; ; round --)
|
||||
{
|
||||
InvShiftRows(state);
|
||||
InvSubBytes(state);
|
||||
AddRoundKey(round, state, ctx->round_key);
|
||||
inv_shift_rows(state);
|
||||
inv_sub_bytes(state);
|
||||
add_round_key(round, state, ctx->round_key);
|
||||
|
||||
if (round == 0)
|
||||
break;
|
||||
|
||||
InvMixColumns(state);
|
||||
inv_mix_columns(state);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
//
|
||||
// 'xor_with_iv()' - XOR a block with the initialization vector.
|
||||
//
|
||||
|
||||
static void
|
||||
XorWithIv(uint8_t *buf, const uint8_t *Iv)
|
||||
xor_with_iv(uint8_t *buf, // I - Block
|
||||
const uint8_t *Iv) // I - Initialization vector
|
||||
{
|
||||
// 16-byte block...
|
||||
*buf++ ^= *Iv++;
|
||||
|
@ -640,6 +640,8 @@ _pdfioArrayRead(pdfio_file_t *pdf, // I - PDF file
|
||||
//
|
||||
// 'pdfioArrayRemove()' - Remove an array entry.
|
||||
//
|
||||
// @since PDFio v1.4@
|
||||
//
|
||||
|
||||
bool // O - `true` on success, `false` otherwise
|
||||
pdfioArrayRemove(pdfio_array_t *a, // I - Array
|
||||
|
@ -47,7 +47,7 @@ _pdfioFileConsume(pdfio_file_t *pdf, // I - PDF file
|
||||
// `false` to halt.
|
||||
//
|
||||
|
||||
bool // O - `false` to stop
|
||||
bool // O - `false` to stop, `true` to continue
|
||||
_pdfioFileDefaultError(
|
||||
pdfio_file_t *pdf, // I - PDF file
|
||||
const char *message, // I - Error message
|
||||
@ -57,7 +57,7 @@ _pdfioFileDefaultError(
|
||||
|
||||
fprintf(stderr, "%s: %s\n", pdf->filename, message);
|
||||
|
||||
return (false);
|
||||
return (!strncmp(message, "WARNING:", 8));
|
||||
}
|
||||
|
||||
|
||||
@ -98,7 +98,7 @@ _pdfioFileFlush(pdfio_file_t *pdf) // I - PDF file
|
||||
if (!write_buffer(pdf, pdf->buffer, (size_t)(pdf->bufptr - pdf->buffer)))
|
||||
return (false);
|
||||
|
||||
pdf->bufpos += pdf->bufptr - pdf->buffer;
|
||||
pdf->bufpos += (off_t)(pdf->bufptr - pdf->buffer);
|
||||
}
|
||||
|
||||
pdf->bufptr = pdf->buffer;
|
||||
@ -134,19 +134,20 @@ _pdfioFileGetChar(pdfio_file_t *pdf) // I - PDF file
|
||||
bool // O - `true` on success, `false` on error
|
||||
_pdfioFileGets(pdfio_file_t *pdf, // I - PDF file
|
||||
char *buffer, // I - Line buffer
|
||||
size_t bufsize) // I - Size of line buffer
|
||||
size_t bufsize, // I - Size of line buffer
|
||||
bool discard) // I - OK to discard excess line chars?
|
||||
{
|
||||
bool eol = false; // End of line?
|
||||
char *bufptr = buffer, // Pointer into buffer
|
||||
*bufend = buffer + bufsize - 1; // Pointer to end of buffer
|
||||
|
||||
|
||||
PDFIO_DEBUG("_pdfioFileGets(pdf=%p, buffer=%p, bufsize=%lu) bufpos=%ld, buffer=%p, bufptr=%p, bufend=%p, offset=%lu\n", pdf, buffer, (unsigned long)bufsize, (long)pdf->bufpos, pdf->buffer, pdf->bufptr, pdf->bufend, (unsigned long)(pdf->bufpos + (pdf->bufptr - pdf->buffer)));
|
||||
PDFIO_DEBUG("_pdfioFileGets(pdf=%p, buffer=%p, bufsize=%lu, discard=%s) bufpos=%ld, buffer=%p, bufptr=%p, bufend=%p, offset=%lu\n", pdf, buffer, (unsigned long)bufsize, discard ? "true" : "false", (long)pdf->bufpos, pdf->buffer, pdf->bufptr, pdf->bufend, (unsigned long)(pdf->bufpos + (pdf->bufptr - pdf->buffer)));
|
||||
|
||||
while (!eol)
|
||||
{
|
||||
// If there are characters ready in the buffer, use them...
|
||||
while (!eol && pdf->bufptr < pdf->bufend && bufptr < bufend)
|
||||
while (!eol && pdf->bufptr < pdf->bufend)
|
||||
{
|
||||
char ch = *(pdf->bufptr++); // Next character in buffer
|
||||
|
||||
@ -168,8 +169,10 @@ _pdfioFileGets(pdfio_file_t *pdf, // I - PDF file
|
||||
pdf->bufptr ++;
|
||||
}
|
||||
}
|
||||
else
|
||||
else if (bufptr < bufend)
|
||||
*bufptr++ = ch;
|
||||
else if (!discard)
|
||||
break;
|
||||
}
|
||||
|
||||
// Fill the read buffer as needed...
|
||||
@ -216,7 +219,7 @@ _pdfioFilePeek(pdfio_file_t *pdf, // I - PDF file
|
||||
PDFIO_DEBUG("_pdfioFilePeek: Sliding buffer, total=%ld\n", (long)total);
|
||||
|
||||
memmove(pdf->buffer, pdf->bufptr, total);
|
||||
pdf->bufpos += pdf->bufptr - pdf->buffer;
|
||||
pdf->bufpos += (off_t)(pdf->bufptr - pdf->buffer);
|
||||
pdf->bufptr = pdf->buffer;
|
||||
pdf->bufend = pdf->buffer + total;
|
||||
|
||||
@ -317,14 +320,14 @@ _pdfioFileRead(pdfio_file_t *pdf, // I - PDF file
|
||||
// Advance current position in file as needed...
|
||||
if (pdf->bufend)
|
||||
{
|
||||
pdf->bufpos += pdf->bufend - pdf->buffer;
|
||||
pdf->bufpos += (off_t)(pdf->bufend - pdf->buffer);
|
||||
pdf->bufptr = pdf->bufend = NULL;
|
||||
}
|
||||
|
||||
// Read directly from the file...
|
||||
if ((rbytes = read_buffer(pdf, bufptr, bytes)) > 0)
|
||||
{
|
||||
pdf->bufpos += rbytes;
|
||||
pdf->bufpos += (off_t)rbytes;
|
||||
continue;
|
||||
}
|
||||
else if (rbytes < 0 && (errno == EINTR || errno == EAGAIN))
|
||||
@ -361,14 +364,14 @@ _pdfioFileSeek(pdfio_file_t *pdf, // I - PDF file
|
||||
// Adjust offset for relative seeks...
|
||||
if (whence == SEEK_CUR)
|
||||
{
|
||||
offset += pdf->bufpos + (pdf->bufptr - pdf->buffer);
|
||||
offset += pdf->bufpos + (off_t)(pdf->bufptr - pdf->buffer);
|
||||
whence = SEEK_SET;
|
||||
}
|
||||
|
||||
if (pdf->mode == _PDFIO_MODE_READ)
|
||||
{
|
||||
// Reading, see if we already have the data we need...
|
||||
if (whence != SEEK_END && offset >= pdf->bufpos && pdf->bufend && offset < (pdf->bufpos + pdf->bufend - pdf->buffer))
|
||||
if (whence != SEEK_END && offset >= pdf->bufpos && pdf->bufend && offset < (off_t)(pdf->bufpos + pdf->bufend - pdf->buffer))
|
||||
{
|
||||
// Yes, seek within existing buffer...
|
||||
pdf->bufptr = pdf->buffer + (offset - pdf->bufpos);
|
||||
@ -424,7 +427,7 @@ off_t // O - Offset from beginning of file
|
||||
_pdfioFileTell(pdfio_file_t *pdf) // I - PDF file
|
||||
{
|
||||
if (pdf->bufptr)
|
||||
return (pdf->bufpos + (pdf->bufptr - pdf->buffer));
|
||||
return (pdf->bufpos + (off_t)(pdf->bufptr - pdf->buffer));
|
||||
else
|
||||
return (pdf->bufpos);
|
||||
}
|
||||
@ -452,7 +455,7 @@ _pdfioFileWrite(pdfio_file_t *pdf, // I - PDF file
|
||||
if (!write_buffer(pdf, buffer, bytes))
|
||||
return (false);
|
||||
|
||||
pdf->bufpos += bytes;
|
||||
pdf->bufpos += (off_t)bytes;
|
||||
|
||||
return (true);
|
||||
}
|
||||
@ -478,7 +481,7 @@ fill_buffer(pdfio_file_t *pdf) // I - PDF file
|
||||
|
||||
// Advance current position in file as needed...
|
||||
if (pdf->bufend)
|
||||
pdf->bufpos += pdf->bufend - pdf->buffer;
|
||||
pdf->bufpos += (off_t)(pdf->bufend - pdf->buffer);
|
||||
|
||||
// Try reading from the file...
|
||||
if ((bytes = read_buffer(pdf, pdf->buffer, sizeof(pdf->buffer))) <= 0)
|
||||
|
1483
pdfio-content.c
@ -120,15 +120,16 @@ extern bool pdfioContentTextMoveLine(pdfio_stream_t *st, double tx, double ty)
|
||||
extern bool pdfioContentTextMoveTo(pdfio_stream_t *st, double tx, double ty) _PDFIO_PUBLIC;
|
||||
extern bool pdfioContentTextNewLine(pdfio_stream_t *st) _PDFIO_PUBLIC;
|
||||
extern bool pdfioContentTextNewLineShow(pdfio_stream_t *st, double ws, double cs, bool unicode, const char *s) _PDFIO_PUBLIC;
|
||||
extern bool pdfioContentTextNewLineShowf(pdfio_stream_t *st, double ws, double cs, bool unicode, const char *format, ...) _PDFIO_PUBLIC _PDFIO_FORMAT(5,6);
|
||||
extern bool pdfioContentTextNewLineShowf(pdfio_stream_t *st, double ws, double cs, bool unicode, const char *format, ...) _PDFIO_PUBLIC;
|
||||
extern bool pdfioContentTextNextLine(pdfio_stream_t *st) _PDFIO_PUBLIC;
|
||||
extern bool pdfioContentTextShow(pdfio_stream_t *st, bool unicode, const char *s) _PDFIO_PUBLIC;
|
||||
extern bool pdfioContentTextShowf(pdfio_stream_t *st, bool unicode, const char *format, ...) _PDFIO_PUBLIC _PDFIO_FORMAT(3,4);
|
||||
extern bool pdfioContentTextShowf(pdfio_stream_t *st, bool unicode, const char *format, ...) _PDFIO_PUBLIC;
|
||||
extern bool pdfioContentTextShowJustified(pdfio_stream_t *st, bool unicode, size_t num_fragments, const double *offsets, const char * const *fragments) _PDFIO_PUBLIC;
|
||||
|
||||
// Resource helpers...
|
||||
extern pdfio_obj_t *pdfioFileCreateFontObjFromBase(pdfio_file_t *pdf, const char *name) _PDFIO_PUBLIC;
|
||||
extern pdfio_obj_t *pdfioFileCreateFontObjFromFile(pdfio_file_t *pdf, const char *filename, bool unicode) _PDFIO_PUBLIC;
|
||||
extern pdfio_obj_t *pdfioFileCreateICCObjFromData(pdfio_file_t *pdf, const unsigned char *data, size_t datalen, size_t num_colors) _PDFIO_PUBLIC;
|
||||
extern pdfio_obj_t *pdfioFileCreateICCObjFromFile(pdfio_file_t *pdf, const char *filename, size_t num_colors) _PDFIO_PUBLIC;
|
||||
extern pdfio_obj_t *pdfioFileCreateImageObjFromData(pdfio_file_t *pdf, const unsigned char *data, size_t width, size_t height, size_t num_colors, pdfio_array_t *color_data, bool alpha, bool interpolate) _PDFIO_PUBLIC;
|
||||
extern pdfio_obj_t *pdfioFileCreateImageObjFromFile(pdfio_file_t *pdf, const char *filename, bool interpolate) _PDFIO_PUBLIC;
|
||||
|
127
pdfio-crypto.c
@ -1,7 +1,7 @@
|
||||
//
|
||||
// Cryptographic support functions for PDFio.
|
||||
//
|
||||
// Copyright © 2021-2023 by Michael R Sweet.
|
||||
// Copyright © 2021-2025 by Michael R Sweet.
|
||||
//
|
||||
// Licensed under Apache License v2.0. See the file "LICENSE" for more
|
||||
// information.
|
||||
@ -98,7 +98,7 @@ static uint8_t pdf_passpad[32] = // Padding for passwords
|
||||
|
||||
static void decrypt_user_key(pdfio_encryption_t encryption, const uint8_t *file_key, uint8_t user_key[32]);
|
||||
static void encrypt_user_key(pdfio_encryption_t encryption, const uint8_t *file_key, uint8_t user_key[32]);
|
||||
static void make_file_key(pdfio_encryption_t encryption, pdfio_permission_t permissions, const unsigned char *file_id, size_t file_idlen, const uint8_t *user_pad, const uint8_t *owner_key, uint8_t file_key[16]);
|
||||
static void make_file_key(pdfio_encryption_t encryption, pdfio_permission_t permissions, const unsigned char *file_id, size_t file_idlen, const uint8_t *user_pad, const uint8_t *owner_key, bool encrypt_metadata, uint8_t file_key[16]);
|
||||
static void make_owner_key(pdfio_encryption_t encryption, const uint8_t *owner_pad, const uint8_t *user_pad, uint8_t owner_key[32]);
|
||||
static void make_user_key(const unsigned char *file_id, size_t file_idlen, uint8_t user_key[32]);
|
||||
static void pad_password(const char *password, uint8_t pad[32]);
|
||||
@ -158,7 +158,7 @@ _pdfioCryptoLock(
|
||||
// Generate the encryption key
|
||||
file_id = pdfioArrayGetBinary(pdf->id_array, 0, &file_idlen);
|
||||
|
||||
make_file_key(encryption, permissions, file_id, file_idlen, user_pad, pdf->owner_key, pdf->file_key);
|
||||
make_file_key(encryption, permissions, file_id, file_idlen, user_pad, pdf->owner_key, pdf->encrypt_metadata, pdf->file_key);
|
||||
pdf->file_keylen = 16;
|
||||
|
||||
// Generate the user key...
|
||||
@ -409,13 +409,6 @@ _pdfioCryptoMakeReader(
|
||||
uint8_t data[21]; // Key data
|
||||
_pdfio_md5_t md5; // MD5 state
|
||||
uint8_t digest[16]; // MD5 digest value
|
||||
#if PDFIO_OBJ_CRYPT
|
||||
pdfio_array_t *id_array; // Object ID array
|
||||
unsigned char *id_value; // Object ID value
|
||||
size_t id_len; // Length of object ID
|
||||
uint8_t temp_key[16]; // File key for object
|
||||
#endif // PDFIO_OBJ_CRYPT
|
||||
uint8_t *file_key; // Computed file key to use
|
||||
|
||||
|
||||
PDFIO_DEBUG("_pdfioCryptoMakeReader(pdf=%p, obj=%p(%d), ctx=%p, iv=%p, ivlen=%p(%d))\n", pdf, obj, (int)obj->number, ctx, iv, ivlen, (int)*ivlen);
|
||||
@ -427,81 +420,29 @@ _pdfioCryptoMakeReader(
|
||||
return (NULL);
|
||||
}
|
||||
|
||||
#if PDFIO_OBJ_CRYPT
|
||||
if ((id_array = pdfioDictGetArray(pdfioObjGetDict(obj), "ID")) != NULL)
|
||||
{
|
||||
// Object has its own ID that will get used for encryption...
|
||||
_pdfio_md5_t md5; // MD5 context
|
||||
uint8_t file_digest[16]; // MD5 digest of file ID and pad
|
||||
uint8_t user_pad[32], // Padded user password
|
||||
own_user_key[32], // Calculated user key
|
||||
pdf_user_key[32]; // Decrypted user key
|
||||
|
||||
PDFIO_DEBUG("_pdfioCryptoMakeReader: Per-object file ID.\n");
|
||||
|
||||
if ((id_value = pdfioArrayGetBinary(id_array, 0, &id_len)) == NULL)
|
||||
{
|
||||
*ivlen = 0;
|
||||
return (NULL);
|
||||
}
|
||||
|
||||
_pdfioCryptoMD5Init(&md5);
|
||||
_pdfioCryptoMD5Append(&md5, pdf_passpad, 32);
|
||||
_pdfioCryptoMD5Append(&md5, id_value, id_len);
|
||||
_pdfioCryptoMD5Finish(&md5, file_digest);
|
||||
|
||||
make_owner_key(pdf->encryption, pdf->password, pdf->owner_key, user_pad);
|
||||
make_file_key(pdf->encryption, pdf->permissions, id_value, id_len, user_pad, pdf->owner_key, temp_key);
|
||||
make_user_key(id_value, id_len, own_user_key);
|
||||
|
||||
if (memcmp(own_user_key, pdf->user_key, sizeof(own_user_key)))
|
||||
{
|
||||
PDFIO_DEBUG("_pdfioCryptoMakeReader: Not user password, trying owner password.\n");
|
||||
|
||||
make_file_key(pdf->encryption, pdf->permissions, id_value, id_len, pdf->password, pdf->owner_key, temp_key);
|
||||
make_user_key(id_value, id_len, own_user_key);
|
||||
|
||||
memcpy(pdf_user_key, pdf->user_key, sizeof(pdf_user_key));
|
||||
decrypt_user_key(pdf->encryption, temp_key, pdf_user_key);
|
||||
|
||||
if (memcmp(pdf->password, pdf_user_key, 32) && memcmp(own_user_key, pdf_user_key, 16))
|
||||
{
|
||||
*ivlen = 0;
|
||||
return (NULL);
|
||||
}
|
||||
}
|
||||
|
||||
file_key = temp_key;
|
||||
}
|
||||
else
|
||||
#endif // PDFIO_OBJ_CRYPT
|
||||
{
|
||||
// Use the default file key...
|
||||
file_key = pdf->file_key;
|
||||
}
|
||||
|
||||
switch (pdf->encryption)
|
||||
{
|
||||
default :
|
||||
_pdfioFileError(pdf, "Unsupported encryption algorithm.");
|
||||
*ivlen = 0;
|
||||
return (NULL);
|
||||
|
||||
case PDFIO_ENCRYPTION_RC4_40 :
|
||||
// Copy the key data for the MD5 hash.
|
||||
memcpy(data, file_key, 16);
|
||||
data[16] = (uint8_t)obj->number;
|
||||
data[17] = (uint8_t)(obj->number >> 8);
|
||||
data[18] = (uint8_t)(obj->number >> 16);
|
||||
data[19] = (uint8_t)obj->generation;
|
||||
data[20] = (uint8_t)(obj->generation >> 8);
|
||||
memcpy(data, pdf->file_key, 5);
|
||||
data[5] = (uint8_t)obj->number;
|
||||
data[6] = (uint8_t)(obj->number >> 8);
|
||||
data[7] = (uint8_t)(obj->number >> 16);
|
||||
data[8] = (uint8_t)obj->generation;
|
||||
data[9] = (uint8_t)(obj->generation >> 8);
|
||||
|
||||
// Hash it...
|
||||
_pdfioCryptoMD5Init(&md5);
|
||||
_pdfioCryptoMD5Append(&md5, data, sizeof(data));
|
||||
_pdfioCryptoMD5Append(&md5, data, 10);
|
||||
_pdfioCryptoMD5Finish(&md5, digest);
|
||||
|
||||
// Initialize the RC4 context using 40 bits of the digest...
|
||||
_pdfioCryptoRC4Init(&ctx->rc4, digest, 5);
|
||||
// Initialize the RC4 context using 80 bits of the digest...
|
||||
_pdfioCryptoRC4Init(&ctx->rc4, digest, 10);
|
||||
*ivlen = 0;
|
||||
return ((_pdfio_crypto_cb_t)_pdfioCryptoRC4Crypt);
|
||||
|
||||
@ -515,7 +456,7 @@ _pdfioCryptoMakeReader(
|
||||
|
||||
case PDFIO_ENCRYPTION_RC4_128 :
|
||||
// Copy the key data for the MD5 hash.
|
||||
memcpy(data, file_key, 16);
|
||||
memcpy(data, pdf->file_key, 16);
|
||||
data[16] = (uint8_t)obj->number;
|
||||
data[17] = (uint8_t)(obj->number >> 8);
|
||||
data[18] = (uint8_t)(obj->number >> 16);
|
||||
@ -641,6 +582,8 @@ _pdfioCryptoUnlock(
|
||||
file_idlen; // Length of file ID
|
||||
_pdfio_md5_t md5; // MD5 context
|
||||
uint8_t file_digest[16]; // MD5 digest of file ID and pad
|
||||
double p; // Permissions value as a double
|
||||
_pdfio_value_t *value; // Encrypt dictionary value, if any
|
||||
|
||||
|
||||
// See if we support the type of encryption specified by the Encrypt object
|
||||
@ -656,7 +599,12 @@ _pdfioCryptoUnlock(
|
||||
revision = (int)pdfioDictGetNumber(encrypt_dict, "R");
|
||||
length = (int)pdfioDictGetNumber(encrypt_dict, "Length");
|
||||
|
||||
PDFIO_DEBUG("_pdfioCryptoUnlock: handler=%p(%s), version=%d, revision=%d, length=%d\n", (void *)handler, handler ? handler : "(null)", version, revision, length);
|
||||
if ((value = _pdfioDictGetValue(encrypt_dict, "EncryptMetadata")) != NULL && value->type == PDFIO_VALTYPE_BOOLEAN)
|
||||
pdf->encrypt_metadata = value->value.boolean;
|
||||
else
|
||||
pdf->encrypt_metadata = true;
|
||||
|
||||
PDFIO_DEBUG("_pdfioCryptoUnlock: handler=%p(%s), version=%d, revision=%d, length=%d, encrypt_metadata=%s\n", (void *)handler, handler ? handler : "(null)", version, revision, length, pdf->encrypt_metadata ? "true" : "false");
|
||||
|
||||
if (!handler || strcmp(handler, "Standard"))
|
||||
{
|
||||
@ -748,8 +696,13 @@ _pdfioCryptoUnlock(
|
||||
|
||||
// Grab the remaining values we need to unlock the PDF...
|
||||
pdf->file_keylen = (size_t)(length / 8);
|
||||
pdf->permissions = (pdfio_permission_t)pdfioDictGetNumber(encrypt_dict, "P");
|
||||
|
||||
p = pdfioDictGetNumber(encrypt_dict, "P");
|
||||
PDFIO_DEBUG("_pdfioCryptoUnlock: P=%.0f\n", p);
|
||||
if (p < 0x7fffffff) // Handle integers > 2^31-1
|
||||
pdf->permissions = (pdfio_permission_t)p;
|
||||
else
|
||||
pdf->permissions = (pdfio_permission_t)(p - 4294967296.0);
|
||||
PDFIO_DEBUG("_pdfioCryptoUnlock: permissions=%d\n", pdf->permissions);
|
||||
|
||||
owner_key = pdfioDictGetBinary(encrypt_dict, "O", &owner_keylen);
|
||||
@ -821,7 +774,7 @@ _pdfioCryptoUnlock(
|
||||
make_owner_key(pdf->encryption, pad, pdf->owner_key, user_pad);
|
||||
PDFIO_DEBUG("_pdfioCryptoUnlock: Upad=%02X%02X%02X%02X...%02X%02X%02X%02X\n", user_pad[0], user_pad[1], user_pad[2], user_pad[3], user_pad[28], user_pad[29], user_pad[30], user_pad[31]);
|
||||
|
||||
make_file_key(pdf->encryption, pdf->permissions, file_id, file_idlen, user_pad, pdf->owner_key, file_key);
|
||||
make_file_key(pdf->encryption, pdf->permissions, file_id, file_idlen, user_pad, pdf->owner_key, pdf->encrypt_metadata, file_key);
|
||||
PDFIO_DEBUG("_pdfioCryptoUnlock: Fown=%02X%02X%02X%02X...%02X%02X%02X%02X\n", file_key[0], file_key[1], file_key[2], file_key[3], file_key[12], file_key[13], file_key[14], file_key[15]);
|
||||
|
||||
make_user_key(file_id, file_idlen, own_user_key);
|
||||
@ -839,7 +792,7 @@ _pdfioCryptoUnlock(
|
||||
}
|
||||
|
||||
// Not the owner password, try the user password...
|
||||
make_file_key(pdf->encryption, pdf->permissions, file_id, file_idlen, pad, pdf->owner_key, file_key);
|
||||
make_file_key(pdf->encryption, pdf->permissions, file_id, file_idlen, pad, pdf->owner_key, pdf->encrypt_metadata, file_key);
|
||||
PDFIO_DEBUG("_pdfioCryptoUnlock: Fuse=%02X%02X%02X%02X...%02X%02X%02X%02X\n", file_key[0], file_key[1], file_key[2], file_key[3], file_key[12], file_key[13], file_key[14], file_key[15]);
|
||||
|
||||
make_user_key(file_id, file_idlen, own_user_key);
|
||||
@ -971,6 +924,8 @@ make_file_key(
|
||||
size_t file_idlen, // I - Length of file ID
|
||||
const uint8_t *user_pad, // I - Padded user password
|
||||
const uint8_t *owner_key, // I - Owner key
|
||||
bool encrypt_metadata,
|
||||
// I - Encrypt metadata?
|
||||
uint8_t file_key[16]) // O - Encryption key
|
||||
{
|
||||
size_t i; // Looping var
|
||||
@ -984,13 +939,25 @@ make_file_key(
|
||||
perm_bytes[2] = (uint8_t)(permissions >> 16);
|
||||
perm_bytes[3] = (uint8_t)(permissions >> 24);
|
||||
|
||||
PDFIO_DEBUG("make_file_key: user_pad[32]=<%02X%02X%02X%02X...%02X%02X%02X%02X>\n", user_pad[0], user_pad[1], user_pad[2], user_pad[3], user_pad[28], user_pad[29], user_pad[30], user_pad[31]);
|
||||
PDFIO_DEBUG("make_file_key: owner_key[32]=<%02X%02X%02X%02X...%02X%02X%02X%02X>\n", owner_key[0], owner_key[1], owner_key[2], owner_key[3], owner_key[28], owner_key[29], owner_key[30], owner_key[31]);
|
||||
PDFIO_DEBUG("make_file_key: permissions(%d)=<%02X%02X%02X%02X>\n", permissions, perm_bytes[0], perm_bytes[1], perm_bytes[2], perm_bytes[3]);
|
||||
|
||||
_pdfioCryptoMD5Init(&md5);
|
||||
_pdfioCryptoMD5Append(&md5, user_pad, 32);
|
||||
_pdfioCryptoMD5Append(&md5, owner_key, 32);
|
||||
_pdfioCryptoMD5Append(&md5, perm_bytes, 4);
|
||||
_pdfioCryptoMD5Append(&md5, file_id, file_idlen);
|
||||
if (!encrypt_metadata)
|
||||
{
|
||||
uint8_t meta_bytes[4] = { 0xff, 0xff, 0xff, 0xff };
|
||||
// Metadata bytes
|
||||
_pdfioCryptoMD5Append(&md5, meta_bytes, 4);
|
||||
}
|
||||
_pdfioCryptoMD5Finish(&md5, digest);
|
||||
|
||||
PDFIO_DEBUG("make_file_key: first md5=<%02X%02X%02X%02X...%02X%02X%02X%02X>\n", digest[0], digest[1], digest[2], digest[3], digest[12], digest[13], digest[14], digest[15]);
|
||||
|
||||
if (encryption != PDFIO_ENCRYPTION_RC4_40)
|
||||
{
|
||||
// MD5 the result 50 times..
|
||||
@ -1002,6 +969,8 @@ make_file_key(
|
||||
}
|
||||
}
|
||||
|
||||
PDFIO_DEBUG("make_file_key: file_key[16]=<%02X%02X%02X%02X...%02X%02X%02X%02X>\n", digest[0], digest[1], digest[2], digest[3], digest[12], digest[13], digest[14], digest[15]);
|
||||
|
||||
memcpy(file_key, digest, 16);
|
||||
}
|
||||
|
||||
@ -1052,9 +1021,11 @@ make_owner_key(
|
||||
// Encrypt 20 times...
|
||||
uint8_t encrypt_key[16]; // RC4 encryption key
|
||||
|
||||
for (i = 0; i < 20; i ++)
|
||||
for (i = 20; i > 0;)
|
||||
{
|
||||
// XOR each byte in the digest with the loop counter to make a key...
|
||||
i --;
|
||||
|
||||
for (j = 0; j < sizeof(encrypt_key); j ++)
|
||||
encrypt_key[j] = (uint8_t)(digest[j] ^ i);
|
||||
|
||||
|
153
pdfio-dict.c
@ -1,7 +1,7 @@
|
||||
//
|
||||
// PDF dictionary functions for PDFio.
|
||||
//
|
||||
// Copyright © 2021-2024 by Michael R Sweet.
|
||||
// Copyright © 2021-2025 by Michael R Sweet.
|
||||
//
|
||||
// Licensed under Apache License v2.0. See the file "LICENSE" for more
|
||||
// information.
|
||||
@ -20,6 +20,8 @@ static int compare_pairs(_pdfio_pair_t *a, _pdfio_pair_t *b);
|
||||
//
|
||||
// 'pdfioDictClear()' - Remove a key/value pair from a dictionary.
|
||||
//
|
||||
// @since PDFio v1.4@
|
||||
//
|
||||
|
||||
bool // O - `true` if cleared, `false` otherwise
|
||||
pdfioDictClear(pdfio_dict_t *dict, // I - Dictionary
|
||||
@ -345,6 +347,8 @@ pdfioDictGetDict(pdfio_dict_t *dict, // I - Dictionary
|
||||
//
|
||||
// 'pdfioDictGetKey()' - Get the key for the specified pair.
|
||||
//
|
||||
// @since PDFio v1.4@
|
||||
//
|
||||
|
||||
const char * // O - Key for specified pair
|
||||
pdfioDictGetKey(pdfio_dict_t *dict, // I - Dictionary
|
||||
@ -375,6 +379,8 @@ pdfioDictGetName(pdfio_dict_t *dict, // I - Dictionary
|
||||
//
|
||||
// 'pdfioDictGetNumPairs()' - Get the number of key/value pairs in a dictionary.
|
||||
//
|
||||
// @since PDFio v1.4@
|
||||
//
|
||||
|
||||
size_t // O - Number of pairs
|
||||
pdfioDictGetNumPairs(pdfio_dict_t *dict)// I - Dictionary
|
||||
@ -465,127 +471,12 @@ pdfioDictGetString(pdfio_dict_t *dict, // I - Dictionary
|
||||
else if (value && value->type == PDFIO_VALTYPE_BINARY && value->value.binary.datalen < 4096)
|
||||
{
|
||||
// Convert binary string to regular string...
|
||||
char temp[4096], // Temporary string
|
||||
*tempptr; // Pointer into temporary string
|
||||
unsigned char *dataptr; // Pointer into the data string
|
||||
char temp[4096]; // Temporary UTF-8 string
|
||||
|
||||
if (!(value->value.binary.datalen & 1) && !memcmp(value->value.binary.data, "\377\376", 2))
|
||||
if (!(value->value.binary.datalen & 1) && (!memcmp(value->value.binary.data, "\376\377", 2) || !memcmp(value->value.binary.data, "\377\376", 2)))
|
||||
{
|
||||
// Copy UTF-16 BE
|
||||
int ch; // Unicode character
|
||||
size_t remaining; // Remaining bytes
|
||||
|
||||
for (dataptr = value->value.binary.data + 2, remaining = value->value.binary.datalen - 2, tempptr = temp; remaining > 1 && tempptr < (temp + sizeof(temp) - 5); dataptr += 2, remaining -= 2)
|
||||
{
|
||||
ch = (dataptr[0] << 8) | dataptr[1];
|
||||
|
||||
if (ch >= 0xd800 && ch <= 0xdbff && remaining > 3)
|
||||
{
|
||||
// Multi-word UTF-16 char...
|
||||
int lch; // Lower bits
|
||||
|
||||
lch = (dataptr[2] << 8) | dataptr[3];
|
||||
|
||||
if (lch < 0xdc00 || lch >= 0xdfff)
|
||||
break;
|
||||
|
||||
ch = (((ch & 0x3ff) << 10) | (lch & 0x3ff)) + 0x10000;
|
||||
dataptr += 2;
|
||||
remaining -= 2;
|
||||
}
|
||||
else if (ch >= 0xfffe)
|
||||
{
|
||||
continue;
|
||||
}
|
||||
|
||||
if (ch < 128)
|
||||
{
|
||||
// ASCII
|
||||
*tempptr++ = (char)ch;
|
||||
}
|
||||
else if (ch < 4096)
|
||||
{
|
||||
// 2-byte UTF-8
|
||||
*tempptr++ = (char)(0xc0 | (ch >> 6));
|
||||
*tempptr++ = (char)(0x80 | (ch & 0x3f));
|
||||
}
|
||||
else if (ch < 65536)
|
||||
{
|
||||
// 3-byte UTF-8
|
||||
*tempptr++ = (char)(0xe0 | (ch >> 12));
|
||||
*tempptr++ = (char)(0x80 | ((ch >> 6) & 0x3f));
|
||||
*tempptr++ = (char)(0x80 | (ch & 0x3f));
|
||||
}
|
||||
else
|
||||
{
|
||||
// 4-byte UTF-8
|
||||
*tempptr++ = (char)(0xe0 | (ch >> 18));
|
||||
*tempptr++ = (char)(0x80 | ((ch >> 12) & 0x3f));
|
||||
*tempptr++ = (char)(0x80 | ((ch >> 6) & 0x3f));
|
||||
*tempptr++ = (char)(0x80 | (ch & 0x3f));
|
||||
}
|
||||
}
|
||||
|
||||
*tempptr = '\0';
|
||||
}
|
||||
else if (!(value->value.binary.datalen & 1) && !memcmp(value->value.binary.data, "\376\377", 2))
|
||||
{
|
||||
// Copy UTF-16 LE
|
||||
int ch; // Unicode character
|
||||
size_t remaining; // Remaining bytes
|
||||
|
||||
for (dataptr = value->value.binary.data + 2, remaining = value->value.binary.datalen - 2, tempptr = temp; remaining > 1 && tempptr < (temp + sizeof(temp) - 5); dataptr += 2, remaining -= 2)
|
||||
{
|
||||
ch = (dataptr[1] << 8) | dataptr[0];
|
||||
|
||||
if (ch >= 0xd800 && ch <= 0xdbff && remaining > 3)
|
||||
{
|
||||
// Multi-word UTF-16 char...
|
||||
int lch; // Lower bits
|
||||
|
||||
lch = (dataptr[3] << 8) | dataptr[2];
|
||||
|
||||
if (lch < 0xdc00 || lch >= 0xdfff)
|
||||
break;
|
||||
|
||||
ch = (((ch & 0x3ff) << 10) | (lch & 0x3ff)) + 0x10000;
|
||||
dataptr += 2;
|
||||
remaining -= 2;
|
||||
}
|
||||
else if (ch >= 0xfffe)
|
||||
{
|
||||
continue;
|
||||
}
|
||||
|
||||
if (ch < 128)
|
||||
{
|
||||
// ASCII
|
||||
*tempptr++ = (char)ch;
|
||||
}
|
||||
else if (ch < 4096)
|
||||
{
|
||||
// 2-byte UTF-8
|
||||
*tempptr++ = (char)(0xc0 | (ch >> 6));
|
||||
*tempptr++ = (char)(0x80 | (ch & 0x3f));
|
||||
}
|
||||
else if (ch < 65536)
|
||||
{
|
||||
// 3-byte UTF-8
|
||||
*tempptr++ = (char)(0xe0 | (ch >> 12));
|
||||
*tempptr++ = (char)(0x80 | ((ch >> 6) & 0x3f));
|
||||
*tempptr++ = (char)(0x80 | (ch & 0x3f));
|
||||
}
|
||||
else
|
||||
{
|
||||
// 4-byte UTF-8
|
||||
*tempptr++ = (char)(0xe0 | (ch >> 18));
|
||||
*tempptr++ = (char)(0x80 | ((ch >> 12) & 0x3f));
|
||||
*tempptr++ = (char)(0x80 | ((ch >> 6) & 0x3f));
|
||||
*tempptr++ = (char)(0x80 | (ch & 0x3f));
|
||||
}
|
||||
}
|
||||
|
||||
*tempptr = '\0';
|
||||
// Copy UTF-16...
|
||||
_pdfio_utf16cpy(temp, value->value.binary.data, value->value.binary.datalen, sizeof(temp));
|
||||
}
|
||||
else
|
||||
{
|
||||
@ -677,6 +568,8 @@ _pdfioDictGetValue(pdfio_dict_t *dict, // I - Dictionary
|
||||
// The iteration continues as long as the callback returns `true` or all keys
|
||||
// have been iterated.
|
||||
//
|
||||
// @since PDFio v1.1@
|
||||
//
|
||||
|
||||
void
|
||||
pdfioDictIterateKeys(
|
||||
@ -737,11 +630,6 @@ _pdfioDictRead(pdfio_file_t *pdf, // I - PDF file
|
||||
_pdfioFileError(pdf, "Invalid dictionary contents.");
|
||||
break;
|
||||
}
|
||||
else if (_pdfioDictGetValue(dict, key + 1))
|
||||
{
|
||||
_pdfioFileError(pdf, "Duplicate dictionary key '%s'.", key + 1);
|
||||
return (NULL);
|
||||
}
|
||||
|
||||
// Then get the next value...
|
||||
PDFIO_DEBUG("_pdfioDictRead: Reading value for '%s'.\n", key + 1);
|
||||
@ -751,8 +639,17 @@ _pdfioDictRead(pdfio_file_t *pdf, // I - PDF file
|
||||
_pdfioFileError(pdf, "Missing value for dictionary key '%s'.", key + 1);
|
||||
break;
|
||||
}
|
||||
|
||||
if (!_pdfioDictSetValue(dict, pdfioStringCreate(pdf, key + 1), &value))
|
||||
else if (_pdfioDictGetValue(dict, key + 1))
|
||||
{
|
||||
// Issue 118: Discard duplicate key/value pairs, in the future this will
|
||||
// be a warning message...
|
||||
_pdfioValueDelete(&value);
|
||||
if (_pdfioFileError(pdf, "WARNING: Discarding value for duplicate dictionary key '%s'.", key + 1))
|
||||
continue;
|
||||
else
|
||||
break;
|
||||
}
|
||||
else if (!_pdfioDictSetValue(dict, pdfioStringCreate(pdf, key + 1), &value))
|
||||
break;
|
||||
|
||||
PDFIO_DEBUG("_pdfioDictRead: Set %s.\n", key);
|
||||
@ -1168,7 +1065,7 @@ _pdfioDictWrite(pdfio_dict_t *dict, // I - Dictionary
|
||||
// Write all of the key/value pairs...
|
||||
for (i = dict->num_pairs, pair = dict->pairs; i > 0; i --, pair ++)
|
||||
{
|
||||
if (!_pdfioFilePrintf(pdf, "/%s", pair->key))
|
||||
if (!_pdfioFilePrintf(pdf, "%N", pair->key))
|
||||
return (false);
|
||||
|
||||
if (length && !strcmp(pair->key, "Length") && pair->value.type == PDFIO_VALTYPE_NUMBER && pair->value.value.number <= 0.0)
|
||||
|
781
pdfio-file.c
432
pdfio-md5.c
@ -1,7 +1,7 @@
|
||||
//
|
||||
// MD5 functions for PDFio.
|
||||
//
|
||||
// Copyright © 2021 by Michael R Sweet.
|
||||
// Copyright © 2021-2025 by Michael R Sweet.
|
||||
// Copyright © 1999 Aladdin Enterprises. All rights reserved.
|
||||
//
|
||||
// This software is provided 'as-is', without any express or implied
|
||||
@ -108,231 +108,285 @@
|
||||
#define T63 0x2ad7d2bb
|
||||
#define T64 0xeb86d391
|
||||
|
||||
|
||||
//
|
||||
// Use the unoptimized (big-endian) implementation if we don't know the
|
||||
// endian-ness of the platform.
|
||||
//
|
||||
|
||||
#ifdef __BYTE_ORDER__
|
||||
# if __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
|
||||
# define ARCH_IS_BIG_ENDIAN 0 // Use little endian optimized version
|
||||
# else
|
||||
# define ARCH_IS_BIG_ENDIAN 1 // Use generic version
|
||||
# endif // __BYTE_ORDER__ == __ORDER_LITTLE_ENDIAN__
|
||||
#elif !defined(ARCH_IS_BIG_ENDIAN)
|
||||
# define ARCH_IS_BIG_ENDIAN 1 // Use generic version
|
||||
#endif // !ARCH_IS_BIG_ENDIAN
|
||||
|
||||
|
||||
//
|
||||
// 'md5_process()' - Hash a block of data.
|
||||
//
|
||||
|
||||
static void
|
||||
md5_process(_pdfio_md5_t *pms, const uint8_t *data /*[64]*/)
|
||||
md5_process(_pdfio_md5_t *pms, // I - MD5 state
|
||||
const uint8_t *data/*[64]*/)// I - Data
|
||||
{
|
||||
uint32_t
|
||||
a = pms->abcd[0], b = pms->abcd[1],
|
||||
c = pms->abcd[2], d = pms->abcd[3];
|
||||
uint32_t t;
|
||||
uint32_t a = pms->abcd[0], // First word of state
|
||||
b = pms->abcd[1], // Second word of state
|
||||
c = pms->abcd[2], // Third word of state
|
||||
d = pms->abcd[3]; // Fourth word of state
|
||||
uint32_t t; // Temporary state
|
||||
|
||||
|
||||
#ifndef ARCH_IS_BIG_ENDIAN
|
||||
# define ARCH_IS_BIG_ENDIAN 1 /* slower, default implementation */
|
||||
#endif
|
||||
#if ARCH_IS_BIG_ENDIAN
|
||||
// On big-endian machines, we must arrange the bytes in the right
|
||||
// order. (This also works on machines of unknown byte order.)
|
||||
uint32_t X[16]; // Little-endian representation
|
||||
const uint8_t *xp; // Pointer into data
|
||||
int i; // Looping var
|
||||
|
||||
/*
|
||||
* On big-endian machines, we must arrange the bytes in the right
|
||||
* order. (This also works on machines of unknown byte order.)
|
||||
*/
|
||||
uint32_t X[16];
|
||||
const uint8_t *xp = data;
|
||||
int i;
|
||||
|
||||
for (i = 0; i < 16; ++i, xp += 4)
|
||||
X[i] = xp[0] + (unsigned)(xp[1] << 8) + (unsigned)(xp[2] << 16) + (unsigned)(xp[3] << 24);
|
||||
for (i = 0, xp = data; i < 16; i ++, xp += 4)
|
||||
X[i] = xp[0] + (unsigned)(xp[1] << 8) + (unsigned)(xp[2] << 16) + (unsigned)(xp[3] << 24);
|
||||
|
||||
#else /* !ARCH_IS_BIG_ENDIAN */
|
||||
// On little-endian machines, we can process properly aligned data without copying it.
|
||||
uint32_t xbuf[16]; // Aligned buffer
|
||||
const uint32_t *X; // Pointer to little-endian representation
|
||||
|
||||
/*
|
||||
* On little-endian machines, we can process properly aligned data
|
||||
* without copying it.
|
||||
*/
|
||||
uint32_t xbuf[16];
|
||||
const uint32_t *X;
|
||||
|
||||
if (!((data - (const uint8_t *)0) & 3)) {
|
||||
/* data are properly aligned */
|
||||
X = (const uint32_t *)data;
|
||||
} else {
|
||||
/* not aligned */
|
||||
memcpy(xbuf, data, 64);
|
||||
X = xbuf;
|
||||
}
|
||||
#endif
|
||||
if (!((data - (const uint8_t *)0) & 3))
|
||||
{
|
||||
// data is properly aligned, use it directly...
|
||||
X = (const uint32_t *)data;
|
||||
}
|
||||
else
|
||||
{
|
||||
// data is not aligned, copy to the aligned buffer...
|
||||
memcpy(xbuf, data, 64);
|
||||
X = xbuf;
|
||||
}
|
||||
#endif // ARCH_IS_BIG_ENDIAN
|
||||
|
||||
#define ROTATE_LEFT(x, n) (((x) << (n)) | ((x) >> (32 - (n))))
|
||||
|
||||
/* Round 1. */
|
||||
/* Let [abcd k s i] denote the operation
|
||||
a = b + ((a + F(b,c,d) + X[k] + T[i]) <<< s). */
|
||||
#define F(x, y, z) (((x) & (y)) | (~(x) & (z)))
|
||||
#define SET(a, b, c, d, k, s, Ti)\
|
||||
t = a + F(b,c,d) + X[k] + Ti;\
|
||||
a = ROTATE_LEFT(t, s) + b
|
||||
/* Do the following 16 operations. */
|
||||
SET(a, b, c, d, 0, 7, T1);
|
||||
SET(d, a, b, c, 1, 12, T2);
|
||||
SET(c, d, a, b, 2, 17, T3);
|
||||
SET(b, c, d, a, 3, 22, T4);
|
||||
SET(a, b, c, d, 4, 7, T5);
|
||||
SET(d, a, b, c, 5, 12, T6);
|
||||
SET(c, d, a, b, 6, 17, T7);
|
||||
SET(b, c, d, a, 7, 22, T8);
|
||||
SET(a, b, c, d, 8, 7, T9);
|
||||
SET(d, a, b, c, 9, 12, T10);
|
||||
SET(c, d, a, b, 10, 17, T11);
|
||||
SET(b, c, d, a, 11, 22, T12);
|
||||
SET(a, b, c, d, 12, 7, T13);
|
||||
SET(d, a, b, c, 13, 12, T14);
|
||||
SET(c, d, a, b, 14, 17, T15);
|
||||
SET(b, c, d, a, 15, 22, T16);
|
||||
// Round 1.
|
||||
// Let [abcd k s i] denote the operation
|
||||
// a = b + ((a + F(b,c,d) + X[k] + T[i]) <<< s).
|
||||
#define F(x, y, z) (((x) & (y)) | (~(x) & (z)))
|
||||
#define SET(a, b, c, d, k, s, Ti) t = a + F(b,c,d) + X[k] + Ti; a = ROTATE_LEFT(t, s) + b
|
||||
|
||||
// Do the following 16 operations.
|
||||
SET(a, b, c, d, 0, 7, T1);
|
||||
SET(d, a, b, c, 1, 12, T2);
|
||||
SET(c, d, a, b, 2, 17, T3);
|
||||
SET(b, c, d, a, 3, 22, T4);
|
||||
SET(a, b, c, d, 4, 7, T5);
|
||||
SET(d, a, b, c, 5, 12, T6);
|
||||
SET(c, d, a, b, 6, 17, T7);
|
||||
SET(b, c, d, a, 7, 22, T8);
|
||||
SET(a, b, c, d, 8, 7, T9);
|
||||
SET(d, a, b, c, 9, 12, T10);
|
||||
SET(c, d, a, b, 10, 17, T11);
|
||||
SET(b, c, d, a, 11, 22, T12);
|
||||
SET(a, b, c, d, 12, 7, T13);
|
||||
SET(d, a, b, c, 13, 12, T14);
|
||||
SET(c, d, a, b, 14, 17, T15);
|
||||
SET(b, c, d, a, 15, 22, T16);
|
||||
|
||||
#undef SET
|
||||
|
||||
/* Round 2. */
|
||||
/* Let [abcd k s i] denote the operation
|
||||
a = b + ((a + G(b,c,d) + X[k] + T[i]) <<< s). */
|
||||
#define G(x, y, z) (((x) & (z)) | ((y) & ~(z)))
|
||||
#define SET(a, b, c, d, k, s, Ti)\
|
||||
t = a + G(b,c,d) + X[k] + Ti;\
|
||||
a = ROTATE_LEFT(t, s) + b
|
||||
/* Do the following 16 operations. */
|
||||
SET(a, b, c, d, 1, 5, T17);
|
||||
SET(d, a, b, c, 6, 9, T18);
|
||||
SET(c, d, a, b, 11, 14, T19);
|
||||
SET(b, c, d, a, 0, 20, T20);
|
||||
SET(a, b, c, d, 5, 5, T21);
|
||||
SET(d, a, b, c, 10, 9, T22);
|
||||
SET(c, d, a, b, 15, 14, T23);
|
||||
SET(b, c, d, a, 4, 20, T24);
|
||||
SET(a, b, c, d, 9, 5, T25);
|
||||
SET(d, a, b, c, 14, 9, T26);
|
||||
SET(c, d, a, b, 3, 14, T27);
|
||||
SET(b, c, d, a, 8, 20, T28);
|
||||
SET(a, b, c, d, 13, 5, T29);
|
||||
SET(d, a, b, c, 2, 9, T30);
|
||||
SET(c, d, a, b, 7, 14, T31);
|
||||
SET(b, c, d, a, 12, 20, T32);
|
||||
// Round 2.
|
||||
// Let [abcd k s i] denote the operation
|
||||
// a = b + ((a + G(b,c,d) + X[k] + T[i]) <<< s).
|
||||
#define G(x, y, z) (((x) & (z)) | ((y) & ~(z)))
|
||||
#define SET(a, b, c, d, k, s, Ti) t = a + G(b,c,d) + X[k] + Ti; a = ROTATE_LEFT(t, s) + b
|
||||
|
||||
// Do the following 16 operations.
|
||||
SET(a, b, c, d, 1, 5, T17);
|
||||
SET(d, a, b, c, 6, 9, T18);
|
||||
SET(c, d, a, b, 11, 14, T19);
|
||||
SET(b, c, d, a, 0, 20, T20);
|
||||
SET(a, b, c, d, 5, 5, T21);
|
||||
SET(d, a, b, c, 10, 9, T22);
|
||||
SET(c, d, a, b, 15, 14, T23);
|
||||
SET(b, c, d, a, 4, 20, T24);
|
||||
SET(a, b, c, d, 9, 5, T25);
|
||||
SET(d, a, b, c, 14, 9, T26);
|
||||
SET(c, d, a, b, 3, 14, T27);
|
||||
SET(b, c, d, a, 8, 20, T28);
|
||||
SET(a, b, c, d, 13, 5, T29);
|
||||
SET(d, a, b, c, 2, 9, T30);
|
||||
SET(c, d, a, b, 7, 14, T31);
|
||||
SET(b, c, d, a, 12, 20, T32);
|
||||
|
||||
#undef SET
|
||||
|
||||
/* Round 3. */
|
||||
/* Let [abcd k s t] denote the operation
|
||||
a = b + ((a + H(b,c,d) + X[k] + T[i]) <<< s). */
|
||||
#define H(x, y, z) ((x) ^ (y) ^ (z))
|
||||
#define SET(a, b, c, d, k, s, Ti)\
|
||||
t = a + H(b,c,d) + X[k] + Ti;\
|
||||
a = ROTATE_LEFT(t, s) + b
|
||||
/* Do the following 16 operations. */
|
||||
SET(a, b, c, d, 5, 4, T33);
|
||||
SET(d, a, b, c, 8, 11, T34);
|
||||
SET(c, d, a, b, 11, 16, T35);
|
||||
SET(b, c, d, a, 14, 23, T36);
|
||||
SET(a, b, c, d, 1, 4, T37);
|
||||
SET(d, a, b, c, 4, 11, T38);
|
||||
SET(c, d, a, b, 7, 16, T39);
|
||||
SET(b, c, d, a, 10, 23, T40);
|
||||
SET(a, b, c, d, 13, 4, T41);
|
||||
SET(d, a, b, c, 0, 11, T42);
|
||||
SET(c, d, a, b, 3, 16, T43);
|
||||
SET(b, c, d, a, 6, 23, T44);
|
||||
SET(a, b, c, d, 9, 4, T45);
|
||||
SET(d, a, b, c, 12, 11, T46);
|
||||
SET(c, d, a, b, 15, 16, T47);
|
||||
SET(b, c, d, a, 2, 23, T48);
|
||||
// Round 3.
|
||||
// Let [abcd k s t] denote the operation
|
||||
// a = b + ((a + H(b,c,d) + X[k] + T[i]) <<< s).
|
||||
#define H(x, y, z) ((x) ^ (y) ^ (z))
|
||||
#define SET(a, b, c, d, k, s, Ti) t = a + H(b,c,d) + X[k] + Ti; a = ROTATE_LEFT(t, s) + b
|
||||
|
||||
// Do the following 16 operations.
|
||||
SET(a, b, c, d, 5, 4, T33);
|
||||
SET(d, a, b, c, 8, 11, T34);
|
||||
SET(c, d, a, b, 11, 16, T35);
|
||||
SET(b, c, d, a, 14, 23, T36);
|
||||
SET(a, b, c, d, 1, 4, T37);
|
||||
SET(d, a, b, c, 4, 11, T38);
|
||||
SET(c, d, a, b, 7, 16, T39);
|
||||
SET(b, c, d, a, 10, 23, T40);
|
||||
SET(a, b, c, d, 13, 4, T41);
|
||||
SET(d, a, b, c, 0, 11, T42);
|
||||
SET(c, d, a, b, 3, 16, T43);
|
||||
SET(b, c, d, a, 6, 23, T44);
|
||||
SET(a, b, c, d, 9, 4, T45);
|
||||
SET(d, a, b, c, 12, 11, T46);
|
||||
SET(c, d, a, b, 15, 16, T47);
|
||||
SET(b, c, d, a, 2, 23, T48);
|
||||
|
||||
#undef SET
|
||||
|
||||
/* Round 4. */
|
||||
/* Let [abcd k s t] denote the operation
|
||||
a = b + ((a + I(b,c,d) + X[k] + T[i]) <<< s). */
|
||||
#define I(x, y, z) ((y) ^ ((x) | ~(z)))
|
||||
#define SET(a, b, c, d, k, s, Ti)\
|
||||
t = a + I(b,c,d) + X[k] + Ti;\
|
||||
a = ROTATE_LEFT(t, s) + b
|
||||
/* Do the following 16 operations. */
|
||||
SET(a, b, c, d, 0, 6, T49);
|
||||
SET(d, a, b, c, 7, 10, T50);
|
||||
SET(c, d, a, b, 14, 15, T51);
|
||||
SET(b, c, d, a, 5, 21, T52);
|
||||
SET(a, b, c, d, 12, 6, T53);
|
||||
SET(d, a, b, c, 3, 10, T54);
|
||||
SET(c, d, a, b, 10, 15, T55);
|
||||
SET(b, c, d, a, 1, 21, T56);
|
||||
SET(a, b, c, d, 8, 6, T57);
|
||||
SET(d, a, b, c, 15, 10, T58);
|
||||
SET(c, d, a, b, 6, 15, T59);
|
||||
SET(b, c, d, a, 13, 21, T60);
|
||||
SET(a, b, c, d, 4, 6, T61);
|
||||
SET(d, a, b, c, 11, 10, T62);
|
||||
SET(c, d, a, b, 2, 15, T63);
|
||||
SET(b, c, d, a, 9, 21, T64);
|
||||
// Round 4.
|
||||
// Let [abcd k s t] denote the operation
|
||||
// a = b + ((a + I(b,c,d) + X[k] + T[i]) <<< s).
|
||||
#define I(x, y, z) ((y) ^ ((x) | ~(z)))
|
||||
#define SET(a, b, c, d, k, s, Ti) t = a + I(b,c,d) + X[k] + Ti; a = ROTATE_LEFT(t, s) + b
|
||||
|
||||
// Do the following 16 operations.
|
||||
SET(a, b, c, d, 0, 6, T49);
|
||||
SET(d, a, b, c, 7, 10, T50);
|
||||
SET(c, d, a, b, 14, 15, T51);
|
||||
SET(b, c, d, a, 5, 21, T52);
|
||||
SET(a, b, c, d, 12, 6, T53);
|
||||
SET(d, a, b, c, 3, 10, T54);
|
||||
SET(c, d, a, b, 10, 15, T55);
|
||||
SET(b, c, d, a, 1, 21, T56);
|
||||
SET(a, b, c, d, 8, 6, T57);
|
||||
SET(d, a, b, c, 15, 10, T58);
|
||||
SET(c, d, a, b, 6, 15, T59);
|
||||
SET(b, c, d, a, 13, 21, T60);
|
||||
SET(a, b, c, d, 4, 6, T61);
|
||||
SET(d, a, b, c, 11, 10, T62);
|
||||
SET(c, d, a, b, 2, 15, T63);
|
||||
SET(b, c, d, a, 9, 21, T64);
|
||||
|
||||
#undef SET
|
||||
|
||||
/* Then perform the following additions. (That is increment each
|
||||
of the four registers by the value it had before this block
|
||||
was started.) */
|
||||
pms->abcd[0] += a;
|
||||
pms->abcd[1] += b;
|
||||
pms->abcd[2] += c;
|
||||
pms->abcd[3] += d;
|
||||
// Then perform the following additions. (That is increment each of the four
|
||||
// registers by the value it had before this block was started.)
|
||||
pms->abcd[0] += a;
|
||||
pms->abcd[1] += b;
|
||||
pms->abcd[2] += c;
|
||||
pms->abcd[3] += d;
|
||||
}
|
||||
|
||||
|
||||
//
|
||||
// '_pdfioCryptoMD5Init()' - Initialize an MD5 hash.
|
||||
//
|
||||
|
||||
void
|
||||
_pdfioCryptoMD5Init(_pdfio_md5_t *pms)
|
||||
_pdfioCryptoMD5Init(_pdfio_md5_t *pms) // I - MD5 state
|
||||
{
|
||||
pms->count[0] = pms->count[1] = 0;
|
||||
pms->abcd[0] = 0x67452301;
|
||||
pms->abcd[1] = 0xefcdab89;
|
||||
pms->abcd[2] = 0x98badcfe;
|
||||
pms->abcd[3] = 0x10325476;
|
||||
pms->abcd[0] = 0x67452301;
|
||||
pms->abcd[1] = 0xefcdab89;
|
||||
pms->abcd[2] = 0x98badcfe;
|
||||
pms->abcd[3] = 0x10325476;
|
||||
}
|
||||
|
||||
|
||||
//
|
||||
// '_pdfioCryptoMD5Append()' - Append bytes to the MD5 hash.
|
||||
//
|
||||
|
||||
void
|
||||
_pdfioCryptoMD5Append(_pdfio_md5_t *pms, const uint8_t *data, size_t nbytes)
|
||||
_pdfioCryptoMD5Append(
|
||||
_pdfio_md5_t *pms, // I - MD5 state
|
||||
const uint8_t *data, // I - Data to add
|
||||
size_t nbytes) // I - Number of bytes
|
||||
{
|
||||
const uint8_t *p = data;
|
||||
size_t left = nbytes;
|
||||
size_t offset = (pms->count[0] >> 3) & 63;
|
||||
uint32_t nbits = (uint32_t)(nbytes << 3);
|
||||
const uint8_t *p = data; // Pointer into data
|
||||
size_t left = nbytes; // Remaining bytes
|
||||
size_t offset = (pms->count[0] >> 3) & 63;
|
||||
// Offset into state
|
||||
uint32_t nbits = (uint32_t)(nbytes << 3);
|
||||
// Number of bits to add
|
||||
|
||||
if (nbytes == 0)
|
||||
return;
|
||||
|
||||
/* Update the message length. */
|
||||
pms->count[1] += (unsigned)(nbytes >> 29);
|
||||
pms->count[0] += nbits;
|
||||
if (pms->count[0] < nbits)
|
||||
pms->count[1]++;
|
||||
if (nbytes == 0)
|
||||
return;
|
||||
|
||||
/* Process an initial partial block. */
|
||||
if (offset) {
|
||||
size_t copy = (offset + nbytes > 64 ? 64 - offset : nbytes);
|
||||
// Update the message length.
|
||||
pms->count[1] += (unsigned)(nbytes >> 29);
|
||||
pms->count[0] += nbits;
|
||||
if (pms->count[0] < nbits)
|
||||
pms->count[1] ++;
|
||||
|
||||
memcpy(pms->buf + offset, p, copy);
|
||||
if (offset + copy < 64)
|
||||
return;
|
||||
p += copy;
|
||||
left -= copy;
|
||||
md5_process(pms, pms->buf);
|
||||
}
|
||||
// Process an initial partial block.
|
||||
if (offset)
|
||||
{
|
||||
size_t copy = ((offset + nbytes) > 64 ? 64 - offset : nbytes);
|
||||
// Number of bytes to copy
|
||||
|
||||
/* Process full blocks. */
|
||||
for (; left >= 64; p += 64, left -= 64)
|
||||
md5_process(pms, p);
|
||||
memcpy(pms->buf + offset, p, copy);
|
||||
|
||||
/* Process a final partial block. */
|
||||
if (left)
|
||||
memcpy(pms->buf, p, left);
|
||||
if ((offset + copy) < 64)
|
||||
return;
|
||||
|
||||
p += copy;
|
||||
left -= copy;
|
||||
|
||||
md5_process(pms, pms->buf);
|
||||
}
|
||||
|
||||
// Process full blocks.
|
||||
for (; left >= 64; p += 64, left -= 64)
|
||||
md5_process(pms, p);
|
||||
|
||||
// Copy a final partial block.
|
||||
if (left)
|
||||
memcpy(pms->buf, p, left);
|
||||
}
|
||||
|
||||
|
||||
//
|
||||
// '_pdfioCryptoMD5Finish()' - Finalize the MD5 hash.
|
||||
//
|
||||
|
||||
void
|
||||
_pdfioCryptoMD5Finish(_pdfio_md5_t *pms, uint8_t digest[16])
|
||||
_pdfioCryptoMD5Finish(
|
||||
_pdfio_md5_t *pms, // I - MD5 state
|
||||
uint8_t digest[16]) // O - Digest value
|
||||
{
|
||||
static const uint8_t pad[64] = {
|
||||
0x80, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
|
||||
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
|
||||
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
|
||||
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
|
||||
};
|
||||
uint8_t data[8];
|
||||
int i;
|
||||
int i; // Looping var
|
||||
uint8_t data[8]; // Digest length data
|
||||
static const uint8_t pad[64] = // Padding bytes
|
||||
{
|
||||
0x80, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
|
||||
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
|
||||
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
|
||||
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
|
||||
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
|
||||
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
|
||||
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,
|
||||
0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
|
||||
};
|
||||
|
||||
/* Save the length before padding. */
|
||||
for (i = 0; i < 8; ++i)
|
||||
data[i] = (uint8_t)(pms->count[i >> 2] >> ((i & 3) << 3));
|
||||
/* Pad to 56 bytes mod 64. */
|
||||
_pdfioCryptoMD5Append(pms, pad, ((55 - (pms->count[0] >> 3)) & 63) + 1);
|
||||
/* Append the length. */
|
||||
_pdfioCryptoMD5Append(pms, data, 8);
|
||||
for (i = 0; i < 16; ++i)
|
||||
digest[i] = (uint8_t)(pms->abcd[i >> 2] >> ((i & 3) << 3));
|
||||
|
||||
// Save the length before padding.
|
||||
for (i = 0; i < 8; ++i)
|
||||
data[i] = (uint8_t)(pms->count[i >> 2] >> ((i & 3) << 3));
|
||||
|
||||
// Pad to 56 bytes mod 64.
|
||||
_pdfioCryptoMD5Append(pms, pad, ((55 - (pms->count[0] >> 3)) & 63) + 1);
|
||||
|
||||
// Append the length.
|
||||
_pdfioCryptoMD5Append(pms, data, 8);
|
||||
|
||||
// Copy the digest from the state...
|
||||
for (i = 0; i < 16; ++i)
|
||||
digest[i] = (uint8_t)(pms->abcd[i >> 2] >> ((i & 3) << 3));
|
||||
}
|
||||
|
@ -1,7 +1,7 @@
|
||||
//
|
||||
// PDF object functions for PDFio.
|
||||
//
|
||||
// Copyright © 2021-2024 by Michael R Sweet.
|
||||
// Copyright © 2021-2025 by Michael R Sweet.
|
||||
//
|
||||
// Licensed under Apache License v2.0. See the file "LICENSE" for more
|
||||
// information.
|
||||
@ -10,13 +10,6 @@
|
||||
#include "pdfio-private.h"
|
||||
|
||||
|
||||
//
|
||||
// Local functions...
|
||||
//
|
||||
|
||||
static bool write_obj_header(pdfio_obj_t *obj);
|
||||
|
||||
|
||||
//
|
||||
// 'pdfioObjClose()' - Close an object, writing any data as needed to the PDF
|
||||
// file.
|
||||
@ -42,7 +35,7 @@ pdfioObjClose(pdfio_obj_t *obj) // I - Object
|
||||
if (!obj->offset)
|
||||
{
|
||||
// Write the object value
|
||||
if (!write_obj_header(obj))
|
||||
if (!_pdfioObjWriteHeader(obj))
|
||||
return (false);
|
||||
|
||||
// Write the "endobj" line...
|
||||
@ -86,6 +79,10 @@ pdfioObjCopy(pdfio_file_t *pdf, // I - PDF file
|
||||
if (srcobj->value.type == PDFIO_VALTYPE_NONE)
|
||||
_pdfioObjLoad(srcobj);
|
||||
|
||||
// See if we have already mapped this object...
|
||||
if ((dstobj = _pdfioFileFindMappedObj(pdf, srcobj->pdf, srcobj->number)) != NULL)
|
||||
return (dstobj); // Yes, return that one...
|
||||
|
||||
// Create the new object...
|
||||
if ((dstobj = _pdfioFileCreateObj(pdf, srcobj->pdf, NULL)) == NULL)
|
||||
return (NULL);
|
||||
@ -148,6 +145,7 @@ pdfioObjCreateStream(
|
||||
pdfio_obj_t *obj, // I - Object
|
||||
pdfio_filter_t filter) // I - Type of compression to apply
|
||||
{
|
||||
pdfio_stream_t *st; // Stream
|
||||
pdfio_obj_t *length_obj = NULL; // Length object, if any
|
||||
|
||||
|
||||
@ -195,17 +193,19 @@ pdfioObjCreateStream(
|
||||
}
|
||||
}
|
||||
|
||||
if (!write_obj_header(obj))
|
||||
if (!_pdfioObjWriteHeader(obj))
|
||||
return (NULL);
|
||||
|
||||
if (!_pdfioFilePuts(obj->pdf, "stream\n"))
|
||||
return (NULL);
|
||||
|
||||
obj->stream_offset = _pdfioFileTell(obj->pdf);
|
||||
obj->pdf->current_obj = obj;
|
||||
obj->stream_offset = _pdfioFileTell(obj->pdf);
|
||||
|
||||
// Return the new stream...
|
||||
return (_pdfioStreamCreate(obj, length_obj, filter));
|
||||
if ((st = _pdfioStreamCreate(obj, length_obj, 0, filter)) != NULL)
|
||||
obj->pdf->current_obj = obj;
|
||||
|
||||
return (st);
|
||||
}
|
||||
|
||||
|
||||
@ -314,7 +314,8 @@ pdfioObjGetLength(pdfio_obj_t *obj) // I - Object
|
||||
|
||||
if ((lenobj = pdfioDictGetObj(obj->value.value.dict, "Length")) == NULL)
|
||||
{
|
||||
_pdfioFileError(obj->pdf, "Unable to get length of stream.");
|
||||
if (!_pdfioDictGetValue(obj->value.value.dict, "Length"))
|
||||
_pdfioFileError(obj->pdf, "Unable to get length of stream.");
|
||||
return (0);
|
||||
}
|
||||
|
||||
@ -336,6 +337,8 @@ pdfioObjGetLength(pdfio_obj_t *obj) // I - Object
|
||||
//
|
||||
// 'pdfioObjGetName()' - Get the name value associated with an object.
|
||||
//
|
||||
// @since PDFio v1.4@
|
||||
//
|
||||
|
||||
const char * // O - Dictionary or `NULL` on error
|
||||
pdfioObjGetName(pdfio_obj_t *obj) // I - Object
|
||||
@ -511,7 +514,7 @@ _pdfioObjLoad(pdfio_obj_t *obj) // I - Object
|
||||
}
|
||||
|
||||
// Decrypt as needed...
|
||||
if (obj->pdf->encryption)
|
||||
if (obj->pdf->encryption && obj->pdf->encrypt_metadata)
|
||||
{
|
||||
PDFIO_DEBUG("_pdfioObjLoad: Decrypting value...\n");
|
||||
|
||||
@ -538,6 +541,9 @@ pdfio_stream_t * // O - Stream or `NULL` on error
|
||||
pdfioObjOpenStream(pdfio_obj_t *obj, // I - Object
|
||||
bool decode) // I - Decode/decompress data?
|
||||
{
|
||||
pdfio_stream_t *st; // Stream
|
||||
|
||||
|
||||
// Range check input...
|
||||
if (!obj)
|
||||
return (NULL);
|
||||
@ -560,9 +566,10 @@ pdfioObjOpenStream(pdfio_obj_t *obj, // I - Object
|
||||
return (NULL);
|
||||
|
||||
// Open the stream...
|
||||
obj->pdf->current_obj = obj;
|
||||
if ((st = _pdfioStreamOpen(obj, decode)) != NULL)
|
||||
obj->pdf->current_obj = obj;
|
||||
|
||||
return (_pdfioStreamOpen(obj, decode));
|
||||
return (st);
|
||||
}
|
||||
|
||||
|
||||
@ -582,11 +589,11 @@ _pdfioObjSetExtension(
|
||||
|
||||
|
||||
//
|
||||
// 'write_obj_header()' - Write the object header...
|
||||
// '_pdfioObjWriteHeader()' - Write the object header...
|
||||
//
|
||||
|
||||
static bool // O - `true` on success, `false` on failure
|
||||
write_obj_header(pdfio_obj_t *obj) // I - Object
|
||||
bool // O - `true` on success, `false` on failure
|
||||
_pdfioObjWriteHeader(pdfio_obj_t *obj) // I - Object
|
||||
{
|
||||
obj->offset = _pdfioFileTell(obj->pdf);
|
||||
|
||||
|
@ -1,7 +1,7 @@
|
||||
//
|
||||
// Private header file for PDFio.
|
||||
//
|
||||
// Copyright © 2021-2024 by Michael R Sweet.
|
||||
// Copyright © 2021-2025 by Michael R Sweet.
|
||||
//
|
||||
// Licensed under Apache License v2.0. See the file "LICENSE" for more
|
||||
// information.
|
||||
@ -10,7 +10,7 @@
|
||||
#ifndef PDFIO_PRIVATE_H
|
||||
# define PDFIO_PRIVATE_H
|
||||
# ifdef _WIN32
|
||||
# define _CRT_SECURE_NO_WARNINGS // Disable bogus VS warnings/errors...
|
||||
# define _CRT_SECURE_NO_WARNINGS 1 // Disable bogus VS warnings/errors...
|
||||
# endif // _WIN32
|
||||
# include "pdfio.h"
|
||||
# include <stdarg.h>
|
||||
@ -28,16 +28,16 @@
|
||||
# define access _access // Map standard POSIX/C99 names
|
||||
# define close _close
|
||||
# define fileno _fileno
|
||||
# define lseek _lseek
|
||||
# define lseek(f,o,w) (off_t)_lseek((f),(long)(o),(w))
|
||||
# define mkdir(d,p) _mkdir(d)
|
||||
# define open _open
|
||||
# define read _read
|
||||
# define read(f,b,s) _read((f),(b),(unsigned)(s))
|
||||
# define rmdir _rmdir
|
||||
# define snprintf _snprintf
|
||||
# define strdup _strdup
|
||||
# define unlink _unlink
|
||||
# define vsnprintf _vsnprintf
|
||||
# define write _write
|
||||
# define write(f,b,s) _write((f),(b),(unsigned)(s))
|
||||
# ifndef F_OK
|
||||
# define F_OK 00 // POSIX parameters/flags
|
||||
# define W_OK 02
|
||||
@ -94,6 +94,7 @@
|
||||
//
|
||||
|
||||
# define PDFIO_MAX_DEPTH 32 // Maximum nesting depth for values
|
||||
# define PDFIO_MAX_STRING 65536 // Maximum length of string
|
||||
|
||||
typedef void (*_pdfio_extfree_t)(void *);
|
||||
// Extension data free function
|
||||
@ -107,7 +108,7 @@ typedef enum _pdfio_mode_e // Read/write mode
|
||||
typedef enum _pdfio_predictor_e // PNG predictor constants
|
||||
{
|
||||
_PDFIO_PREDICTOR_NONE = 1, // No predictor (default)
|
||||
_PDFIO_PREDICTOR_TIFF2 = 2, // TIFF2 predictor (???)
|
||||
_PDFIO_PREDICTOR_TIFF2 = 2, // TIFF predictor 2 (difference from left neighbor)
|
||||
_PDFIO_PREDICTOR_PNG_NONE = 10, // PNG None predictor (same as `_PDFIO_PREDICTOR_NONE`)
|
||||
_PDFIO_PREDICTOR_PNG_SUB = 11, // PNG Sub predictor
|
||||
_PDFIO_PREDICTOR_PNG_UP = 12, // PNG Up predictor
|
||||
@ -220,13 +221,22 @@ struct _pdfio_dict_s // Dictionary
|
||||
typedef struct _pdfio_objmap_s // PDF object map
|
||||
{
|
||||
pdfio_obj_t *obj; // Object for this file
|
||||
pdfio_file_t *src_pdf; // Source PDF file
|
||||
unsigned char src_id[32]; // Source PDF file file identifier
|
||||
size_t src_number; // Source object number
|
||||
} _pdfio_objmap_t;
|
||||
|
||||
typedef struct _pdfio_strbuf_s // PDF string buffer
|
||||
{
|
||||
struct _pdfio_strbuf_s *next; // Next string buffer
|
||||
bool bufused; // Is this string buffer being used?
|
||||
char buffer[PDFIO_MAX_STRING + 32];
|
||||
// String buffer
|
||||
} _pdfio_strbuf_t;
|
||||
|
||||
struct _pdfio_file_s // PDF file structure
|
||||
{
|
||||
char *filename; // Filename
|
||||
unsigned char file_id[32]; // File identifier bytes
|
||||
struct lconv *loc; // Locale data
|
||||
char *version; // Version number
|
||||
pdfio_rect_t media_box, // Default MediaBox value
|
||||
@ -261,6 +271,7 @@ struct _pdfio_file_s // PDF file structure
|
||||
pdfio_obj_t *cp1252_obj, // CP1252 font encoding object
|
||||
*unicode_obj; // Unicode font encoding object
|
||||
pdfio_array_t *id_array; // ID array
|
||||
bool encrypt_metadata; // Encrypt metadata?
|
||||
|
||||
// Allocated data elements
|
||||
size_t num_arrays, // Number of arrays
|
||||
@ -283,6 +294,7 @@ struct _pdfio_file_s // PDF file structure
|
||||
size_t num_strings, // Number of strings
|
||||
alloc_strings; // Allocated strings
|
||||
char **strings; // Nul-terminated strings
|
||||
_pdfio_strbuf_t *strbuffers; // String buffers
|
||||
};
|
||||
|
||||
struct _pdfio_obj_s // Object
|
||||
@ -313,8 +325,9 @@ struct _pdfio_stream_s // Stream
|
||||
z_stream flate; // Flate filter state
|
||||
_pdfio_predictor_t predictor; // Predictor function, if any
|
||||
size_t pbpixel, // Size of a pixel in bytes
|
||||
pbsize; // Predictor buffer size, if any
|
||||
unsigned char cbuffer[4096], // Compressed data buffer
|
||||
pbsize, // Predictor buffer size, if any
|
||||
cbsize; // Compressed data buffer size
|
||||
unsigned char *cbuffer, // Compressed data buffer
|
||||
*prbuffer, // Raw buffer (previous line), as needed
|
||||
*psbuffer; // PNG filter buffer, as needed
|
||||
_pdfio_crypto_cb_t crypto_cb; // Encryption/descryption callback, if any
|
||||
@ -326,7 +339,9 @@ struct _pdfio_stream_s // Stream
|
||||
// Functions...
|
||||
//
|
||||
|
||||
extern size_t _pdfio_strlcpy(char *dst, const char *src, size_t dstsize) _PDFIO_INTERNAL;
|
||||
extern double _pdfio_strtod(pdfio_file_t *pdf, const char *s) _PDFIO_INTERNAL;
|
||||
extern void _pdfio_utf16cpy(char *dst, const unsigned char *src, size_t srclen, size_t dstsize) _PDFIO_INTERNAL;
|
||||
extern ssize_t _pdfio_vsnprintf(pdfio_file_t *pdf, char *buffer, size_t bufsize, const char *format, va_list ap) _PDFIO_INTERNAL;
|
||||
|
||||
extern bool _pdfioArrayDecrypt(pdfio_file_t *pdf, pdfio_obj_t *obj, pdfio_array_t *a, size_t depth) _PDFIO_INTERNAL;
|
||||
@ -366,13 +381,13 @@ extern bool _pdfioFileAddPage(pdfio_file_t *pdf, pdfio_obj_t *obj) _PDFIO_INTER
|
||||
extern bool _pdfioFileConsume(pdfio_file_t *pdf, size_t bytes) _PDFIO_INTERNAL;
|
||||
extern pdfio_obj_t *_pdfioFileCreateObj(pdfio_file_t *pdf, pdfio_file_t *srcpdf, _pdfio_value_t *value) _PDFIO_INTERNAL;
|
||||
extern bool _pdfioFileDefaultError(pdfio_file_t *pdf, const char *message, void *data) _PDFIO_INTERNAL;
|
||||
extern bool _pdfioFileError(pdfio_file_t *pdf, const char *format, ...) _PDFIO_FORMAT(2,3) _PDFIO_INTERNAL;
|
||||
extern bool _pdfioFileError(pdfio_file_t *pdf, const char *format, ...) _PDFIO_INTERNAL;
|
||||
extern pdfio_obj_t *_pdfioFileFindMappedObj(pdfio_file_t *pdf, pdfio_file_t *src_pdf, size_t src_number) _PDFIO_INTERNAL;
|
||||
extern bool _pdfioFileFlush(pdfio_file_t *pdf) _PDFIO_INTERNAL;
|
||||
extern int _pdfioFileGetChar(pdfio_file_t *pdf) _PDFIO_INTERNAL;
|
||||
extern bool _pdfioFileGets(pdfio_file_t *pdf, char *buffer, size_t bufsize) _PDFIO_INTERNAL;
|
||||
extern bool _pdfioFileGets(pdfio_file_t *pdf, char *buffer, size_t bufsize, bool discard) _PDFIO_INTERNAL;
|
||||
extern ssize_t _pdfioFilePeek(pdfio_file_t *pdf, void *buffer, size_t bytes) _PDFIO_INTERNAL;
|
||||
extern bool _pdfioFilePrintf(pdfio_file_t *pdf, const char *format, ...) _PDFIO_FORMAT(2,3) _PDFIO_INTERNAL;
|
||||
extern bool _pdfioFilePrintf(pdfio_file_t *pdf, const char *format, ...) _PDFIO_INTERNAL;
|
||||
extern bool _pdfioFilePuts(pdfio_file_t *pdf, const char *s) _PDFIO_INTERNAL;
|
||||
extern ssize_t _pdfioFileRead(pdfio_file_t *pdf, void *buffer, size_t bytes) _PDFIO_INTERNAL;
|
||||
extern off_t _pdfioFileSeek(pdfio_file_t *pdf, off_t offset, int whence) _PDFIO_INTERNAL;
|
||||
@ -383,10 +398,13 @@ extern void _pdfioObjDelete(pdfio_obj_t *obj) _PDFIO_INTERNAL;
|
||||
extern void *_pdfioObjGetExtension(pdfio_obj_t *obj) _PDFIO_INTERNAL;
|
||||
extern bool _pdfioObjLoad(pdfio_obj_t *obj) _PDFIO_INTERNAL;
|
||||
extern void _pdfioObjSetExtension(pdfio_obj_t *obj, void *data, _pdfio_extfree_t datafree) _PDFIO_INTERNAL;
|
||||
extern bool _pdfioObjWriteHeader(pdfio_obj_t *obj) _PDFIO_INTERNAL;
|
||||
|
||||
extern pdfio_stream_t *_pdfioStreamCreate(pdfio_obj_t *obj, pdfio_obj_t *length_obj, pdfio_filter_t compression) _PDFIO_INTERNAL;
|
||||
extern pdfio_stream_t *_pdfioStreamCreate(pdfio_obj_t *obj, pdfio_obj_t *length_obj, size_t cbsize, pdfio_filter_t compression) _PDFIO_INTERNAL;
|
||||
extern pdfio_stream_t *_pdfioStreamOpen(pdfio_obj_t *obj, bool decode) _PDFIO_INTERNAL;
|
||||
|
||||
extern char *_pdfioStringAllocBuffer(pdfio_file_t *pdf);
|
||||
extern void _pdfioStringFreeBuffer(pdfio_file_t *pdf, char *buffer);
|
||||
extern bool _pdfioStringIsAllocated(pdfio_file_t *pdf, const char *s) _PDFIO_INTERNAL;
|
||||
|
||||
extern void _pdfioTokenClear(_pdfio_token_t *tb) _PDFIO_INTERNAL;
|
||||
|
169
pdfio-stream.c
@ -1,7 +1,7 @@
|
||||
//
|
||||
// PDF stream functions for PDFio.
|
||||
//
|
||||
// Copyright © 2021-2024 by Michael R Sweet.
|
||||
// Copyright © 2021-2025 by Michael R Sweet.
|
||||
//
|
||||
// Licensed under Apache License v2.0. See the file "LICENSE" for more
|
||||
// information.
|
||||
@ -50,7 +50,7 @@ pdfioStreamClose(pdfio_stream_t *st) // I - Stream
|
||||
|
||||
while ((status = deflate(&st->flate, Z_FINISH)) != Z_STREAM_END)
|
||||
{
|
||||
size_t bytes = sizeof(st->cbuffer) - st->flate.avail_out,
|
||||
size_t bytes = st->cbsize - st->flate.avail_out,
|
||||
// Bytes to write
|
||||
outbytes; // Actual bytes written
|
||||
|
||||
@ -89,13 +89,13 @@ pdfioStreamClose(pdfio_stream_t *st) // I - Stream
|
||||
}
|
||||
|
||||
st->flate.next_out = (Bytef *)st->cbuffer + bytes;
|
||||
st->flate.avail_out = (uInt)(sizeof(st->cbuffer) - bytes);
|
||||
st->flate.avail_out = (uInt)(st->cbsize - bytes);
|
||||
}
|
||||
|
||||
if (st->flate.avail_out < (uInt)sizeof(st->cbuffer))
|
||||
if (st->flate.avail_out < (uInt)st->cbsize)
|
||||
{
|
||||
// Write any residuals...
|
||||
size_t bytes = sizeof(st->cbuffer) - st->flate.avail_out;
|
||||
size_t bytes = st->cbsize - st->flate.avail_out;
|
||||
// Bytes to write
|
||||
|
||||
if (st->crypto_cb)
|
||||
@ -140,7 +140,7 @@ pdfioStreamClose(pdfio_stream_t *st) // I - Stream
|
||||
// Update the length as needed...
|
||||
if (st->length_obj)
|
||||
{
|
||||
st->length_obj->value.value.number = st->obj->stream_length;
|
||||
st->length_obj->value.value.number = (double)st->obj->stream_length;
|
||||
pdfioObjClose(st->length_obj);
|
||||
}
|
||||
else if (st->obj->length_offset)
|
||||
@ -172,6 +172,7 @@ pdfioStreamClose(pdfio_stream_t *st) // I - Stream
|
||||
|
||||
st->pdf->current_obj = NULL;
|
||||
|
||||
free(st->cbuffer);
|
||||
free(st->prbuffer);
|
||||
free(st->psbuffer);
|
||||
free(st);
|
||||
@ -190,6 +191,7 @@ pdfio_stream_t * // O - Stream or `NULL` on error
|
||||
_pdfioStreamCreate(
|
||||
pdfio_obj_t *obj, // I - Object
|
||||
pdfio_obj_t *length_obj, // I - Length object, if any
|
||||
size_t cbsize, // I - Size of compression buffer
|
||||
pdfio_filter_t compression) // I - Compression to apply
|
||||
{
|
||||
pdfio_stream_t *st; // Stream
|
||||
@ -257,7 +259,7 @@ _pdfioStreamCreate(
|
||||
{
|
||||
colors = 1;
|
||||
}
|
||||
else if (colors < 0 || colors > 4)
|
||||
else if (colors < 0 || colors > 32)
|
||||
{
|
||||
_pdfioFileError(st->pdf, "Unsupported Colors value %d.", colors);
|
||||
free(st);
|
||||
@ -268,7 +270,7 @@ _pdfioStreamCreate(
|
||||
{
|
||||
columns = 1;
|
||||
}
|
||||
else if (columns < 0)
|
||||
else if (columns < 0 || columns > 65536)
|
||||
{
|
||||
_pdfioFileError(st->pdf, "Unsupported Columns value %d.", columns);
|
||||
free(st);
|
||||
@ -302,8 +304,21 @@ _pdfioStreamCreate(
|
||||
else
|
||||
st->predictor = _PDFIO_PREDICTOR_NONE;
|
||||
|
||||
if (cbsize == 0)
|
||||
cbsize = 4096;
|
||||
|
||||
st->cbsize = cbsize;
|
||||
if ((st->cbuffer = malloc(cbsize)) == NULL)
|
||||
{
|
||||
_pdfioFileError(st->pdf, "Unable to allocate %lu bytes for Flate output buffer: %s", (unsigned long)cbsize, strerror(errno));
|
||||
free(st->prbuffer);
|
||||
free(st->psbuffer);
|
||||
free(st);
|
||||
return (NULL);
|
||||
}
|
||||
|
||||
st->flate.next_out = (Bytef *)st->cbuffer;
|
||||
st->flate.avail_out = (uInt)sizeof(st->cbuffer);
|
||||
st->flate.avail_out = (uInt)cbsize;
|
||||
|
||||
if ((status = deflateInit(&(st->flate), 9)) != Z_OK)
|
||||
{
|
||||
@ -362,15 +377,16 @@ pdfioStreamConsume(pdfio_stream_t *st, // I - Stream
|
||||
//
|
||||
// 'pdfioStreamGetToken()' - Read a single PDF token from a stream.
|
||||
//
|
||||
// This function reads a single PDF token from a stream. Operator tokens,
|
||||
// boolean values, and numbers are returned as-is in the provided string buffer.
|
||||
// String values start with the opening parenthesis ('(') but have all escaping
|
||||
// resolved and the terminating parenthesis removed. Hexadecimal string values
|
||||
// start with the opening angle bracket ('<') and have all whitespace and the
|
||||
// terminating angle bracket removed.
|
||||
// This function reads a single PDF token from a stream, skipping all whitespace
|
||||
// and comments. Operator tokens, boolean values, and numbers are returned
|
||||
// as-is in the provided string buffer. String values start with the opening
|
||||
// parenthesis ('(') but have all escaping resolved and the terminating
|
||||
// parenthesis removed. Hexadecimal string values start with the opening angle
|
||||
// bracket ('<') and have all whitespace and the terminating angle bracket
|
||||
// removed.
|
||||
//
|
||||
|
||||
bool // O - `true` on success, `false` on EOF
|
||||
bool // O - `true` on success, `false` on end-of-stream or error
|
||||
pdfioStreamGetToken(
|
||||
pdfio_stream_t *st, // I - Stream
|
||||
char *buffer, // I - String buffer
|
||||
@ -423,16 +439,16 @@ _pdfioStreamOpen(pdfio_obj_t *obj, // I - Object
|
||||
st->pdf = obj->pdf;
|
||||
st->obj = obj;
|
||||
|
||||
if ((st->remaining = pdfioObjGetLength(obj)) == 0)
|
||||
if ((st->remaining = pdfioObjGetLength(obj)) == 0 && !_pdfioDictGetValue(pdfioObjGetDict(obj), "Length"))
|
||||
{
|
||||
free(st);
|
||||
return (NULL);
|
||||
_pdfioFileError(obj->pdf, "No stream data.");
|
||||
goto error;
|
||||
}
|
||||
|
||||
if (_pdfioFileSeek(st->pdf, obj->stream_offset, SEEK_SET) != obj->stream_offset)
|
||||
{
|
||||
free(st);
|
||||
return (NULL);
|
||||
_pdfioFileError(obj->pdf, "Unable to seek to stream data.");
|
||||
goto error;
|
||||
}
|
||||
|
||||
type = pdfioObjGetType(obj);
|
||||
@ -445,11 +461,7 @@ _pdfioStreamOpen(pdfio_obj_t *obj, // I - Object
|
||||
ivlen = (size_t)_pdfioFilePeek(st->pdf, iv, sizeof(iv));
|
||||
|
||||
if ((st->crypto_cb = _pdfioCryptoMakeReader(st->pdf, obj, &st->crypto_ctx, iv, &ivlen)) == NULL)
|
||||
{
|
||||
// TODO: Add error message?
|
||||
free(st);
|
||||
return (NULL);
|
||||
}
|
||||
goto error;
|
||||
|
||||
PDFIO_DEBUG("_pdfioStreamOpen: ivlen=%d\n", (int)ivlen);
|
||||
if (ivlen > 0)
|
||||
@ -480,8 +492,7 @@ _pdfioStreamOpen(pdfio_obj_t *obj, // I - Object
|
||||
{
|
||||
// TODO: Implement compound filters...
|
||||
_pdfioFileError(st->pdf, "Unsupported compound stream filter.");
|
||||
free(st);
|
||||
return (NULL);
|
||||
goto error;
|
||||
}
|
||||
|
||||
// No filter, read as-is...
|
||||
@ -514,37 +525,33 @@ _pdfioStreamOpen(pdfio_obj_t *obj, // I - Object
|
||||
else if (bpc < 1 || bpc == 3 || (bpc > 4 && bpc < 8) || (bpc > 8 && bpc < 16) || bpc > 16)
|
||||
{
|
||||
_pdfioFileError(st->pdf, "Unsupported BitsPerColor value %d.", bpc);
|
||||
free(st);
|
||||
return (NULL);
|
||||
goto error;
|
||||
}
|
||||
|
||||
if (colors == 0)
|
||||
{
|
||||
colors = 1;
|
||||
}
|
||||
else if (colors < 0 || colors > 4)
|
||||
else if (colors < 0 || colors > 32)
|
||||
{
|
||||
_pdfioFileError(st->pdf, "Unsupported Colors value %d.", colors);
|
||||
free(st);
|
||||
return (NULL);
|
||||
goto error;
|
||||
}
|
||||
|
||||
if (columns == 0)
|
||||
{
|
||||
columns = 1;
|
||||
}
|
||||
else if (columns < 0)
|
||||
else if (columns < 0 || columns > 65536)
|
||||
{
|
||||
_pdfioFileError(st->pdf, "Unsupported Columns value %d.", columns);
|
||||
free(st);
|
||||
return (NULL);
|
||||
goto error;
|
||||
}
|
||||
|
||||
if ((predictor > 2 && predictor < 10) || predictor > 15)
|
||||
{
|
||||
_pdfioFileError(st->pdf, "Unsupported Predictor function %d.", predictor);
|
||||
free(st);
|
||||
return (NULL);
|
||||
goto error;
|
||||
}
|
||||
else if (predictor > 1)
|
||||
{
|
||||
@ -555,31 +562,41 @@ _pdfioStreamOpen(pdfio_obj_t *obj, // I - Object
|
||||
if (predictor >= 10)
|
||||
st->pbsize ++; // Add PNG predictor byte
|
||||
|
||||
if (st->pbsize < 2)
|
||||
{
|
||||
_pdfioFileError(st->pdf, "Bad Predictor buffer size %lu.", (unsigned long)st->pbsize);
|
||||
goto error;
|
||||
}
|
||||
|
||||
PDFIO_DEBUG("_pdfioStreamOpen: st->predictor=%d, st->pbpixel=%u, st->pbsize=%lu\n", st->predictor, (unsigned)st->pbpixel, (unsigned long)st->pbsize);
|
||||
if ((st->prbuffer = calloc(1, st->pbsize - 1)) == NULL || (st->psbuffer = calloc(1, st->pbsize)) == NULL)
|
||||
{
|
||||
_pdfioFileError(st->pdf, "Unable to allocate %lu bytes for Predictor buffers.", (unsigned long)st->pbsize);
|
||||
free(st->prbuffer);
|
||||
free(st->psbuffer);
|
||||
free(st);
|
||||
return (NULL);
|
||||
goto error;
|
||||
}
|
||||
}
|
||||
else
|
||||
{
|
||||
st->predictor = _PDFIO_PREDICTOR_NONE;
|
||||
}
|
||||
|
||||
st->cbsize = 4096;
|
||||
if ((st->cbuffer = malloc(st->cbsize)) == NULL)
|
||||
{
|
||||
_pdfioFileError(st->pdf, "Unable to allocate %lu bytes for Flate compression buffer.", (unsigned long)st->cbsize);
|
||||
goto error;
|
||||
}
|
||||
|
||||
PDFIO_DEBUG("_pdfioStreamOpen: pos=%ld\n", (long)_pdfioFileTell(st->pdf));
|
||||
if (sizeof(st->cbuffer) > st->remaining)
|
||||
if (st->cbsize > st->remaining)
|
||||
rbytes = _pdfioFileRead(st->pdf, st->cbuffer, st->remaining);
|
||||
else
|
||||
rbytes = _pdfioFileRead(st->pdf, st->cbuffer, sizeof(st->cbuffer));
|
||||
rbytes = _pdfioFileRead(st->pdf, st->cbuffer, st->cbsize);
|
||||
|
||||
if (rbytes <= 0)
|
||||
{
|
||||
_pdfioFileError(st->pdf, "Unable to read bytes for stream.");
|
||||
free(st->prbuffer);
|
||||
free(st->psbuffer);
|
||||
free(st);
|
||||
return (NULL);
|
||||
goto error;
|
||||
}
|
||||
|
||||
if (st->crypto_cb)
|
||||
@ -593,10 +610,7 @@ _pdfioStreamOpen(pdfio_obj_t *obj, // I - Object
|
||||
if ((status = inflateInit(&(st->flate))) != Z_OK)
|
||||
{
|
||||
_pdfioFileError(st->pdf, "Unable to start Flate filter: %s", zstrerror(status));
|
||||
free(st->prbuffer);
|
||||
free(st->psbuffer);
|
||||
free(st);
|
||||
return (NULL);
|
||||
goto error;
|
||||
}
|
||||
|
||||
st->remaining -= st->flate.avail_in;
|
||||
@ -609,9 +623,8 @@ _pdfioStreamOpen(pdfio_obj_t *obj, // I - Object
|
||||
else
|
||||
{
|
||||
// Something else we don't support
|
||||
_pdfioFileError(st->pdf, "Unsupported stream filter '/%s'.", filter);
|
||||
free(st);
|
||||
return (NULL);
|
||||
_pdfioFileError(st->pdf, "Unsupported stream filter '%N'.", filter);
|
||||
goto error;
|
||||
}
|
||||
}
|
||||
else
|
||||
@ -621,6 +634,16 @@ _pdfioStreamOpen(pdfio_obj_t *obj, // I - Object
|
||||
}
|
||||
|
||||
return (st);
|
||||
|
||||
// If we get here something went wrong...
|
||||
error:
|
||||
|
||||
free(st->cbuffer);
|
||||
free(st->prbuffer);
|
||||
free(st->psbuffer);
|
||||
free(st);
|
||||
return (NULL);
|
||||
|
||||
}
|
||||
|
||||
|
||||
@ -673,6 +696,11 @@ pdfioStreamPeek(pdfio_stream_t *st, // I - Stream
|
||||
//
|
||||
// 'pdfioStreamPrintf()' - Write a formatted string to a stream.
|
||||
//
|
||||
// This function writes a formatted string to a stream. In addition to the
|
||||
// standard `printf` format characters, you can use "%H" to format a HTML/XML
|
||||
// string value, "%N" to format a PDF name value ("/Name"), and "%S" to format
|
||||
// a PDF string ("(String)") value.
|
||||
//
|
||||
|
||||
bool // O - `true` on success, `false` on failure
|
||||
pdfioStreamPrintf(
|
||||
@ -1045,10 +1073,10 @@ stream_read(pdfio_stream_t *st, // I - Stream
|
||||
if (st->flate.avail_in == 0)
|
||||
{
|
||||
// Read more from the file...
|
||||
if (sizeof(st->cbuffer) > st->remaining)
|
||||
if (st->cbsize > st->remaining)
|
||||
rbytes = _pdfioFileRead(st->pdf, st->cbuffer, st->remaining);
|
||||
else
|
||||
rbytes = _pdfioFileRead(st->pdf, st->cbuffer, sizeof(st->cbuffer));
|
||||
rbytes = _pdfioFileRead(st->pdf, st->cbuffer, st->cbsize);
|
||||
|
||||
if (rbytes <= 0)
|
||||
return (-1); // End of file...
|
||||
@ -1101,10 +1129,10 @@ stream_read(pdfio_stream_t *st, // I - Stream
|
||||
if (st->flate.avail_in == 0)
|
||||
{
|
||||
// Read more from the file...
|
||||
if (sizeof(st->cbuffer) > st->remaining)
|
||||
if (st->cbsize > st->remaining)
|
||||
rbytes = _pdfioFileRead(st->pdf, st->cbuffer, st->remaining);
|
||||
else
|
||||
rbytes = _pdfioFileRead(st->pdf, st->cbuffer, sizeof(st->cbuffer));
|
||||
rbytes = _pdfioFileRead(st->pdf, st->cbuffer, st->cbsize);
|
||||
|
||||
if (rbytes <= 0)
|
||||
return (-1); // End of file...
|
||||
@ -1171,10 +1199,10 @@ stream_read(pdfio_stream_t *st, // I - Stream
|
||||
if (st->flate.avail_in == 0)
|
||||
{
|
||||
// Read more from the file...
|
||||
if (sizeof(st->cbuffer) > st->remaining)
|
||||
if (st->cbsize > st->remaining)
|
||||
rbytes = _pdfioFileRead(st->pdf, st->cbuffer, st->remaining);
|
||||
else
|
||||
rbytes = _pdfioFileRead(st->pdf, st->cbuffer, sizeof(st->cbuffer));
|
||||
rbytes = _pdfioFileRead(st->pdf, st->cbuffer, st->cbsize);
|
||||
|
||||
if (rbytes <= 0)
|
||||
return (-1); // End of file...
|
||||
@ -1207,7 +1235,18 @@ stream_read(pdfio_stream_t *st, // I - Stream
|
||||
}
|
||||
|
||||
// Apply predictor for this line
|
||||
PDFIO_DEBUG("stream_read: Line %02X %02X %02X %02X %02X.\n", sptr[-1], sptr[0], sptr[0], sptr[2], sptr[3]);
|
||||
#ifdef DEBUG
|
||||
if (remaining > 4)
|
||||
PDFIO_DEBUG("stream_read: Line %02X %02X %02X %02X %02X ...\n", sptr[-1], sptr[0], sptr[1], sptr[2], sptr[3]);
|
||||
else if (remaining > 3)
|
||||
PDFIO_DEBUG("stream_read: Line %02X %02X %02X %02X %02X.\n", sptr[-1], sptr[0], sptr[1], sptr[2], sptr[3]);
|
||||
else if (remaining > 2)
|
||||
PDFIO_DEBUG("stream_read: Line %02X %02X %02X %02X.\n", sptr[-1], sptr[0], sptr[1], sptr[2]);
|
||||
else if (remaining > 1)
|
||||
PDFIO_DEBUG("stream_read: Line %02X %02X %02X.\n", sptr[-1], sptr[0], sptr[1]);
|
||||
else
|
||||
PDFIO_DEBUG("stream_read: Line %02X %02X.\n", sptr[-1], sptr[0]);
|
||||
#endif // DEBUG
|
||||
|
||||
switch (sptr[-1])
|
||||
{
|
||||
@ -1278,10 +1317,10 @@ stream_write(pdfio_stream_t *st, // I - Stream
|
||||
|
||||
while (st->flate.avail_in > 0)
|
||||
{
|
||||
if (st->flate.avail_out < (sizeof(st->cbuffer) / 8))
|
||||
if (st->flate.avail_out < (st->cbsize / 8))
|
||||
{
|
||||
// Flush the compression buffer...
|
||||
size_t cbytes = sizeof(st->cbuffer) - st->flate.avail_out,
|
||||
size_t cbytes = st->cbsize - st->flate.avail_out,
|
||||
outbytes;
|
||||
|
||||
if (st->crypto_cb)
|
||||
@ -1310,7 +1349,7 @@ stream_write(pdfio_stream_t *st, // I - Stream
|
||||
}
|
||||
|
||||
st->flate.next_out = (Bytef *)st->cbuffer + cbytes;
|
||||
st->flate.avail_out = (uInt)(sizeof(st->cbuffer) - cbytes);
|
||||
st->flate.avail_out = (uInt)(st->cbsize - cbytes);
|
||||
}
|
||||
|
||||
// Deflate what we can this time...
|
||||
|
392
pdfio-string.c
@ -1,7 +1,7 @@
|
||||
//
|
||||
// PDF string functions for PDFio.
|
||||
//
|
||||
// Copyright © 2021-2024 by Michael R Sweet.
|
||||
// Copyright © 2021-2025 by Michael R Sweet.
|
||||
//
|
||||
// Licensed under Apache License v2.0. See the file "LICENSE" for more
|
||||
// information.
|
||||
@ -17,6 +17,83 @@
|
||||
static size_t find_string(pdfio_file_t *pdf, const char *s, int *rdiff);
|
||||
|
||||
|
||||
//
|
||||
// '_pdfio_strlcpy()' - Safe string copy.
|
||||
//
|
||||
|
||||
size_t // O - Length of source string
|
||||
_pdfio_strlcpy(char *dst, // I - Destination string buffer
|
||||
const char *src, // I - Source string
|
||||
size_t dstsize) // I - Size of destination
|
||||
{
|
||||
size_t srclen; // Length of source string
|
||||
|
||||
|
||||
// Range check input...
|
||||
if (!dst || !src || dstsize == 0)
|
||||
{
|
||||
if (dst)
|
||||
*dst = '\0';
|
||||
return (0);
|
||||
}
|
||||
|
||||
// Figure out how much room is needed...
|
||||
dstsize --;
|
||||
|
||||
srclen = strlen(src);
|
||||
|
||||
// Copy the appropriate amount...
|
||||
if (srclen <= dstsize)
|
||||
{
|
||||
// Source string will fit...
|
||||
memmove(dst, src, srclen);
|
||||
dst[srclen] = '\0';
|
||||
}
|
||||
else
|
||||
{
|
||||
// Source string too big, copy what we can and clean up the end...
|
||||
char *ptr = dst + dstsize - 1, // Pointer into string
|
||||
*end = ptr + 1; // Pointer to end of string
|
||||
|
||||
memmove(dst, src, dstsize);
|
||||
dst[dstsize] = '\0';
|
||||
|
||||
// Validate last character in destination buffer...
|
||||
if (ptr > dst && *ptr & 0x80)
|
||||
{
|
||||
while ((*ptr & 0xc0) == 0x80 && ptr > dst)
|
||||
ptr --;
|
||||
|
||||
if ((*ptr & 0xe0) == 0xc0)
|
||||
{
|
||||
// Verify 2-byte UTF-8 sequence...
|
||||
if ((end - ptr) != 2)
|
||||
*ptr = '\0';
|
||||
}
|
||||
else if ((*ptr & 0xf0) == 0xe0)
|
||||
{
|
||||
// Verify 3-byte UTF-8 sequence...
|
||||
if ((end - ptr) != 3)
|
||||
*ptr = '\0';
|
||||
}
|
||||
else if ((*ptr & 0xf8) == 0xf0)
|
||||
{
|
||||
// Verify 4-byte UTF-8 sequence...
|
||||
if ((end - ptr) != 4)
|
||||
*ptr = '\0';
|
||||
}
|
||||
else if (*ptr & 0x80)
|
||||
{
|
||||
// Invalid sequence at end...
|
||||
*ptr = '\0';
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
return (srclen);
|
||||
}
|
||||
|
||||
|
||||
//
|
||||
// '_pdfio_strtod()' - Convert a string to a double value.
|
||||
//
|
||||
@ -81,6 +158,89 @@ _pdfio_strtod(pdfio_file_t *pdf, // I - PDF file
|
||||
}
|
||||
|
||||
|
||||
//
|
||||
// '_pdfio_utf16cpy()' - Convert UTF-16 to UTF-8.
|
||||
//
|
||||
|
||||
void
|
||||
_pdfio_utf16cpy(
|
||||
char *dst, // I - Destination buffer for UTF-8
|
||||
const unsigned char *src, // I - Source UTF-16
|
||||
size_t srclen, // I - Length of UTF-16
|
||||
size_t dstsize) // I - Destination buffer size
|
||||
{
|
||||
char *dstptr = dst, // Pointer into buffer
|
||||
*dstend = dst + dstsize - 5; // End of buffer
|
||||
int ch; // Unicode character
|
||||
bool is_be = !memcmp(src, "\376\377", 2);
|
||||
// Big-endian strings?
|
||||
|
||||
|
||||
// Loop through the UTF-16 string, converting to Unicode then UTF-8...
|
||||
for (src += 2, srclen -= 2; srclen > 1 && dstptr < dstend; src += 2, srclen -= 2)
|
||||
{
|
||||
// Initial character...
|
||||
if (is_be)
|
||||
ch = (src[0] << 8) | src[1];
|
||||
else
|
||||
ch = (src[1] << 8) | src[0];
|
||||
|
||||
if (ch >= 0xd800 && ch <= 0xdbff && srclen > 3)
|
||||
{
|
||||
// Multi-word UTF-16 char...
|
||||
int lch; // Lower bits
|
||||
|
||||
if (is_be)
|
||||
lch = (src[2] << 8) | src[3];
|
||||
else
|
||||
lch = (src[3] << 8) | src[2];
|
||||
|
||||
if (lch < 0xdc00 || lch >= 0xdfff)
|
||||
break;
|
||||
|
||||
ch = (((ch & 0x3ff) << 10) | (lch & 0x3ff)) + 0x10000;
|
||||
src += 2;
|
||||
srclen -= 2;
|
||||
}
|
||||
else if (ch >= 0xfffe)
|
||||
{
|
||||
continue;
|
||||
}
|
||||
|
||||
// Convert Unicode to UTF-8...
|
||||
if (ch < 128)
|
||||
{
|
||||
// ASCII
|
||||
*dstptr++ = (char)ch;
|
||||
}
|
||||
else if (ch < 4096)
|
||||
{
|
||||
// 2-byte UTF-8
|
||||
*dstptr++ = (char)(0xc0 | (ch >> 6));
|
||||
*dstptr++ = (char)(0x80 | (ch & 0x3f));
|
||||
}
|
||||
else if (ch < 65536)
|
||||
{
|
||||
// 3-byte UTF-8
|
||||
*dstptr++ = (char)(0xe0 | (ch >> 12));
|
||||
*dstptr++ = (char)(0x80 | ((ch >> 6) & 0x3f));
|
||||
*dstptr++ = (char)(0x80 | (ch & 0x3f));
|
||||
}
|
||||
else
|
||||
{
|
||||
// 4-byte UTF-8
|
||||
*dstptr++ = (char)(0xe0 | (ch >> 18));
|
||||
*dstptr++ = (char)(0x80 | ((ch >> 12) & 0x3f));
|
||||
*dstptr++ = (char)(0x80 | ((ch >> 6) & 0x3f));
|
||||
*dstptr++ = (char)(0x80 | (ch & 0x3f));
|
||||
}
|
||||
}
|
||||
|
||||
// Nul-terminate the UTF-8 string...
|
||||
*dstptr = '\0';
|
||||
}
|
||||
|
||||
|
||||
//
|
||||
// '_pdfio_vsnprintf()' - Format a string.
|
||||
//
|
||||
@ -112,10 +272,9 @@ _pdfio_vsnprintf(pdfio_file_t *pdf, // I - PDF file
|
||||
|
||||
|
||||
// Loop through the format string, formatting as needed...
|
||||
bufptr = buffer;
|
||||
bufend = buffer + bufsize - 1;
|
||||
*bufend = '\0';
|
||||
bytes = 0;
|
||||
bufptr = buffer;
|
||||
bufend = buffer + bufsize - 1;
|
||||
bytes = 0;
|
||||
|
||||
while (*format)
|
||||
{
|
||||
@ -178,14 +337,12 @@ _pdfio_vsnprintf(pdfio_file_t *pdf, // I - PDF file
|
||||
}
|
||||
else
|
||||
{
|
||||
prec = 0;
|
||||
|
||||
while (isdigit(*format & 255))
|
||||
{
|
||||
if (tptr < (tformat + sizeof(tformat) - 1))
|
||||
*tptr++ = *format;
|
||||
|
||||
prec = prec * 10 + *format++ - '0';
|
||||
format ++;
|
||||
}
|
||||
}
|
||||
}
|
||||
@ -259,7 +416,7 @@ _pdfio_vsnprintf(pdfio_file_t *pdf, // I - PDF file
|
||||
|
||||
if (bufptr < bufend)
|
||||
{
|
||||
strncpy(bufptr, temp, (size_t)(bufend - bufptr - 1));
|
||||
_pdfio_strlcpy(bufptr, temp, (size_t)(bufend - bufptr + 1));
|
||||
bufptr += strlen(bufptr);
|
||||
}
|
||||
break;
|
||||
@ -289,7 +446,7 @@ _pdfio_vsnprintf(pdfio_file_t *pdf, // I - PDF file
|
||||
|
||||
if (bufptr < bufend)
|
||||
{
|
||||
strncpy(bufptr, temp, (size_t)(bufend - bufptr - 1));
|
||||
_pdfio_strlcpy(bufptr, temp, (size_t)(bufend - bufptr + 1));
|
||||
bufptr += strlen(bufptr);
|
||||
}
|
||||
break;
|
||||
@ -304,7 +461,7 @@ _pdfio_vsnprintf(pdfio_file_t *pdf, // I - PDF file
|
||||
|
||||
if (bufptr < bufend)
|
||||
{
|
||||
strncpy(bufptr, temp, (size_t)(bufend - bufptr - 1));
|
||||
_pdfio_strlcpy(bufptr, temp, (size_t)(bufend - bufptr + 1));
|
||||
bufptr += strlen(bufptr);
|
||||
}
|
||||
break;
|
||||
@ -329,19 +486,164 @@ _pdfio_vsnprintf(pdfio_file_t *pdf, // I - PDF file
|
||||
}
|
||||
break;
|
||||
|
||||
case 's' : // String
|
||||
case 'H' : // XML/HTML string
|
||||
if ((s = va_arg(ap, char *)) == NULL)
|
||||
s = "(null)";
|
||||
|
||||
// Loop through the literal string...
|
||||
while (*s)
|
||||
{
|
||||
// Escape special characters
|
||||
if (*s == '&')
|
||||
{
|
||||
// &
|
||||
if (bufptr < bufend)
|
||||
{
|
||||
_pdfio_strlcpy(bufptr, "&", (size_t)(bufend - bufptr + 1));
|
||||
bufptr += strlen(bufptr);
|
||||
}
|
||||
|
||||
bytes += 5;
|
||||
}
|
||||
else if (*s == '<')
|
||||
{
|
||||
// <
|
||||
if (bufptr < bufend)
|
||||
{
|
||||
_pdfio_strlcpy(bufptr, "<", (size_t)(bufend - bufptr + 1));
|
||||
bufptr += strlen(bufptr);
|
||||
}
|
||||
|
||||
bytes += 4;
|
||||
}
|
||||
else if (*s == '>')
|
||||
{
|
||||
// >
|
||||
if (bufptr < bufend)
|
||||
{
|
||||
_pdfio_strlcpy(bufptr, ">", (size_t)(bufend - bufptr + 1));
|
||||
bufptr += strlen(bufptr);
|
||||
}
|
||||
|
||||
bytes += 4;
|
||||
}
|
||||
else
|
||||
{
|
||||
// Literal character...
|
||||
if (bufptr < bufend)
|
||||
*bufptr++ = *s;
|
||||
bytes ++;
|
||||
}
|
||||
|
||||
s ++;
|
||||
}
|
||||
break;
|
||||
|
||||
case 'S' : // PDF string
|
||||
if ((s = va_arg(ap, char *)) == NULL)
|
||||
s = "(null)";
|
||||
|
||||
// PDF strings start with "("...
|
||||
if (bufptr < bufend)
|
||||
*bufptr++ = '(';
|
||||
|
||||
bytes ++;
|
||||
|
||||
// Loop through the literal string...
|
||||
while (*s)
|
||||
{
|
||||
// Escape special characters
|
||||
if (*s == '\\' || *s == '(' || *s == ')')
|
||||
{
|
||||
// Simple escape...
|
||||
if (bufptr < bufend)
|
||||
*bufptr++ = '\\';
|
||||
|
||||
if (bufptr < bufend)
|
||||
*bufptr++ = *s;
|
||||
|
||||
bytes += 2;
|
||||
}
|
||||
else if (*s < ' ')
|
||||
{
|
||||
// Octal escape...
|
||||
snprintf(bufptr, (size_t)(bufend - bufptr + 1), "\\%03o", *s & 255);
|
||||
bufptr += strlen(bufptr);
|
||||
bytes += 4;
|
||||
}
|
||||
else
|
||||
{
|
||||
// Literal character...
|
||||
if (bufptr < bufend)
|
||||
*bufptr++ = *s;
|
||||
bytes ++;
|
||||
}
|
||||
|
||||
s ++;
|
||||
}
|
||||
|
||||
// PDF strings end with ")"...
|
||||
if (bufptr < bufend)
|
||||
*bufptr++ = ')';
|
||||
|
||||
bytes ++;
|
||||
break;
|
||||
|
||||
case 's' : // Literal string
|
||||
if ((s = va_arg(ap, char *)) == NULL)
|
||||
s = "(null)";
|
||||
|
||||
if (width != 0)
|
||||
{
|
||||
// Format string to fit inside the specified width...
|
||||
if ((size_t)(width + 1) > sizeof(temp))
|
||||
break;
|
||||
|
||||
snprintf(temp, sizeof(temp), tformat, s);
|
||||
s = temp;
|
||||
}
|
||||
|
||||
bytes += strlen(s);
|
||||
|
||||
if (bufptr < bufend)
|
||||
{
|
||||
strncpy(bufptr, s, (size_t)(bufend - bufptr - 1));
|
||||
_pdfio_strlcpy(bufptr, s, (size_t)(bufend - bufptr + 1));
|
||||
bufptr += strlen(bufptr);
|
||||
}
|
||||
break;
|
||||
|
||||
case 'N' : // Output name string with proper escaping
|
||||
if ((s = va_arg(ap, char *)) == NULL)
|
||||
s = "(null)";
|
||||
|
||||
// PDF names start with "/"...
|
||||
if (bufptr < bufend)
|
||||
*bufptr++ = '/';
|
||||
|
||||
bytes ++;
|
||||
|
||||
// Loop through the name string...
|
||||
while (*s)
|
||||
{
|
||||
if (*s < 0x21 || *s > 0x7e || *s == '#')
|
||||
{
|
||||
// Output #XX for character...
|
||||
snprintf(bufptr, (size_t)(bufend - bufptr + 1), "#%02X", *s & 255);
|
||||
bufptr += strlen(bufptr);
|
||||
bytes += 3;
|
||||
}
|
||||
else
|
||||
{
|
||||
// Output literal character...
|
||||
if (bufptr < bufend)
|
||||
*bufptr++ = *s;
|
||||
bytes ++;
|
||||
}
|
||||
|
||||
s ++;
|
||||
}
|
||||
break;
|
||||
|
||||
case 'n' : // Output number of chars so far
|
||||
*(va_arg(ap, int *)) = (int)bytes;
|
||||
break;
|
||||
@ -358,11 +660,7 @@ _pdfio_vsnprintf(pdfio_file_t *pdf, // I - PDF file
|
||||
}
|
||||
|
||||
// Nul-terminate the string and return the number of characters needed.
|
||||
if (bufptr < bufend)
|
||||
{
|
||||
// Everything fit in the buffer...
|
||||
*bufptr = '\0';
|
||||
}
|
||||
*bufptr = '\0';
|
||||
|
||||
PDFIO_DEBUG("_pdfio_vsnprintf: Returning %ld \"%s\"\n", (long)bytes, buffer);
|
||||
|
||||
@ -370,6 +668,41 @@ _pdfio_vsnprintf(pdfio_file_t *pdf, // I - PDF file
|
||||
}
|
||||
|
||||
|
||||
//
|
||||
// '_pdfioStringAllocBuffer()' - Allocate a string buffer.
|
||||
//
|
||||
|
||||
char * // O - Buffer or `NULL` on error
|
||||
_pdfioStringAllocBuffer(
|
||||
pdfio_file_t *pdf) // I - PDF file
|
||||
{
|
||||
_pdfio_strbuf_t *current; // Current string buffer
|
||||
|
||||
|
||||
// See if we have an available string buffer...
|
||||
for (current = pdf->strbuffers; current; current = current->next)
|
||||
{
|
||||
if (!current->bufused)
|
||||
{
|
||||
current->bufused = true;
|
||||
return (current->buffer);
|
||||
}
|
||||
}
|
||||
|
||||
// Didn't find one, allocate a new one...
|
||||
if ((current = calloc(1, sizeof(_pdfio_strbuf_t))) == NULL)
|
||||
return (NULL);
|
||||
|
||||
// Add to the linked list of string buffers...
|
||||
current->next = pdf->strbuffers;
|
||||
current->bufused = true;
|
||||
|
||||
pdf->strbuffers = current;
|
||||
|
||||
return (current->buffer);
|
||||
}
|
||||
|
||||
|
||||
//
|
||||
// 'pdfioStringCreate()' - Create a durable literal string.
|
||||
//
|
||||
@ -480,6 +813,29 @@ pdfioStringCreatef(
|
||||
}
|
||||
|
||||
|
||||
//
|
||||
// '_pdfioStringFreeBuffer()' - Free a string buffer.
|
||||
//
|
||||
|
||||
void
|
||||
_pdfioStringFreeBuffer(
|
||||
pdfio_file_t *pdf, // I - PDF file
|
||||
char *buffer) // I - String buffer
|
||||
{
|
||||
_pdfio_strbuf_t *current; // Current string buffer
|
||||
|
||||
|
||||
for (current = pdf->strbuffers; current; current = current->next)
|
||||
{
|
||||
if (current->buffer == buffer)
|
||||
{
|
||||
current->bufused = false;
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
//
|
||||
// '_pdfioStringIsAllocated()' - Check whether a string has been allocated.
|
||||
//
|
||||
|
@ -1,7 +1,7 @@
|
||||
//
|
||||
// PDF token parsing functions for PDFio.
|
||||
//
|
||||
// Copyright © 2021-2023 by Michael R Sweet.
|
||||
// Copyright © 2021-2025 by Michael R Sweet.
|
||||
//
|
||||
// Licensed under Apache License v2.0. See the file "LICENSE" for more
|
||||
// information.
|
||||
@ -528,13 +528,6 @@ _pdfioTokenRead(_pdfio_token_t *tb, // I - Token buffer/stack
|
||||
return (false);
|
||||
}
|
||||
}
|
||||
|
||||
if (bufptr == (buffer + 1))
|
||||
{
|
||||
_pdfioFileError(tb->pdf, "Empty name.");
|
||||
*bufptr = '\0';
|
||||
return (false);
|
||||
}
|
||||
break;
|
||||
|
||||
case '<' : // Potential hex string
|
||||
|
323
pdfio-value.c
@ -1,7 +1,7 @@
|
||||
//
|
||||
// PDF value functions for PDFio.
|
||||
//
|
||||
// Copyright © 2021-2024 by Michael R Sweet.
|
||||
// Copyright © 2021-2025 by Michael R Sweet.
|
||||
//
|
||||
// Licensed under Apache License v2.0. See the file "LICENSE" for more
|
||||
// information.
|
||||
@ -125,7 +125,7 @@ _pdfioValueDecrypt(pdfio_file_t *pdf, // I - PDF file
|
||||
_pdfio_crypto_ctx_t ctx; // Decryption context
|
||||
_pdfio_crypto_cb_t cb; // Decryption callback
|
||||
size_t ivlen; // Number of initialization vector bytes
|
||||
uint8_t temp[32768]; // Temporary buffer for decryption
|
||||
uint8_t *temp = NULL; // Temporary buffer for decryption
|
||||
size_t templen; // Number of actual data bytes
|
||||
time_t timeval; // Date/time value
|
||||
|
||||
@ -152,11 +152,16 @@ _pdfioValueDecrypt(pdfio_file_t *pdf, // I - PDF file
|
||||
|
||||
case PDFIO_VALTYPE_BINARY :
|
||||
// Decrypt the binary string...
|
||||
if (v->value.binary.datalen > (sizeof(temp) - 32))
|
||||
if (v->value.binary.datalen > PDFIO_MAX_STRING)
|
||||
{
|
||||
_pdfioFileError(pdf, "Unable to read encrypted binary string - too long.");
|
||||
return (false);
|
||||
}
|
||||
else if ((temp = (uint8_t *)_pdfioStringAllocBuffer(pdf)) == NULL)
|
||||
{
|
||||
_pdfioFileError(pdf, "Unable to read encrypted binary string - out of memory.");
|
||||
return (false);
|
||||
}
|
||||
|
||||
ivlen = v->value.binary.datalen;
|
||||
if ((cb = _pdfioCryptoMakeReader(pdf, obj, &ctx, v->value.binary.data, &ivlen)) == NULL)
|
||||
@ -167,29 +172,59 @@ _pdfioValueDecrypt(pdfio_file_t *pdf, // I - PDF file
|
||||
// Copy the decrypted string back to the value and adjust the length...
|
||||
memcpy(v->value.binary.data, temp, templen);
|
||||
|
||||
if (pdf->encryption >= PDFIO_ENCRYPTION_AES_128)
|
||||
if (pdf->encryption >= PDFIO_ENCRYPTION_AES_128 && temp[templen - 1] <= templen)
|
||||
v->value.binary.datalen = templen - temp[templen - 1];
|
||||
else
|
||||
v->value.binary.datalen = templen;
|
||||
|
||||
_pdfioStringFreeBuffer(pdf, (char *)temp);
|
||||
break;
|
||||
|
||||
case PDFIO_VALTYPE_STRING :
|
||||
// Decrypt regular string...
|
||||
templen = strlen(v->value.string);
|
||||
if (templen > (sizeof(temp) - 33))
|
||||
if (templen > (PDFIO_MAX_STRING - 1))
|
||||
{
|
||||
_pdfioFileError(pdf, "Unable to read encrypted string - too long.");
|
||||
return (false);
|
||||
}
|
||||
else if ((temp = (uint8_t *)_pdfioStringAllocBuffer(pdf)) == NULL)
|
||||
{
|
||||
_pdfioFileError(pdf, "Unable to read encrypted binary string - out of memory.");
|
||||
return (false);
|
||||
}
|
||||
|
||||
ivlen = templen;
|
||||
if ((cb = _pdfioCryptoMakeReader(pdf, obj, &ctx, (uint8_t *)v->value.string, &ivlen)) == NULL)
|
||||
return (false);
|
||||
|
||||
templen = (cb)(&ctx, temp, (uint8_t *)v->value.string + ivlen, templen - ivlen);
|
||||
|
||||
if (pdf->encryption >= PDFIO_ENCRYPTION_AES_128 && temp[templen - 1] <= templen)
|
||||
templen -= temp[templen - 1];
|
||||
|
||||
temp[templen] = '\0';
|
||||
|
||||
if ((timeval = get_date_time((char *)temp)) != 0)
|
||||
if ((templen & 1) == 0 && (!memcmp(temp, "\376\377", 2) || !memcmp(temp, "\377\376", 2)))
|
||||
{
|
||||
// Convert UTF-16 to UTF-8...
|
||||
char utf8[4096]; // Temporary string
|
||||
|
||||
_pdfio_utf16cpy(utf8, temp, templen, sizeof(utf8));
|
||||
|
||||
if ((timeval = get_date_time((char *)utf8)) != 0)
|
||||
{
|
||||
// Change the type to date...
|
||||
v->type = PDFIO_VALTYPE_DATE;
|
||||
v->value.date = timeval;
|
||||
}
|
||||
else
|
||||
{
|
||||
// Copy the decrypted string back to the value...
|
||||
v->value.string = pdfioStringCreate(pdf, utf8);
|
||||
}
|
||||
}
|
||||
else if ((timeval = get_date_time((char *)temp)) != 0)
|
||||
{
|
||||
// Change the type to date...
|
||||
v->type = PDFIO_VALTYPE_DATE;
|
||||
@ -200,6 +235,8 @@ _pdfioValueDecrypt(pdfio_file_t *pdf, // I - PDF file
|
||||
// Copy the decrypted string back to the value...
|
||||
v->value.string = pdfioStringCreate(pdf, (char *)temp);
|
||||
}
|
||||
|
||||
_pdfioStringFreeBuffer(pdf, (char *)temp);
|
||||
break;
|
||||
}
|
||||
|
||||
@ -300,7 +337,9 @@ _pdfioValueRead(pdfio_file_t *pdf, // I - PDF file
|
||||
_pdfio_value_t *v, // I - Value
|
||||
size_t depth) // I - Depth of value
|
||||
{
|
||||
char token[32768]; // Token buffer
|
||||
_pdfio_value_t *ret = NULL; // Return value
|
||||
char *token = _pdfioStringAllocBuffer(pdf);
|
||||
// Token buffer
|
||||
time_t timeval; // Date/time value
|
||||
#ifdef DEBUG
|
||||
static const char * const valtypes[] =
|
||||
@ -322,8 +361,11 @@ _pdfioValueRead(pdfio_file_t *pdf, // I - PDF file
|
||||
|
||||
PDFIO_DEBUG("_pdfioValueRead(pdf=%p, obj=%p, v=%p)\n", pdf, obj, v);
|
||||
|
||||
if (!_pdfioTokenGet(tb, token, sizeof(token)))
|
||||
return (NULL);
|
||||
if (!token)
|
||||
goto done;
|
||||
|
||||
if (!_pdfioTokenGet(tb, token, PDFIO_MAX_STRING))
|
||||
goto done;
|
||||
|
||||
if (!strcmp(token, "["))
|
||||
{
|
||||
@ -331,12 +373,14 @@ _pdfioValueRead(pdfio_file_t *pdf, // I - PDF file
|
||||
if (depth >= PDFIO_MAX_DEPTH)
|
||||
{
|
||||
_pdfioFileError(pdf, "Too many nested arrays.");
|
||||
return (NULL);
|
||||
goto done;
|
||||
}
|
||||
|
||||
v->type = PDFIO_VALTYPE_ARRAY;
|
||||
if ((v->value.array = _pdfioArrayRead(pdf, obj, tb, depth + 1)) == NULL)
|
||||
return (NULL);
|
||||
goto done;
|
||||
|
||||
ret = v;
|
||||
}
|
||||
else if (!strcmp(token, "<<"))
|
||||
{
|
||||
@ -344,29 +388,38 @@ _pdfioValueRead(pdfio_file_t *pdf, // I - PDF file
|
||||
if (depth >= PDFIO_MAX_DEPTH)
|
||||
{
|
||||
_pdfioFileError(pdf, "Too many nested dictionaries.");
|
||||
return (NULL);
|
||||
goto done;
|
||||
}
|
||||
|
||||
v->type = PDFIO_VALTYPE_DICT;
|
||||
if ((v->value.dict = _pdfioDictRead(pdf, obj, tb, depth + 1)) == NULL)
|
||||
return (NULL);
|
||||
}
|
||||
else if (!strncmp(token, "(D:", 3) && (timeval = get_date_time(token + 1)) != 0)
|
||||
{
|
||||
v->type = PDFIO_VALTYPE_DATE;
|
||||
v->value.date = timeval;
|
||||
goto done;
|
||||
|
||||
ret = v;
|
||||
}
|
||||
else if (token[0] == '(')
|
||||
{
|
||||
// String
|
||||
v->type = PDFIO_VALTYPE_STRING;
|
||||
v->value.string = pdfioStringCreate(pdf, token + 1);
|
||||
if ((timeval = get_date_time(token + 1)) != 0)
|
||||
{
|
||||
// Date
|
||||
v->type = PDFIO_VALTYPE_DATE;
|
||||
v->value.date = timeval;
|
||||
ret = v;
|
||||
}
|
||||
else
|
||||
{
|
||||
// String
|
||||
v->type = PDFIO_VALTYPE_STRING;
|
||||
v->value.string = pdfioStringCreate(pdf, token + 1);
|
||||
ret = v;
|
||||
}
|
||||
}
|
||||
else if (token[0] == '/')
|
||||
{
|
||||
// Name
|
||||
v->type = PDFIO_VALTYPE_NAME;
|
||||
v->value.name = pdfioStringCreate(pdf, token + 1);
|
||||
ret = v;
|
||||
}
|
||||
else if (token[0] == '<')
|
||||
{
|
||||
@ -379,7 +432,7 @@ _pdfioValueRead(pdfio_file_t *pdf, // I - PDF file
|
||||
if ((v->value.binary.data = (unsigned char *)malloc(v->value.binary.datalen)) == NULL)
|
||||
{
|
||||
_pdfioFileError(pdf, "Out of memory for hex string.");
|
||||
return (NULL);
|
||||
goto done;
|
||||
}
|
||||
|
||||
// Convert hex to binary...
|
||||
@ -406,6 +459,8 @@ _pdfioValueRead(pdfio_file_t *pdf, // I - PDF file
|
||||
|
||||
*dataptr++ = (unsigned char)d;
|
||||
}
|
||||
|
||||
ret = v;
|
||||
}
|
||||
else if (strchr("0123456789-+.", token[0]) != NULL)
|
||||
{
|
||||
@ -493,7 +548,8 @@ _pdfioValueRead(pdfio_file_t *pdf, // I - PDF file
|
||||
|
||||
PDFIO_DEBUG("_pdfioValueRead: Returning indirect value %lu %u R.\n", (unsigned long)v->value.indirect.number, v->value.indirect.generation);
|
||||
|
||||
return (v);
|
||||
ret = v;
|
||||
goto done;
|
||||
}
|
||||
}
|
||||
}
|
||||
@ -501,27 +557,41 @@ _pdfioValueRead(pdfio_file_t *pdf, // I - PDF file
|
||||
// If we get here, we have a number...
|
||||
v->type = PDFIO_VALTYPE_NUMBER;
|
||||
v->value.number = _pdfio_strtod(pdf, token);
|
||||
ret = v;
|
||||
}
|
||||
else if (!strcmp(token, "true") || !strcmp(token, "false"))
|
||||
{
|
||||
// Boolean value
|
||||
v->type = PDFIO_VALTYPE_BOOLEAN;
|
||||
v->value.boolean = !strcmp(token, "true");
|
||||
ret = v;
|
||||
}
|
||||
else if (!strcmp(token, "null"))
|
||||
{
|
||||
// null value
|
||||
v->type = PDFIO_VALTYPE_NULL;
|
||||
ret = v;
|
||||
}
|
||||
else
|
||||
{
|
||||
_pdfioFileError(pdf, "Unexpected '%s' token seen.", token);
|
||||
return (NULL);
|
||||
}
|
||||
|
||||
PDFIO_DEBUG("_pdfioValueRead: Returning %s value.\n", valtypes[v->type]);
|
||||
done:
|
||||
|
||||
return (v);
|
||||
if (token)
|
||||
_pdfioStringFreeBuffer(pdf, token);
|
||||
|
||||
if (ret)
|
||||
{
|
||||
PDFIO_DEBUG("_pdfioValueRead: Returning %s value.\n", valtypes[ret->type]);
|
||||
return (ret);
|
||||
}
|
||||
else
|
||||
{
|
||||
PDFIO_DEBUG("_pdfioValueRead: Returning NULL.\n");
|
||||
return (NULL);
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
@ -546,8 +616,10 @@ _pdfioValueWrite(pdfio_file_t *pdf, // I - PDF file
|
||||
case PDFIO_VALTYPE_BINARY :
|
||||
{
|
||||
size_t databytes; // Bytes to write
|
||||
uint8_t temp[32768], // Temporary buffer for encryption
|
||||
uint8_t *temp = NULL, // Temporary buffer for encryption
|
||||
*dataptr; // Pointer into data
|
||||
bool ret = false; // Return value
|
||||
|
||||
|
||||
if (obj && pdf->encryption)
|
||||
{
|
||||
@ -556,11 +628,16 @@ _pdfioValueWrite(pdfio_file_t *pdf, // I - PDF file
|
||||
_pdfio_crypto_cb_t cb; // Encryption callback
|
||||
size_t ivlen; // Number of initialization vector bytes
|
||||
|
||||
if (v->value.binary.datalen > (sizeof(temp) - 32))
|
||||
if (v->value.binary.datalen > PDFIO_MAX_STRING)
|
||||
{
|
||||
_pdfioFileError(pdf, "Unable to write encrypted binary string - too long.");
|
||||
return (false);
|
||||
}
|
||||
else if ((temp = (uint8_t *)_pdfioStringAllocBuffer(pdf)) == NULL)
|
||||
{
|
||||
_pdfioFileError(pdf, "Unable to write encrypted binary string - out of memory.");
|
||||
return (false);
|
||||
}
|
||||
|
||||
cb = _pdfioCryptoMakeWriter(pdf, obj, &ctx, temp, &ivlen);
|
||||
databytes = (cb)(&ctx, temp + ivlen, v->value.binary.data, v->value.binary.datalen) + ivlen;
|
||||
@ -573,18 +650,25 @@ _pdfioValueWrite(pdfio_file_t *pdf, // I - PDF file
|
||||
}
|
||||
|
||||
if (!_pdfioFilePuts(pdf, "<"))
|
||||
return (false);
|
||||
goto bindone;
|
||||
|
||||
for (; databytes > 1; databytes -= 2, dataptr += 2)
|
||||
{
|
||||
if (!_pdfioFilePrintf(pdf, "%02X%02X", dataptr[0], dataptr[1]))
|
||||
return (false);
|
||||
goto bindone;
|
||||
}
|
||||
|
||||
if (databytes > 0)
|
||||
return (_pdfioFilePrintf(pdf, "%02X>", dataptr[0]));
|
||||
else
|
||||
return (_pdfioFilePuts(pdf, ">"));
|
||||
if (databytes > 0 && !_pdfioFilePrintf(pdf, "%02X", dataptr[0]))
|
||||
goto bindone;
|
||||
|
||||
ret = _pdfioFilePuts(pdf, ">");
|
||||
|
||||
bindone:
|
||||
|
||||
if (temp)
|
||||
_pdfioStringFreeBuffer(pdf, (char *)temp);
|
||||
|
||||
return (ret);
|
||||
}
|
||||
|
||||
case PDFIO_VALTYPE_BOOLEAN :
|
||||
@ -609,7 +693,7 @@ _pdfioValueWrite(pdfio_file_t *pdf, // I - PDF file
|
||||
if (obj && pdf->encryption)
|
||||
{
|
||||
// Write encrypted string...
|
||||
uint8_t temp[32768], // Encrypted bytes
|
||||
uint8_t temp[64], // Encrypted bytes
|
||||
*tempptr; // Pointer into encrypted bytes
|
||||
_pdfio_crypto_ctx_t ctx; // Encryption context
|
||||
_pdfio_crypto_cb_t cb; // Encryption callback
|
||||
@ -637,7 +721,7 @@ _pdfioValueWrite(pdfio_file_t *pdf, // I - PDF file
|
||||
}
|
||||
else
|
||||
{
|
||||
return (_pdfioFilePrintf(pdf, "(%s)", datestr));
|
||||
return (_pdfioFilePrintf(pdf, "%S", datestr));
|
||||
}
|
||||
}
|
||||
|
||||
@ -648,19 +732,19 @@ _pdfioValueWrite(pdfio_file_t *pdf, // I - PDF file
|
||||
return (_pdfioFilePrintf(pdf, " %lu %u R", (unsigned long)v->value.indirect.number, v->value.indirect.generation));
|
||||
|
||||
case PDFIO_VALTYPE_NAME :
|
||||
return (_pdfioFilePrintf(pdf, "/%s", v->value.name));
|
||||
return (_pdfioFilePrintf(pdf, "%N", v->value.name));
|
||||
|
||||
case PDFIO_VALTYPE_NULL :
|
||||
return (_pdfioFilePuts(pdf, " null"));
|
||||
|
||||
case PDFIO_VALTYPE_NUMBER :
|
||||
return (_pdfioFilePrintf(pdf, " %g", v->value.number));
|
||||
return (_pdfioFilePrintf(pdf, " %.6f", v->value.number));
|
||||
|
||||
case PDFIO_VALTYPE_STRING :
|
||||
if (obj && pdf->encryption)
|
||||
{
|
||||
// Write encrypted string...
|
||||
uint8_t temp[32768], // Encrypted bytes
|
||||
uint8_t *temp = NULL, // Encrypted bytes
|
||||
*tempptr; // Pointer into encrypted bytes
|
||||
_pdfio_crypto_ctx_t ctx; // Encryption context
|
||||
_pdfio_crypto_cb_t cb; // Encryption callback
|
||||
@ -668,74 +752,46 @@ _pdfioValueWrite(pdfio_file_t *pdf, // I - PDF file
|
||||
// Length of value
|
||||
ivlen, // Number of initialization vector bytes
|
||||
tempbytes; // Number of output bytes
|
||||
bool ret = false; // Return value
|
||||
|
||||
if (len > (sizeof(temp) - 32))
|
||||
if (len > PDFIO_MAX_STRING)
|
||||
{
|
||||
_pdfioFileError(pdf, "Unable to write encrypted string - too long.");
|
||||
return (false);
|
||||
}
|
||||
else if ((temp = (uint8_t *)_pdfioStringAllocBuffer(pdf)) == NULL)
|
||||
{
|
||||
_pdfioFileError(pdf, "Unable to write encrypted string - out of memory.");
|
||||
return (false);
|
||||
}
|
||||
|
||||
cb = _pdfioCryptoMakeWriter(pdf, obj, &ctx, temp, &ivlen);
|
||||
tempbytes = (cb)(&ctx, temp + ivlen, (const uint8_t *)v->value.string, len) + ivlen;
|
||||
|
||||
if (!_pdfioFilePuts(pdf, "<"))
|
||||
return (false);
|
||||
goto strdone;
|
||||
|
||||
for (tempptr = temp; tempbytes > 1; tempbytes -= 2, tempptr += 2)
|
||||
{
|
||||
if (!_pdfioFilePrintf(pdf, "%02X%02X", tempptr[0], tempptr[1]))
|
||||
return (false);
|
||||
goto strdone;
|
||||
}
|
||||
|
||||
if (tempbytes > 0)
|
||||
return (_pdfioFilePrintf(pdf, "%02X>", *tempptr));
|
||||
else
|
||||
return (_pdfioFilePuts(pdf, ">"));
|
||||
if (tempbytes > 0 && !_pdfioFilePrintf(pdf, "%02X", *tempptr))
|
||||
goto strdone;
|
||||
|
||||
ret = _pdfioFilePuts(pdf, ">");
|
||||
|
||||
strdone :
|
||||
|
||||
_pdfioStringFreeBuffer(pdf, (char *)temp);
|
||||
|
||||
return (ret);
|
||||
}
|
||||
else
|
||||
{
|
||||
// Write unencrypted string...
|
||||
const char *start, // Start of fragment
|
||||
*end; // End of fragment
|
||||
|
||||
if (!_pdfioFilePuts(pdf, "("))
|
||||
return (false);
|
||||
|
||||
// Write a quoted string value...
|
||||
for (start = v->value.string; *start; start = end)
|
||||
{
|
||||
// Find the next character that needs to be quoted...
|
||||
for (end = start; *end; end ++)
|
||||
{
|
||||
if (*end == '\\' || *end == ')' || (*end & 255) < ' ')
|
||||
break;
|
||||
}
|
||||
|
||||
if (end > start)
|
||||
{
|
||||
// Write unquoted (safe) characters...
|
||||
if (!_pdfioFileWrite(pdf, start, (size_t)(end - start)))
|
||||
return (false);
|
||||
}
|
||||
|
||||
if (*end)
|
||||
{
|
||||
// Quote this character...
|
||||
bool success; // Did the write work?
|
||||
|
||||
if (*end == '\\' || *end == ')')
|
||||
success = _pdfioFilePrintf(pdf, "\\%c", *end);
|
||||
else
|
||||
success = _pdfioFilePrintf(pdf, "\\%03o", *end);
|
||||
|
||||
if (!success)
|
||||
return (false);
|
||||
|
||||
end ++;
|
||||
}
|
||||
}
|
||||
|
||||
return (_pdfioFilePuts(pdf, ")"));
|
||||
return (_pdfioFilePrintf(pdf, "%S", v->value.string));
|
||||
}
|
||||
}
|
||||
|
||||
@ -747,31 +803,59 @@ _pdfioValueWrite(pdfio_file_t *pdf, // I - PDF file
|
||||
// 'get_date_time()' - Convert PDF date/time value to time_t.
|
||||
//
|
||||
|
||||
static time_t // O - Time in seconds
|
||||
static time_t // O - Time in seconds or `0` for none
|
||||
get_date_time(const char *s) // I - PDF date/time value
|
||||
{
|
||||
int i; // Looping var
|
||||
struct tm dateval; // Date value
|
||||
int offset; // Date offset
|
||||
int offset = 0; // Date offset in seconds
|
||||
time_t t; // Time value
|
||||
|
||||
|
||||
PDFIO_DEBUG("get_date_time(s=\"%s\")\n", s);
|
||||
|
||||
// Possible date value of the form:
|
||||
//
|
||||
// (D:YYYYMMDDhhmmssZ)
|
||||
// (D:YYYYMMDDhhmmss+HH'mm)
|
||||
// (D:YYYYMMDDhhmmss-HH'mm)
|
||||
// D:YYYYMMDDhhmmssZ
|
||||
// D:YYYYMMDDhhmmss+HH'mm
|
||||
// D:YYYYMMDDhhmmss-HH'mm
|
||||
//
|
||||
|
||||
if (strncmp(s, "D:", 2))
|
||||
return (0);
|
||||
|
||||
for (i = 2; i < 16; i ++)
|
||||
{
|
||||
// Look for date/time digits...
|
||||
if (!isdigit(s[i] & 255) || !s[i])
|
||||
break;
|
||||
}
|
||||
|
||||
if (i >= 16)
|
||||
if (i < 6 || (i & 1))
|
||||
{
|
||||
// Short year or missing digit...
|
||||
return (0);
|
||||
}
|
||||
|
||||
memset(&dateval, 0, sizeof(dateval));
|
||||
|
||||
dateval.tm_year = (s[2] - '0') * 1000 + (s[3] - '0') * 100 + (s[4] - '0') * 10 + s[5] - '0' - 1900;
|
||||
if (i > 6)
|
||||
dateval.tm_mon = (s[6] - '0') * 10 + s[7] - '0' - 1;
|
||||
if (i > 8)
|
||||
dateval.tm_mday = (s[8] - '0') * 10 + s[9] - '0';
|
||||
else
|
||||
dateval.tm_mday = 1;
|
||||
if (i > 10)
|
||||
dateval.tm_hour = (s[10] - '0') * 10 + s[11] - '0';
|
||||
if (i > 12)
|
||||
dateval.tm_min = (s[12] - '0') * 10 + s[13] - '0';
|
||||
if (i > 14)
|
||||
dateval.tm_sec = (s[14] - '0') * 10 + s[15] - '0';
|
||||
|
||||
if (i >= 16 && s[i])
|
||||
{
|
||||
// Get zone info...
|
||||
if (s[i] == 'Z')
|
||||
{
|
||||
// UTC...
|
||||
@ -782,14 +866,20 @@ get_date_time(const char *s) // I - PDF date/time value
|
||||
// Timezone offset from UTC...
|
||||
if (isdigit(s[i + 1] & 255) && isdigit(s[i + 2] & 255) && s[i + 3] == '\'' && isdigit(s[i + 4] & 255) && isdigit(s[i + 5] & 255))
|
||||
{
|
||||
offset = (s[i + 1] - '0') * 36000 + (s[i + 2] - '0') * 3600 + (s[i + 4] - '0') * 600 + (s[i + 5] - '0') * 60;
|
||||
if (s[i] == '-')
|
||||
offset = -offset;
|
||||
|
||||
i += 6;
|
||||
|
||||
// Accept trailing quote, per PDF spec...
|
||||
if (s[i] == '\'')
|
||||
i ++;
|
||||
}
|
||||
}
|
||||
else if (!s[i])
|
||||
else
|
||||
{
|
||||
// Missing zone info, invalid date string...
|
||||
// Random zone info, invalid date string...
|
||||
return (0);
|
||||
}
|
||||
}
|
||||
@ -800,26 +890,31 @@ get_date_time(const char *s) // I - PDF date/time value
|
||||
return (0);
|
||||
}
|
||||
|
||||
// Date value...
|
||||
memset(&dateval, 0, sizeof(dateval));
|
||||
// Convert date value to time_t...
|
||||
#if _WIN32
|
||||
if ((t = _mkgmtime(&dateval)) <= 0)
|
||||
return (0);
|
||||
|
||||
dateval.tm_year = (s[2] - '0') * 1000 + (s[3] - '0') * 100 + (s[4] - '0') * 10 + s[5] - '0' - 1900;
|
||||
dateval.tm_mon = (s[6] - '0') * 10 + s[7] - '0' - 1;
|
||||
dateval.tm_mday = (s[8] - '0') * 10 + s[9] - '0';
|
||||
dateval.tm_hour = (s[10] - '0') * 10 + s[11] - '0';
|
||||
dateval.tm_min = (s[12] - '0') * 10 + s[13] - '0';
|
||||
dateval.tm_sec = (s[14] - '0') * 10 + s[15] - '0';
|
||||
#elif defined(HAVE_TIMEGM)
|
||||
if ((t = timegm(&dateval)) <= 0)
|
||||
return (0);
|
||||
|
||||
if (s[16] == 'Z')
|
||||
{
|
||||
offset = 0;
|
||||
}
|
||||
else
|
||||
{
|
||||
offset = (s[17] - '0') * 600 + (s[18] - '0') * 60 + (s[19] - '0') * 10 + s[20] - '0';
|
||||
if (s[16] == '-')
|
||||
offset = -offset;
|
||||
}
|
||||
#else
|
||||
if ((t = mktime(&dateval)) <= 0)
|
||||
return (0);
|
||||
|
||||
return (mktime(&dateval) + offset);
|
||||
# if defined(HAVE_TM_GMTOFF)
|
||||
// Adjust the time value using the "tm_gmtoff" and "tm_isdst" members. As
|
||||
// noted by M-HT on Github, this DST hack will fail in timezones where the
|
||||
// DST offset is not one hour, such as Australia/Lord_Howe. Fortunately,
|
||||
// this is unusual and most systems support the "timegm" function...
|
||||
t += dateval.tm_gmtoff - 3600 * dateval.tm_isdst;
|
||||
# else
|
||||
// Adjust the time value using the even more legacy "timezone" variable,
|
||||
// which also reflects any DST offset...
|
||||
t += timezone;
|
||||
# endif // HAVE_TM_GMTOFF
|
||||
#endif // _WIN32
|
||||
|
||||
return (t - offset);
|
||||
}
|
||||
|
20
pdfio.h
@ -20,10 +20,12 @@ extern "C" {
|
||||
|
||||
|
||||
//
|
||||
// Version number...
|
||||
// Version numbers...
|
||||
//
|
||||
|
||||
# define PDFIO_VERSION "1.4.1"
|
||||
# define PDFIO_VERSION "1.5.3"
|
||||
# define PDFIO_VERSION_MAJOR 1
|
||||
# define PDFIO_VERSION_MINOR 5
|
||||
|
||||
|
||||
//
|
||||
@ -32,11 +34,9 @@ extern "C" {
|
||||
|
||||
# if defined(__has_extension) || defined(__GNUC__)
|
||||
# define _PDFIO_PUBLIC __attribute__ ((visibility("default")))
|
||||
# define _PDFIO_FORMAT(a,b) __attribute__ ((__format__(__printf__, a,b)))
|
||||
# define _PDFIO_DEPRECATED __attribute__ ((deprecated)) _PDFIO_PUBLIC
|
||||
# else
|
||||
# define _PDFIO_PUBLIC
|
||||
# define _PDFIO_FORMAT(a,b)
|
||||
# define _PDFIO_DEPRECATED
|
||||
# endif // __has_extension || __GNUC__
|
||||
|
||||
@ -46,7 +46,7 @@ extern "C" {
|
||||
//
|
||||
|
||||
# if _WIN32
|
||||
typedef __int64 ssize_t; // POSIX type not present on Windows... @private@
|
||||
typedef __int64 ssize_t; // POSIX type not present on Windows @private@
|
||||
# endif // _WIN32
|
||||
|
||||
typedef struct _pdfio_array_s pdfio_array_t;
|
||||
@ -62,7 +62,7 @@ typedef bool (*pdfio_error_cb_t)(pdfio_file_t *pdf, const char *message, void *d
|
||||
typedef enum pdfio_encryption_e // PDF encryption modes
|
||||
{
|
||||
PDFIO_ENCRYPTION_NONE = 0, // No encryption
|
||||
PDFIO_ENCRYPTION_RC4_40, // 40-bit RC4 encryption (PDF 1.3)
|
||||
PDFIO_ENCRYPTION_RC4_40, // 40-bit RC4 encryption (PDF 1.3, reading only)
|
||||
PDFIO_ENCRYPTION_RC4_128, // 128-bit RC4 encryption (PDF 1.4)
|
||||
PDFIO_ENCRYPTION_AES_128, // 128-bit AES encryption (PDF 1.6)
|
||||
PDFIO_ENCRYPTION_AES_256 // 256-bit AES encryption (PDF 2.0) @exclude all@
|
||||
@ -181,7 +181,7 @@ extern bool pdfioDictSetNumber(pdfio_dict_t *dict, const char *key, double valu
|
||||
extern bool pdfioDictSetObj(pdfio_dict_t *dict, const char *key, pdfio_obj_t *value) _PDFIO_PUBLIC;
|
||||
extern bool pdfioDictSetRect(pdfio_dict_t *dict, const char *key, pdfio_rect_t *value) _PDFIO_PUBLIC;
|
||||
extern bool pdfioDictSetString(pdfio_dict_t *dict, const char *key, const char *value) _PDFIO_PUBLIC;
|
||||
extern bool pdfioDictSetStringf(pdfio_dict_t *dict, const char *key, const char *format, ...) _PDFIO_PUBLIC _PDFIO_FORMAT(3,4);
|
||||
extern bool pdfioDictSetStringf(pdfio_dict_t *dict, const char *key, const char *format, ...) _PDFIO_PUBLIC;
|
||||
|
||||
extern bool pdfioFileClose(pdfio_file_t *pdf) _PDFIO_PUBLIC;
|
||||
extern pdfio_file_t *pdfioFileCreate(const char *filename, const char *version, pdfio_rect_t *media_box, pdfio_rect_t *crop_box, pdfio_error_cb_t error_cb, void *error_data) _PDFIO_PUBLIC;
|
||||
@ -201,6 +201,7 @@ extern time_t pdfioFileGetCreationDate(pdfio_file_t *pdf) _PDFIO_PUBLIC;
|
||||
extern const char *pdfioFileGetCreator(pdfio_file_t *pdf) _PDFIO_PUBLIC;
|
||||
extern pdfio_array_t *pdfioFileGetID(pdfio_file_t *pdf) _PDFIO_PUBLIC;
|
||||
extern const char *pdfioFileGetKeywords(pdfio_file_t *pdf) _PDFIO_PUBLIC;
|
||||
extern time_t pdfioFileGetModificationDate(pdfio_file_t *pdf) _PDFIO_PUBLIC;
|
||||
extern const char *pdfioFileGetName(pdfio_file_t *pdf) _PDFIO_PUBLIC;
|
||||
extern size_t pdfioFileGetNumObjs(pdfio_file_t *pdf) _PDFIO_PUBLIC;
|
||||
extern size_t pdfioFileGetNumPages(pdfio_file_t *pdf) _PDFIO_PUBLIC;
|
||||
@ -216,6 +217,7 @@ extern void pdfioFileSetAuthor(pdfio_file_t *pdf, const char *value) _PDFIO_PUB
|
||||
extern void pdfioFileSetCreationDate(pdfio_file_t *pdf, time_t value) _PDFIO_PUBLIC;
|
||||
extern void pdfioFileSetCreator(pdfio_file_t *pdf, const char *value) _PDFIO_PUBLIC;
|
||||
extern void pdfioFileSetKeywords(pdfio_file_t *pdf, const char *value) _PDFIO_PUBLIC;
|
||||
extern void pdfioFileSetModificationDate(pdfio_file_t *pdf, time_t value) _PDFIO_PUBLIC;
|
||||
extern bool pdfioFileSetPermissions(pdfio_file_t *pdf, pdfio_permission_t permissions, pdfio_encryption_t encryption, const char *owner_password, const char *user_password) _PDFIO_PUBLIC;
|
||||
extern void pdfioFileSetSubject(pdfio_file_t *pdf, const char *value) _PDFIO_PUBLIC;
|
||||
extern void pdfioFileSetTitle(pdfio_file_t *pdf, const char *value) _PDFIO_PUBLIC;
|
||||
@ -241,14 +243,14 @@ extern bool pdfioStreamClose(pdfio_stream_t *st) _PDFIO_PUBLIC;
|
||||
extern bool pdfioStreamConsume(pdfio_stream_t *st, size_t bytes) _PDFIO_PUBLIC;
|
||||
extern bool pdfioStreamGetToken(pdfio_stream_t *st, char *buffer, size_t bufsize) _PDFIO_PUBLIC;
|
||||
extern ssize_t pdfioStreamPeek(pdfio_stream_t *st, void *buffer, size_t bytes) _PDFIO_PUBLIC;
|
||||
extern bool pdfioStreamPrintf(pdfio_stream_t *st, const char *format, ...) _PDFIO_PUBLIC _PDFIO_FORMAT(2,3);
|
||||
extern bool pdfioStreamPrintf(pdfio_stream_t *st, const char *format, ...) _PDFIO_PUBLIC;
|
||||
extern bool pdfioStreamPutChar(pdfio_stream_t *st, int ch) _PDFIO_PUBLIC;
|
||||
extern bool pdfioStreamPuts(pdfio_stream_t *st, const char *s) _PDFIO_PUBLIC;
|
||||
extern ssize_t pdfioStreamRead(pdfio_stream_t *st, void *buffer, size_t bytes) _PDFIO_PUBLIC;
|
||||
extern bool pdfioStreamWrite(pdfio_stream_t *st, const void *buffer, size_t bytes) _PDFIO_PUBLIC;
|
||||
|
||||
extern char *pdfioStringCreate(pdfio_file_t *pdf, const char *s) _PDFIO_PUBLIC;
|
||||
extern char *pdfioStringCreatef(pdfio_file_t *pdf, const char *format, ...) _PDFIO_FORMAT(2,3) _PDFIO_PUBLIC;
|
||||
extern char *pdfioStringCreatef(pdfio_file_t *pdf, const char *format, ...) _PDFIO_PUBLIC;
|
||||
|
||||
|
||||
# ifdef __cplusplus
|
||||
|
@ -7,7 +7,7 @@ Name: pdfio
|
||||
Description: PDF read/write library
|
||||
Version: @PDFIO_VERSION@
|
||||
URL: https://www.msweet.org/pdfio
|
||||
Requires: @PKGCONFIG_REQUIRES@
|
||||
Cflags: @PKGCONFIG_CFLAGS@
|
||||
Libs: @PKGCONFIG_LIBS@
|
||||
Libs.private: @PKGCONFIG_LIBS_PRIVATE@
|
||||
Cflags: @PKGCONFIG_CFLAGS@
|
||||
Requires: @PKGCONFIG_REQUIRES@
|
||||
|
@ -115,7 +115,7 @@
|
||||
<ClCompile>
|
||||
<WarningLevel>Level3</WarningLevel>
|
||||
<SDLCheck>true</SDLCheck>
|
||||
<PreprocessorDefinitions>_DEBUG;_CONSOLE;%(PreprocessorDefinitions)</PreprocessorDefinitions>
|
||||
<PreprocessorDefinitions>HAVE_LIBPNG;_DEBUG;_CONSOLE;%(PreprocessorDefinitions)</PreprocessorDefinitions>
|
||||
<ConformanceMode>true</ConformanceMode>
|
||||
</ClCompile>
|
||||
<Link>
|
||||
@ -130,7 +130,7 @@
|
||||
<FunctionLevelLinking>true</FunctionLevelLinking>
|
||||
<IntrinsicFunctions>true</IntrinsicFunctions>
|
||||
<SDLCheck>true</SDLCheck>
|
||||
<PreprocessorDefinitions>NDEBUG;_CONSOLE;%(PreprocessorDefinitions)</PreprocessorDefinitions>
|
||||
<PreprocessorDefinitions>HAVE_LIBPNG;NDEBUG;_CONSOLE;%(PreprocessorDefinitions)</PreprocessorDefinitions>
|
||||
<ConformanceMode>true</ConformanceMode>
|
||||
</ClCompile>
|
||||
<Link>
|
||||
@ -172,6 +172,8 @@
|
||||
</ItemGroup>
|
||||
<Import Project="$(VCTargetsPath)\Microsoft.Cpp.targets" />
|
||||
<ImportGroup Label="ExtensionTargets">
|
||||
<Import Project="packages\libpng_native.redist.1.6.30\build\native\libpng_native.redist.targets" Condition="Exists('packages\libpng_native.redist.1.6.30\build\native\libpng_native.redist.targets')" />
|
||||
<Import Project="packages\libpng_native.1.6.30\build\native\libpng_native.targets" Condition="Exists('packages\libpng_native.1.6.30\build\native\libpng_native.targets')" />
|
||||
<Import Project="packages\zlib_native.redist.1.2.11\build\native\zlib_native.redist.targets" Condition="Exists('packages\zlib_native.redist.1.2.11\build\native\zlib_native.redist.targets')" />
|
||||
<Import Project="packages\zlib_native.1.2.11\build\native\zlib_native.targets" Condition="Exists('packages\zlib_native.1.2.11\build\native\zlib_native.targets')" />
|
||||
</ImportGroup>
|
||||
|
58
pdfio1.def
@ -1,73 +1,20 @@
|
||||
LIBRARY pdfio1
|
||||
VERSION 1.4
|
||||
VERSION 1.5
|
||||
EXPORTS
|
||||
_pdfioArrayDebug
|
||||
_pdfioArrayDecrypt
|
||||
_pdfioArrayDelete
|
||||
_pdfioArrayGetValue
|
||||
_pdfioArrayRead
|
||||
_pdfioArrayWrite
|
||||
_pdfioCryptoAESDecrypt
|
||||
_pdfioCryptoAESEncrypt
|
||||
_pdfioCryptoAESInit
|
||||
_pdfioCryptoLock
|
||||
_pdfioCryptoMD5Append
|
||||
_pdfioCryptoMD5Finish
|
||||
_pdfioCryptoMD5Init
|
||||
_pdfioCryptoMakeRandom
|
||||
_pdfioCryptoMakeReader
|
||||
_pdfioCryptoMakeWriter
|
||||
_pdfioCryptoRC4Crypt
|
||||
_pdfioCryptoRC4Init
|
||||
_pdfioCryptoSHA256Append
|
||||
_pdfioCryptoSHA256Finish
|
||||
_pdfioCryptoSHA256Init
|
||||
_pdfioCryptoUnlock
|
||||
_pdfioDictDebug
|
||||
_pdfioDictDecrypt
|
||||
_pdfioDictDelete
|
||||
_pdfioDictGetValue
|
||||
_pdfioDictRead
|
||||
_pdfioDictSetValue
|
||||
_pdfioDictWrite
|
||||
_pdfioFileAddMappedObj
|
||||
_pdfioFileAddPage
|
||||
_pdfioFileConsume
|
||||
_pdfioFileCreateObj
|
||||
_pdfioFileDefaultError
|
||||
_pdfioFileError
|
||||
_pdfioFileFindMappedObj
|
||||
_pdfioFileFlush
|
||||
_pdfioFileGetChar
|
||||
_pdfioFileGets
|
||||
_pdfioFilePeek
|
||||
_pdfioFilePrintf
|
||||
_pdfioFilePuts
|
||||
_pdfioFileRead
|
||||
_pdfioFileSeek
|
||||
_pdfioFileTell
|
||||
_pdfioFileWrite
|
||||
_pdfioObjDelete
|
||||
_pdfioObjGetExtension
|
||||
_pdfioObjLoad
|
||||
_pdfioObjSetExtension
|
||||
_pdfioStreamCreate
|
||||
_pdfioStreamOpen
|
||||
_pdfioStringIsAllocated
|
||||
_pdfioTokenClear
|
||||
_pdfioTokenFlush
|
||||
_pdfioTokenGet
|
||||
_pdfioTokenInit
|
||||
_pdfioTokenPush
|
||||
_pdfioTokenRead
|
||||
_pdfioValueCopy
|
||||
_pdfioValueDebug
|
||||
_pdfioValueDecrypt
|
||||
_pdfioValueDelete
|
||||
_pdfioValueRead
|
||||
_pdfioValueWrite
|
||||
_pdfio_strtod
|
||||
_pdfio_vsnprintf
|
||||
pdfioArrayAppendArray
|
||||
pdfioArrayAppendBinary
|
||||
pdfioArrayAppendBoolean
|
||||
@ -187,6 +134,7 @@ pdfioFileCreate
|
||||
pdfioFileCreateArrayObj
|
||||
pdfioFileCreateFontObjFromBase
|
||||
pdfioFileCreateFontObjFromFile
|
||||
pdfioFileCreateICCObjFromData
|
||||
pdfioFileCreateICCObjFromFile
|
||||
pdfioFileCreateImageObjFromData
|
||||
pdfioFileCreateImageObjFromFile
|
||||
@ -204,6 +152,7 @@ pdfioFileGetCreationDate
|
||||
pdfioFileGetCreator
|
||||
pdfioFileGetID
|
||||
pdfioFileGetKeywords
|
||||
pdfioFileGetModificationDate
|
||||
pdfioFileGetName
|
||||
pdfioFileGetNumObjs
|
||||
pdfioFileGetNumPages
|
||||
@ -219,6 +168,7 @@ pdfioFileSetAuthor
|
||||
pdfioFileSetCreationDate
|
||||
pdfioFileSetCreator
|
||||
pdfioFileSetKeywords
|
||||
pdfioFileSetModificationDate
|
||||
pdfioFileSetPermissions
|
||||
pdfioFileSetSubject
|
||||
pdfioFileSetTitle
|
||||
|
@ -3,7 +3,7 @@
|
||||
<metadata>
|
||||
<id>pdfio_native</id>
|
||||
<title>PDFio Library for VS2019+</title>
|
||||
<version>1.4.1</version>
|
||||
<version>1.5.3</version>
|
||||
<authors>Michael R Sweet</authors>
|
||||
<owners>michaelrsweet</owners>
|
||||
<projectUrl>https://github.com/michaelrsweet/pappl</projectUrl>
|
||||
@ -16,7 +16,8 @@
|
||||
<copyright>Copyright © 2019-2025 by Michael R Sweet</copyright>
|
||||
<tags>pdf file native</tags>
|
||||
<dependencies>
|
||||
<dependency id="pdfio_native.redist" version="1.4.1" />
|
||||
<dependency id="pdfio_native.redist" version="1.5.3" />
|
||||
<dependency id="libpng_native.redist" version="1.6.30" />
|
||||
<dependency id="zlib_native.redist" version="1.2.11" />
|
||||
</dependencies>
|
||||
</metadata>
|
||||
|
@ -3,7 +3,7 @@
|
||||
<metadata>
|
||||
<id>pdfio_native.redist</id>
|
||||
<title>PDFio Library for VS2019+</title>
|
||||
<version>1.4.1</version>
|
||||
<version>1.5.3</version>
|
||||
<authors>Michael R Sweet</authors>
|
||||
<owners>michaelrsweet</owners>
|
||||
<projectUrl>https://github.com/michaelrsweet/pappl</projectUrl>
|
||||
@ -16,6 +16,7 @@
|
||||
<copyright>Copyright © 2019-2025 by Michael R Sweet</copyright>
|
||||
<tags>pdf file native</tags>
|
||||
<dependencies>
|
||||
<dependency id="libpng_native.redist" version="1.6.30" />
|
||||
<dependency id="zlib_native.redist" version="1.2.11" />
|
||||
</dependencies>
|
||||
</metadata>
|
||||
|
@ -7,6 +7,8 @@
|
||||
|
||||
:: Copy dependent DLLs to the named build directory
|
||||
echo Copying DLLs
|
||||
copy packages\libpng_native.redist.1.6.30\build\native\bin\x64\Debug\*.dll %1
|
||||
copy packages\libpng_native.redist.1.6.30\build\native\bin\x64\Release\*.dll %1
|
||||
copy packages\zlib_native.redist.1.2.11\build\native\bin\x64\Debug\*.dll %1
|
||||
copy packages\zlib_native.redist.1.2.11\build\native\bin\x64\Release\*.dll %1
|
||||
|
||||
|
9
testfiles/pngsuite-LICENSE.txt
Normal file
@ -0,0 +1,9 @@
|
||||
PngSuite
|
||||
--------
|
||||
|
||||
Permission to use, copy, modify and distribute these images for any
|
||||
purpose and without fee is hereby granted.
|
||||
|
||||
|
||||
(c) Willem van Schaik, 1996, 2011
|
||||
|
BIN
testfiles/pngsuite/basi0g01.png
Normal file
After Width: | Height: | Size: 217 B |
BIN
testfiles/pngsuite/basi0g02.png
Normal file
After Width: | Height: | Size: 154 B |
BIN
testfiles/pngsuite/basi0g04.png
Normal file
After Width: | Height: | Size: 247 B |
BIN
testfiles/pngsuite/basi0g08.png
Normal file
After Width: | Height: | Size: 254 B |
BIN
testfiles/pngsuite/basi2c08.png
Normal file
After Width: | Height: | Size: 315 B |
BIN
testfiles/pngsuite/basi3p01.png
Normal file
After Width: | Height: | Size: 132 B |
BIN
testfiles/pngsuite/basi3p02.png
Normal file
After Width: | Height: | Size: 193 B |
BIN
testfiles/pngsuite/basi3p04.png
Normal file
After Width: | Height: | Size: 327 B |
BIN
testfiles/pngsuite/basi3p08.png
Normal file
After Width: | Height: | Size: 1.5 KiB |
BIN
testfiles/pngsuite/basi4a08.png
Normal file
After Width: | Height: | Size: 214 B |
BIN
testfiles/pngsuite/basi6a08.png
Normal file
After Width: | Height: | Size: 361 B |
BIN
testfiles/pngsuite/basn0g01.png
Normal file
After Width: | Height: | Size: 164 B |
BIN
testfiles/pngsuite/basn0g02.png
Normal file
After Width: | Height: | Size: 104 B |
BIN
testfiles/pngsuite/basn0g04.png
Normal file
After Width: | Height: | Size: 145 B |
BIN
testfiles/pngsuite/basn0g08.png
Normal file
After Width: | Height: | Size: 138 B |
BIN
testfiles/pngsuite/basn2c08.png
Normal file
After Width: | Height: | Size: 145 B |
BIN
testfiles/pngsuite/basn3p01.png
Normal file
After Width: | Height: | Size: 112 B |
BIN
testfiles/pngsuite/basn3p02.png
Normal file
After Width: | Height: | Size: 146 B |
BIN
testfiles/pngsuite/basn3p04.png
Normal file
After Width: | Height: | Size: 216 B |
BIN
testfiles/pngsuite/basn3p08.png
Normal file
After Width: | Height: | Size: 1.3 KiB |
BIN
testfiles/pngsuite/basn4a08.png
Normal file
After Width: | Height: | Size: 126 B |
BIN
testfiles/pngsuite/basn6a08.png
Normal file
After Width: | Height: | Size: 184 B |
BIN
testfiles/pngsuite/exif2c08.png
Normal file
After Width: | Height: | Size: 1.7 KiB |
BIN
testfiles/pngsuite/g03n2c08.png
Normal file
After Width: | Height: | Size: 370 B |
BIN
testfiles/pngsuite/g03n3p04.png
Normal file
After Width: | Height: | Size: 214 B |
BIN
testfiles/pngsuite/g04n2c08.png
Normal file
After Width: | Height: | Size: 377 B |
BIN
testfiles/pngsuite/g04n3p04.png
Normal file
After Width: | Height: | Size: 219 B |
BIN
testfiles/pngsuite/g05n2c08.png
Normal file
After Width: | Height: | Size: 350 B |
BIN
testfiles/pngsuite/g05n3p04.png
Normal file
After Width: | Height: | Size: 206 B |
BIN
testfiles/pngsuite/g07n2c08.png
Normal file
After Width: | Height: | Size: 340 B |
BIN
testfiles/pngsuite/g07n3p04.png
Normal file
After Width: | Height: | Size: 207 B |
BIN
testfiles/pngsuite/g10n2c08.png
Normal file
After Width: | Height: | Size: 285 B |
BIN
testfiles/pngsuite/g10n3p04.png
Normal file
After Width: | Height: | Size: 214 B |
BIN
testfiles/pngsuite/g25n2c08.png
Normal file
After Width: | Height: | Size: 405 B |
BIN
testfiles/pngsuite/g25n3p04.png
Normal file
After Width: | Height: | Size: 215 B |
BIN
testfiles/pngsuite/s02i3p01.png
Normal file
After Width: | Height: | Size: 114 B |
BIN
testfiles/pngsuite/s02n3p01.png
Normal file
After Width: | Height: | Size: 115 B |
BIN
testfiles/pngsuite/s03i3p01.png
Normal file
After Width: | Height: | Size: 118 B |
BIN
testfiles/pngsuite/s03n3p01.png
Normal file
After Width: | Height: | Size: 120 B |
BIN
testfiles/pngsuite/s04i3p01.png
Normal file
After Width: | Height: | Size: 126 B |
BIN
testfiles/pngsuite/s04n3p01.png
Normal file
After Width: | Height: | Size: 121 B |
BIN
testfiles/pngsuite/s05i3p02.png
Normal file
After Width: | Height: | Size: 134 B |
BIN
testfiles/pngsuite/s05n3p02.png
Normal file
After Width: | Height: | Size: 129 B |
BIN
testfiles/pngsuite/s06i3p02.png
Normal file
After Width: | Height: | Size: 143 B |
BIN
testfiles/pngsuite/s06n3p02.png
Normal file
After Width: | Height: | Size: 131 B |
BIN
testfiles/pngsuite/s07i3p02.png
Normal file
After Width: | Height: | Size: 149 B |
BIN
testfiles/pngsuite/s07n3p02.png
Normal file
After Width: | Height: | Size: 138 B |
BIN
testfiles/pngsuite/s08i3p02.png
Normal file
After Width: | Height: | Size: 149 B |
BIN
testfiles/pngsuite/s08n3p02.png
Normal file
After Width: | Height: | Size: 139 B |
BIN
testfiles/pngsuite/s09i3p02.png
Normal file
After Width: | Height: | Size: 147 B |
BIN
testfiles/pngsuite/s09n3p02.png
Normal file
After Width: | Height: | Size: 143 B |
BIN
testfiles/pngsuite/s32i3p04.png
Normal file
After Width: | Height: | Size: 355 B |
BIN
testfiles/pngsuite/s32n3p04.png
Normal file
After Width: | Height: | Size: 263 B |
BIN
testfiles/pngsuite/s33i3p04.png
Normal file
After Width: | Height: | Size: 385 B |
BIN
testfiles/pngsuite/s33n3p04.png
Normal file
After Width: | Height: | Size: 329 B |
BIN
testfiles/pngsuite/s34i3p04.png
Normal file
After Width: | Height: | Size: 349 B |
BIN
testfiles/pngsuite/s34n3p04.png
Normal file
After Width: | Height: | Size: 248 B |
BIN
testfiles/pngsuite/s35i3p04.png
Normal file
After Width: | Height: | Size: 399 B |