63 Commits

Author SHA1 Message Date
63cdb13b1b Fix image name property - forgot to call pdfioStringCreate* APIs for formatted
string...
2024-12-12 07:11:25 -05:00
72e55b5bd1 Refactor block formatter to split content into lines and render the lines.
Also cache the current font for the whole page.
2024-12-11 20:14:32 -05:00
dc65eb8d2f Save work on image support. 2024-12-11 15:37:03 -05:00
a39b01ec9c Add metadata support. 2024-12-10 18:53:51 -05:00
4b29c9a1c2 Add text color and optimize text groups into whole blocks.
Add UNICODE_VALUE define to allow toggling between Unicode and ISO-8859-1 modes.
2024-12-10 18:41:23 -05:00
5a4afad566 Save work on markdown formatting example code. 2024-12-10 16:35:12 -05:00
7a45adb7f5 Drop cert.py from cppcheck invocation. 2024-12-10 08:24:47 -05:00
45ac66874c Add readme for example programs. 2024-12-09 19:28:15 -05:00
eb9dad9b51 Add Code 128 example code. 2024-12-09 19:13:48 -05:00
2ecb9cfb2d Changelog. 2024-12-08 19:19:13 -05:00
91a467e55c Merge pull request #81 from vlasovsoft1979/ttf_h_size_t_error
Fixed compilation error MSVC 19.16.27039.0 32 bit
2024-12-08 19:18:46 -05:00
d705d7eb5d Fix reading PDF files whose trailer is missing a newline (Issue #80) 2024-12-08 19:14:58 -05:00
55745bcea8 Fixed compilation error MSVC 19.16.27039.0 32 bit 2024-12-08 22:56:25 +03:00
2ea99597cc Update Windows DLL exports. 2024-10-25 17:57:17 -04:00
a3a3512ed8 Update docos. 2024-10-25 17:50:51 -04:00
afac83530f Add pdfioDictGetKey and pdfioDictGetNumPairs APIs (Issue #63)
Add pdfioArrayRemove and pdfioDictClear APIs (Issue #74)
2024-10-25 17:48:19 -04:00
21ac2b52d1 Clean up updated docos (Issue #78) 2024-10-25 17:32:38 -04:00
21b8e3b06f Changelog. 2024-10-25 17:17:39 -04:00
91392a931f Changelog. 2024-10-25 17:17:38 -04:00
1d8bcf4d73 Start v1.4.0. 2024-10-25 17:17:38 -04:00
1e55779906 Merge pull request #78 from uddhavphatak/master
Update in documentation
2024-10-25 17:16:46 -04:00
0e45e49ea4 Merge pull request #76 from vlasovsoft1979/master
Get name from object
2024-10-25 17:14:59 -04:00
0ab291a78b Update pdfio.md 2024-10-21 20:37:25 +05:30
cac6d4891c Update pdfio.md 2024-10-21 19:52:34 +05:30
4f29ad89da Merge branch 'michaelrsweet:master' into master 2024-10-21 17:09:38 +05:30
9c04d1dc20 Update changelog. 2024-10-15 13:10:06 -04:00
335472023e Bump version in header. 2024-10-15 13:06:40 -04:00
25834e07ef Update pdfio.md
addition of lines requeested
2024-10-15 09:38:01 +05:30
2d2a7126d2 Update pdfio.md
updated doc
2024-10-14 13:34:27 +05:30
df1064ff39 Update pdfio.md 2024-10-14 13:20:44 +05:30
853fa4fe8f Update pdfio.md 2024-10-14 13:14:59 +05:30
2cadfd8a1e Update pdfio.md 2024-10-14 13:10:57 +05:30
f5d40a305e Update pdfio.md 2024-10-14 13:09:13 +05:30
eb5be57b4a Update pdfio.md
basics of pdf file
2024-10-14 13:06:06 +05:30
3de47ea63d Update pdfio.md
update documentation
2024-10-14 12:43:40 +05:30
8f2c47cb07 Make sure memory is freed on error conditions. 2024-10-09 15:32:48 -04:00
74dfefdcc1 Update documentation (Issue #77)
- Explain pdfioObjGetSubtype and pdfioObjGetType values
- Provide example code and documentation for accessing common page object values
2024-10-09 15:07:57 -04:00
ee31096019 PR comment 2024-09-27 20:38:15 +03:00
121b933307 minor 2024-09-25 18:44:34 +03:00
f4409146e3 minor 2024-09-25 18:42:38 +03:00
4312933409 pdfioFileCreateNameObj implemented 2024-09-25 18:40:36 +03:00
a19949834b PR comments 2024-09-25 18:06:17 +03:00
04c4f44324 Get name from simple object. For example, Image ColorSpace is the reference to other object. 2024-09-25 17:04:25 +03:00
206f75403a Add debug printfs. 2024-08-26 09:19:34 -04:00
7d22477917 Fix opening of certain encrypted PDF files (Issue #62) 2024-08-21 11:28:39 -04:00
7c3651671b Add NULL checks in the private debug APIs that testpdfio calls. 2024-08-21 09:22:58 -04:00
6cb661f0f4 Cleanup changelog. 2024-08-21 08:25:11 -04:00
7e01451b18 Merge 0-character font fix from TTF. 2024-08-21 08:22:31 -04:00
138f3955d1 Add --password option to PDFio test program. 2024-08-19 17:12:16 -04:00
82844ad2ce Merge TTF v1.0.0 source files. 2024-08-19 16:59:00 -04:00
d7cce4dfbc Merge TTF v1.0.0 source files. 2024-08-19 16:58:38 -04:00
1cec42f399 Bump version to 1.3.2. 2024-08-09 10:55:32 -04:00
f3f70e7877 Merge some TTF sanity check fixes from the TTF project. 2024-08-09 10:54:28 -04:00
90923c3818 Update DLL exports. 2024-08-05 21:55:32 -04:00
986cc512cd Bump NuGet project versions. 2024-08-05 21:50:18 -04:00
c35ddbec00 Changelog 2024-08-05 21:49:26 -04:00
e4e1c39578 Merge commit from fork
Add range checking to TTF loader.
2024-08-05 21:47:48 -04:00
1d4f77cab1 Add examples to documentation (Issue #69) 2024-08-05 21:44:56 -04:00
b035130cde Merge pull request #68 from devnibo/master
Update documentation
2024-08-05 19:56:40 -04:00
d6d5813b04 Update changelog with CVE number. 2024-08-05 16:34:12 -04:00
6492f210cf Bump version and changelog. 2024-08-05 10:23:51 -04:00
207062a996 Add size limiting for num_cmap and nGlyphs. 2024-08-05 10:16:00 -04:00
7d37abb0df Update documentation 2024-07-07 16:35:56 +02:00
36 changed files with 5506 additions and 117 deletions

4
.gitignore vendored
View File

@ -1,5 +1,6 @@
*.1.dylib
*.a
*.dSYM
*.log
*.o
*.so.1
@ -8,7 +9,10 @@
/autom4te.cache
/config.log
/config.status
/configure~
/doc/pdfio.epub
/examples/code128
/examples/md2pdf
/Makefile
/packages
/pdfio.pc

View File

@ -2,8 +2,36 @@ Changes in PDFio
================
v1.3.0 (June 28, 2024)
----------------------
v1.4.0 - YYYY-MM-DD
-------------------
- Added new `pdfioDictGetKey` and `pdfioDictGetNumPairs` APIs (Issue #63)
- Added new `pdfioArrayRemove` and `pdfioDictClear` APIs (Issue #74)
- Added new `pdfioFileCreateNameObj` and `pdfioObjGetName` APIs for creating and
getting name object values (Issue #76)
- Updated documentation (Issue #78)
- Fixed reading of PDF files whose trailer is missing a newline (Issue #80)
- Fixed builds with some versions of VC++ (Issue #81)
v1.3.2 - 2024-08-15
-------------------
- Added some more sanity checks to the TrueType font reader.
- Updated documentation (Issue #77)
- Fixed an issue when opening certain encrypted PDF files (Issue #62)
v1.3.1 - 2024-08-05
-------------------
- CVE 2024-42358: Updated TrueType font reader to avoid large memory
allocations.
- Fixed some documentation errors and added examples (Issue #68, Issue #69)
v1.3.0 - 2024-06-28
-------------------
- Added `pdfioFileGetCatalog` API for accessing the root/catalog object of a
PDF file (Issue #67)
@ -13,8 +41,8 @@ v1.3.0 (June 28, 2024)
- Optimized string pool code.
v1.2.0 (January 24, 2024)
-------------------------
v1.2.0 - 2024-01-24
-------------------
- Now use autoconf to configure the PDFio sources (Issue #54)
- Added `pdfioFileCreateNumberObj` and `pdfioFileCreateStringObj` functions
@ -37,8 +65,8 @@ v1.2.0 (January 24, 2024)
65536 in the xref table (Issue #59)
v1.1.4 (December 3, 2023)
-------------------------
v1.1.4 - 2023-12-03
-------------------
- Fixed detection of encrypted strings that are too short (Issue #52)
- Fixed a TrueType CMAP decoding bug.
@ -46,15 +74,15 @@ v1.1.4 (December 3, 2023)
- Added a ToUnicode map for Unicode text to support text copying.
v1.1.3 (November 15, 2023)
--------------------------
v1.1.3 - 2023-11-15
-------------------
- Fixed Unicode font support (Issue #16)
- Fixed missing initializer for 40-bit RC4 encryption (Issue #51)
v1.1.2 (October 10, 2023)
-------------------------
v1.1.2 - 2023-10-10
-------------------
- Updated `pdfioContentSetDashPattern` to support setting a solid (0 length)
dash pattern (Issue #41)
@ -69,15 +97,15 @@ v1.1.2 (October 10, 2023)
(Issue #48)
v1.1.1 (March 20, 2023)
-----------------------
v1.1.1 - 2023-03-20
-------------------
- CVE-2023-28428: Fixed a potential denial-of-service with corrupt PDF files.
- Fixed a few build issues.
v1.1.0 (February 6, 2023)
-------------------------
v1.1.0 - 2023-02-06
-------------------
- CVE-2023-24808: Fixed a potential denial-of-service with corrupt PDF files.
- Added `pdfioFileCreateTemporary` function (Issue #29)
@ -91,28 +119,28 @@ v1.1.0 (February 6, 2023)
- Fixed `pdfioContentMatrixRotate` function.
v1.0.1 (March 2, 2022)
----------------------
v1.0.1 - 2022-03-02
-------------------
- Added missing `pdfioPageGetNumStreams` and `pdfioPageOpenStream` functions.
- Added demo pdfiototext utility.
- Fixed bug in `pdfioStreamGetToken`.
v1.0.0 (December 14, 2021)
--------------------------
v1.0.0 - 2021-12-14
-------------------
- First stable release.
v1.0rc1 (November 30, 2021)
---------------------------
v1.0rc1 - 2021-11-30
--------------------
- Fixed a few stack/buffer overflow bugs discovered via fuzzing.
v1.0b2 (November 7, 2021)
-------------------------
v1.0b2 - 2021-11-07
-------------------
- Added `pdfioFileCreateOutput` API to support streaming output of PDF
(Issue #21)
@ -123,7 +151,7 @@ v1.0b2 (November 7, 2021)
- Fixed some issues identified by a Coverity scan.
v1.0b1 (August 30, 2021)
------------------------
v1.0b1 - 2021-08-30
-------------------
- Initial release

18
EXAMPLES.md Normal file
View File

@ -0,0 +1,18 @@
PDFio Examples
==============
The "examples" subdirectory contains example code showing how to do different
things with PDFio.
code128.c
---------
This example shows how to embed and use a barcode font.
md2pdf.c
--------
This example shows how to generate pages with multiple fonts, embedded images,
and headers and footers.

View File

@ -258,5 +258,5 @@ clang:
# Analyze code using Cppcheck <http://cppcheck.sourceforge.net>
cppcheck:
cppcheck $(CPPFLAGS) --template=gcc --addon=cert.py --suppressions-list=.cppcheck $(OBJS:.o=.c) 2>cppcheck.log
cppcheck $(CPPFLAGS) --template=gcc --suppressions-list=.cppcheck $(OBJS:.o=.c) 2>cppcheck.log
test -s cppcheck.log && (echo "$(GHA_ERROR)Cppcheck detected issues."; echo ""; cat cppcheck.log; exit 1) || exit 0

24
configure vendored
View File

@ -1,6 +1,6 @@
#! /bin/sh
# Guess values for system-dependent variables and create Makefiles.
# Generated by GNU Autoconf 2.71 for pdfio 1.3.0.
# Generated by GNU Autoconf 2.71 for pdfio 1.4.0.
#
# Report bugs to <https://github.com/michaelrsweet/pdfio/issues>.
#
@ -610,8 +610,8 @@ MAKEFLAGS=
# Identity of this package.
PACKAGE_NAME='pdfio'
PACKAGE_TARNAME='pdfio'
PACKAGE_VERSION='1.3.0'
PACKAGE_STRING='pdfio 1.3.0'
PACKAGE_VERSION='1.4.0'
PACKAGE_STRING='pdfio 1.4.0'
PACKAGE_BUGREPORT='https://github.com/michaelrsweet/pdfio/issues'
PACKAGE_URL='https://www.msweet.org/pdfio'
@ -1293,7 +1293,7 @@ if test "$ac_init_help" = "long"; then
# Omit some internal or obsolete options to make the list less imposing.
# This message is too long to be a string in the A/UX 3.1 sh.
cat <<_ACEOF
\`configure' configures pdfio 1.3.0 to adapt to many kinds of systems.
\`configure' configures pdfio 1.4.0 to adapt to many kinds of systems.
Usage: $0 [OPTION]... [VAR=VALUE]...
@ -1359,7 +1359,7 @@ fi
if test -n "$ac_init_help"; then
case $ac_init_help in
short | recursive ) echo "Configuration of pdfio 1.3.0:";;
short | recursive ) echo "Configuration of pdfio 1.4.0:";;
esac
cat <<\_ACEOF
@ -1456,7 +1456,7 @@ fi
test -n "$ac_init_help" && exit $ac_status
if $ac_init_version; then
cat <<\_ACEOF
pdfio configure 1.3.0
pdfio configure 1.4.0
generated by GNU Autoconf 2.71
Copyright (C) 2021 Free Software Foundation, Inc.
@ -1612,7 +1612,7 @@ cat >config.log <<_ACEOF
This file contains any messages produced by compilers while
running configure, to aid debugging if configure makes a mistake.
It was created by pdfio $as_me 1.3.0, which was
It was created by pdfio $as_me 1.4.0, which was
generated by GNU Autoconf 2.71. Invocation command line was
$ $0$ac_configure_args_raw
@ -2368,9 +2368,9 @@ ac_compiler_gnu=$ac_cv_c_compiler_gnu
PDFIO_VERSION="1.3.0"
PDFIO_VERSION_MAJOR="`echo 1.3.0 | awk -F. '{print $1}'`"
PDFIO_VERSION_MINOR="`echo 1.3.0 | awk -F. '{printf("%d\n",$2);}'`"
PDFIO_VERSION="1.4.0"
PDFIO_VERSION_MAJOR="`echo 1.4.0 | awk -F. '{print $1}'`"
PDFIO_VERSION_MINOR="`echo 1.4.0 | awk -F. '{printf("%d\n",$2);}'`"
@ -4935,7 +4935,7 @@ cat >>$CONFIG_STATUS <<\_ACEOF || ac_write_fail=1
# report actual input values of CONFIG_FILES etc. instead of their
# values after options handling.
ac_log="
This file was extended by pdfio $as_me 1.3.0, which was
This file was extended by pdfio $as_me 1.4.0, which was
generated by GNU Autoconf 2.71. Invocation command line was
CONFIG_FILES = $CONFIG_FILES
@ -4991,7 +4991,7 @@ ac_cs_config_escaped=`printf "%s\n" "$ac_cs_config" | sed "s/^ //; s/'/'\\\\\\\\
cat >>$CONFIG_STATUS <<_ACEOF || ac_write_fail=1
ac_cs_config='$ac_cs_config_escaped'
ac_cs_version="\\
pdfio config.status 1.3.0
pdfio config.status 1.4.0
configured by $0, generated by GNU Autoconf 2.71,
with options \\"\$ac_cs_config\\"

View File

@ -21,7 +21,7 @@ AC_PREREQ([2.70])
dnl Package name and version...
AC_INIT([pdfio], [1.3.0], [https://github.com/michaelrsweet/pdfio/issues], [pdfio], [https://www.msweet.org/pdfio])
AC_INIT([pdfio], [1.4.0], [https://github.com/michaelrsweet/pdfio/issues], [pdfio], [https://www.msweet.org/pdfio])
PDFIO_VERSION="AC_PACKAGE_VERSION"
PDFIO_VERSION_MAJOR="`echo AC_PACKAGE_VERSION | awk -F. '{print $1}'`"

View File

@ -1,4 +1,4 @@
.TH pdfio 3 "pdf read/write library" "2024-06-24" "pdf read/write library"
.TH pdfio 3 "pdf read/write library" "2024-10-25" "pdf read/write library"
.SH NAME
pdfio \- pdf read/write library
.SH Introduction
@ -138,6 +138,121 @@ PDFio also provides PDF content helper functions for producing PDF content that
#include <pdfio\-content.h>
.fi
.SS Understanding PDF Files
.PP
A PDF file provides data and commands for displaying pages of graphics and text, and is structured in a way that allows it to be displayed in the same way across multiple devices and platforms. The following is a PDF which shows "Hello, World!" on one page:
.nf
%PDF\-1.0 % Header starts here
%âãÏÓ
1 0 obj % Body starts here
<<
/Kids [2 0 R]
/Count 1
/Type /Pages
.fi
.PP
> endobj 2 0 obj <
/Rotate 0
/Parent 1 0 R
/Resources 3 0 R
/MediaBox [0 0 612 792]
/Contents [4 0 R]/Type /Page
endobj 3 0 obj <
/Font
<<
/F0
<<
/BaseFont /Times\-Italic
/Subtype /Type1
/Type /Font
> > endobj 4 0 obj <
/Length 65
stream
.IP \(bu 5
.PP
0. 0. 1. 50. 700. cm BT /F0 36. Tf (Hello, World!) Tj ET endstream endobj 5 0 obj << /Pages 1 0 R /Type /Catalog
.PP
> endobj xref % Cross\-reference table starts here 0 6 0000000000 65535 f 0000000015 00000 n 0000000074 00000 n 0000000192 00000 n 0000000291 00000 n 0000000409 00000 n trailer % Trailer starts here << /Root 5 0 R /Size 6 > startxref 459 %%EOF
.nf
.fi
.PP
Header
.PP
The header is the first line of a PDF file that specifies the version of the PDF format that has been used, for example %PDF\-1.0\.
.PP
Since PDF files almost always contain binary data, they can become corrupted if line endings are changed. For example, if the file is transferred using FTP in text mode or is edited in Notepad on Windows. To allow legacy file transfer programs to determine that the file is binary, the PDF standard recommends including some bytes with character codes higher than 127 in the header, for example:
.nf
%âãÏÓ
.fi
.PP
The percent sign indicates a comment line while the other few bytes are arbitrary character codes in excess of 127. So, the whole header in our example is:
.nf
%PDF\-1.0
%âãÏÓ
.fi
.PP
Body
.PP
The file body consists of a sequence of objects, each preceded by an object number, generation number, and the obj keyword on one line, and followed by the endobj keyword on another. For example:
.nf
1 0 obj
<<
/Kids [2 0 R]
/Count 1
/Type /Pages
.fi
.PP
> endobj
.nf
.fi
.PP
In this example, the object number is 1 and the generation number is 0, meaning it is the first version of the object. The content for object 1 is between the initial 1 0 obj and trailing endobj lines. In this case, the content is the dictionary <</Kids [2 0 R] /Count 1 /Type /Pages>>\.
.PP
Cross\-Reference Table
.PP
The cross\-reference table lists the byte offset of each object in the file body. This allows random access to objects, meaning they don't have to be read in order. Objects that are not used are never read, making the process efficient. Operations like counting the number of pages in a PDF document are fast, even in large files.
.PP
Each object has an object number and a generation number. Generation numbers are used when a cross\-reference table entry is reused. For simplicity, we will assume generation numbers to be always zero and ignore them. The cross\-reference table consists of a header line that indicates the number of entries, a free entry line for object 0, and a line for each of the objects in the file body. For example:
.nf
0 6 % Six entries in table, starting at 0
0000000000 65535 f % Free entry for object 0
0000000015 00000 n % Object 1 is at byte offset 15
0000000074 00000 n % Object 2 is at byte offset 74
0000000192 00000 n % etc...
0000000291 00000 n
0000000409 00000 n % Object 5 is at byte offset 409
.fi
.PP
Trailer
.PP
The first line of the trailer is just the trailer keyword. This is followed by the trailer dictionary which contains at least the /Size entry specifying the number of entries in the cross\-reference table and the /Root entry which references the object for the document catalog which is the root element of the graph of objects in the body.
.PP
There follows a line with just the startxref keyword, a line with a single number specifying the byte offset of the start of the cross\-reference table within the file, and then the line %%EOF which signals the end of the PDF file.
.nf
trailer % Trailer keyword
<< % The trailer dictinonary
/Root 5 0 R
/Size 6
.fi
.PP
> startxref % startxref keyword 459 % Byte offset of cross\-reference table %%EOF % End\-of\-file marker
.nf
.fi
.SH API Overview
.PP
PDFio exposes several types:
@ -218,7 +333,90 @@ Each PDF file contains one or more pages. The pdfioFileGetNumPages function retu
}
.fi
.PP
Each page is represented by a "page tree" object (what pdfioFileGetPage returns) that specifies information about the page and one or more "content" objects that contain the images, fonts, text, and graphics that appear on the page. Use the pdfioPageGetNumStreams and pdfioPageOpenStream functions to access the content streams for each page.
Each page is represented by a "page tree" object (what pdfioFileGetPage returns) that specifies information about the page and one or more "content" objects that contain the images, fonts, text, and graphics that appear on the page. Use the pdfioPageGetNumStreams and pdfioPageOpenStream functions to access the content streams for each page, and pdfioObjGetDict to get the associated page object dictionary. For example, if you want to display the media and crop boxes for a given page:
.nf
pdfio_file_t *pdf; // PDF file
size_t i; // Looping var
size_t count; // Number of pages
pdfio_obj_t *page; // Current page
pdfio_dict_t *dict; // Current page dictionary
pdfio_array_t *media_box; // MediaBox array
double media_values[4]; // MediaBox values
pdfio_array_t *crop_box; // CropBox array
double crop_values[4]; // CropBox values
// Iterate the pages in the PDF file
for (i = 0, count = pdfioFileGetNumPages(pdf); i < count; i ++)
{
page = pdfioFileGetPage(pdf, i);
dict = pdfioObjGetDict(page);
media_box = pdfioDictGetArray(dict, "MediaBox");
media_values[0] = pdfioArrayGetNumber(media_box, 0);
media_values[1] = pdfioArrayGetNumber(media_box, 1);
media_values[2] = pdfioArrayGetNumber(media_box, 2);
media_values[3] = pdfioArrayGetNumber(media_box, 3);
crop_box = pdfioDictGetArray(dict, "CropBox");
crop_values[0] = pdfioArrayGetNumber(crop_box, 0);
crop_values[1] = pdfioArrayGetNumber(crop_box, 1);
crop_values[2] = pdfioArrayGetNumber(crop_box, 2);
crop_values[3] = pdfioArrayGetNumber(crop_box, 3);
printf("Page %u: MediaBox=[%g %g %g %g], CropBox=[%g %g %g %g]\\n",
(unsigned)(i + 1),
media_values[0], media_values[1], media_values[2], media_values[3],
crop_values[0], crop_values[1], crop_values[2], crop_values[3]);
}
.fi
.PP
Page object dictionaries have several (mostly optional) key/value pairs, including:
.IP \(bu 5
.PP
"Annots": An array of annotation dictionaries for the page; use pdfioDictGetArray to get the array
.IP \(bu 5
.PP
"CropBox": The crop box as an array of four numbers for the left, bottom, right, and top coordinates of the target media; use pdfioDictGetArray to get a pointer to the array of numbers
.IP \(bu 5
.PP
"Dur": The number of seconds the page should be displayed; use pdfioDictGetNumber to get the page duration value
.IP \(bu 5
.PP
"Group": The dictionary of transparency group values for the page; use pdfioDictGetDict to get a pointer to the resources dictionary
.IP \(bu 5
.PP
"LastModified": The date and time when this page was last modified; use pdfioDictGetDate to get the Unix time_t value
.IP \(bu 5
.PP
"Parent": The parent page tree node object for this page; use pdfioDictGetObj to get a pointer to the object
.IP \(bu 5
.PP
"MediaBox": The media box as an array of four numbers for the left, bottom, right, and top coordinates of the target media; use pdfioDictGetArray to get a pointer to the array of numbers
.IP \(bu 5
.PP
"Resources": The dictionary of resources for the page; use pdfioDictGetDict to get a pointer to the resources dictionary
.IP \(bu 5
.PP
"Rotate": A number indicating the number of degrees of counter\-clockwise rotation to apply to the page when viewing; use pdfioDictGetNumber to get the rotation angle
.IP \(bu 5
.PP
"Thumb": A thumbnail image object for the page; use pdfioDictGetObj to get a pointer to the thumbnail image object
.IP \(bu 5
.PP
"Trans": The page transition dictionary; use pdfioDictGetDict to get a pointer to the dictionary
.PP
The pdfioFileClose function closes a PDF file and frees all memory that was used for it:
.nf
@ -361,7 +559,7 @@ pdfioStreamWrite writes a buffer of data to the stream
.PP
The PDF content helper functions provide additional functions for writing specific PDF page stream commands.
.PP
When you are done writing the stream, call pdfioStreamCLose to close both the stream and the object.
When you are done writing the stream, call pdfioStreamClose to close both the stream and the object.
.SS PDF Content Helper Functions
.PP
PDFio includes many helper functions for embedding or writing specific kinds of content to a PDF file. These functions can be roughly grouped into five categories:
@ -787,6 +985,125 @@ pdfioContentTextShowf draws a formatted string in a text block
pdfioContentTextShowJustified draws an array of literal strings with offsets between them
.SH Examples
.SS Read PDF Metadata
.PP
The following example function will open a PDF file and print the title, author, creation date, and number of pages:
.nf
#include <pdfio.h>
#include <time.h>
void
show_pdf_info(const char *filename)
{
pdfio_file_t *pdf;
time_t creation_date;
struct tm *creation_tm;
char creation_text[256];
// Open the PDF file with the default callbacks...
pdf = pdfioFileOpen(filename, /*password_cb*/NULL, /*password_cbdata*/NULL, /*error_cb*/NULL, /*error_cbdata*/NULL);
if (pdf == NULL)
return;
// Get the creation date and convert to a string...
creation_date = pdfioFileGetCreationDate(pdf);
creation_tm = localtime(&creation_date);
strftime(creation_text, sizeof(creation_text), "%c", &creation_tm);
// Print file information to stdout...
printf("%s:\\n", filename);
printf(" Title: %s\\n", pdfioFileGetTitle(pdf));
printf(" Author: %s\\n", pdfioFileGetAuthor(pdf));
printf(" Created On: %s\\n", creation_text);
printf(" Number Pages: %u\\n", (unsigned)pdfioFileGetNumPages(pdf));
// Close the PDF file...
pdfioFileClose(pdf);
}
.fi
.SS Create PDF File With Text and Image
.PP
The following example function will create a PDF file, embed a base font and the named JPEG or PNG image file, and then creates a page with the image centered on the page with the text centered below:
.nf
#include <pdfio.h>
#include <pdfio\-content.h>
#include <string.h>
void
create_pdf_image_file(const char *pdfname, const char *imagename, const char *caption)
{
pdfio_file_t *pdf;
pdfio_obj_t *font;
pdfio_obj_t *image;
pdfio_dict_t *dict;
pdfio_stream_t *page;
double width, height;
double swidth, sheight;
double tx, ty;
// Create the PDF file...
pdf = pdfioFileCreate(pdfname, /*version*/NULL, /*media_box*/NULL, /*crop_box*/NULL, /*error_cb*/NULL, /*error_cbdata*/NULL);
// Create a Courier base font for the caption
font = pdfioFileCreateFontObjFromBase(pdf, "Courier");
// Create an image object from the JPEG/PNG image file...
image = pdfioFileCreateImageObjFromFile(pdf, imagename, true);
// Create a page dictionary with the font and image...
dict = pdfioDictCreate(pdf);
pdfioPageDictAddFont(dict, "F1", font);
pdfioPageDictAddImage(dict, "IM1", image);
// Create the page and its content stream...
page = pdfioFileCreatePage(pdf, dict);
// Position and scale the image on the page...
width = pdfioImageGetWidth(image);
height = pdfioImageGetHeight(image);
// Default media_box is "universal" 595.28x792 points (8.27x11in or 210x279mm)
// Use margins of 36 points (0.5in or 12.7mm) with another 36 points for the
// caption underneath...
swidth = 595.28 \- 72.0;
sheight = swidth * height / width;
if (sheight > (792.0 \- 36.0 \- 72.0))
{
sheight = 792.0 \- 36.0 \- 72.0;
swidth = sheight * width / height;
}
tx = 0.5 * (595.28 \- swidth);
ty = 0.5 * (792 \- 36 \- sheight);
pdfioContentDrawImage(page, "IM1", tx, ty + 36.0, swidth, sheight);
// Draw the caption in black...
pdfioContentSetFillColorDeviceGray(page, 0.0);
// Compute the starting point for the text \- Courier is monospaced with a
// nominal width of 0.6 times the text height...
tx = 0.5 * (595.28 \- 18.0 * 0.6 * strlen(caption));
// Position and draw the caption underneath...
pdfioContentTextBegin(page);
pdfioContentSetTextFont(page, "F1", 18.0);
pdfioContentTextMoveTo(page, tx, ty);
pdfioContentTextShow(page, /*unicode*/false, caption);
pdfioContentTextEnd(page);
// Close the page stream and the PDF file...
pdfioStreamClose(page);
pdfioFileClose(pdf);
}
.fi
.SH ENUMERATIONS
.SS pdfio_cs_e
@ -1285,6 +1602,15 @@ pdfio_valtype_t pdfioArrayGetType (
size_t n
);
.fi
.SS pdfioArrayRemove
Remove an array entry.
.PP
.nf
bool pdfioArrayRemove (
pdfio_array_t *a,
size_t n
);
.fi
.SS pdfioContentClip
Clip output to the current path.
.PP
@ -1865,6 +2191,15 @@ bool pdfioContentTextShowf (
...
);
.fi
.SS pdfioDictClear
Remove a key/value pair from a dictionary.
.PP
.nf
bool pdfioDictClear (
pdfio_dict_t *dict,
const char *key
);
.fi
.SS pdfioDictCopy
Copy a dictionary to a PDF file.
.PP
@ -1928,6 +2263,15 @@ pdfio_dict_t * pdfioDictGetDict (
const char *key
);
.fi
.SS pdfioDictGetKey
Get the key for the specified pair.
.PP
.nf
const char * pdfioDictGetKey (
pdfio_dict_t *dict,
size_t n
);
.fi
.SS pdfioDictGetName
Get a key name value from a dictionary.
.PP
@ -1937,6 +2281,14 @@ const char * pdfioDictGetName (
const char *key
);
.fi
.SS pdfioDictGetNumPairs
Get the number of key/value pairs in a dictionary.
.PP
.nf
size_t pdfioDictGetNumPairs (
pdfio_dict_t *dict
);
.fi
.SS pdfioDictGetNumber
Get a key number value from a dictionary.
.PP
@ -2298,6 +2650,18 @@ Note: Currently PNG support is limited to grayscale, RGB, or indexed files
without interlacing or alpha. Transparency (masking) based on color/index
.IP 5
is supported.
.SS pdfioFileCreateNameObj
Create a new object in a PDF file containing a name.
.PP
.nf
pdfio_obj_t * pdfioFileCreateNameObj (
pdfio_file_t *pdf,
const char *name
);
.fi
.PP
This function creates a new object with a name value in a PDF file.
You must call \fIpdfioObjClose\fR to write the object to the file.
.SS pdfioFileCreateNumberObj
Create a new object in a PDF file containing a number.
.PP
@ -2734,6 +3098,14 @@ size_t pdfioObjGetLength (
pdfio_obj_t *obj
);
.fi
.SS pdfioObjGetName
Get the name value associated with an object.
.PP
.nf
const char * pdfioObjGetName (
pdfio_obj_t *obj
);
.fi
.SS pdfioObjGetNumber
Get the object's number.
.PP
@ -2750,6 +3122,29 @@ const char * pdfioObjGetSubtype (
pdfio_obj_t *obj
);
.fi
.PP
This function returns an object's PDF subtype name, if any. Common subtype
names include:
.PP
.IP \(bu 5
"CIDFontType0": A CID Type0 font
.IP \(bu 5
"CIDFontType2": A CID TrueType font
.IP \(bu 5
"Image": An image or image mask
.IP \(bu 5
"Form": A fillable form
.IP \(bu 5
"OpenType": An OpenType font
.IP \(bu 5
"Type0": A composite font
.IP \(bu 5
"Type1": A PostScript Type1 font
.IP \(bu 5
"Type3": A PDF Type3 font
.IP \(bu 5
"TrueType": A TrueType font</li>
</ul>
.SS pdfioObjGetType
Get an object's type.
.PP
@ -2758,6 +3153,27 @@ const char * pdfioObjGetType (
pdfio_obj_t *obj
);
.fi
.PP
This function returns an object's PDF type name, if any. Common type names
include:
.PP
.IP \(bu 5
"CMap": A character map for composite fonts
.IP \(bu 5
"Font": An embedded font (\fIpdfioObjGetSubtype\fR will tell you the
font format)
.IP \(bu 5
"FontDescriptor": A font descriptor
.IP \(bu 5
"Page": A (visible) page
.IP \(bu 5
"Pages": A page tree node
.IP \(bu 5
"Template": An invisible template page
.IP \(bu 5
"XObject": An image, image mask, or form (\fIpdfioObjGetSubtype\fR will
tell you which)</li>
</ul>
.SS pdfioObjOpenStream
Open an object's (data) stream for reading.
.PP

View File

@ -1,13 +1,13 @@
<!DOCTYPE html>
<html lang="en-US">
<head>
<title>PDFio Programming Manual v1.3.0</title>
<title>PDFio Programming Manual v1.4.0</title>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8">
<meta name="generator" content="codedoc v3.7">
<meta name="author" content="Michael R Sweet">
<meta name="language" content="en-US">
<meta name="copyright" content="Copyright © 2021-2024 by Michael R Sweet">
<meta name="version" content="1.3.0">
<meta name="version" content="1.4.0">
<style type="text/css"><!--
body {
background: white;
@ -251,7 +251,7 @@ span.string {
<body>
<div class="header">
<p><img class="title" src="pdfio-512.png"></p>
<h1 class="title">PDFio Programming Manual v1.3.0</h1>
<h1 class="title">PDFio Programming Manual v1.4.0</h1>
<p>Michael R Sweet</p>
<p>Copyright © 2021-2024 by Michael R Sweet</p>
</div>
@ -265,6 +265,7 @@ span.string {
<li><a href="#xcode-project">Xcode Project</a></li>
<li><a href="#detecting-pdfio">Detecting PDFio</a></li>
<li><a href="#header-files">Header Files</a></li>
<li><a href="#understanding-pdf-files">Understanding PDF Files</a></li>
</ul></li>
<li><a href="#api-overview">API Overview</a><ul class="subcontents">
<li><a href="#reading-pdf-files">Reading PDF Files</a></li>
@ -273,6 +274,10 @@ span.string {
<li><a href="#pdf-streams">PDF Streams</a></li>
<li><a href="#pdf-content-helper-functions">PDF Content Helper Functions</a></li>
</ul></li>
<li><a href="#examples">Examples</a><ul class="subcontents">
<li><a href="#read-pdf-metadata">Read PDF Metadata</a></li>
<li><a href="#create-pdf-file-with-text-and-image">Create PDF File With Text and Image</a></li>
</ul></li>
<li><a href="#FUNCTIONS">Functions</a><ul class="subcontents">
<li><a href="#pdfioArrayAppendArray">pdfioArrayAppendArray</a></li>
<li><a href="#pdfioArrayAppendBinary">pdfioArrayAppendBinary</a></li>
@ -301,6 +306,7 @@ span.string {
<li><a href="#pdfioArrayGetSize">pdfioArrayGetSize</a></li>
<li><a href="#pdfioArrayGetString">pdfioArrayGetString</a></li>
<li><a href="#pdfioArrayGetType">pdfioArrayGetType</a></li>
<li><a href="#pdfioArrayRemove">pdfioArrayRemove</a></li>
<li><a href="#pdfioContentClip">pdfioContentClip</a></li>
<li><a href="#pdfioContentDrawImage">pdfioContentDrawImage</a></li>
<li><a href="#pdfioContentFill">pdfioContentFill</a></li>
@ -357,6 +363,7 @@ span.string {
<li><a href="#pdfioContentTextShow">pdfioContentTextShow</a></li>
<li><a href="#pdfioContentTextShowJustified">pdfioContentTextShowJustified</a></li>
<li><a href="#pdfioContentTextShowf">pdfioContentTextShowf</a></li>
<li><a href="#pdfioDictClear">pdfioDictClear</a></li>
<li><a href="#pdfioDictCopy">pdfioDictCopy</a></li>
<li><a href="#pdfioDictCreate">pdfioDictCreate</a></li>
<li><a href="#pdfioDictGetArray">pdfioDictGetArray</a></li>
@ -364,7 +371,9 @@ span.string {
<li><a href="#pdfioDictGetBoolean">pdfioDictGetBoolean</a></li>
<li><a href="#pdfioDictGetDate">pdfioDictGetDate</a></li>
<li><a href="#pdfioDictGetDict">pdfioDictGetDict</a></li>
<li><a href="#pdfioDictGetKey">pdfioDictGetKey</a></li>
<li><a href="#pdfioDictGetName">pdfioDictGetName</a></li>
<li><a href="#pdfioDictGetNumPairs">pdfioDictGetNumPairs</a></li>
<li><a href="#pdfioDictGetNumber">pdfioDictGetNumber</a></li>
<li><a href="#pdfioDictGetObj">pdfioDictGetObj</a></li>
<li><a href="#pdfioDictGetRect">pdfioDictGetRect</a></li>
@ -391,6 +400,7 @@ span.string {
<li><a href="#pdfioFileCreateICCObjFromFile">pdfioFileCreateICCObjFromFile</a></li>
<li><a href="#pdfioFileCreateImageObjFromData">pdfioFileCreateImageObjFromData</a></li>
<li><a href="#pdfioFileCreateImageObjFromFile">pdfioFileCreateImageObjFromFile</a></li>
<li><a href="#pdfioFileCreateNameObj">pdfioFileCreateNameObj</a></li>
<li><a href="#pdfioFileCreateNumberObj">pdfioFileCreateNumberObj</a></li>
<li><a href="#pdfioFileCreateObj">pdfioFileCreateObj</a></li>
<li><a href="#pdfioFileCreateOutput">pdfioFileCreateOutput</a></li>
@ -432,6 +442,7 @@ span.string {
<li><a href="#pdfioObjGetDict">pdfioObjGetDict</a></li>
<li><a href="#pdfioObjGetGeneration">pdfioObjGetGeneration</a></li>
<li><a href="#pdfioObjGetLength">pdfioObjGetLength</a></li>
<li><a href="#pdfioObjGetName">pdfioObjGetName</a></li>
<li><a href="#pdfioObjGetNumber">pdfioObjGetNumber</a></li>
<li><a href="#pdfioObjGetSubtype">pdfioObjGetSubtype</a></li>
<li><a href="#pdfioObjGetType">pdfioObjGetType</a></li>
@ -568,6 +579,104 @@ LIBS += `pkg-config --libs pdfio`
<p>PDFio also provides <a href="#pdf-content-helper-functions">PDF content helper functions</a> for producing PDF content that are defined in a separate header file:</p>
<pre><code class="language-c"><span class="directive">#include &lt;pdfio-content.h&gt;</span>
</code></pre>
<h3 class="title" id="understanding-pdf-files">Understanding PDF Files</h3>
<p>A PDF file provides data and commands for displaying pages of graphics and text, and is structured in a way that allows it to be displayed in the same way across multiple devices and platforms. The following is a PDF which shows &quot;Hello, World!&quot; on one page:</p>
<pre><code>%PDF-1.0 % Header starts here
%âãÏÓ
1 0 obj % Body starts here
&lt;&lt;
/Kids [2 0 R]
/Count 1
/Type /Pages
</code></pre>
<blockquote>
<p>&gt; endobj 2 0 obj <a href="<
/Rotate 0
/Parent 1 0 R
/Resources 3 0 R
/MediaBox [0 0 612 792]
/Contents [4 0 R]/Type /Page
">&lt;
/Rotate 0
/Parent 1 0 R
/Resources 3 0 R
/MediaBox [0 0 612 792]
/Contents [4 0 R]/Type /Page
</a> endobj 3 0 obj <a href="<
/Font
<<
/F0
<<
/BaseFont /Times-Italic
/Subtype /Type1
/Type /Font
">&lt;
/Font
&lt;&lt;
/F0
&lt;&lt;
/BaseFont /Times-Italic
/Subtype /Type1
/Type /Font
</a> &gt; &gt; endobj 4 0 obj <a href="<
/Length 65
">&lt;
/Length 65
</a> stream</p>
</blockquote>
<ol>
<li><p>0. 0. 1. 50. 700. cm BT /F0 36. Tf (Hello, World!) Tj ET endstream endobj 5 0 obj &lt;&lt; /Pages 1 0 R /Type /Catalog</p>
</li>
</ol>
<blockquote>
<p>&gt; endobj xref % Cross-reference table starts here 0 6 0000000000 65535 f 0000000015 00000 n 0000000074 00000 n 0000000192 00000 n 0000000291 00000 n 0000000409 00000 n trailer % Trailer starts here &lt;&lt; /Root 5 0 R /Size 6 &gt; startxref 459 %%EOF</p>
<pre><code></code></pre>
</blockquote>
<h4 id="header">Header</h4>
<p>The header is the first line of a PDF file that specifies the version of the PDF format that has been used, for example <code>%PDF-1.0</code>.</p>
<p>Since PDF files almost always contain binary data, they can become corrupted if line endings are changed. For example, if the file is transferred using FTP in text mode or is edited in Notepad on Windows. To allow legacy file transfer programs to determine that the file is binary, the PDF standard recommends including some bytes with character codes higher than 127 in the header, for example:</p>
<pre><code>%âãÏÓ
</code></pre>
<p>The percent sign indicates a comment line while the other few bytes are arbitrary character codes in excess of 127. So, the whole header in our example is:</p>
<pre><code>%PDF-1.0
%âãÏÓ
</code></pre>
<h4 id="body">Body</h4>
<p>The file body consists of a sequence of objects, each preceded by an object number, generation number, and the obj keyword on one line, and followed by the endobj keyword on another. For example:</p>
<pre><code>1 0 obj
&lt;&lt;
/Kids [2 0 R]
/Count 1
/Type /Pages
</code></pre>
<blockquote>
<p>&gt; endobj</p>
<pre><code></code></pre>
</blockquote>
<p>In this example, the object number is 1 and the generation number is 0, meaning it is the first version of the object. The content for object 1 is between the initial <code>1 0 obj</code> and trailing <code>endobj</code> lines. In this case, the content is the dictionary <code>&lt;&lt;/Kids [2 0 R] /Count 1 /Type /Pages&gt;&gt;</code>.</p>
<h4 id="cross-reference-table">Cross-Reference Table</h4>
<p>The cross-reference table lists the byte offset of each object in the file body. This allows random access to objects, meaning they don't have to be read in order. Objects that are not used are never read, making the process efficient. Operations like counting the number of pages in a PDF document are fast, even in large files.</p>
<p>Each object has an object number and a generation number. Generation numbers are used when a cross-reference table entry is reused. For simplicity, we will assume generation numbers to be always zero and ignore them. The cross-reference table consists of a header line that indicates the number of entries, a free entry line for object 0, and a line for each of the objects in the file body. For example:</p>
<pre><code>0 6 % Six entries in table, starting at 0
0000000000 65535 f % Free entry for object 0
0000000015 00000 n % Object 1 is at byte offset 15
0000000074 00000 n % Object 2 is at byte offset 74
0000000192 00000 n % etc...
0000000291 00000 n
0000000409 00000 n % Object 5 is at byte offset 409
</code></pre>
<h4 id="trailer">Trailer</h4>
<p>The first line of the trailer is just the <code>trailer</code> keyword. This is followed by the trailer dictionary which contains at least the <code>/Size</code> entry specifying the number of entries in the cross-reference table and the <code>/Root</code> entry which references the object for the document catalog which is the root element of the graph of objects in the body.</p>
<p>There follows a line with just the <code>startxref</code> keyword, a line with a single number specifying the byte offset of the start of the cross-reference table within the file, and then the line <code>%%EOF</code> which signals the end of the PDF file.</p>
<pre><code>trailer % Trailer keyword
&lt;&lt; % The trailer dictinonary
/Root 5 0 R
/Size 6
</code></pre>
<blockquote>
<p>&gt; startxref % startxref keyword 459 % Byte offset of cross-reference table %%EOF % End-of-file marker</p>
<pre><code></code></pre>
</blockquote>
<h2 class="title" id="api-overview">API Overview</h2>
<p>PDFio exposes several types:</p>
<ul>
@ -624,7 +733,66 @@ pdfio_obj_t *page; <span class="comment">// Current page</span>
<span class="comment">// do something with page</span>
}
</code></pre>
<p>Each page is represented by a &quot;page tree&quot; object (what <a href="#pdfioFileGetPage"><code>pdfioFileGetPage</code></a> returns) that specifies information about the page and one or more &quot;content&quot; objects that contain the images, fonts, text, and graphics that appear on the page. Use the <a href="#pdfioPageGetNumStreams"><code>pdfioPageGetNumStreams</code></a> and <a href="#pdfioPageOpenStream"><code>pdfioPageOpenStream</code></a> functions to access the content streams for each page.</p>
<p>Each page is represented by a &quot;page tree&quot; object (what <a href="#pdfioFileGetPage"><code>pdfioFileGetPage</code></a> returns) that specifies information about the page and one or more &quot;content&quot; objects that contain the images, fonts, text, and graphics that appear on the page. Use the <a href="#pdfioPageGetNumStreams"><code>pdfioPageGetNumStreams</code></a> and <a href="#pdfioPageOpenStream"><code>pdfioPageOpenStream</code></a> functions to access the content streams for each page, and <a href="#pdfioObjGetDict"><code>pdfioObjGetDict</code></a> to get the associated page object dictionary. For example, if you want to display the media and crop boxes for a given page:</p>
<pre><code class="language-c">pdfio_file_t *pdf; <span class="comment">// PDF file</span>
size_t i; <span class="comment">// Looping var</span>
size_t count; <span class="comment">// Number of pages</span>
pdfio_obj_t *page; <span class="comment">// Current page</span>
pdfio_dict_t *dict; <span class="comment">// Current page dictionary</span>
pdfio_array_t *media_box; <span class="comment">// MediaBox array</span>
<span class="reserved">double</span> media_values[<span class="number">4</span>]; <span class="comment">// MediaBox values</span>
pdfio_array_t *crop_box; <span class="comment">// CropBox array</span>
<span class="reserved">double</span> crop_values[<span class="number">4</span>]; <span class="comment">// CropBox values</span>
<span class="comment">// Iterate the pages in the PDF file</span>
<span class="reserved">for</span> (i = <span class="number">0</span>, count = pdfioFileGetNumPages(pdf); i &lt; count; i ++)
{
page = pdfioFileGetPage(pdf, i);
dict = pdfioObjGetDict(page);
media_box = pdfioDictGetArray(dict, <span class="string">&quot;MediaBox&quot;</span>);
media_values[<span class="number">0</span>] = pdfioArrayGetNumber(media_box, <span class="number">0</span>);
media_values[<span class="number">1</span>] = pdfioArrayGetNumber(media_box, <span class="number">1</span>);
media_values[<span class="number">2</span>] = pdfioArrayGetNumber(media_box, <span class="number">2</span>);
media_values[<span class="number">3</span>] = pdfioArrayGetNumber(media_box, <span class="number">3</span>);
crop_box = pdfioDictGetArray(dict, <span class="string">&quot;CropBox&quot;</span>);
crop_values[<span class="number">0</span>] = pdfioArrayGetNumber(crop_box, <span class="number">0</span>);
crop_values[<span class="number">1</span>] = pdfioArrayGetNumber(crop_box, <span class="number">1</span>);
crop_values[<span class="number">2</span>] = pdfioArrayGetNumber(crop_box, <span class="number">2</span>);
crop_values[<span class="number">3</span>] = pdfioArrayGetNumber(crop_box, <span class="number">3</span>);
printf(<span class="string">&quot;Page %u: MediaBox=[%g %g %g %g], CropBox=[%g %g %g %g]\n&quot;</span>,
(<span class="reserved">unsigned</span>)(i + <span class="number">1</span>),
media_values[<span class="number">0</span>], media_values[<span class="number">1</span>], media_values[<span class="number">2</span>], media_values[<span class="number">3</span>],
crop_values[<span class="number">0</span>], crop_values[<span class="number">1</span>], crop_values[<span class="number">2</span>], crop_values[<span class="number">3</span>]);
}
</code></pre>
<p>Page object dictionaries have several (mostly optional) key/value pairs, including:</p>
<ul>
<li><p>&quot;Annots&quot;: An array of annotation dictionaries for the page; use <a href="#pdfioDictGetArray"><code>pdfioDictGetArray</code></a> to get the array</p>
</li>
<li><p>&quot;CropBox&quot;: The crop box as an array of four numbers for the left, bottom, right, and top coordinates of the target media; use <a href="#pdfioDictGetArray"><code>pdfioDictGetArray</code></a> to get a pointer to the array of numbers</p>
</li>
<li><p>&quot;Dur&quot;: The number of seconds the page should be displayed; use <a href="#pdfioDictGetNumber"><code>pdfioDictGetNumber</code></a> to get the page duration value</p>
</li>
<li><p>&quot;Group&quot;: The dictionary of transparency group values for the page; use <a href="#pdfioDictGetDict"><code>pdfioDictGetDict</code></a> to get a pointer to the resources dictionary</p>
</li>
<li><p>&quot;LastModified&quot;: The date and time when this page was last modified; use <a href="#pdfioDictGetDate"><code>pdfioDictGetDate</code></a> to get the Unix <code>time_t</code> value</p>
</li>
<li><p>&quot;Parent&quot;: The parent page tree node object for this page; use <a href="#pdfioDictGetObj"><code>pdfioDictGetObj</code></a> to get a pointer to the object</p>
</li>
<li><p>&quot;MediaBox&quot;: The media box as an array of four numbers for the left, bottom, right, and top coordinates of the target media; use <a href="#pdfioDictGetArray"><code>pdfioDictGetArray</code></a> to get a pointer to the array of numbers</p>
</li>
<li><p>&quot;Resources&quot;: The dictionary of resources for the page; use <a href="#pdfioDictGetDict"><code>pdfioDictGetDict</code></a> to get a pointer to the resources dictionary</p>
</li>
<li><p>&quot;Rotate&quot;: A number indicating the number of degrees of counter-clockwise rotation to apply to the page when viewing; use <a href="#pdfioDictGetNumber"><code>pdfioDictGetNumber</code></a> to get the rotation angle</p>
</li>
<li><p>&quot;Thumb&quot;: A thumbnail image object for the page; use <a href="#pdfioDictGetObj"><code>pdfioDictGetObj</code></a> to get a pointer to the thumbnail image object</p>
</li>
<li><p>&quot;Trans&quot;: The page transition dictionary; use <a href="#pdfioDictGetDict"><code>pdfioDictGetDict</code></a> to get a pointer to the dictionary</p>
</li>
</ul>
<p>The <a href="#pdfioFileClose"><code>pdfioFileClose</code></a> function closes a PDF file and frees all memory that was used for it:</p>
<pre><code class="language-c">pdfioFileClose(pdf);
</code></pre>
@ -709,7 +877,7 @@ pdfio_stream_t *st = pdfioFileCreatePage(pdf, dict);
</li>
</ul>
<p>The <a href="#pdf-content-helper-functions">PDF content helper functions</a> provide additional functions for writing specific PDF page stream commands.</p>
<p>When you are done writing the stream, call <a href="#pdfioStreamCLose"><code>pdfioStreamCLose</code></a> to close both the stream and the object.</p>
<p>When you are done writing the stream, call <a href="#pdfioStreamClose"><code>pdfioStreamClose</code></a> to close both the stream and the object.</p>
<h3 class="title" id="pdf-content-helper-functions">PDF Content Helper Functions</h3>
<p>PDFio includes many helper functions for embedding or writing specific kinds of content to a PDF file. These functions can be roughly grouped into five categories:</p>
<ul>
@ -943,6 +1111,119 @@ pdfio_obj_t *img = pdfioFileCreateImageObjFromFile(pdf, <span class="string">&qu
<li><p><a href="#pdfioContentTextShowJustified"><code>pdfioContentTextShowJustified</code></a> draws an array of literal strings with offsets between them</p>
</li>
</ul>
<h2 class="title" id="examples">Examples</h2>
<h3 class="title" id="read-pdf-metadata">Read PDF Metadata</h3>
<p>The following example function will open a PDF file and print the title, author, creation date, and number of pages:</p>
<pre><code class="language-c"><span class="directive">#include &lt;pdfio.h&gt;</span>
<span class="directive">#include &lt;time.h&gt;</span>
<span class="reserved">void</span>
show_pdf_info(<span class="reserved">const</span> <span class="reserved">char</span> *filename)
{
pdfio_file_t *pdf;
time_t creation_date;
<span class="reserved">struct</span> tm *creation_tm;
<span class="reserved">char</span> creation_text[<span class="number">256</span>];
<span class="comment">// Open the PDF file with the default callbacks...</span>
pdf = pdfioFileOpen(filename, <span class="comment">/*password_cb*/</span>NULL, <span class="comment">/*password_cbdata*/</span>NULL, <span class="comment">/*error_cb*/</span>NULL, <span class="comment">/*error_cbdata*/</span>NULL);
<span class="reserved">if</span> (pdf == NULL)
<span class="reserved">return</span>;
<span class="comment">// Get the creation date and convert to a string...</span>
creation_date = pdfioFileGetCreationDate(pdf);
creation_tm = localtime(&amp;creation_date);
strftime(creation_text, <span class="reserved">sizeof</span>(creation_text), <span class="string">&quot;%c&quot;</span>, &amp;creation_tm);
<span class="comment">// Print file information to stdout...</span>
printf(<span class="string">&quot;%s:\n&quot;</span>, filename);
printf(<span class="string">&quot; Title: %s\n&quot;</span>, pdfioFileGetTitle(pdf));
printf(<span class="string">&quot; Author: %s\n&quot;</span>, pdfioFileGetAuthor(pdf));
printf(<span class="string">&quot; Created On: %s\n&quot;</span>, creation_text);
printf(<span class="string">&quot; Number Pages: %u\n&quot;</span>, (<span class="reserved">unsigned</span>)pdfioFileGetNumPages(pdf));
<span class="comment">// Close the PDF file...</span>
pdfioFileClose(pdf);
}
</code></pre>
<h3 class="title" id="create-pdf-file-with-text-and-image">Create PDF File With Text and Image</h3>
<p>The following example function will create a PDF file, embed a base font and the named JPEG or PNG image file, and then creates a page with the image centered on the page with the text centered below:</p>
<pre><code class="language-c"><span class="directive">#include &lt;pdfio.h&gt;</span>
<span class="directive">#include &lt;pdfio-content.h&gt;</span>
<span class="directive">#include &lt;string.h&gt;</span>
<span class="reserved">void</span>
create_pdf_image_file(<span class="reserved">const</span> <span class="reserved">char</span> *pdfname, <span class="reserved">const</span> <span class="reserved">char</span> *imagename, <span class="reserved">const</span> <span class="reserved">char</span> *caption)
{
pdfio_file_t *pdf;
pdfio_obj_t *font;
pdfio_obj_t *image;
pdfio_dict_t *dict;
pdfio_stream_t *page;
<span class="reserved">double</span> width, height;
<span class="reserved">double</span> swidth, sheight;
<span class="reserved">double</span> tx, ty;
<span class="comment">// Create the PDF file...</span>
pdf = pdfioFileCreate(pdfname, <span class="comment">/*version*/</span>NULL, <span class="comment">/*media_box*/</span>NULL, <span class="comment">/*crop_box*/</span>NULL, <span class="comment">/*error_cb*/</span>NULL, <span class="comment">/*error_cbdata*/</span>NULL);
<span class="comment">// Create a Courier base font for the caption</span>
font = pdfioFileCreateFontObjFromBase(pdf, <span class="string">&quot;Courier&quot;</span>);
<span class="comment">// Create an image object from the JPEG/PNG image file...</span>
image = pdfioFileCreateImageObjFromFile(pdf, imagename, <span class="reserved">true</span>);
<span class="comment">// Create a page dictionary with the font and image...</span>
dict = pdfioDictCreate(pdf);
pdfioPageDictAddFont(dict, <span class="string">&quot;F1&quot;</span>, font);
pdfioPageDictAddImage(dict, <span class="string">&quot;IM1&quot;</span>, image);
<span class="comment">// Create the page and its content stream...</span>
page = pdfioFileCreatePage(pdf, dict);
<span class="comment">// Position and scale the image on the page...</span>
width = pdfioImageGetWidth(image);
height = pdfioImageGetHeight(image);
<span class="comment">// Default media_box is &quot;universal&quot; 595.28x792 points (8.27x11in or 210x279mm)</span>
<span class="comment">// Use margins of 36 points (0.5in or 12.7mm) with another 36 points for the</span>
<span class="comment">// caption underneath...</span>
swidth = <span class="number">595.28</span> - <span class="number">72.0</span>;
sheight = swidth * height / width;
<span class="reserved">if</span> (sheight &gt; (<span class="number">792.0</span> - <span class="number">36.0</span> - <span class="number">72.0</span>))
{
sheight = <span class="number">792.0</span> - <span class="number">36.0</span> - <span class="number">72.0</span>;
swidth = sheight * width / height;
}
tx = <span class="number">0.5</span> * (<span class="number">595.28</span> - swidth);
ty = <span class="number">0.5</span> * (<span class="number">792</span> - <span class="number">36</span> - sheight);
pdfioContentDrawImage(page, <span class="string">&quot;IM1&quot;</span>, tx, ty + <span class="number">36.0</span>, swidth, sheight);
<span class="comment">// Draw the caption in black...</span>
pdfioContentSetFillColorDeviceGray(page, <span class="number">0.0</span>);
<span class="comment">// Compute the starting point for the text - Courier is monospaced with a</span>
<span class="comment">// nominal width of 0.6 times the text height...</span>
tx = <span class="number">0.5</span> * (<span class="number">595.28</span> - <span class="number">18.0</span> * <span class="number">0.6</span> * strlen(caption));
<span class="comment">// Position and draw the caption underneath...</span>
pdfioContentTextBegin(page);
pdfioContentSetTextFont(page, <span class="string">&quot;F1&quot;</span>, <span class="number">18.0</span>);
pdfioContentTextMoveTo(page, tx, ty);
pdfioContentTextShow(page, <span class="comment">/*unicode*/</span><span class="reserved">false</span>, caption);
pdfioContentTextEnd(page);
<span class="comment">// Close the page stream and the PDF file...</span>
pdfioStreamClose(page);
pdfioFileClose(pdf);
}
</code></pre>
<h2 class="title"><a id="FUNCTIONS">Functions</a></h2>
<h3 class="function"><a id="pdfioArrayAppendArray">pdfioArrayAppendArray</a></h3>
<p class="description">Add an array value to an array.</p>
@ -1326,6 +1607,19 @@ size_t pdfioArrayGetSize(<a href="#pdfio_array_t">pdfio_array_t</a> *a);</p>
</tbody></table>
<h4 class="returnvalue">Return Value</h4>
<p class="description">Value type</p>
<h3 class="function"><a id="pdfioArrayRemove">pdfioArrayRemove</a></h3>
<p class="description">Remove an array entry.</p>
<p class="code">
<span class="reserved">bool</span> pdfioArrayRemove(<a href="#pdfio_array_t">pdfio_array_t</a> *a, size_t n);</p>
<h4 class="parameters">Parameters</h4>
<table class="list"><tbody>
<tr><th>a</th>
<td class="description">Array</td></tr>
<tr><th>n</th>
<td class="description">Index</td></tr>
</tbody></table>
<h4 class="returnvalue">Return Value</h4>
<p class="description"><code>true</code> on success, <code>false</code> otherwise</p>
<h3 class="function"><a id="pdfioContentClip">pdfioContentClip</a></h3>
<p class="description">Clip output to the current path.</p>
<p class="code">
@ -2180,6 +2474,19 @@ argument specifies an array of UTF-8 encoded strings.</p>
<p class="discussion">This function shows some formatted text in a PDF content stream. The
&quot;unicode&quot; argument specifies that the current font maps to full Unicode.
The &quot;format&quot; argument specifies a UTF-8 encoded <code>printf</code>-style format string.</p>
<h3 class="function"><a id="pdfioDictClear">pdfioDictClear</a></h3>
<p class="description">Remove a key/value pair from a dictionary.</p>
<p class="code">
<span class="reserved">bool</span> pdfioDictClear(<a href="#pdfio_dict_t">pdfio_dict_t</a> *dict, <span class="reserved">const</span> <span class="reserved">char</span> *key);</p>
<h4 class="parameters">Parameters</h4>
<table class="list"><tbody>
<tr><th>dict</th>
<td class="description">Dictionary</td></tr>
<tr><th>key</th>
<td class="description">Key</td></tr>
</tbody></table>
<h4 class="returnvalue">Return Value</h4>
<p class="description"><code>true</code> if cleared, <code>false</code> otherwise</p>
<h3 class="function"><a id="pdfioDictCopy">pdfioDictCopy</a></h3>
<p class="description">Copy a dictionary to a PDF file.</p>
<p class="code">
@ -2271,6 +2578,19 @@ time_t pdfioDictGetDate(<a href="#pdfio_dict_t">pdfio_dict_t</a> *dict, <span cl
</tbody></table>
<h4 class="returnvalue">Return Value</h4>
<p class="description">Value</p>
<h3 class="function"><a id="pdfioDictGetKey">pdfioDictGetKey</a></h3>
<p class="description">Get the key for the specified pair.</p>
<p class="code">
<span class="reserved">const</span> <span class="reserved">char</span> *pdfioDictGetKey(<a href="#pdfio_dict_t">pdfio_dict_t</a> *dict, size_t n);</p>
<h4 class="parameters">Parameters</h4>
<table class="list"><tbody>
<tr><th>dict</th>
<td class="description">Dictionary</td></tr>
<tr><th>n</th>
<td class="description">Pair index (<code>0</code>-based)</td></tr>
</tbody></table>
<h4 class="returnvalue">Return Value</h4>
<p class="description">Key for specified pair</p>
<h3 class="function"><a id="pdfioDictGetName">pdfioDictGetName</a></h3>
<p class="description">Get a key name value from a dictionary.</p>
<p class="code">
@ -2284,6 +2604,17 @@ time_t pdfioDictGetDate(<a href="#pdfio_dict_t">pdfio_dict_t</a> *dict, <span cl
</tbody></table>
<h4 class="returnvalue">Return Value</h4>
<p class="description">Value</p>
<h3 class="function"><a id="pdfioDictGetNumPairs">pdfioDictGetNumPairs</a></h3>
<p class="description">Get the number of key/value pairs in a dictionary.</p>
<p class="code">
size_t pdfioDictGetNumPairs(<a href="#pdfio_dict_t">pdfio_dict_t</a> *dict);</p>
<h4 class="parameters">Parameters</h4>
<table class="list"><tbody>
<tr><th>dict</th>
<td class="description">Dictionary</td></tr>
</tbody></table>
<h4 class="returnvalue">Return Value</h4>
<p class="description">Number of pairs</p>
<h3 class="function"><a id="pdfioDictGetNumber">pdfioDictGetNumber</a></h3>
<p class="description">Get a key number value from a dictionary.</p>
<p class="code">
@ -2771,6 +3102,22 @@ image on the page.<br>
Note: Currently PNG support is limited to grayscale, RGB, or indexed files
without interlacing or alpha. Transparency (masking) based on color/index
is supported.</blockquote>
<h3 class="function"><a id="pdfioFileCreateNameObj">pdfioFileCreateNameObj</a></h3>
<p class="description">Create a new object in a PDF file containing a name.</p>
<p class="code">
<a href="#pdfio_obj_t">pdfio_obj_t</a> *pdfioFileCreateNameObj(<a href="#pdfio_file_t">pdfio_file_t</a> *pdf, <span class="reserved">const</span> <span class="reserved">char</span> *name);</p>
<h4 class="parameters">Parameters</h4>
<table class="list"><tbody>
<tr><th>pdf</th>
<td class="description">PDF file</td></tr>
<tr><th>name</th>
<td class="description">Name value</td></tr>
</tbody></table>
<h4 class="returnvalue">Return Value</h4>
<p class="description">New object</p>
<h4 class="discussion">Discussion</h4>
<p class="discussion">This function creates a new object with a name value in a PDF file.
You must call <a href="#pdfioObjClose"><code>pdfioObjClose</code></a> to write the object to the file.</p>
<h3 class="function"><a id="pdfioFileCreateNumberObj">pdfioFileCreateNumberObj</a></h3>
<p class="description">Create a new object in a PDF file containing a number.</p>
<p class="code">
@ -3352,6 +3699,17 @@ size_t pdfioObjGetLength(<a href="#pdfio_obj_t">pdfio_obj_t</a> *obj);</p>
</tbody></table>
<h4 class="returnvalue">Return Value</h4>
<p class="description">Length in bytes or <code>0</code> for none</p>
<h3 class="function"><a id="pdfioObjGetName">pdfioObjGetName</a></h3>
<p class="description">Get the name value associated with an object.</p>
<p class="code">
<span class="reserved">const</span> <span class="reserved">char</span> *pdfioObjGetName(<a href="#pdfio_obj_t">pdfio_obj_t</a> *obj);</p>
<h4 class="parameters">Parameters</h4>
<table class="list"><tbody>
<tr><th>obj</th>
<td class="description">Object</td></tr>
</tbody></table>
<h4 class="returnvalue">Return Value</h4>
<p class="description">Dictionary or <code>NULL</code> on error</p>
<h3 class="function"><a id="pdfioObjGetNumber">pdfioObjGetNumber</a></h3>
<p class="description">Get the object's number.</p>
<p class="code">
@ -3373,7 +3731,30 @@ size_t pdfioObjGetNumber(<a href="#pdfio_obj_t">pdfio_obj_t</a> *obj);</p>
<td class="description">Object</td></tr>
</tbody></table>
<h4 class="returnvalue">Return Value</h4>
<p class="description">Object subtype</p>
<p class="description">Object subtype name or <code>NULL</code> for none</p>
<h4 class="discussion">Discussion</h4>
<p class="discussion">This function returns an object's PDF subtype name, if any. Common subtype
names include:
</p><ul>
<li>&quot;CIDFontType0&quot;: A CID Type0 font
</li>
<li>&quot;CIDFontType2&quot;: A CID TrueType font
</li>
<li>&quot;Image&quot;: An image or image mask
</li>
<li>&quot;Form&quot;: A fillable form
</li>
<li>&quot;OpenType&quot;: An OpenType font
</li>
<li>&quot;Type0&quot;: A composite font
</li>
<li>&quot;Type1&quot;: A PostScript Type1 font
</li>
<li>&quot;Type3&quot;: A PDF Type3 font
</li>
<li>&quot;TrueType&quot;: A TrueType font</li>
</ul>
<h3 class="function"><a id="pdfioObjGetType">pdfioObjGetType</a></h3>
<p class="description">Get an object's type.</p>
<p class="code">
@ -3384,7 +3765,28 @@ size_t pdfioObjGetNumber(<a href="#pdfio_obj_t">pdfio_obj_t</a> *obj);</p>
<td class="description">Object</td></tr>
</tbody></table>
<h4 class="returnvalue">Return Value</h4>
<p class="description">Object type</p>
<p class="description">Object type name or <code>NULL</code> for none</p>
<h4 class="discussion">Discussion</h4>
<p class="discussion">This function returns an object's PDF type name, if any. Common type names
include:
</p><ul>
<li>&quot;CMap&quot;: A character map for composite fonts
</li>
<li>&quot;Font&quot;: An embedded font (<a href="#pdfioObjGetSubtype"><code>pdfioObjGetSubtype</code></a> will tell you the
font format)
</li>
<li>&quot;FontDescriptor&quot;: A font descriptor
</li>
<li>&quot;Page&quot;: A (visible) page
</li>
<li>&quot;Pages&quot;: A page tree node
</li>
<li>&quot;Template&quot;: An invisible template page
</li>
<li>&quot;XObject&quot;: An image, image mask, or form (<a href="#pdfioObjGetSubtype"><code>pdfioObjGetSubtype</code></a> will
tell you which)</li>
</ul>
<h3 class="function"><a id="pdfioObjOpenStream">pdfioObjOpenStream</a></h3>
<p class="description">Open an object's (data) stream for reading.</p>
<p class="code">
@ -3447,7 +3849,7 @@ array that was created using the
<tr><th>dict</th>
<td class="description">Page dictionary</td></tr>
<tr><th>name</th>
<td class="description">Font name</td></tr>
<td class="description">Font name; must not contain spaces</td></tr>
<tr><th>obj</th>
<td class="description">Font object</td></tr>
</tbody></table>

View File

@ -120,6 +120,182 @@ that are defined in a separate header file:
```
Understanding PDF Files
-----------------------
A PDF file provides data and commands for displaying pages of graphics and text,
and is structured in a way that allows it to be displayed in the same way across
multiple devices and platforms. The following is a PDF which shows "Hello,
World!" on one page:
```
%PDF-1.0 % Header starts here
%âãÏÓ
1 0 obj % Body starts here
<<
/Kids [2 0 R]
/Count 1
/Type /Pages
>>
endobj
2 0 obj
<<
/Rotate 0
/Parent 1 0 R
/Resources 3 0 R
/MediaBox [0 0 612 792]
/Contents [4 0 R]/Type /Page
>>
endobj
3 0 obj
<<
/Font
<<
/F0
<<
/BaseFont /Times-Italic
/Subtype /Type1
/Type /Font
>>
>>
>>
endobj
4 0 obj
<<
/Length 65
>>
stream
1. 0. 0. 1. 50. 700. cm
BT
/F0 36. Tf
(Hello, World!) Tj
ET
endstream
endobj
5 0 obj
<<
/Pages 1 0 R
/Type /Catalog
>>
endobj
xref % Cross-reference table starts here
0 6
0000000000 65535 f
0000000015 00000 n
0000000074 00000 n
0000000192 00000 n
0000000291 00000 n
0000000409 00000 n
trailer % Trailer starts here
<<
/Root 5 0 R
/Size 6
>>
startxref
459
%%EOF
```
### Header
The header is the first line of a PDF file that specifies the version of the PDF
format that has been used, for example `%PDF-1.0`.
Since PDF files almost always contain binary data, they can become corrupted if
line endings are changed. For example, if the file is transferred using FTP in
text mode or is edited in Notepad on Windows. To allow legacy file transfer
programs to determine that the file is binary, the PDF standard recommends
including some bytes with character codes higher than 127 in the header, for
example:
```
%âãÏÓ
```
The percent sign indicates a comment line while the other few bytes are
arbitrary character codes in excess of 127. So, the whole header in our example
is:
```
%PDF-1.0
%âãÏÓ
```
### Body
The file body consists of a sequence of objects, each preceded by an object
number, generation number, and the obj keyword on one line, and followed by the
endobj keyword on another. For example:
```
1 0 obj
<<
/Kids [2 0 R]
/Count 1
/Type /Pages
>>
endobj
```
In this example, the object number is 1 and the generation number is 0, meaning
it is the first version of the object. The content for object 1 is between the
initial `1 0 obj` and trailing `endobj` lines. In this case, the content is the
dictionary `<</Kids [2 0 R] /Count 1 /Type /Pages>>`.
### Cross-Reference Table
The cross-reference table lists the byte offset of each object in the file body.
This allows random access to objects, meaning they don't have to be read in
order. Objects that are not used are never read, making the process efficient.
Operations like counting the number of pages in a PDF document are fast, even in
large files.
Each object has an object number and a generation number. Generation numbers
are used when a cross-reference table entry is reused. For simplicity, we will
assume generation numbers to be always zero and ignore them. The
cross-reference table consists of a header line that indicates the number of
entries, a free entry line for object 0, and a line for each of the objects in
the file body. For example:
```
0 6 % Six entries in table, starting at 0
0000000000 65535 f % Free entry for object 0
0000000015 00000 n % Object 1 is at byte offset 15
0000000074 00000 n % Object 2 is at byte offset 74
0000000192 00000 n % etc...
0000000291 00000 n
0000000409 00000 n % Object 5 is at byte offset 409
```
### Trailer
The first line of the trailer is just the `trailer` keyword. This is followed
by the trailer dictionary which contains at least the `/Size` entry specifying
the number of entries in the cross-reference table and the `/Root` entry which
references the object for the document catalog which is the root element of the
graph of objects in the body.
There follows a line with just the `startxref` keyword, a line with a single
number specifying the byte offset of the start of the cross-reference table
within the file, and then the line `%%EOF` which signals the end of the PDF
file.
```
trailer % Trailer keyword
<< % The trailer dictinonary
/Root 5 0 R
/Size 6
>>
startxref % startxref keyword
459 % Byte offset of cross-reference table
%%EOF % End-of-file marker
```
API Overview
============
@ -132,6 +308,7 @@ PDFio exposes several types:
- `pdfio_stream_t`: An object stream
Reading PDF Files
-----------------
@ -202,7 +379,74 @@ Each page is represented by a "page tree" object (what [`pdfioFileGetPage`](@@)
returns) that specifies information about the page and one or more "content"
objects that contain the images, fonts, text, and graphics that appear on the
page. Use the [`pdfioPageGetNumStreams`](@@) and [`pdfioPageOpenStream`](@@)
functions to access the content streams for each page.
functions to access the content streams for each page, and
[`pdfioObjGetDict`](@@) to get the associated page object dictionary. For
example, if you want to display the media and crop boxes for a given page:
```c
pdfio_file_t *pdf; // PDF file
size_t i; // Looping var
size_t count; // Number of pages
pdfio_obj_t *page; // Current page
pdfio_dict_t *dict; // Current page dictionary
pdfio_array_t *media_box; // MediaBox array
double media_values[4]; // MediaBox values
pdfio_array_t *crop_box; // CropBox array
double crop_values[4]; // CropBox values
// Iterate the pages in the PDF file
for (i = 0, count = pdfioFileGetNumPages(pdf); i < count; i ++)
{
page = pdfioFileGetPage(pdf, i);
dict = pdfioObjGetDict(page);
media_box = pdfioDictGetArray(dict, "MediaBox");
media_values[0] = pdfioArrayGetNumber(media_box, 0);
media_values[1] = pdfioArrayGetNumber(media_box, 1);
media_values[2] = pdfioArrayGetNumber(media_box, 2);
media_values[3] = pdfioArrayGetNumber(media_box, 3);
crop_box = pdfioDictGetArray(dict, "CropBox");
crop_values[0] = pdfioArrayGetNumber(crop_box, 0);
crop_values[1] = pdfioArrayGetNumber(crop_box, 1);
crop_values[2] = pdfioArrayGetNumber(crop_box, 2);
crop_values[3] = pdfioArrayGetNumber(crop_box, 3);
printf("Page %u: MediaBox=[%g %g %g %g], CropBox=[%g %g %g %g]\n",
(unsigned)(i + 1),
media_values[0], media_values[1], media_values[2], media_values[3],
crop_values[0], crop_values[1], crop_values[2], crop_values[3]);
}
```
Page object dictionaries have several (mostly optional) key/value pairs,
including:
- "Annots": An array of annotation dictionaries for the page; use
[`pdfioDictGetArray`](@@) to get the array
- "CropBox": The crop box as an array of four numbers for the left, bottom,
right, and top coordinates of the target media; use [`pdfioDictGetArray`](@@)
to get a pointer to the array of numbers
- "Dur": The number of seconds the page should be displayed; use
[`pdfioDictGetNumber`](@@) to get the page duration value
- "Group": The dictionary of transparency group values for the page; use
[`pdfioDictGetDict`](@@) to get a pointer to the resources dictionary
- "LastModified": The date and time when this page was last modified; use
[`pdfioDictGetDate`](@@) to get the Unix `time_t` value
- "Parent": The parent page tree node object for this page; use
[`pdfioDictGetObj`](@@) to get a pointer to the object
- "MediaBox": The media box as an array of four numbers for the left, bottom,
right, and top coordinates of the target media; use [`pdfioDictGetArray`](@@)
to get a pointer to the array of numbers
- "Resources": The dictionary of resources for the page; use
[`pdfioDictGetDict`](@@) to get a pointer to the resources dictionary
- "Rotate": A number indicating the number of degrees of counter-clockwise
rotation to apply to the page when viewing; use [`pdfioDictGetNumber`](@@)
to get the rotation angle
- "Thumb": A thumbnail image object for the page; use [`pdfioDictGetObj`](@@)
to get a pointer to the thumbnail image object
- "Trans": The page transition dictionary; use [`pdfioDictGetDict`](@@) to get
a pointer to the dictionary
The [`pdfioFileClose`](@@) function closes a PDF file and frees all memory that
was used for it:
@ -345,7 +589,7 @@ to the stream:
The [PDF content helper functions](@) provide additional functions for writing
specific PDF page stream commands.
When you are done writing the stream, call [`pdfioStreamCLose`](@@) to close
When you are done writing the stream, call [`pdfioStreamClose`](@@) to close
both the stream and the object.
@ -586,3 +830,133 @@ escaping, as needed:
- [`pdfioContentTextShowf`](@@) draws a formatted string in a text block
- [`pdfioContentTextShowJustified`](@@) draws an array of literal strings with
offsets between them
Examples
========
Read PDF Metadata
-----------------
The following example function will open a PDF file and print the title, author,
creation date, and number of pages:
```c
#include <pdfio.h>
#include <time.h>
void
show_pdf_info(const char *filename)
{
pdfio_file_t *pdf;
time_t creation_date;
struct tm *creation_tm;
char creation_text[256];
// Open the PDF file with the default callbacks...
pdf = pdfioFileOpen(filename, /*password_cb*/NULL, /*password_cbdata*/NULL, /*error_cb*/NULL, /*error_cbdata*/NULL);
if (pdf == NULL)
return;
// Get the creation date and convert to a string...
creation_date = pdfioFileGetCreationDate(pdf);
creation_tm = localtime(&creation_date);
strftime(creation_text, sizeof(creation_text), "%c", &creation_tm);
// Print file information to stdout...
printf("%s:\n", filename);
printf(" Title: %s\n", pdfioFileGetTitle(pdf));
printf(" Author: %s\n", pdfioFileGetAuthor(pdf));
printf(" Created On: %s\n", creation_text);
printf(" Number Pages: %u\n", (unsigned)pdfioFileGetNumPages(pdf));
// Close the PDF file...
pdfioFileClose(pdf);
}
```
Create PDF File With Text and Image
-----------------------------------
The following example function will create a PDF file, embed a base font and the
named JPEG or PNG image file, and then creates a page with the image centered on
the page with the text centered below:
```c
#include <pdfio.h>
#include <pdfio-content.h>
#include <string.h>
void
create_pdf_image_file(const char *pdfname, const char *imagename, const char *caption)
{
pdfio_file_t *pdf;
pdfio_obj_t *font;
pdfio_obj_t *image;
pdfio_dict_t *dict;
pdfio_stream_t *page;
double width, height;
double swidth, sheight;
double tx, ty;
// Create the PDF file...
pdf = pdfioFileCreate(pdfname, /*version*/NULL, /*media_box*/NULL, /*crop_box*/NULL, /*error_cb*/NULL, /*error_cbdata*/NULL);
// Create a Courier base font for the caption
font = pdfioFileCreateFontObjFromBase(pdf, "Courier");
// Create an image object from the JPEG/PNG image file...
image = pdfioFileCreateImageObjFromFile(pdf, imagename, true);
// Create a page dictionary with the font and image...
dict = pdfioDictCreate(pdf);
pdfioPageDictAddFont(dict, "F1", font);
pdfioPageDictAddImage(dict, "IM1", image);
// Create the page and its content stream...
page = pdfioFileCreatePage(pdf, dict);
// Position and scale the image on the page...
width = pdfioImageGetWidth(image);
height = pdfioImageGetHeight(image);
// Default media_box is "universal" 595.28x792 points (8.27x11in or 210x279mm)
// Use margins of 36 points (0.5in or 12.7mm) with another 36 points for the
// caption underneath...
swidth = 595.28 - 72.0;
sheight = swidth * height / width;
if (sheight > (792.0 - 36.0 - 72.0))
{
sheight = 792.0 - 36.0 - 72.0;
swidth = sheight * width / height;
}
tx = 0.5 * (595.28 - swidth);
ty = 0.5 * (792 - 36 - sheight);
pdfioContentDrawImage(page, "IM1", tx, ty + 36.0, swidth, sheight);
// Draw the caption in black...
pdfioContentSetFillColorDeviceGray(page, 0.0);
// Compute the starting point for the text - Courier is monospaced with a
// nominal width of 0.6 times the text height...
tx = 0.5 * (595.28 - 18.0 * 0.6 * strlen(caption));
// Position and draw the caption underneath...
pdfioContentTextBegin(page);
pdfioContentSetTextFont(page, "F1", 18.0);
pdfioContentTextMoveTo(page, tx, ty);
pdfioContentTextShow(page, /*unicode*/false, caption);
pdfioContentTextEnd(page);
// Close the page stream and the PDF file...
pdfioStreamClose(page);
pdfioFileClose(pdf);
}
```

46
examples/Makefile Normal file
View File

@ -0,0 +1,46 @@
#
# Makefile for PDFio examples.
#
# Copyright © 2024 by Michael R Sweet.
#
# Licensed under Apache License v2.0. See the file "LICENSE" for more
# information.
#
# POSIX makefile
.POSIX:
# Common options
CFLAGS = -g $(CPPFLAGS)
CPPFLAGS = -I..
LIBS = -L.. -lpdfio -lz
# Targets
TARGETS = \
code128 \
md2pdf
# Make everything
all: $(TARGETS)
# Clean everything
clean:
rm -f $(TARGETS)
# code128
code128: code128.c
$(CC) $(CFLAGS) -o $@ code128.c $(LIBS)
# md2pdf
md2pdf: md2pdf.c mmd.c mmd.h
$(CC) $(CFLAGS) -o $@ md2pdf.c mmd.c $(LIBS)
# Common dependencies...
$(TARGETS): Makefile ../pdfio.h ../pdfio-content.h

BIN
examples/Roboto-Bold.ttf Normal file

Binary file not shown.

BIN
examples/Roboto-Italic.ttf Normal file

Binary file not shown.

BIN
examples/Roboto-Regular.ttf Normal file

Binary file not shown.

Binary file not shown.

209
examples/code128.c Normal file
View File

@ -0,0 +1,209 @@
//
// Code 128 barcode example for PDFio.
//
// Copyright © 2024 by Michael R Sweet.
//
// Licensed under Apache License v2.0. See the file "LICENSE" for more
// information.
//
// Usage:
//
// ./code128 "BARCODE" ["TEXT"] >FILENAME.pdf
//
#include <pdfio.h>
#include <pdfio-content.h>
//
// 'make_code128()' - Make a Code 128 barcode string.
//
// This function produces a Code B (printable ASCII) representation of the
// source string and doesn't try to optimize using Code C. Non-printable and
// extended characters are ignored in the source string.
//
static char * // O - Output string
make_code128(char *dst, // I - Destination buffer
const char *src, // I - Source string
size_t dstsize) // I - Size of destination buffer
{
char *dstptr, // Pointer into destination buffer
*dstend; // End of destination buffer
int sum; // Weighted sum
static const char *code128_chars = // Code 128 characters
" !\"#$%&'()*+,-./0123456789:;<=>?"
"@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_"
"`abcdefghijklmnopqrstuvwxyz{|}~\303"
"\304\305\306\307\310\311\312";
static const char code128_fnc_3 = '\304';
// FNC 3
static const char code128_fnc_2 = '\305';
// FNC 2
static const char code128_shift_b = '\306';
// Shift B (for lowercase)
static const char code128_code_c = '\307';
// Code C
static const char code128_code_b = '\310';
// Code B
static const char code128_fnc_4 = '\311';
// FNC 4
static const char code128_fnc_1 = '\312';
// FNC 1
static const char code128_start_code_a = '\313';
// Start code A
static const char code128_start_code_b = '\314';
// Start code A
static const char code128_start_code_c = '\315';
// Start code A
static const char code128_stop = '\316';
// Stop pattern
// Start a Code B barcode...
dstptr = dst;
dstend = dst + dstsize - 3;
*dstptr++ = code128_start_code_b;
sum = code128_start_code_b - 100;
while (*src && dstptr < dstend)
{
if (*src >= ' ' && *src < 0x7f)
{
sum += (dstptr - dst) * (*src - ' ');
*dstptr++ = *src;
}
src ++;
}
// Add the weighted sum modulo 103
*dstptr++ = code128_chars[sum % 103];
// Add the stop pattern and return...
*dstptr++ = code128_stop;
*dstptr = '\0';
return (dst);
}
//
// 'output_cb()' - Write PDF data to the standard output...
//
static ssize_t // O - Number of bytes written
output_cb(void *output_cbdata, // I - Callback data (not used)
const void *buffer, // I - Buffer to write
size_t bytes) // I - Number of bytes to write
{
(void)output_cbdata;
return ((ssize_t)fwrite(buffer, 1, bytes, stdout));
}
//
// 'main()' - Produce a single-page barcode file.
//
int // O - Exit status
main(int argc, // I - Number of command-line arguments
char *argv[]) // I - Command-line arguments
{
const char *barcode, // Barcode to show
*text; // Text to display under barcode
pdfio_file_t *pdf; // Output PDF file
pdfio_obj_t *barcode_font; // Barcode font object
pdfio_obj_t *text_font = NULL; // Text font object
pdfio_dict_t *page_dict; // Page dictionary
pdfio_rect_t media_box; // Media/CropBox for page
pdfio_stream_t *page_st; // Page stream
char barcode_temp[256]; // Barcode buffer
double barcode_height = 36.0, // Height of barcode
barcode_width, // Width of barcode
text_height = 0.0, // Height of text
text_width = 0.0; // Width of text
// Get the barcode and optional text from the command-line...
if (argc < 2 || argc > 3)
{
fputs("Usage: code128 \"BARCODE\" [\"TEXT\"] >FILENAME.pdf\n", stderr);
return (1);
}
barcode = argv[1];
text = argv[2];
// Output a PDF file to the standard output...
#ifdef _WIN32
setmode(1, O_BINARY); // Force binary output on Windows
#endif // _WIN32
if ((pdf = pdfioFileCreateOutput(output_cb, /*output_cbdata*/NULL, /*version*/NULL, /*media_box*/NULL, /*crop_box*/NULL, /*error_cb*/NULL, /*error_data*/NULL)) == NULL)
return (1);
// Load fonts...
barcode_font = pdfioFileCreateFontObjFromFile(pdf, "code128.ttf", /*unicode*/false);
if (text)
text_font = pdfioFileCreateFontObjFromFile(pdf, "../testfiles/OpenSans-Regular.ttf", /*unicode*/true);
// Generate Code128 characters for the desired barcode...
if (!(barcode[0] & 0x80))
barcode = make_code128(barcode_temp, barcode, sizeof(barcode_temp));
// Compute sizes of the text...
barcode_width = pdfioContentTextMeasure(barcode_font, barcode, barcode_height);
if (text && text_font)
{
text_height = 9.0;
text_width = pdfioContentTextMeasure(text_font, text, text_height);
}
// Compute the size of the PDF page...
media_box.x1 = 0.0;
media_box.y1 = 0.0;
media_box.x2 = (barcode_width > text_width ? barcode_width : text_width) + 18.0;
media_box.y2 = barcode_height + text_height + 18.0;
// Start a page for the barcode...
page_dict = pdfioDictCreate(pdf);
pdfioDictSetRect(page_dict, "MediaBox", &media_box);
pdfioDictSetRect(page_dict, "CropBox", &media_box);
pdfioPageDictAddFont(page_dict, "B128", barcode_font);
if (text_font)
pdfioPageDictAddFont(page_dict, "TEXT", text_font);
page_st = pdfioFileCreatePage(pdf, page_dict);
// Draw the page...
pdfioContentSetStrokeColorGray(page_st, 0.0);
pdfioContentSetTextFont(page_st, "B128", barcode_height);
pdfioContentTextBegin(page_st);
pdfioContentTextMoveTo(page_st, 0.5 * (media_box.x2 - barcode_width), 9.0 + text_height);
pdfioContentTextShow(page_st, /*unicode*/false, barcode);
pdfioContentTextEnd(page_st);
if (text && text_font)
{
pdfioContentSetTextFont(page_st, "TEXT", text_height);
pdfioContentTextBegin(page_st);
pdfioContentTextMoveTo(page_st, 0.5 * (media_box.x2 - text_width), 9.0);
pdfioContentTextShow(page_st, /*unicode*/true, text);
pdfioContentTextEnd(page_st);
}
pdfioStreamClose(page_st);
// Close and return...
pdfioFileClose(pdf);
return (0);
}

BIN
examples/code128.ttf Normal file

Binary file not shown.

865
examples/md2pdf.c Normal file
View File

@ -0,0 +1,865 @@
//
// Simple markdown to PDF converter example for PDFio.
//
// Copyright © 2024 by Michael R Sweet.
//
// Licensed under Apache License v2.0. See the file "LICENSE" for more
// information.
//
// Usage:
//
// ./md2pdf FILENAME.md >FILENAME.pdf
//
// The generated PDF file is formatted for a "universal" paper size (8.27x11",
// the intersection of US Letter and ISO A4) with 1" top and bottom margins and
// 0.5" side margins. The document title (if present) is centered at the top
// of the second and subsequent pages while the current heading and page number
// are provided at the bottom of each page.
//
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <math.h>
#ifdef _WIN32
# include <io.h>
#else
# include <unistd.h>
#endif // _WIN32
#include "mmd.h"
#include <pdfio.h>
#include <pdfio-content.h>
//
// Types...
//
typedef enum doccolor_e // Document color enumeration
{
DOCCOLOR_BLACK, // #000
DOCCOLOR_RED, // #900
DOCCOLOR_GREEN, // #090
DOCCOLOR_BLUE, // #00C
DOCCOLOR_GRAY // #555
} doccolor_t;
typedef enum docfont_e // Document font enumeration
{
DOCFONT_REGULAR, // Roboto-Regular
DOCFONT_BOLD, // Roboto-Bold
DOCFONT_ITALIC, // Roboto-Italic
DOCFONT_MONOSPACE, // RobotoMono-Regular
DOCFONT_MAX // Number of fonts
} docfont_t;
typedef struct docimage_s // Document image info
{
const char *url; // Reference URL
pdfio_obj_t *obj; // Image object
} docimage_t;
#define DOCIMAGE_MAX 1000 // Maximum number of images
typedef struct doclink_s // Document link info
{
const char *url; // Reference URL
pdfio_rect_t box; // Link box
} doclink_t;
#define DOCLINK_MAX 1000 // Maximum number of links/page
typedef struct docdata_s // Document formatting data
{
pdfio_file_t *pdf; // PDF file
pdfio_rect_t media_box; // Media (page) box
pdfio_rect_t crop_box; // Crop box (for margins)
pdfio_rect_t art_box; // Art box (for markdown content)
pdfio_obj_t *fonts[DOCFONT_MAX]; // Embedded fonts
size_t num_images; // Number of embedded images
docimage_t images[DOCIMAGE_MAX]; // Embedded images
const char *title; // Document title
char *heading; // Current document heading
pdfio_stream_t *st; // Current page stream
double y; // Current position on page
docfont_t font; // Current font
double fsize; // Current font size
doccolor_t color; // Current color
pdfio_obj_t *annots; // Annotations object (for links)
size_t num_links; // Number of links for this page
doclink_t links[DOCLINK_MAX]; // Links for this page
} docdata_t;
typedef struct linefrag_s // Line fragment
{
double x, // X position of item
width, // Width of item
height; // Height of item
size_t imagenum; // Image number
const char *text; // Text string
bool ws; // Whitespace before text?
docfont_t font; // Text font
doccolor_t color; // Text color
} linefrag_t;
#define LINEFRAG_MAX 200 // Maximum number of fragments on a line
//
// Macros...
//
#define in2pt(in) (in * 72.0)
#define mm2pt(mm) (mm * 72.0 / 25.4)
//
// Constants...
//
static const char * const docfont_filenames[] =
{
"Roboto-Regular.ttf",
"Roboto-Bold.ttf",
"Roboto-Italic.ttf",
"RobotoMono-Regular.ttf"
};
static const char * const docfont_names[] =
{
"FR",
"FB",
"FI",
"FM"
};
#define LINE_HEIGHT 1.4 // Multiplier for line height
#define SIZE_BODY 11.0 // Size of body text (points)
#define SIZE_CODEBLOCK 10.0 // Size of code block text (points)
#define SIZE_HEADFOOT 9.0 // Size of header/footer text (points)
#define SIZE_HEADING_1 18.0 // Size of first level heading (points)
#define SIZE_HEADING_2 16.0 // Size of top level heading (points)
#define SIZE_HEADING_3 15.0 // Size of top level heading (points)
#define SIZE_HEADING_4 14.0 // Size of top level heading (points)
#define SIZE_HEADING_5 13.0 // Size of top level heading (points)
#define SIZE_HEADING_6 12.0 // Size of top level heading (points)
#define SIZE_TABLE 10.0 // Size of table text (points)
#define PAGE_WIDTH mm2pt(210) // Page width in points
#define PAGE_LENGTH in2pt(11) // Page length in points
#define PAGE_LEFT in2pt(0.5) // Left margin in points
#define PAGE_RIGHT (PAGE_WIDTH - in2pt(0.5))
// Right margin in points
#define PAGE_BOTTOM in2pt(1.0) // Bottom margin in points
#define PAGE_TOP (PAGE_LENGTH - in2pt(1.0))
// Top margin in points
#define PAGE_HEADER (PAGE_LENGTH - in2pt(0.5))
// Vertical position of header
#define PAGE_FOOTER in2pt(0.5) // Vertical position of footer
#define UNICODE_VALUE true // `true` for Unicode text, `false` for ISO-8859-1
//
// 'mmd_walk_next()' - Find the next markdown node.
//
static mmd_t * // O - Next node or `NULL` at end
mmd_walk_next(mmd_t *top, // I - Top node
mmd_t *node) // I - Current node
{
mmd_t *next, // Next node
*parent; // Parent node
// Figure out the next node under "top"...
if ((next = mmdGetFirstChild(node)) == NULL)
{
if ((next = mmdGetNextSibling(node)) == NULL)
{
if ((parent = mmdGetParent(node)) != top)
{
while ((next = mmdGetNextSibling(parent)) == NULL)
{
if ((parent = mmdGetParent(parent)) == top)
break;
}
}
}
}
return (next);
}
//
// 'add_images()' - Scan the markdown document for images.
//
static void
add_images(docdata_t *dd, // I - Document data
mmd_t *doc) // I - Markdown document
{
mmd_t *current, // Current node
*next; // Next node
// Scan the entire document for images...
for (current = mmdGetFirstChild(doc); current; current = next)
{
// Get next node
next = mmd_walk_next(doc, current);
// Look for image nodes...
if (mmdGetType(current) == MMD_TYPE_IMAGE)
{
const char *url, // URL for image
*ext; // Extension
url = mmdGetURL(current);
ext = strrchr(url, '.');
fprintf(stderr, "IMAGE(%s), ext=\"%s\"\n", url, ext);
if (!access(url, 0) && ext && (!strcmp(ext, ".png") || !strcmp(ext, ".jpg") || !strcmp(ext, ".jpeg")))
{
// Local JPEG or PNG file, so add it if we haven't already...
size_t i; // Looping var
for (i = 0; i < dd->num_images; i ++)
{
if (!strcmp(dd->images[i].url, url))
break;
}
if (i >= dd->num_images && dd->num_images < DOCIMAGE_MAX)
{
dd->images[i].url = url;
if ((dd->images[i].obj = pdfioFileCreateImageObjFromFile(dd->pdf, url, false)) != NULL)
dd->num_images ++;
}
}
}
}
}
//
// 'set_color()' - Set the stroke and fill color as needed.
//
static void
set_color(docdata_t *dd, // I - Document data
doccolor_t color) // I - Document color
{
if (color == dd->color)
return;
switch (color)
{
case DOCCOLOR_BLACK :
pdfioContentSetFillColorDeviceGray(dd->st, 0.0);
pdfioContentSetStrokeColorDeviceGray(dd->st, 0.0);
break;
case DOCCOLOR_RED :
pdfioContentSetFillColorDeviceRGB(dd->st, 0.6, 0.0, 0.0);
pdfioContentSetStrokeColorDeviceRGB(dd->st, 0.6, 0.0, 0.0);
break;
case DOCCOLOR_GREEN :
pdfioContentSetFillColorDeviceRGB(dd->st, 0.0, 0.6, 0.0);
pdfioContentSetStrokeColorDeviceRGB(dd->st, 0.0, 0.6, 0.0);
break;
case DOCCOLOR_BLUE :
pdfioContentSetFillColorDeviceRGB(dd->st, 0.0, 0.0, 0.8);
pdfioContentSetStrokeColorDeviceRGB(dd->st, 0.0, 0.0, 0.8);
break;
case DOCCOLOR_GRAY :
pdfioContentSetFillColorDeviceGray(dd->st, 0.333);
pdfioContentSetStrokeColorDeviceGray(dd->st, 0.333);
break;
}
dd->color = color;
}
//
// 'set_font()' - Set the font typeface and size as needed.
//
static void
set_font(docdata_t *dd, // I - Document data
docfont_t font, // I - Font
double fsize) // I - Font size
{
if (font == dd->font && fabs(fsize - dd->fsize) < 0.1)
return;
if (font == DOCFONT_MAX)
return;
pdfioContentSetTextFont(dd->st, docfont_names[font], fsize);
dd->font = font;
dd->fsize = fsize;
}
//
// 'new_page()' - Start a new page.
//
static void
new_page(docdata_t *dd) // I - Document data
{
pdfio_dict_t *page_dict; // Page dictionary
docfont_t fontface; // Current font face
size_t i; // Looping var
char temp[32]; // Temporary string
double width; // Width of fragment
// Close the current page...
pdfioStreamClose(dd->st);
// Prep the new page...
page_dict = pdfioDictCreate(dd->pdf);
pdfioDictSetRect(page_dict, "MediaBox", &dd->media_box);
// pdfioDictSetRect(page_dict, "CropBox", &dd->crop_box);
pdfioDictSetRect(page_dict, "ArtBox", &dd->art_box);
for (fontface = DOCFONT_REGULAR; fontface < DOCFONT_MAX; fontface ++)
pdfioPageDictAddFont(page_dict, docfont_names[fontface], dd->fonts[fontface]);
for (i = 0; i < dd->num_images; i ++)
pdfioPageDictAddImage(page_dict, pdfioStringCreatef(dd->pdf, "I%u", (unsigned)i), dd->images[i].obj);
dd->st = pdfioFileCreatePage(dd->pdf, page_dict);
dd->color = DOCCOLOR_BLACK;
dd->font = DOCFONT_MAX;
dd->fsize = 0.0;
dd->y = dd->art_box.y2;
// Add header/footer text
set_color(dd, DOCCOLOR_GRAY);
set_font(dd, DOCFONT_REGULAR, SIZE_HEADFOOT);
if (pdfioFileGetNumPages(dd->pdf) > 1 && dd->title)
{
// Show title in header...
width = pdfioContentTextMeasure(dd->fonts[DOCFONT_REGULAR], dd->title, SIZE_HEADFOOT);
pdfioContentTextBegin(dd->st);
pdfioContentTextMoveTo(dd->st, dd->crop_box.x1 + 0.5 * (dd->crop_box.x2 - dd->crop_box.x1 - width), dd->crop_box.y2 - SIZE_HEADFOOT);
pdfioContentTextShow(dd->st, UNICODE_VALUE, dd->title);
pdfioContentTextEnd(dd->st);
pdfioContentPathMoveTo(dd->st, dd->crop_box.x1, dd->crop_box.y2 - 2 * SIZE_HEADFOOT * LINE_HEIGHT + SIZE_HEADFOOT);
pdfioContentPathLineTo(dd->st, dd->crop_box.x2, dd->crop_box.y2 - 2 * SIZE_HEADFOOT * LINE_HEIGHT + SIZE_HEADFOOT);
pdfioContentStroke(dd->st);
}
// Show page number and current heading...
pdfioContentPathMoveTo(dd->st, dd->crop_box.x1, dd->crop_box.y1 + SIZE_HEADFOOT * LINE_HEIGHT);
pdfioContentPathLineTo(dd->st, dd->crop_box.x2, dd->crop_box.y1 + SIZE_HEADFOOT * LINE_HEIGHT);
pdfioContentStroke(dd->st);
pdfioContentTextBegin(dd->st);
snprintf(temp, sizeof(temp), "%u", (unsigned)pdfioFileGetNumPages(dd->pdf));
if (pdfioFileGetNumPages(dd->pdf) & 1)
{
// Page number on right...
width = pdfioContentTextMeasure(dd->fonts[DOCFONT_REGULAR], temp, SIZE_HEADFOOT);
pdfioContentTextMoveTo(dd->st, dd->crop_box.x2 - width, dd->crop_box.y1);
}
else
{
// Page number on left...
pdfioContentTextMoveTo(dd->st, dd->crop_box.x1, dd->crop_box.y1);
}
pdfioContentTextShow(dd->st, UNICODE_VALUE, temp);
pdfioContentTextEnd(dd->st);
if (dd->heading)
{
pdfioContentTextBegin(dd->st);
if (pdfioFileGetNumPages(dd->pdf) & 1)
{
// Current heading on left...
pdfioContentTextMoveTo(dd->st, dd->crop_box.x1, dd->crop_box.y1);
}
else
{
width = pdfioContentTextMeasure(dd->fonts[DOCFONT_REGULAR], dd->heading, SIZE_HEADFOOT);
pdfioContentTextMoveTo(dd->st, dd->crop_box.x2 - width, dd->crop_box.y1);
}
pdfioContentTextShow(dd->st, UNICODE_VALUE, dd->heading);
pdfioContentTextEnd(dd->st);
}
}
//
// 'render_line()' - Render a line of text/graphics.
//
static void
render_line(docdata_t *dd, // I - Document data
double margin_top, // I - Top margin
double lineheight, // I - Height of line
size_t num_frags, // I - Number of line fragments
linefrag_t *frags) // I - Line fragments
{
size_t i; // Looping var
linefrag_t *frag; // Current line fragment
bool in_text = false; // Are we in a text block?
if (!dd->st)
new_page(dd);
dd->y -= margin_top + lineheight;
if (dd->y < dd->art_box.y1)
{
new_page(dd);
dd->y -= lineheight;
}
fprintf(stderr, "num_frags=%u, y=%g\n", (unsigned)num_frags, dd->y);
for (i = 0, frag = frags; i < num_frags; i ++, frag ++)
{
if (frag->text)
{
// Draw text
fprintf(stderr, " text=\"%s\", font=%d, color=%d, x=%g\n", frag->text, frag->font, frag->color, frag->x);
set_color(dd, frag->color);
set_font(dd, frag->font, frag->height);
if (!in_text)
{
pdfioContentTextBegin(dd->st);
pdfioContentTextMoveTo(dd->st, frag->x, dd->y);
in_text = true;
}
if (frag->ws)
pdfioContentTextShowf(dd->st, UNICODE_VALUE, " %s", frag->text);
else
pdfioContentTextShow(dd->st, UNICODE_VALUE, frag->text);
}
else
{
// Draw image
char imagename[32]; // Current image name
fprintf(stderr, " imagenum=%u, x=%g, width=%g, height=%g\n", (unsigned)frag->imagenum, frag->x, frag->width, frag->height);
if (in_text)
{
pdfioContentTextEnd(dd->st);
in_text = false;
}
snprintf(imagename, sizeof(imagename), "I%u", (unsigned)frag->imagenum);
pdfioContentDrawImage(dd->st, imagename, frag->x, dd->y, frag->width, frag->height);
}
}
if (in_text)
pdfioContentTextEnd(dd->st);
}
//
// 'format_block()' - Format a block of text
//
static void
format_block(docdata_t *dd, // I - Document data
mmd_t *block, // I - Block to format
docfont_t deffont, // I - Default font
double fsize, // I - Size of font
double left, // I - Left margin
double right, // I - Right margin
const char *leader) // I - Leader text on the first line
{
mmd_type_t blocktype; // Block type
mmd_t *current, // Current node
*next; // Next node
size_t num_frags; // Number of line fragments
linefrag_t frags[LINEFRAG_MAX], // Line fragments
*frag; // Current fragment
mmd_type_t type; // Current node type
const char *text, // Current text
*url; // Current URL, if any
bool ws; // Current whitespace
pdfio_obj_t *image; // Current image, if any
size_t imagenum; // Current image number
doccolor_t color = DOCCOLOR_BLACK; // Current text color
docfont_t font = deffont; // Current text font
double x, // Current position
width, // Width of current fragment
wswidth, // Width of whitespace
margin_top, // Top margin
height, // Height of current fragment
lineheight; // Height of current line
blocktype = mmdGetType(block);
margin_top = fsize * LINE_HEIGHT;
if (leader)
{
// Add leader text on first line...
frags[0].width = pdfioContentTextMeasure(dd->fonts[deffont], leader, fsize);
frags[0].height = fsize;
frags[0].x = left - frags[0].width;
frags[0].text = leader;
frags[0].font = deffont;
frags[0].color = DOCCOLOR_BLACK;
num_frags = 1;
lineheight = fsize * LINE_HEIGHT;
}
else
{
// No leader text...
num_frags = 0;
lineheight = 0.0;
}
frag = frags + num_frags;
// Loop through the block and render lines...
for (current = mmdGetFirstChild(block), x = left; current; current = next)
{
// Get information about the current node...
type = mmdGetType(current);
text = mmdGetText(current);
image = NULL;
imagenum = 0;
url = mmdGetURL(current);
ws = mmdGetWhitespace(current);
wswidth = 0.0;
next = mmd_walk_next(block, current);
// Process the node...
if (type == MMD_TYPE_IMAGE && url)
{
// Embed an image
size_t i; // Looping var
for (i = 0; i < dd->num_images; i ++)
{
if (!strcmp(dd->images[i].url, url))
{
image = dd->images[i].obj;
imagenum = i;
break;
}
}
if (!image)
continue;
// Image - treat as 100dpi
width = 72.0 * pdfioImageGetWidth(image) / 100.0;
height = 72.0 * pdfioImageGetHeight(image) / 100.0;
text = NULL;
if (width > (right - left))
{
// Too wide, scale to width...
width = right - left;
height = width * pdfioImageGetHeight(image) / pdfioImageGetWidth(image);
}
else if (height > (dd->art_box.y2 - dd->art_box.y1))
{
// Too tall, scale to height...
height = dd->art_box.y2 - dd->art_box.y1;
width = height * pdfioImageGetWidth(image) / pdfioImageGetHeight(image);
}
}
else if (!text)
{
continue;
}
else
{
// Text fragment...
if (type == MMD_TYPE_EMPHASIZED_TEXT)
font = DOCFONT_ITALIC;
else if (type == MMD_TYPE_STRONG_TEXT)
font = DOCFONT_BOLD;
else if (type == MMD_TYPE_CODE_TEXT)
font = DOCFONT_MONOSPACE;
else
font = deffont;
if (type == MMD_TYPE_CODE_TEXT)
color = DOCCOLOR_RED;
else if (type == MMD_TYPE_LINKED_TEXT)
color = DOCCOLOR_BLUE;
else
color = DOCCOLOR_BLACK;
width = pdfioContentTextMeasure(dd->fonts[font], text, fsize);
height = fsize * LINE_HEIGHT;
if (ws)
wswidth = pdfioContentTextMeasure(dd->fonts[font], " ", fsize);
}
// See if this node will fit on the current line...
if ((num_frags > 0 && (x + width + wswidth) >= right) || num_frags == LINEFRAG_MAX)
{
// No, render this line and start over...
render_line(dd, margin_top, lineheight, num_frags, frags);
num_frags = 0;
frag = frags;
x = left;
lineheight = 0.0;
margin_top = 0.0;
}
// Add the current node to the fragment list
if (num_frags == 0)
wswidth = 0.0;
frag->x = x;
frag->width = width + wswidth;
frag->height = text ? fsize : height;
frag->imagenum = imagenum;
frag->text = text;
frag->ws = ws;
frag->font = font;
frag->color = color;
num_frags ++;
frag ++;
x += width + wswidth;
if (height > lineheight)
lineheight = height;
}
if (num_frags > 0)
render_line(dd, margin_top, lineheight, num_frags, frags);
}
//
// 'format_doc()' - Format a document.
//
static void
format_doc(docdata_t *dd, // I - Document data
mmd_t *doc, // I - Document node to format
double left, // I - Left margin
double right) // I - Right margin
{
int i; // Child number
mmd_type_t doctype; // Document node type
mmd_t *current; // Current node
mmd_type_t curtype; // Current node type
char leader[32]; // Leader
static const double heading_sizes[] = // Heading font sizes
{
SIZE_HEADING_1,
SIZE_HEADING_2,
SIZE_HEADING_3,
SIZE_HEADING_4,
SIZE_HEADING_5,
SIZE_HEADING_6
};
doctype = mmdGetType(doc);
for (i = 1, current = mmdGetFirstChild(doc); current; i ++, current = mmdGetNextSibling(current))
{
switch (curtype = mmdGetType(current))
{
default :
break;
case MMD_TYPE_BLOCK_QUOTE :
format_doc(dd, current, left + 36.0, right - 36.0);
break;
case MMD_TYPE_ORDERED_LIST :
case MMD_TYPE_UNORDERED_LIST :
format_doc(dd, current, left + 36.0, right);
break;
case MMD_TYPE_LIST_ITEM :
if (doctype == MMD_TYPE_ORDERED_LIST)
{
snprintf(leader, sizeof(leader), "%d. ", i);
format_block(dd, current, DOCFONT_REGULAR, SIZE_BODY, left, right, leader);
}
else
{
format_block(dd, current, DOCFONT_REGULAR, SIZE_BODY, left, right, /*leader*/"");
}
break;
case MMD_TYPE_HEADING_1 :
case MMD_TYPE_HEADING_2 :
case MMD_TYPE_HEADING_3 :
case MMD_TYPE_HEADING_4 :
case MMD_TYPE_HEADING_5 :
case MMD_TYPE_HEADING_6 :
if (dd->heading)
free(dd->heading);
dd->heading = mmdCopyAllText(current);
format_block(dd, current, DOCFONT_BOLD, heading_sizes[curtype - MMD_TYPE_HEADING_1], left, right, /*leader*/NULL);
break;
case MMD_TYPE_PARAGRAPH :
format_block(dd, current, DOCFONT_REGULAR, SIZE_BODY, left, right, /*leader*/NULL);
break;
case MMD_TYPE_CODE_BLOCK :
{
mmd_t *code; // Current code block
linefrag_t frag; // Line fragment
double margin_top; // Top margin
frag.x = left + 36.0;
frag.width = 0.0;
frag.height = SIZE_CODEBLOCK;
frag.imagenum = 0;
frag.ws = false;
frag.font = DOCFONT_MONOSPACE;
frag.color = DOCCOLOR_RED;
margin_top = SIZE_CODEBLOCK * LINE_HEIGHT;
for (code = mmdGetFirstChild(current); code; code = mmdGetNextSibling(code))
{
frag.text = mmdGetText(code);
render_line(dd, margin_top, SIZE_CODEBLOCK * LINE_HEIGHT, 1, &frag);
margin_top = 0.0;
}
}
break;
}
}
}
//
// 'output_cb()' - Write PDF data to the standard output...
//
static ssize_t // O - Number of bytes written
output_cb(void *output_cbdata, // I - Callback data (not used)
const void *buffer, // I - Buffer to write
size_t bytes) // I - Number of bytes to write
{
(void)output_cbdata;
return ((ssize_t)fwrite(buffer, 1, bytes, stdout));
}
//
// 'main()' - Convert markdown to PDF.
//
int // O - Exit status
main(int argc, // I - Number of command-line arguments
char *argv[]) // I - Command-line arguments
{
docdata_t dd; // Document data
docfont_t fontface; // Current font
mmd_t *doc; // Markdown document
const char *value; // Metadata value
setbuf(stderr, NULL);
// Get the markdown file from the command-line...
if (argc != 2)
{
fputs("Usage: md2pdf FILENANE.md >FILENAME.pdf\n", stderr);
return (1);
}
if ((doc = mmdLoad(/*root*/NULL, argv[1])) == NULL)
return (1);
// Initialize the document data
memset(&dd, 0, sizeof(dd));
dd.media_box.x2 = PAGE_WIDTH;
dd.media_box.y2 = PAGE_LENGTH;
dd.crop_box.x1 = PAGE_LEFT;
dd.crop_box.y1 = PAGE_FOOTER;
dd.crop_box.x2 = PAGE_RIGHT;
dd.crop_box.y2 = PAGE_HEADER;
dd.art_box.x1 = PAGE_LEFT;
dd.art_box.y1 = PAGE_BOTTOM;
dd.art_box.x2 = PAGE_RIGHT;
dd.art_box.y2 = PAGE_TOP;
dd.title = mmdGetMetadata(doc, "title");
// Output a PDF file to the standard output...
#ifdef _WIN32
setmode(1, O_BINARY); // Force binary output on Windows
#endif // _WIN32
if ((dd.pdf = pdfioFileCreateOutput(output_cb, /*output_cbdata*/NULL, /*version*/NULL, /*media_box*/NULL, /*crop_box*/NULL, /*error_cb*/NULL, /*error_data*/NULL)) == NULL)
return (1);
if ((value = mmdGetMetadata(doc, "author")) != NULL)
pdfioFileSetAuthor(dd.pdf, value);
if ((value = mmdGetMetadata(doc, "keywords")) != NULL)
pdfioFileSetKeywords(dd.pdf, value);
if ((value = mmdGetMetadata(doc, "subject")) != NULL)
pdfioFileSetSubject(dd.pdf, value);
else if ((value = mmdGetMetadata(doc, "copyright")) != NULL)
pdfioFileSetSubject(dd.pdf, value);
if (dd.title)
pdfioFileSetTitle(dd.pdf, dd.title);
// Add fonts...
for (fontface = DOCFONT_REGULAR; fontface < DOCFONT_MAX; fontface ++)
{
if ((dd.fonts[fontface] = pdfioFileCreateFontObjFromFile(dd.pdf, docfont_filenames[fontface], UNICODE_VALUE)) == NULL)
return (1);
}
// Add images...
add_images(&dd, doc);
// Parse the markdown document...
format_doc(&dd, doc, dd.art_box.x1, dd.art_box.x2);
// Close the PDF and return...
pdfioStreamClose(dd.st);
pdfioFileClose(dd.pdf);
mmdFree(doc);
return (0);
}

235
examples/md2pdf.md Normal file
View File

@ -0,0 +1,235 @@
---
title: Mini-Markdown Test Document
...
All heading levels are supported from 1 to 6, using both the ATX and Setext
forms. As an indented code block:
# Heading 1
## Heading 2
### Heading 3
#### Heading 4
##### Heading 5
###### Heading 6
Setext Heading 1
================
Setext Heading 2
----------------
As block headings:
# Heading 1
## Heading 2
### Heading 3
#### Heading 4
##### Heading 5
###### Heading 6
Setext Heading 1
================
Setext Heading 2
----------------
And block quotes:
> # BQ Heading 1
> ## BQ Heading 2
> ### BQ Heading 3
> #### BQ Heading 4
> ##### BQ Heading 5
> ###### BQ Heading 6
>
> Setext Heading 1
> ================
>
> Setext Heading 2
> ----------------
And ordered lists:
1. First item.
2. Second item.
3. Third item with very long text that wraps
across multiple lines.
With a secondary paragraph associated with
the third item.
And unordered lists:
- First item.
+ Second item.
* Third item.
* [ ] Fourth item (unchecked)
- [x] Fifth item (checked)
Code block with `\``:
```
#include <stdio.h>
int main(void)
{
puts("Hello, World!");
return (0);
}
~~~
```
Code block with `~`:
~~~
#include <stdio.h>
int main(void)
{
puts("Hello, World!");
return (0);
}
```
~~~
Link to [mmd web site](https://michaelrsweet.github.io/mmd).
Normal link to [Heading 1](@).
Code link to [`Heading 2`](@).
Inner emphasized link to [*Heading 3*](@).
Outer emphasized link to *[Heading 3](@)*.
Inner strong link to [**Heading 4**](@).
Outer strong link to **[Heading 4](@)**.
Implicit link to [reference1][].
Shortcut link to [reference1] without a link title.
[reference1]: https://michaelrsweet.github.io/mmd 'MMD Home Page'
[reference2]: https://michaelrsweet.github.io/mmd/mmd.html 'MMD Documentation'
[reference3]: https://michaelrsweet.github.io/mmd/mmd-160.png "MMD Logo"
Link to [mmd web site][reference1] works.
Link to [mmd documentation][reference2] works.
Link to ![mmd logo][reference3] image.
Link to [bad reference][reference4] doesn't work.
Autolink to <https://michaelrsweet.github.io/mmd>.
Autolink in parenthesis (<https://michaelrsweet.github.io/mmd>).
[Link broken
across two lines](https://michaelrsweet.github.io/mmd)
Color JPEG Image: ![Color JPEG Image](../testfiles/color.jpg)
Grayscale JPEG Image: ![Grayscale JPEG Image](../testfiles/gray.jpg)
Color PNG Image: ![Color PNG Image](../testfiles/pdfio-color.png)
Grayscale PNG Image: ![Grayscale PNG Image](../testfiles/pdfio-gray.png)
Indexed PNG Image: ![Indexed PNG Image](../testfiles/pdfio-indexed.png)
This sentence contains *Emphasized Text*, **Bold Text**, and `Code Text` for
testing the MMD parser. The `<mmd.h>` header file.
This sentence contains _Emphasized Text_, __Bold Text__, and
~~Strikethrough Text~~ for testing the MMD parser.
*Emphasized Text Split
Across Two Lines*
**Bold Text Split
Across Two Lines**
`Code Text Split
Across Two lines`
_Emphasized Text Split
Across Two Lines_
__Bold Text Split
Across Two Lines__
~~Strikethrough Text Split
Across Two Lines~~
All work and no play makes Johnny a dull boy.
All work and no play makes Johnny a dull boy.
All work and no play makes Johnny a dull boy.
All work and no play makes Johnny a dull boy.
All work and no play makes Johnny a dull boy.
All work and no play makes Johnny a dull boy.
\(Escaped Parenthesis)
\(*Emphasized Parenthesis*)
\(**Boldface Parenthesis**)
\(`Code Parenthesis`)
Escaped backtick (`\``)
Table as code:
| Heading 1 | Heading 2 | Heading 3 |
| --------- | --------- | --------- |
| Cell 1,1 | Cell 1,2 | Cell 1,3 |
| Cell 2,1 | Cell 2,2 | Cell 2,3 |
| Cell 3,1 | Cell 3,2 | Cell 3,3 |
Table with leading/trailing pipes:
| Heading 1 | Heading 2 | Heading 3 |
| --------- | --------- | --------- |
| Cell 1,1 | Cell 1,2 | Cell 1,3 |
| Cell 2,1 | Cell 2,2 | Cell 2,3 |
| Cell 3,1 | Cell 3,2 | Cell 3,3 |
Table without leading/trailing pipes:
Heading 1 | Heading 2 | Heading 3
--------- | --------- | ---------
Cell 1,1 | Cell 1,2 | Cell 1,3
Cell 2,1 | Cell 2,2 | Cell 2,3
Cell 3,1 | Cell 3,2 | Cell 3,3
Table with alignment:
Left Alignment | Center Alignment | Right Alignment
:-------- | :-------: | --------:
Cell 1,1 | Cell 1,2 | 1
Cell 2,1 | Cell 2,2 | 12
Cell 3,1 | Cell 3,2 | 123
Table in block quote:
> Heading 1 | Heading 2 | Heading 3
> --------- | --------- | ---------
> Cell 1,1 | Cell 1,2 | Cell 1,3
> Cell 2,1 | Cell 2,2 | Cell 2,3
> Cell 3,1 | Cell 3,2 | Cell 3,3
# Tests for Bugs/Edge Cases
Paragraph with "|" that should not
be interpreted as a table.
code before a bulleted list
- First item
- Second item
- Some pathological nested link and inline style features supported by
CommonMark like "`******Really Strong Text******`".

2381
examples/mmd.c Normal file

File diff suppressed because it is too large Load Diff

112
examples/mmd.h Normal file
View File

@ -0,0 +1,112 @@
//
// Header file for miniature markdown library.
//
// https://www.msweet.org/mmd
//
// Copyright © 2017-2024 by Michael R Sweet.
//
// Licensed under Apache License v2.0. See the file "LICENSE" for more
// information.
//
#ifndef MMD_H
# define MMD_H
# include <stdio.h>
# include <stdbool.h>
# ifdef __cplusplus
extern "C" {
# endif // __cplusplus
//
// Constants...
//
enum mmd_option_e
{
MMD_OPTION_NONE = 0x00, // No markdown extensions
MMD_OPTION_METADATA = 0x01, // Jekyll metadata extension
MMD_OPTION_TABLES = 0x02, // Github table extension
MMD_OPTION_TASKS = 0x04, // Github task item extension (check boxes)
MMD_OPTION_ALL = 0x07 // All supported markdown extensions
};
typedef unsigned mmd_option_t;
typedef enum mmd_type_e
{
MMD_TYPE_NONE = -1,
MMD_TYPE_DOCUMENT, // The document root
MMD_TYPE_METADATA, // Document metadata
MMD_TYPE_BLOCK_QUOTE, // <blockquote>
MMD_TYPE_ORDERED_LIST, // <ol>
MMD_TYPE_UNORDERED_LIST, // <ul>
MMD_TYPE_LIST_ITEM, // <li>
MMD_TYPE_TABLE, // <table>
MMD_TYPE_TABLE_HEADER, // <thead>
MMD_TYPE_TABLE_BODY, // <tbody>
MMD_TYPE_TABLE_ROW, // <tr>
MMD_TYPE_HEADING_1 = 10, // <h1>
MMD_TYPE_HEADING_2, // <h2>
MMD_TYPE_HEADING_3, // <h3>
MMD_TYPE_HEADING_4, // <h4>
MMD_TYPE_HEADING_5, // <h5>
MMD_TYPE_HEADING_6, // <h6>
MMD_TYPE_PARAGRAPH, // <p>
MMD_TYPE_CODE_BLOCK, // <pre><code>
MMD_TYPE_THEMATIC_BREAK, // <hr />
MMD_TYPE_TABLE_HEADER_CELL, // <th>
MMD_TYPE_TABLE_BODY_CELL_LEFT, // <td align="left">
MMD_TYPE_TABLE_BODY_CELL_CENTER, // <td align="center">
MMD_TYPE_TABLE_BODY_CELL_RIGHT, // <td align="right">
MMD_TYPE_NORMAL_TEXT = 100, // Normal text
MMD_TYPE_EMPHASIZED_TEXT, // <em>text</em>
MMD_TYPE_STRONG_TEXT, // <strong>text</strong>
MMD_TYPE_STRUCK_TEXT, // <del>text</del>
MMD_TYPE_LINKED_TEXT, // <a href="link">text</a>
MMD_TYPE_CODE_TEXT, // <code>text</code>
MMD_TYPE_IMAGE, // <img src="link" />
MMD_TYPE_HARD_BREAK, // <br />
MMD_TYPE_SOFT_BREAK, // <wbr />
MMD_TYPE_METADATA_TEXT, // name: value
MMD_TYPE_CHECKBOX // [ ] or [x]
} mmd_type_t;
//
// Types...
//
typedef struct _mmd_s mmd_t; // Markdown node
typedef size_t (*mmd_iocb_t)(void *cbdata, char *buffer, size_t bytes);
// mmdLoadIO callback function
//
// Functions...
//
extern char *mmdCopyAllText(mmd_t *node);
extern void mmdFree(mmd_t *node);
extern const char *mmdGetExtra(mmd_t *node);
extern mmd_t *mmdGetFirstChild(mmd_t *node);
extern mmd_t *mmdGetLastChild(mmd_t *node);
extern const char *mmdGetMetadata(mmd_t *doc, const char *keyword);
extern mmd_t *mmdGetNextSibling(mmd_t *node);
extern mmd_option_t mmdGetOptions(void);
extern mmd_t *mmdGetParent(mmd_t *node);
extern mmd_t *mmdGetPrevSibling(mmd_t *node);
extern const char *mmdGetText(mmd_t *node);
extern mmd_type_t mmdGetType(mmd_t *node);
extern const char *mmdGetURL(mmd_t *node);
extern bool mmdGetWhitespace(mmd_t *node);
extern bool mmdIsBlock(mmd_t *node);
extern mmd_t *mmdLoad(mmd_t *root, const char *filename);
extern mmd_t *mmdLoadFile(mmd_t *root, FILE *fp);
extern mmd_t *mmdLoadIO(mmd_t *root, mmd_iocb_t cb, void *cbdata);
extern mmd_t *mmdLoadString(mmd_t *root, const char *s);
extern void mmdSetOptions(mmd_option_t options);
# ifdef __cplusplus
}
# endif // __cplusplus
#endif // !MMD_H

View File

@ -1,7 +1,7 @@
//
// PDF array functions for PDFio.
//
// Copyright © 2021 by Michael R Sweet.
// Copyright © 2021-2024 by Michael R Sweet.
//
// Licensed under Apache License v2.0. See the file "LICENSE" for more
// information.
@ -363,6 +363,9 @@ _pdfioArrayDebug(pdfio_array_t *a, // I - Array
_pdfio_value_t *v; // Current value
if (!a)
return;
putc('[', fp);
for (i = a->num_values, v = a->values; i > 0; i --, v ++)
_pdfioValueDebug(v, fp);
@ -634,6 +637,28 @@ _pdfioArrayRead(pdfio_file_t *pdf, // I - PDF file
}
//
// 'pdfioArrayRemove()' - Remove an array entry.
//
bool // O - `true` on success, `false` otherwise
pdfioArrayRemove(pdfio_array_t *a, // I - Array
size_t n) // I - Index
{
if (!a || n >= a->num_values)
return (false);
if (a->values[n].type == PDFIO_VALTYPE_BINARY)
free(a->values[n].value.binary.data);
a->num_values --;
if (n < a->num_values)
memmove(a->values + n, a->values + n + 1, (a->num_values - n) * sizeof(_pdfio_value_t));
return (true);
}
//
// '_pdfioArrayWrite()' - Write an array to a PDF file.
//

View File

@ -2349,7 +2349,7 @@ pdfioPageDictAddColorSpace(
bool // O - `true` on success, `false` on failure
pdfioPageDictAddFont(
pdfio_dict_t *dict, // I - Page dictionary
const char *name, // I - Font name
const char *name, // I - Font name; must not contain spaces
pdfio_obj_t *obj) // I - Font object
{
pdfio_dict_t *resources; // Resource dictionary

View File

@ -1,7 +1,7 @@
//
// PDF dictionary functions for PDFio.
//
// Copyright © 2021-2023 by Michael R Sweet.
// Copyright © 2021-2024 by Michael R Sweet.
//
// Licensed under Apache License v2.0. See the file "LICENSE" for more
// information.
@ -18,19 +18,22 @@ static int compare_pairs(_pdfio_pair_t *a, _pdfio_pair_t *b);
//
// '_pdfioDictClear()' - Remove a key/value pair from a dictionary.
// 'pdfioDictClear()' - Remove a key/value pair from a dictionary.
//
void
_pdfioDictClear(pdfio_dict_t *dict, // I - Dictionary
const char *key) // I - Key
bool // O - `true` if cleared, `false` otherwise
pdfioDictClear(pdfio_dict_t *dict, // I - Dictionary
const char *key) // I - Key
{
size_t idx; // Index into pairs
_pdfio_pair_t *pair, // Current pair
pkey; // Search key
PDFIO_DEBUG("_pdfioDictClear(dict=%p, key=\"%s\")\n", dict, key);
PDFIO_DEBUG("pdfioDictClear(dict=%p, key=\"%s\")\n", dict, key);
if (!dict || !key)
return (false);
// See if the key is already set...
if (dict->num_pairs > 0)
@ -48,8 +51,12 @@ _pdfioDictClear(pdfio_dict_t *dict, // I - Dictionary
if (idx < dict->num_pairs)
memmove(pair, pair + 1, (dict->num_pairs - idx) * sizeof(_pdfio_pair_t));
return (true);
}
}
return (false);
}
@ -194,6 +201,9 @@ _pdfioDictDebug(pdfio_dict_t *dict, // I - Dictionary
_pdfio_pair_t *pair; // Current pair
if (!dict)
return;
for (i = dict->num_pairs, pair = dict->pairs; i > 0; i --, pair ++)
{
fprintf(fp, "/%s", pair->key);
@ -332,6 +342,18 @@ pdfioDictGetDict(pdfio_dict_t *dict, // I - Dictionary
}
//
// 'pdfioDictGetKey()' - Get the key for the specified pair.
//
const char * // O - Key for specified pair
pdfioDictGetKey(pdfio_dict_t *dict, // I - Dictionary
size_t n) // I - Pair index (`0`-based)
{
return ((dict && n < dict->num_pairs) ? dict->pairs[n].key : NULL);
}
//
// 'pdfioDictGetName()' - Get a key name value from a dictionary.
//
@ -350,6 +372,17 @@ pdfioDictGetName(pdfio_dict_t *dict, // I - Dictionary
}
//
// 'pdfioDictGetNumPairs()' - Get the number of key/value pairs in a dictionary.
//
size_t // O - Number of pairs
pdfioDictGetNumPairs(pdfio_dict_t *dict)// I - Dictionary
{
return (dict ? dict->num_pairs : 0);
}
//
// 'pdfioDictGetNumber()' - Get a key number value from a dictionary.
//

View File

@ -188,6 +188,8 @@ pdfioFileCreate(
int fd; // File descriptor
PDFIO_DEBUG("pdfioFileCreate(filename=\"%s\", version=\"%s\", media_box=%p, crop_box=%p, error_cb=%p, error_cbdata=%p)\n", filename, version, (void *)media_box, (void *)crop_box, (void *)error_cb, (void *)error_cbdata);
// Range check input...
if (!filename)
return (NULL);
@ -242,6 +244,35 @@ pdfioFileCreateArrayObj(
}
//
// 'pdfioFileCreateNameObj()' - Create a new object in a PDF file containing a name.
//
// This function creates a new object with a name value in a PDF file.
// You must call @link pdfioObjClose@ to write the object to the file.
//
pdfio_obj_t * // O - New object
pdfioFileCreateNameObj(
pdfio_file_t *pdf, // I - PDF file
const char *name) // I - Name value
{
_pdfio_value_t value; // Object value
// Range check input...
if (!pdf || !name)
return (NULL);
value.type = PDFIO_VALTYPE_NAME;
value.value.name = pdfioStringCreate(pdf, name);
if (!value.value.name)
return (NULL);
return (_pdfioFileCreateObj(pdf, NULL, &value));
}
//
// 'pdfioFileCreateNumberObj()' - Create a new object in a PDF file containing a number.
//
@ -390,6 +421,8 @@ pdfioFileCreateOutput(
pdfio_error_cb_t error_cb, // I - Error callback or `NULL` for default
void *error_cbdata) // I - Error callback data, if any
{
PDFIO_DEBUG("pdfioFileCreate(output_cb=%p, output_cbdata=%p, version=\"%s\", media_box=%p, crop_box=%p, error_cb=%p, error_cbdata=%p)\n", (void *)output_cb, (void *)output_cbdata, version, (void *)media_box, (void *)crop_box, (void *)error_cb, (void *)error_cbdata);
return (create_common("output.pdf", /*fd*/-1, output_cb, output_cbdata, version, media_box, crop_box, error_cb, error_cbdata));
}
@ -524,6 +557,8 @@ pdfioFileCreateTemporary(
unsigned tmpnum; // Temporary filename number
PDFIO_DEBUG("pdfioFileCreate(buffer=%p, bufsize=%lu, version=\"%s\", media_box=%p, crop_box=%p, error_cb=%p, error_cbdata=%p)\n", (void *)buffer, (unsigned long)bufsize, version, (void *)media_box, (void *)crop_box, (void *)error_cb, (void *)error_cbdata);
// Range check input...
if (!buffer || bufsize < 32)
{
@ -648,11 +683,12 @@ pdfioFileFindObj(
if ((current = number - 1) >= pdf->num_objs)
current = pdf->num_objs / 2;
PDFIO_DEBUG("pdfioFileFindObj: objs[current=%lu]=%p\n", (unsigned long)current, (void *)pdf->objs[current]);
PDFIO_DEBUG("pdfioFileFindObj: objs[current=%lu]=%p(%lu)\n", (unsigned long)current, (void *)pdf->objs[current], (unsigned long)(pdf->objs[current] ? pdf->objs[current]->number : 0));
if (number == pdf->objs[current]->number)
{
// Fast match...
PDFIO_DEBUG("pdfioFileFindObj: Returning %lu (%p)\n", (unsigned long)current, pdf->objs[current]);
return (pdf->objs[current]);
}
else if (number < pdf->objs[current]->number)
@ -679,11 +715,20 @@ pdfioFileFindObj(
}
if (number == pdf->objs[left]->number)
{
PDFIO_DEBUG("pdfioFileFindObj: Returning %lu (%p)\n", (unsigned long)left, pdf->objs[left]);
return (pdf->objs[left]);
}
else if (number == pdf->objs[right]->number)
{
PDFIO_DEBUG("pdfioFileFindObj: Returning %lu (%p)\n", (unsigned long)right, pdf->objs[right]);
return (pdf->objs[right]);
}
else
{
PDFIO_DEBUG("pdfioFileFindObj: Returning NULL\n");
return (NULL);
}
}
@ -928,6 +973,8 @@ pdfioFileOpen(
off_t xref_offset; // Offset to xref table
PDFIO_DEBUG("pdfioFileOpen(filename=\"%s\", password_cb=%p, password_cbdata=%p, error_cb=%p, error_cbdata=%p)\n", filename, (void *)password_cb, (void *)password_cbdata, (void *)error_cb, (void *)error_cbdata);
// Range check input...
if (!filename)
return (NULL);
@ -1285,6 +1332,8 @@ create_common(
unsigned char id_value[16]; // File ID value
PDFIO_DEBUG("create_common(filename=\"%s\", fd=%d, output_cb=%p, output_cbdata=%p, version=\"%s\", media_box=%p, crop_box=%p, error_cb=%p, error_cbdata=%p)\n", filename, fd, (void *)output_cb, (void *)output_cbdata, version, (void *)media_box, (void *)crop_box, (void *)error_cb, (void *)error_cbdata);
// Range check input...
if (!filename || (fd < 0 && !output_cb))
return (NULL);
@ -1928,7 +1977,7 @@ load_xref(
{
PDFIO_DEBUG("load_xref: '%s' at offset %lu\n", line, (unsigned long)trailer_offset);
if (!strncmp(line, "trailer", 7) && (!line[7] || isspace(line[7] & 255)))
if (!strncmp(line, "trailer", 7) && (!line[7] || isspace(line[7] & 255) || line[7] == '<'))
{
if (line[7])
{

View File

@ -1,7 +1,7 @@
//
// PDF object functions for PDFio.
//
// Copyright © 2021-2023 by Michael R Sweet.
// Copyright © 2021-2024 by Michael R Sweet.
//
// Licensed under Apache License v2.0. See the file "LICENSE" for more
// information.
@ -99,7 +99,7 @@ pdfioObjCopy(pdfio_file_t *pdf, // I - PDF file
return (NULL);
if (dstobj->value.type == PDFIO_VALTYPE_DICT)
_pdfioDictClear(dstobj->value.value.dict, "Length");
pdfioDictClear(dstobj->value.value.dict, "Length");
if (srcobj->stream_offset)
{
@ -333,6 +333,26 @@ pdfioObjGetLength(pdfio_obj_t *obj) // I - Object
}
//
// 'pdfioObjGetName()' - Get the name value associated with an object.
//
const char * // O - Dictionary or `NULL` on error
pdfioObjGetName(pdfio_obj_t *obj) // I - Object
{
if (!obj)
return (NULL);
if (obj->value.type == PDFIO_VALTYPE_NONE)
_pdfioObjLoad(obj);
if (obj->value.type == PDFIO_VALTYPE_NAME)
return (obj->value.value.name);
else
return (NULL);
}
//
// 'pdfioObjGetNumber()' - Get the object's number.
//
@ -347,8 +367,21 @@ pdfioObjGetNumber(pdfio_obj_t *obj) // I - Object
//
// 'pdfioObjGetSubtype()' - Get an object's subtype.
//
// This function returns an object's PDF subtype name, if any. Common subtype
// names include:
//
// - "CIDFontType0": A CID Type0 font
// - "CIDFontType2": A CID TrueType font
// - "Image": An image or image mask
// - "Form": A fillable form
// - "OpenType": An OpenType font
// - "Type0": A composite font
// - "Type1": A PostScript Type1 font
// - "Type3": A PDF Type3 font
// - "TrueType": A TrueType font
//
const char * // O - Object subtype
const char * // O - Object subtype name or `NULL` for none
pdfioObjGetSubtype(pdfio_obj_t *obj) // I - Object
{
pdfio_dict_t *dict; // Object dictionary
@ -364,8 +397,21 @@ pdfioObjGetSubtype(pdfio_obj_t *obj) // I - Object
//
// 'pdfioObjGetType()' - Get an object's type.
//
// This function returns an object's PDF type name, if any. Common type names
// include:
//
// - "CMap": A character map for composite fonts
// - "Font": An embedded font (@link pdfioObjGetSubtype@ will tell you the
// font format)
// - "FontDescriptor": A font descriptor
// - "Page": A (visible) page
// - "Pages": A page tree node
// - "Template": An invisible template page
// - "XObject": An image, image mask, or form (@link pdfioObjGetSubtype@ will
// tell you which)
//
const char * // O - Object type
const char * // O - Object type name or `NULL` for none
pdfioObjGetType(pdfio_obj_t *obj) // I - Object
{
pdfio_dict_t *dict; // Object dictionary

View File

@ -353,7 +353,6 @@ extern void _pdfioCryptoSHA256Init(_pdfio_sha256_t *ctx) _PDFIO_INTERNAL;
extern void _pdfioCryptoSHA256Finish(_pdfio_sha256_t *ctx, uint8_t *Message_Digest) _PDFIO_INTERNAL;
extern bool _pdfioCryptoUnlock(pdfio_file_t *pdf, pdfio_password_cb_t password_cb, void *password_data) _PDFIO_INTERNAL;
extern void _pdfioDictClear(pdfio_dict_t *dict, const char *key) _PDFIO_INTERNAL;
extern bool _pdfioDictDecrypt(pdfio_file_t *pdf, pdfio_obj_t *obj, pdfio_dict_t *dict, size_t depth) _PDFIO_INTERNAL;
extern void _pdfioDictDebug(pdfio_dict_t *dict, FILE *fp) _PDFIO_INTERNAL;
extern void _pdfioDictDelete(pdfio_dict_t *dict) _PDFIO_INTERNAL;

View File

@ -408,6 +408,7 @@ _pdfioStreamOpen(pdfio_obj_t *obj, // I - Object
pdfio_stream_t *st; // Stream
pdfio_dict_t *dict = pdfioObjGetDict(obj);
// Object dictionary
const char *type; // Object type
PDFIO_DEBUG("_pdfioStreamOpen(obj=%p(%u), decode=%s)\n", obj, (unsigned)obj->number, decode ? "true" : "false");
@ -434,7 +435,9 @@ _pdfioStreamOpen(pdfio_obj_t *obj, // I - Object
return (NULL);
}
if (obj->pdf->encryption)
type = pdfioObjGetType(obj);
if (obj->pdf->encryption && (!type || strcmp(type, "XRef")))
{
uint8_t iv[64]; // Initialization vector
size_t ivlen; // Length of initialization vector, if any
@ -1061,19 +1064,11 @@ stream_read(pdfio_stream_t *st, // I - Stream
st->flate.next_out = (Bytef *)buffer;
st->flate.avail_out = (uInt)bytes;
avail_in = st->flate.avail_in;
avail_out = st->flate.avail_out;
if ((status = inflate(&(st->flate), Z_NO_FLUSH)) < Z_OK)
{
_pdfioFileError(st->pdf, "Unable to decompress stream data for object %ld: %s", (long)st->obj->number, zstrerror(status));
return (-1);
}
else if (avail_in == st->flate.avail_in && avail_out == st->flate.avail_out)
{
_pdfioFileError(st->pdf, "Corrupt stream data.");
return (-1);
}
return (st->flate.next_out - (Bytef *)buffer);
}

View File

@ -215,6 +215,9 @@ void
_pdfioValueDebug(_pdfio_value_t *v, // I - Value
FILE *fp) // I - Output file
{
if (!v)
return;
switch (v->type)
{
case PDFIO_VALTYPE_ARRAY :

View File

@ -23,7 +23,7 @@ extern "C" {
// Version number...
//
# define PDFIO_VERSION "1.3.0"
# define PDFIO_VERSION "1.4.0"
//
@ -151,7 +151,9 @@ extern pdfio_obj_t *pdfioArrayGetObj(pdfio_array_t *a, size_t n) _PDFIO_PUBLIC;
extern size_t pdfioArrayGetSize(pdfio_array_t *a) _PDFIO_PUBLIC;
extern const char *pdfioArrayGetString(pdfio_array_t *a, size_t n) _PDFIO_PUBLIC;
extern pdfio_valtype_t pdfioArrayGetType(pdfio_array_t *a, size_t n) _PDFIO_PUBLIC;
extern bool pdfioArrayRemove(pdfio_array_t *a, size_t n) _PDFIO_PUBLIC;
extern bool pdfioDictClear(pdfio_dict_t *dict, const char *key) _PDFIO_PUBLIC;
extern pdfio_dict_t *pdfioDictCopy(pdfio_file_t *pdf, pdfio_dict_t *dict) _PDFIO_PUBLIC;
extern pdfio_dict_t *pdfioDictCreate(pdfio_file_t *pdf) _PDFIO_PUBLIC;
extern pdfio_array_t *pdfioDictGetArray(pdfio_dict_t *dict, const char *key) _PDFIO_PUBLIC;
@ -159,7 +161,9 @@ extern unsigned char *pdfioDictGetBinary(pdfio_dict_t *dict, const char *key, si
extern bool pdfioDictGetBoolean(pdfio_dict_t *dict, const char *key) _PDFIO_PUBLIC;
extern time_t pdfioDictGetDate(pdfio_dict_t *dict, const char *key) _PDFIO_PUBLIC;
extern pdfio_dict_t *pdfioDictGetDict(pdfio_dict_t *dict, const char *key) _PDFIO_PUBLIC;
extern const char *pdfioDictGetKey(pdfio_dict_t *dict, size_t n) _PDFIO_PUBLIC;
extern const char *pdfioDictGetName(pdfio_dict_t *dict, const char *key) _PDFIO_PUBLIC;
extern size_t pdfioDictGetNumPairs(pdfio_dict_t *dict) _PDFIO_PUBLIC;
extern double pdfioDictGetNumber(pdfio_dict_t *dict, const char *key) _PDFIO_PUBLIC;
extern pdfio_obj_t *pdfioDictGetObj(pdfio_dict_t *dict, const char *key) _PDFIO_PUBLIC;
extern pdfio_rect_t *pdfioDictGetRect(pdfio_dict_t *dict, const char *key, pdfio_rect_t *rect) _PDFIO_PUBLIC;
@ -182,6 +186,7 @@ extern bool pdfioDictSetStringf(pdfio_dict_t *dict, const char *key, const char
extern bool pdfioFileClose(pdfio_file_t *pdf) _PDFIO_PUBLIC;
extern pdfio_file_t *pdfioFileCreate(const char *filename, const char *version, pdfio_rect_t *media_box, pdfio_rect_t *crop_box, pdfio_error_cb_t error_cb, void *error_data) _PDFIO_PUBLIC;
extern pdfio_obj_t *pdfioFileCreateArrayObj(pdfio_file_t *pdf, pdfio_array_t *array) _PDFIO_PUBLIC;
extern pdfio_obj_t *pdfioFileCreateNameObj(pdfio_file_t *pdf, const char *name) _PDFIO_PUBLIC;
extern pdfio_obj_t *pdfioFileCreateNumberObj(pdfio_file_t *pdf, double number) _PDFIO_PUBLIC;
extern pdfio_obj_t *pdfioFileCreateObj(pdfio_file_t *pdf, pdfio_dict_t *dict) _PDFIO_PUBLIC;
extern pdfio_file_t *pdfioFileCreateOutput(pdfio_output_cb_t output_cb, void *output_ctx, const char *version, pdfio_rect_t *media_box, pdfio_rect_t *crop_box, pdfio_error_cb_t error_cb, void *error_data) _PDFIO_PUBLIC;
@ -222,6 +227,7 @@ extern pdfio_array_t *pdfioObjGetArray(pdfio_obj_t *obj) _PDFIO_PUBLIC;
extern pdfio_dict_t *pdfioObjGetDict(pdfio_obj_t *obj) _PDFIO_PUBLIC;
extern unsigned short pdfioObjGetGeneration(pdfio_obj_t *obj) _PDFIO_PUBLIC;
extern size_t pdfioObjGetLength(pdfio_obj_t *obj) _PDFIO_PUBLIC;
extern const char *pdfioObjGetName(pdfio_obj_t *obj) _PDFIO_PUBLIC;
extern size_t pdfioObjGetNumber(pdfio_obj_t *obj) _PDFIO_PUBLIC;
extern const char *pdfioObjGetSubtype(pdfio_obj_t *obj) _PDFIO_PUBLIC;
extern const char *pdfioObjGetType(pdfio_obj_t *obj) _PDFIO_PUBLIC;

View File

@ -1,7 +1,8 @@
LIBRARY pdfio1
VERSION 1.2
VERSION 1.4
EXPORTS
_pdfioArrayDebug
_pdfioArrayDecrypt
_pdfioArrayDelete
_pdfioArrayGetValue
_pdfioArrayRead
@ -22,8 +23,8 @@ _pdfioCryptoSHA256Append
_pdfioCryptoSHA256Finish
_pdfioCryptoSHA256Init
_pdfioCryptoUnlock
_pdfioDictClear
_pdfioDictDebug
_pdfioDictDecrypt
_pdfioDictDelete
_pdfioDictGetValue
_pdfioDictRead
@ -61,9 +62,12 @@ _pdfioTokenPush
_pdfioTokenRead
_pdfioValueCopy
_pdfioValueDebug
_pdfioValueDecrypt
_pdfioValueDelete
_pdfioValueRead
_pdfioValueWrite
_pdfio_strtod
_pdfio_vsnprintf
pdfioArrayAppendArray
pdfioArrayAppendBinary
pdfioArrayAppendBoolean
@ -91,6 +95,7 @@ pdfioArrayGetObj
pdfioArrayGetSize
pdfioArrayGetString
pdfioArrayGetType
pdfioArrayRemove
pdfioContentClip
pdfioContentDrawImage
pdfioContentFill
@ -148,6 +153,7 @@ pdfioContentTextNextLine
pdfioContentTextShow
pdfioContentTextShowJustified
pdfioContentTextShowf
pdfioDictClear
pdfioDictCopy
pdfioDictCreate
pdfioDictGetArray
@ -155,7 +161,9 @@ pdfioDictGetBinary
pdfioDictGetBoolean
pdfioDictGetDate
pdfioDictGetDict
pdfioDictGetKey
pdfioDictGetName
pdfioDictGetNumPairs
pdfioDictGetNumber
pdfioDictGetObj
pdfioDictGetRect
@ -182,6 +190,7 @@ pdfioFileCreateFontObjFromFile
pdfioFileCreateICCObjFromFile
pdfioFileCreateImageObjFromData
pdfioFileCreateImageObjFromFile
pdfioFileCreateNameObj
pdfioFileCreateNumberObj
pdfioFileCreateObj
pdfioFileCreateOutput
@ -190,6 +199,7 @@ pdfioFileCreateStringObj
pdfioFileCreateTemporary
pdfioFileFindObj
pdfioFileGetAuthor
pdfioFileGetCatalog
pdfioFileGetCreationDate
pdfioFileGetCreator
pdfioFileGetID
@ -222,6 +232,7 @@ pdfioObjGetArray
pdfioObjGetDict
pdfioObjGetGeneration
pdfioObjGetLength
pdfioObjGetName
pdfioObjGetNumber
pdfioObjGetSubtype
pdfioObjGetType

View File

@ -3,7 +3,7 @@
<metadata>
<id>pdfio_native</id>
<title>PDFio Library for VS2019+</title>
<version>1.3.0</version>
<version>1.3.2</version>
<authors>Michael R Sweet</authors>
<owners>michaelrsweet</owners>
<projectUrl>https://github.com/michaelrsweet/pappl</projectUrl>
@ -16,7 +16,7 @@
<copyright>Copyright © 2019-2024 by Michael R Sweet</copyright>
<tags>pdf file native</tags>
<dependencies>
<dependency id="pdfio_native.redist" version="1.3.0" />
<dependency id="pdfio_native.redist" version="1.3.2" />
<dependency id="zlib_native.redist" version="1.2.11" />
</dependencies>
</metadata>

View File

@ -3,7 +3,7 @@
<metadata>
<id>pdfio_native.redist</id>
<title>PDFio Library for VS2019+</title>
<version>1.3.0</version>
<version>1.3.2</version>
<authors>Michael R Sweet</authors>
<owners>michaelrsweet</owners>
<projectUrl>https://github.com/michaelrsweet/pappl</projectUrl>

View File

@ -27,7 +27,7 @@
//
static int do_crypto_tests(void);
static int do_test_file(const char *filename, int objnum, bool verbose);
static int do_test_file(const char *filename, int objnum, const char *password, bool verbose);
static int do_unit_tests(void);
static int draw_image(pdfio_stream_t *st, const char *name, double x, double y, double w, double h, const char *label);
static bool error_cb(pdfio_file_t *pdf, const char *message, bool *error);
@ -37,6 +37,7 @@ static const char *password_cb(void *data, const char *filename);
static int read_unit_file(const char *filename, size_t num_pages, size_t first_image, bool is_output);
static ssize_t token_consume_cb(const char **s, size_t bytes);
static ssize_t token_peek_cb(const char **s, char *buffer, size_t bytes);
static int usage(FILE *fp);
static int verify_image(pdfio_file_t *pdf, size_t number);
static int write_alpha_test(pdfio_file_t *pdf, int number, pdfio_obj_t *font);
static int write_color_patch(pdfio_stream_t *st, bool device);
@ -59,22 +60,33 @@ int // O - Exit status
main(int argc, // I - Number of command-line arguments
char *argv[]) // I - Command-line arguments
{
int ret = 0; // Return value
int ret = 0; // Return value
fprintf(stderr, "testpdfio: Test locale is \"%s\".\n", setlocale(LC_ALL, getenv("LANG")));
if (argc > 1)
{
int i; // Looping var
const char *password = NULL; // Password
bool verbose = false; // Be verbose?
for (i = 1; i < argc; i ++)
{
if (!strcmp(argv[i], "--help"))
{
puts("Usage: ./testpdfio [--help] [--verbose] [filename [objnum] ...]");
return (0);
return (usage(stdout));
}
else if (!strcmp(argv[i], "--password"))
{
i ++;
if (i < argc)
{
password = argv[i];
}
else
{
fputs("testpdfio: Missing password after '--password'.\n", stderr);
return (usage(stderr));
}
}
else if (!strcmp(argv[i], "--verbose"))
{
@ -82,24 +94,27 @@ main(int argc, // I - Number of command-line arguments
}
else if (argv[i][0] == '-')
{
printf("Unknown option '%s'.\n\n", argv[i]);
puts("Usage: ./testpdfio [--help] [--verbose] [filename [objnum] ...]");
return (1);
fprintf(stderr, "testpdfio: Unknown option '%s'.\n", argv[i]);
return (usage(stderr));
}
else if ((i + 1) < argc && isdigit(argv[i + 1][0] & 255))
{
// filename.pdf object-number
if (do_test_file(argv[i], atoi(argv[i + 1]), verbose))
if (do_test_file(argv[i], atoi(argv[i + 1]), password, verbose))
ret = 1;
i ++;
}
else if (do_test_file(argv[i], 0, verbose))
else if (do_test_file(argv[i], 0, password, verbose))
{
ret = 1;
}
}
}
else
{
fprintf(stderr, "testpdfio: Test locale is \"%s\".\n", setlocale(LC_ALL, getenv("LANG")));
#if _WIN32
// Windows puts executables in Platform/Configuration subdirs...
if (!_access("../../testfiles", 0))
@ -363,6 +378,7 @@ do_crypto_tests(void)
static int // O - Exit status
do_test_file(const char *filename, // I - PDF filename
int objnum, // I - Object number to dump, if any
const char *password, // I - Password for file
bool verbose) // I - Be verbose?
{
bool error = false; // Have we shown an error yet?
@ -381,7 +397,7 @@ do_test_file(const char *filename, // I - PDF filename
fflush(stdout);
}
if ((pdf = pdfioFileOpen(filename, /*password_cb*/NULL, /*password_data*/NULL, (pdfio_error_cb_t)error_cb, &error)) != NULL)
if ((pdf = pdfioFileOpen(filename, password_cb, (void *)password, (pdfio_error_cb_t)error_cb, &error)) != NULL)
{
if (objnum)
{
@ -1559,6 +1575,23 @@ token_peek_cb(const char **s, // IO - Test string
}
//
// 'usage()' - Show program usage.
//
static int // O - Exit status
usage(FILE *fp) // I - Output file
{
fputs("Usage: ./testpdfio [OPTIONS] [FILENAME [OBJNUM]] ...\n", fp);
fputs("Options:\n", fp);
fputs(" --help Show program help.\n", fp);
fputs(" --password PASSWORD Set PDF password.\n", fp);
fputs(" --verbose Be verbose.\n", fp);
return (fp != stdout);
}
//
// 'verify_image()' - Verify an image object.
//

View File

@ -3,7 +3,7 @@
//
// https://github.com/michaelrsweet/ttf
//
// Copyright © 2018-2023 by Michael R Sweet.
// Copyright © 2018-2024 by Michael R Sweet.
//
// Licensed under Apache License v2.0. See the file "LICENSE" for more
// information.
@ -120,6 +120,7 @@ test_font(const char *filename) // I - Font filename
printf("ttfCreate(\"%s\"): ", filename);
fflush(stdout);
if ((font = ttfCreate(filename, 0, error_cb, NULL)) != NULL)
puts("PASS");
else

127
ttf.c
View File

@ -3,7 +3,7 @@
//
// https://github.com/michaelrsweet/ttf
//
// Copyright © 2018-2023 by Michael R Sweet.
// Copyright © 2018-2024 by Michael R Sweet.
//
// Licensed under Apache License v2.0. See the file "LICENSE" for more
// information.
@ -62,7 +62,7 @@
# define O_CREAT _O_CREAT
# define O_TRUNC _O_TRUNC
typedef __int64 ssize_t; // POSIX type not present on Windows...
typedef __int64 ssize_t; // POSIX type not present on Windows... @private@
#else
# include <unistd.h>
@ -99,6 +99,8 @@ typedef __int64 ssize_t; // POSIX type not present on Windows...
//
#define TTF_FONT_MAX_CHAR 262144 // Maximum number of character values
#define TTF_FONT_MAX_GROUPS 65536 // Maximum number of sub-groups
#define TTF_FONT_MAX_NAMES 16777216// Maximum size of names table we support
//
@ -254,7 +256,7 @@ typedef struct _ttf_off_hhea_s // Horizontal header
{
short ascender, // Ascender
descender; // Descender
int numberOfHMetrics; // Number of horizontal metrics
unsigned short numberOfHMetrics; // Number of horizontal metrics
} _ttf_off_hhea_t;
typedef struct _ttf_off_os_2_s // OS/2 information
@ -297,7 +299,28 @@ static unsigned seek_table(ttf_t *font, unsigned tag, unsigned offset, bool requ
//
// 'ttfCreate()' - Create a new font object for the named font family.
// 'ttfCreate()' - Create a new font object for the named font file.
//
// This function creates a new font object for the named TrueType or OpenType
// font file or collection. The "filename" argument specifies the name of the
// file to read.
//
// The "idx" argument specifies the font to load from a collection - the first
// font is number `0`. Once created, you can call the @link ttfGetNumFonts@
// function to determine whether the loaded font file is a collection with more
// than one font.
//
// The "err_cb" and "err_data" arguments specify a callback function and data
// pointer for receiving error messages. If `NULL`, errors are sent to the
// `stderr` file. The callback function receives the data pointer and a text
// message string, for example:
//
// ```
// void my_err_cb(void *err_data, const char *message)
// {
// fprintf(stderr, "ERROR: %s\n", message);
// }
// ```
//
ttf_t * // O - New font object
@ -550,6 +573,10 @@ ttfGetAscent(ttf_t *font) // I - Font
//
// 'ttfGetBounds()' - Get the bounds of all characters in a font.
//
// This function gets the bounds of all characters in a font. The "bounds"
// argument is a pointer to a `ttf_rect_t` structure that will be filled with
// the limits for characters in the font scaled to a 1000x1000 unit square.
//
ttf_rect_t * // O - Bounds or `NULL` on error
ttfGetBounds(ttf_t *font, // I - Font
@ -631,8 +658,11 @@ ttfGetDescent(ttf_t *font) // I - Font
//
// 'ttfGetExtents()' - Get the extents of a UTF-8 string.
//
// This function computes the extents of a UTF-8 string when rendered using the
// specified font and size.
// This function computes the extents of the UTF-8 string "s" when rendered
// using the specified font "font" and size "size". The "extents" argument is
// a pointer to a `ttf_rect_t` structure that is filled with the extents of a
// simple rendering of the string with no kerning or rewriting applied. The
// values are scaled using the specified font size.
//
ttf_rect_t * // O - Pointer to extents or `NULL` on error
@ -1272,20 +1302,38 @@ read_cmap(ttf_t *font) // I - Font
for (i = 0; i < numGlyphIdArray; i ++)
glyphIdArray[i] = read_ushort(font);
#ifdef DEBUG
for (i = 0; i < segCount; i ++)
TTF_DEBUG("read_cmap: segment[%d].startCode=%d, endCode=%d, idDelta=%d, idRangeOffset=%d\n", i, segments[i].startCode, segments[i].endCode, segments[i].idDelta, segments[i].idRangeOffset);
for (i = 0, segment = segments; i < segCount; i ++, segment ++)
{
TTF_DEBUG("read_cmap: segment[%d].startCode=%d, endCode=%d, idDelta=%d, idRangeOffset=%d\n", i, segment->startCode, segment->endCode, segment->idDelta, segment->idRangeOffset);
if (segment->startCode > segment->endCode)
{
errorf(font, "Bad cmap table segment %u to %u.", segments->startCode, segment->endCode);
free(segments);
free(glyphIdArray);
return (false);
}
// Based on the end code of the segment table, allocate space for the
// uncompressed cmap table...
if (segment->endCode >= font->num_cmap)
font->num_cmap = segment->endCode + 1;
}
#ifdef DEBUG
for (i = 0; i < numGlyphIdArray; i ++)
TTF_DEBUG("read_cmap: glyphIdArray[%d]=%d\n", i, glyphIdArray[i]);
#endif /* DEBUG */
// Based on the end code of the segent table, allocate space for the
// uncompressed cmap table...
// segCount --; // Last segment is not used (sigh)
if (font->num_cmap == 0 || font->num_cmap > TTF_FONT_MAX_CHAR)
{
errorf(font, "Invalid cmap table with %u characters.", (unsigned)font->num_cmap);
free(segments);
free(glyphIdArray);
return (false);
}
font->num_cmap = segments[segCount - 1].endCode + 1;
font->cmap = cmapptr = (int *)malloc(font->num_cmap * sizeof(int));
font->cmap = cmapptr = (int *)malloc(font->num_cmap * sizeof(int));
if (!font->cmap)
{
@ -1356,6 +1404,12 @@ read_cmap(ttf_t *font) // I - Font
TTF_DEBUG("read_cmap: nGroups=%u\n", nGroups);
if (nGroups > TTF_FONT_MAX_GROUPS)
{
errorf(font, "Invalid cmap table with %u groups.", nGroups);
return (false);
}
if ((groups = (_ttf_off_cmap12_t *)calloc(nGroups, sizeof(_ttf_off_cmap12_t))) == NULL)
{
errorf(font, "Unable to allocate memory for cmap.");
@ -1369,6 +1423,13 @@ read_cmap(ttf_t *font) // I - Font
group->startGlyphID = read_ulong(font);
TTF_DEBUG("read_cmap: [%u] startCharCode=%u, endCharCode=%u, startGlyphID=%u\n", gidx, group->startCharCode, group->endCharCode, group->startGlyphID);
if (group->startCharCode > group->endCharCode)
{
errorf(font, "Bad cmap table segment %u to %u.", group->startCharCode, group->endCharCode);
free(groups);
return (false);
}
if (group->endCharCode >= font->num_cmap)
font->num_cmap = group->endCharCode + 1;
}
@ -1376,6 +1437,14 @@ read_cmap(ttf_t *font) // I - Font
// Based on the end code of the segent table, allocate space for the
// uncompressed cmap table...
TTF_DEBUG("read_cmap: num_cmap=%u\n", (unsigned)font->num_cmap);
if (font->num_cmap == 0 || font->num_cmap > TTF_FONT_MAX_CHAR)
{
errorf(font, "Invalid cmap table with %u characters.", (unsigned)font->num_cmap);
free(groups);
return (false);
}
font->cmap = cmapptr = (int *)malloc(font->num_cmap * sizeof(int));
if (!font->cmap)
@ -1426,6 +1495,12 @@ read_cmap(ttf_t *font) // I - Font
TTF_DEBUG("read_cmap: nGroups=%u\n", nGroups);
if (nGroups > TTF_FONT_MAX_GROUPS)
{
errorf(font, "Invalid cmap table with %u groups.", nGroups);
return (false);
}
if ((groups = (_ttf_off_cmap13_t *)calloc(nGroups, sizeof(_ttf_off_cmap13_t))) == NULL)
{
errorf(font, "Unable to allocate memory for cmap.");
@ -1439,6 +1514,13 @@ read_cmap(ttf_t *font) // I - Font
group->glyphID = read_ulong(font);
TTF_DEBUG("read_cmap: [%u] startCharCode=%u, endCharCode=%u, glyphID=%u\n", gidx, group->startCharCode, group->endCharCode, group->glyphID);
if (group->startCharCode > group->endCharCode)
{
errorf(font, "Bad cmap table segment %u to %u.", group->startCharCode, group->endCharCode);
free(groups);
return (false);
}
if (group->endCharCode >= font->num_cmap)
font->num_cmap = group->endCharCode + 1;
}
@ -1446,6 +1528,14 @@ read_cmap(ttf_t *font) // I - Font
// Based on the end code of the segent table, allocate space for the
// uncompressed cmap table...
TTF_DEBUG("read_cmap: num_cmap=%u\n", (unsigned)font->num_cmap);
if (font->num_cmap == 0 || font->num_cmap > TTF_FONT_MAX_CHAR)
{
errorf(font, "Invalid cmap table with %u characters.", (unsigned)font->num_cmap);
free(groups);
return (false);
}
font->cmap = cmapptr = (int *)malloc(font->num_cmap * sizeof(int));
if (!font->cmap)
@ -1565,7 +1655,7 @@ read_hmtx(ttf_t *font, // I - Font
_ttf_off_hhea_t *hhea) // O - hhea table data
{
unsigned length; // Length of hmtx table
int i; // Looping var
unsigned i; // Looping var
_ttf_metric_t *widths; // Glyph metrics array
@ -1644,8 +1734,15 @@ read_names(ttf_t *font) // I - Font
return (false);
font->names.storage_size = length - (unsigned)offset;
if (font->names.storage_size > TTF_FONT_MAX_NAMES)
{
errorf(font, "Name table too large - %u bytes.", (unsigned)font->names.storage_size);
return (false);
}
if ((font->names.storage = malloc(font->names.storage_size)) == NULL)
return (false);
memset(font->names.storage, 'A', font->names.storage_size);
for (i = font->names.num_names, name = font->names.names; i > 0; i --, name ++)

17
ttf.h
View File

@ -3,7 +3,7 @@
//
// https://github.com/michaelrsweet/ttf
//
// Copyright © 2018-2023 by Michael R Sweet.
// Copyright © 2018-2024 by Michael R Sweet.
//
// Licensed under Apache License v2.0. See the file "LICENSE" for more
// information.
@ -11,6 +11,7 @@
#ifndef TTF_H
# define TTF_H
# include <stddef.h>
# include <stdbool.h>
# include <sys/types.h>
# ifdef __cplusplus
@ -22,12 +23,12 @@ extern "C" {
// Types...
//
typedef struct _ttf_s ttf_t; //// Font object
typedef struct _ttf_s ttf_t; // Font object
typedef void (*ttf_err_cb_t)(void *data, const char *message);
//// Font error callback
// Font error callback
typedef enum ttf_stretch_e //// Font stretch
typedef enum ttf_stretch_e // Font stretch
{
TTF_STRETCH_NORMAL, // normal
TTF_STRETCH_ULTRA_CONDENSED, // ultra-condensed
@ -40,20 +41,20 @@ typedef enum ttf_stretch_e //// Font stretch
TTF_STRETCH_ULTRA_EXPANDED // ultra-expanded
} ttf_stretch_t;
typedef enum ttf_style_e //// Font style
typedef enum ttf_style_e // Font style
{
TTF_STYLE_NORMAL, // Normal font
TTF_STYLE_ITALIC, // Italic font
TTF_STYLE_OBLIQUE // Oblique (angled) font
} ttf_style_t;
typedef enum ttf_variant_e //// Font variant
typedef enum ttf_variant_e // Font variant
{
TTF_VARIANT_NORMAL, // Normal font
TTF_VARIANT_SMALL_CAPS // Font whose lowercase letters are small capitals
} ttf_variant_t;
typedef enum ttf_weight_e //// Font weight
typedef enum ttf_weight_e // Font weight
{
TTF_WEIGHT_100 = 100, // Weight 100 (Thin)
TTF_WEIGHT_200 = 200, // Weight 200 (Extra/Ultra-Light)
@ -66,7 +67,7 @@ typedef enum ttf_weight_e //// Font weight
TTF_WEIGHT_900 = 900 // Weight 900 (Black/Heavy)
} ttf_weight_t;
typedef struct ttf_rect_s //// Bounding rectangle
typedef struct ttf_rect_s // Bounding rectangle
{
float left; // Left offset
float top; // Top offset