Michael R Sweet
Copyright © 2021-2024 by Michael R Sweet
PDFio is a simple C library for reading and writing PDF files. The primary goals of PDFio are:
Read and write any version of PDF file
Provide access to pages, objects, and streams within a PDF file
Support reading and writing of encrypted PDF files
Extract or embed useful metadata (author, creator, page information, etc.)
"Filter" PDF files, for example to extract a range of pages or to embed fonts that are missing from a PDF
Provide access to objects used for each page
PDFio is not concerned with rendering or viewing a PDF file, although a PDF RIP or viewer could be written using it.
PDFio is Copyright © 2021-2024 by Michael R Sweet and is licensed under the Apache License Version 2.0 with an (optional) exception to allow linking against GPL2/LGPL2 software. See the files "LICENSE" and "NOTICE" for more information.
PDFio requires the following to build the software:
A C99 compiler such as Clang, GCC, or MS Visual C
A POSIX-compliant make
program
A POSIX-compliant sh
program
ZLIB (https://www.zlib.net) 1.0 or higher
IDE files for Xcode (macOS/iOS) and Visual Studio (Windows) are also provided.
PDFio comes with a configure script that creates a portable makefile that will work on any POSIX-compliant system with ZLIB installed. To make it, run:
./configure
make
To test it, run:
make test
To install it, run:
sudo make install
If you want a shared library, run:
./configure --enable-shared
make
sudo make install
The default installation location is "/usr/local". Pass the --prefix
option to make to install it to another location:
./configure --prefix=/some/other/directory
Other configure options can be found using the --help
option:
./configure --help
The Visual Studio solution ("pdfio.sln") is provided for Windows developers and generates both a static library and DLL.
There is also an Xcode project ("pdfio.xcodeproj") you can use on macOS which generates a static library that will be installed under "/usr/local" with:
sudo xcodebuild install
PDFio can be detected using the pkg-config
command, for example:
if pkg-config --exists pdfio; then
...
fi
In a makefile you can add the necessary compiler and linker options with:
CFLAGS += `pkg-config --cflags pdfio`
LIBS += `pkg-config --libs pdfio`
On Windows, you need to link to the PDFIO1.LIB
(DLL) library and include the zlib_native
NuGet package dependency. You can also use the published pdfio_native
NuGet package.
PDFio provides a primary header file that is always used:
#include <pdfio.h>
PDFio also provides PDF content helper functions for producing PDF content that are defined in a separate header file:
#include <pdfio-content.h>
A PDF file provides data and commands for displaying pages of graphics and text, and is structured in a way that allows it to be displayed in the same way across multiple devices and platforms. The following is a PDF which shows "Hello, World!" on one page:
%PDF-1.0 % Header starts here
%âãÏÓ
1 0 obj % Body starts here
<<
/Kids [2 0 R]
/Count 1
/Type /Pages
>>
endobj
2 0 obj
<<
/Rotate 0
/Parent 1 0 R
/Resources 3 0 R
/MediaBox [0 0 612 792]
/Contents [4 0 R]/Type /Page
>>
endobj
3 0 obj
<<
/Font
<<
/F0
<<
/BaseFont /Times-Italic
/Subtype /Type1
/Type /Font
>>
>>
>>
endobj
4 0 obj
<<
/Length 65
>>
stream
1. 0. 0. 1. 50. 700. cm
BT
/F0 36. Tf
(Hello, World!) Tj
ET
endstream
endobj
5 0 obj
<<
/Pages 1 0 R
/Type /Catalog
>>
endobj
xref % Cross-reference table starts here
0 6
0000000000 65535 f
0000000015 00000 n
0000000074 00000 n
0000000192 00000 n
0000000291 00000 n
0000000409 00000 n
trailer % Trailer starts here
<<
/Root 5 0 R
/Size 6
>>
startxref
459
%%EOF
The header is the first line of a PDF file that specifies the version of the PDF format that has been used, for example %PDF-1.0
.
Since PDF files almost always contain binary data, they can become corrupted if line endings are changed. For example, if the file is transferred using FTP in text mode or is edited in Notepad on Windows. To allow legacy file transfer programs to determine that the file is binary, the PDF standard recommends including some bytes with character codes higher than 127 in the header, for example:
%âãÏÓ
The percent sign indicates a comment line while the other few bytes are arbitrary character codes in excess of 127. So, the whole header in our example is:
%PDF-1.0
%âãÏÓ
The file body consists of a sequence of objects, each preceded by an object number, generation number, and the obj keyword on one line, and followed by the endobj keyword on another. For example:
1 0 obj
<<
/Kids [2 0 R]
/Count 1
/Type /Pages
>>
endobj
In this example, the object number is 1 and the generation number is 0, meaning it is the first version of the object. The content for object 1 is between the initial 1 0 obj
and trailing endobj
lines. In this case, the content is the dictionary <</Kids [2 0 R] /Count 1 /Type /Pages>>
.
The cross-reference table lists the byte offset of each object in the file body. This allows random access to objects, meaning they don't have to be read in order. Objects that are not used are never read, making the process efficient. Operations like counting the number of pages in a PDF document are fast, even in large files.
Each object has an object number and a generation number. Generation numbers are used when a cross-reference table entry is reused. For simplicity, we will assume generation numbers to be always zero and ignore them. The cross-reference table consists of a header line that indicates the number of entries, a free entry line for object 0, and a line for each of the objects in the file body. For example:
0 6 % Six entries in table, starting at 0
0000000000 65535 f % Free entry for object 0
0000000015 00000 n % Object 1 is at byte offset 15
0000000074 00000 n % Object 2 is at byte offset 74
0000000192 00000 n % etc...
0000000291 00000 n
0000000409 00000 n % Object 5 is at byte offset 409
The first line of the trailer is just the trailer
keyword. This is followed by the trailer dictionary which contains at least the /Size
entry specifying the number of entries in the cross-reference table and the /Root
entry which references the object for the document catalog which is the root element of the graph of objects in the body.
There follows a line with just the startxref
keyword, a line with a single number specifying the byte offset of the start of the cross-reference table within the file, and then the line %%EOF
which signals the end of the PDF file.
trailer % Trailer keyword
<< % The trailer dictinonary
/Root 5 0 R
/Size 6
>>
startxref % startxref keyword
459 % Byte offset of cross-reference table
%%EOF % End-of-file marker
PDFio exposes several types:
pdfio_file_t
: A PDF file (for reading or writing)
pdfio_array_t
: An array of values
pdfio_dict_t
: A dictionary of key/value pairs in a PDF file, object, etc.
pdfio_obj_t
: An object in a PDF file
pdfio_stream_t
: An object stream
You open an existing PDF file using the pdfioFileOpen
function:
pdfio_file_t *pdf =
pdfioFileOpen("myinputfile.pdf", password_cb, password_data, error_cb,
error_data);
where the five arguments to the function are the filename ("myinputfile.pdf"), an optional password callback function (password_cb
) and data pointer value (password_data
), and an optional error callback function (error_cb
) and data pointer value (error_data
). The password callback is called for encrypted PDF files that are not using the default password, for example:
const char *
password_cb(void *data, const char *filename)
{
(void)data; // This callback doesn't use the data pointer
(void)filename; // This callback doesn't use the filename
// Return a password string for the file...
return ("Password42");
}
The error callback is called for both errors and warnings and accepts the pdfio_file_t
pointer, a message string, and the callback pointer value, for example:
bool
error_cb(pdfio_file_t *pdf, const char *message, void *data)
{
(void)data; // This callback does not use the data pointer
fprintf(stderr, "%s: %s\n", pdfioFileGetName(pdf), message);
// Return false to treat warnings as errors
return (false);
}
The default error callback (NULL
) does the equivalent of the above.
Each PDF file contains one or more pages. The pdfioFileGetNumPages
function returns the number of pages in the file while the pdfioFileGetPage
function gets the specified page in the PDF file:
pdfio_file_t *pdf; // PDF file
size_t i; // Looping var
size_t count; // Number of pages
pdfio_obj_t *page; // Current page
// Iterate the pages in the PDF file
for (i = 0, count = pdfioFileGetNumPages(pdf); i < count; i ++)
{
page = pdfioFileGetPage(pdf, i);
// do something with page
}
Each page is represented by a "page tree" object (what pdfioFileGetPage
returns) that specifies information about the page and one or more "content" objects that contain the images, fonts, text, and graphics that appear on the page. Use the pdfioPageGetNumStreams
and pdfioPageOpenStream
functions to access the content streams for each page, and pdfioObjGetDict
to get the associated page object dictionary. For example, if you want to display the media and crop boxes for a given page:
pdfio_file_t *pdf; // PDF file
size_t i; // Looping var
size_t count; // Number of pages
pdfio_obj_t *page; // Current page
pdfio_dict_t *dict; // Current page dictionary
pdfio_array_t *media_box; // MediaBox array
double media_values[4]; // MediaBox values
pdfio_array_t *crop_box; // CropBox array
double crop_values[4]; // CropBox values
// Iterate the pages in the PDF file
for (i = 0, count = pdfioFileGetNumPages(pdf); i < count; i ++)
{
page = pdfioFileGetPage(pdf, i);
dict = pdfioObjGetDict(page);
media_box = pdfioDictGetArray(dict, "MediaBox");
media_values[0] = pdfioArrayGetNumber(media_box, 0);
media_values[1] = pdfioArrayGetNumber(media_box, 1);
media_values[2] = pdfioArrayGetNumber(media_box, 2);
media_values[3] = pdfioArrayGetNumber(media_box, 3);
crop_box = pdfioDictGetArray(dict, "CropBox");
crop_values[0] = pdfioArrayGetNumber(crop_box, 0);
crop_values[1] = pdfioArrayGetNumber(crop_box, 1);
crop_values[2] = pdfioArrayGetNumber(crop_box, 2);
crop_values[3] = pdfioArrayGetNumber(crop_box, 3);
printf("Page %u: MediaBox=[%g %g %g %g], CropBox=[%g %g %g %g]\n",
(unsigned)(i + 1),
media_values[0], media_values[1], media_values[2], media_values[3],
crop_values[0], crop_values[1], crop_values[2], crop_values[3]);
}
Page object dictionaries have several (mostly optional) key/value pairs, including:
"Annots": An array of annotation dictionaries for the page; use pdfioDictGetArray
to get the array
"CropBox": The crop box as an array of four numbers for the left, bottom, right, and top coordinates of the target media; use pdfioDictGetArray
to get a pointer to the array of numbers
"Dur": The number of seconds the page should be displayed; use pdfioDictGetNumber
to get the page duration value
"Group": The dictionary of transparency group values for the page; use pdfioDictGetDict
to get a pointer to the resources dictionary
"LastModified": The date and time when this page was last modified; use pdfioDictGetDate
to get the Unix time_t
value
"Parent": The parent page tree node object for this page; use pdfioDictGetObj
to get a pointer to the object
"MediaBox": The media box as an array of four numbers for the left, bottom, right, and top coordinates of the target media; use pdfioDictGetArray
to get a pointer to the array of numbers
"Resources": The dictionary of resources for the page; use pdfioDictGetDict
to get a pointer to the resources dictionary
"Rotate": A number indicating the number of degrees of counter-clockwise rotation to apply to the page when viewing; use pdfioDictGetNumber
to get the rotation angle
"Thumb": A thumbnail image object for the page; use pdfioDictGetObj
to get a pointer to the thumbnail image object
"Trans": The page transition dictionary; use pdfioDictGetDict
to get a pointer to the dictionary
The pdfioFileClose
function closes a PDF file and frees all memory that was used for it:
pdfioFileClose(pdf);
You create a new PDF file using the pdfioFileCreate
function:
pdfio_rect_t media_box = { 0.0, 0.0, 612.0, 792.0 }; // US Letter
pdfio_rect_t crop_box = { 36.0, 36.0, 576.0, 756.0 }; // w/0.5" margins
pdfio_file_t *pdf = pdfioFileCreate("myoutputfile.pdf", "2.0", &media_box, &crop_box,
error_cb, error_data);
where the six arguments to the function are the filename ("myoutputfile.pdf"), PDF version ("2.0"), media box (media_box
), crop box (crop_box
), an optional error callback function (error_cb
), and an optional pointer value for the error callback function (error_data
). The units for the media and crop boxes are points (1/72nd of an inch).
Alternately you can stream a PDF file using the pdfioFileCreateOutput
function:
pdfio_rect_t media_box = { 0.0, 0.0, 612.0, 792.0 }; // US Letter
pdfio_rect_t crop_box = { 36.0, 36.0, 576.0, 756.0 }; // w/0.5" margins
pdfio_file_t *pdf = pdfioFileCreateOutput(output_cb, output_ctx, "2.0", &media_box,
&crop_box, error_cb, error_data);
Once the file is created, use the pdfioFileCreateObj
, pdfioFileCreatePage
, and pdfioPageCopy
functions to create objects and pages in the file.
Finally, the pdfioFileClose
function writes the PDF cross-reference and "trailer" information, closes the file, and frees all memory that was used for it.
PDF objects are identified using two numbers - the object number (1 to N) and the object generation (0 to 65535) that specifies a particular version of an object. An object's numbers are returned by the pdfioObjGetNumber
and pdfioObjGetGeneration
functions. You can find a numbered object using the pdfioFileFindObj
function.
Objects contain values (typically dictionaries) and usually an associated data stream containing images, fonts, ICC profiles, and page content. PDFio provides several accessor functions to get the value(s) associated with an object:
pdfioObjGetArray
returns an object's array value, if any
pdfioObjGetDict
returns an object's dictionary value, if any
pdfioObjGetLength
returns the length of the data stream, if any
pdfioObjGetSubtype
returns the sub-type name of the object, for example "Image" for an image object.
pdfioObjGetType
returns the type name of the object, for example "XObject" for an image object.
Some PDF objects have an associated data stream, such as for pages, images, ICC color profiles, and fonts. You access the stream for an existing object using the pdfioObjOpenStream
function:
pdfio_file_t *pdf = pdfioFileOpen(...);
pdfio_obj_t *obj = pdfioFileFindObj(pdf, number);
pdfio_stream_t *st = pdfioObjOpenStream(obj, true);
The first argument is the object pointer. The second argument is a boolean value that specifies whether you want to decode (typically decompress) the stream data or return it as-is.
When reading a page stream you'll use the pdfioPageOpenStream
function instead:
pdfio_file_t *pdf = pdfioFileOpen(...);
pdfio_obj_t *obj = pdfioFileGetPage(pdf, number);
pdfio_stream_t *st = pdfioPageOpenStream(obj, 0, true);
Once you have the stream open, you can use one of several functions to read from it:
pdfioStreamConsume
reads and discards a number of bytes in the stream
pdfioStreamGetToken
reads a PDF token from the stream
pdfioStreamPeek
peeks at the next stream data without advancing or "consuming" it
pdfioStreamRead
reads a buffer of data
When you are done reading from the stream, call the pdfioStreamClose
function:
pdfioStreamClose(st);
To create a stream for a new object, call the pdfioObjCreateStream
function:
pdfio_file_t *pdf = pdfioFileCreate(...);
pdfio_obj_t *obj = pdfioFileCreateObj(pdf, ...);
pdfio_stream_t *st = pdfioObjCreateStream(obj, PDFIO_FILTER_FLATE);
The first argument is the newly created object. The second argument is either PDFIO_FILTER_NONE
to specify that any encoding is done by your program or PDFIO_FILTER_FLATE
to specify that PDFio should Flate compress the stream.
To create a page content stream call the pdfioFileCreatePage
function:
pdfio_file_t *pdf = pdfioFileCreate(...);
pdfio_dict_t *dict = pdfioDictCreate(pdf);
... set page dictionary keys and values ...
pdfio_stream_t *st = pdfioFileCreatePage(pdf, dict);
Once you have created the stream, use any of the following functions to write to the stream:
pdfioStreamPrintf
writes a formatted string to the stream
pdfioStreamPutChar
writes a single character to the stream
pdfioStreamPuts
writes a C string to the stream
pdfioStreamWrite
writes a buffer of data to the stream
The PDF content helper functions provide additional functions for writing specific PDF page stream commands.
When you are done writing the stream, call pdfioStreamClose
to close both the stream and the object.
PDFio includes many helper functions for embedding or writing specific kinds of content to a PDF file. These functions can be roughly grouped into five categories:
PDF color spaces are specified using well-known names like "DeviceCMYK", "DeviceGray", and "DeviceRGB" or using arrays that define so-called calibrated color spaces. PDFio provides several functions for embedding ICC profiles and creating color space arrays:
pdfioArrayCreateColorFromICCObj
creates a color array for an ICC color profile object
pdfioArrayCreateColorFromMatrix
creates a color array using a CIE XYZ color transform matrix, a gamma value, and a CIE XYZ white point
pdfioArrayCreateColorFromPalette
creates an indexed color array from an array of sRGB values
pdfioArrayCreateColorFromPrimaries
creates a color array using CIE XYZ primaries and a gamma value
pdfioArrayCreateColorFromStandard
creates a color array for a standard color space
You can embed an ICC color profile using the pdfioFileCreateICCObjFromFile
function:
pdfio_file_t *pdf = pdfioFileCreate(...);
pdfio_obj_t *icc = pdfioFileCreateICCObjFromFile(pdf, "filename.icc");
where the first argument is the PDF file and the second argument is the filename of the ICC color profile.
PDFio also includes predefined constants for creating a few standard color spaces:
pdfio_file_t *pdf = pdfioFileCreate(...);
// Create an AdobeRGB color array
pdfio_array_t *adobe_rgb =
pdfioArrayCreateColorFromStandard(pdf, 3, PDFIO_CS_ADOBE);
// Create an Display P3 color array
pdfio_array_t *display_p3 =
pdfioArrayCreateColorFromStandard(pdf, 3, PDFIO_CS_P3_D65);
// Create an sRGB color array
pdfio_array_t *srgb =
pdfioArrayCreateColorFromStandard(pdf, 3, PDFIO_CS_SRGB);
PDF supports many kinds of fonts, including PostScript Type1, PDF Type3, TrueType/OpenType, and CID. PDFio provides two functions for creating font objects. The first is pdfioFileCreateFontObjFromBase
which creates a font object for one of the base PDF fonts:
"Courier"
"Courier-Bold"
"Courier-BoldItalic"
"Courier-Italic"
"Helvetica"
"Helvetica-Bold"
"Helvetica-BoldOblique"
"Helvetica-Oblique"
"Symbol"
"Times-Bold"
"Times-BoldItalic"
"Times-Italic"
"Times-Roman"
"ZapfDingbats"
Except for Symbol and ZapfDingbats (which use a custom 8-bit character set), PDFio always uses the Windows CP1252 subset of Unicode for these fonts.
The second function is pdfioFileCreateFontObjFromFile
which creates a font object from a TrueType/OpenType font file, for example:
pdfio_file_t *pdf = pdfioFileCreate(...);
pdfio_obj_t *arial =
pdfioFileCreateFontObjFromFile(pdf, "OpenSans-Regular.ttf", false);
will embed an OpenSans Regular TrueType font using the Windows CP1252 subset of Unicode. Pass true
for the third argument to embed it as a Unicode CID font instead, for example:
pdfio_file_t *pdf = pdfioFileCreate(...);
pdfio_obj_t *arial =
pdfioFileCreateFontObjFromFile(pdf, "NotoSansJP-Regular.otf", true);
will embed the NotoSansJP Regular OpenType font with full support for Unicode.
Note: Not all fonts support Unicode, and most do not contain a full complement of Unicode characters.
pdfioFileCreateFontObjFromFile
does not perform any character subsetting, so the entire font file is embedded in the PDF file.
PDF supports images with many different color spaces and bit depths with optional transparency. PDFio provides two helper functions for creating image objects that can be referenced in page streams. The first function is pdfioFileCreateImageObjFromData
which creates an image object from data in memory, for example:
pdfio_file_t *pdf = pdfioFileCreate(...);
unsigned char data[1024 * 1024 * 4]; // 1024x1024 RGBA image data
pdfio_obj_t *img =
pdfioFileCreateImageObjFromData(pdf, data, /*width*/1024, /*height*/1024,
/*num_colors*/3, /*color_data*/NULL,
/*alpha*/true, /*interpolate*/false);
will create an object for a 1024x1024 RGBA image in memory, using the default color space for 3 colors ("DeviceRGB"). We can use one of the color space functions to use a specific color space for this image, for example:
pdfio_file_t *pdf = pdfioFileCreate(...);
// Create an AdobeRGB color array
pdfio_array_t *adobe_rgb =
pdfioArrayCreateColorFromMatrix(pdf, 3, pdfioAdobeRGBGamma,
pdfioAdobeRGBMatrix, pdfioAdobeRGBWhitePoint);
// Create a 1024x1024 RGBA image using AdobeRGB
unsigned char data[1024 * 1024 * 4]; // 1024x1024 RGBA image data
pdfio_obj_t *img =
pdfioFileCreateImageObjFromData(pdf, data, /*width*/1024, /*height*/1024,
/*num_colors*/3, /*color_data*/adobe_rgb,
/*alpha*/true, /*interpolate*/false);
The "interpolate" argument specifies whether the colors in the image should be smoothed/interpolated when scaling. This is most useful for photographs but should be false
for screenshot and barcode images.
If you have a JPEG or PNG file, use the pdfioFileCreateImageObjFromFile
function to copy the image into a PDF image object, for example:
pdfio_file_t *pdf = pdfioFileCreate(...);
pdfio_obj_t *img =
pdfioFileCreateImageObjFromFile(pdf, "myphoto.jpg", /*interpolate*/true);
Note: Currently
pdfioFileCreateImageObjFromFile
does not support 12 bit JPEG files or PNG files with an alpha channel.
PDF pages each have an associated dictionary to specify the images, fonts, and color spaces used by the page. PDFio provides functions to add these resources to the dictionary:
pdfioPageDictAddColorSpace
adds a named color space to the page dictionary
pdfioPageDictAddFont
adds a named font to the page dictionary
pdfioPageDictAddImage
adds a named image to the page dictionary
PDF page streams contain textual commands for drawing on the page. PDFio provides many functions for writing these commands with the correct format and escaping, as needed:
pdfioContentClip
clips future drawing to the current path
pdfioContentDrawImage
draws an image object
pdfioContentFill
fills the current path
pdfioContentFillAndStroke
fills and strokes the current path
pdfioContentMatrixConcat
concatenates a matrix with the current transform matrix
pdfioContentMatrixRotate
concatenates a rotation matrix with the current transform matrix
pdfioContentMatrixScale
concatenates a scaling matrix with the current transform matrix
pdfioContentMatrixTranslate
concatenates a translation matrix with the current transform matrix
pdfioContentPathClose
closes the current path
pdfioContentPathCurve
appends a Bezier curve to the current path
pdfioContentPathCurve13
appends a Bezier curve with 2 control points to the current path
pdfioContentPathCurve23
appends a Bezier curve with 2 control points to the current path
pdfioContentPathLineTo
appends a line to the current path
pdfioContentPathMoveTo
moves the current point in the current path
pdfioContentPathRect
appends a rectangle to the current path
pdfioContentRestore
restores a previous graphics state
pdfioContentSave
saves the current graphics state
pdfioContentSetDashPattern
sets the line dash pattern
pdfioContentSetFillColorDeviceCMYK
sets the current fill color using a device CMYK color
pdfioContentSetFillColorDeviceGray
sets the current fill color using a device gray color
pdfioContentSetFillColorDeviceRGB
sets the current fill color using a device RGB color
pdfioContentSetFillColorGray
sets the current fill color using a calibrated gray color
pdfioContentSetFillColorRGB
sets the current fill color using a calibrated RGB color
pdfioContentSetFillColorSpace
sets the current fill color space
pdfioContentSetFlatness
sets the flatness for curves
pdfioContentSetLineCap
sets how the ends of lines are stroked
pdfioContentSetLineJoin
sets how connections between lines are stroked
pdfioContentSetLineWidth
sets the width of stroked lines
pdfioContentSetMiterLimit
sets the miter limit for stroked lines
pdfioContentSetStrokeColorDeviceCMYK
sets the current stroke color using a device CMYK color
pdfioContentSetStrokeColorDeviceGray
sets the current stroke color using a device gray color
pdfioContentSetStrokeColorDeviceRGB
sets the current stroke color using a device RGB color
pdfioContentSetStrokeColorGray
sets the current stroke color using a calibrated gray color
pdfioContentSetStrokeColorRGB
sets the current stroke color using a calibrated RGB color
pdfioContentSetStrokeColorSpace
sets the current stroke color space
pdfioContentSetTextCharacterSpacing
sets the spacing between characters for text
pdfioContentSetTextFont
sets the font and size for text
pdfioContentSetTextLeading
sets the line height for text
pdfioContentSetTextMatrix
concatenates a matrix with the current text matrix
pdfioContentSetTextRenderingMode
sets the text rendering mode
pdfioContentSetTextRise
adjusts the baseline for text
pdfioContentSetTextWordSpacing
sets the spacing between words for text
pdfioContentSetTextXScaling
sets the horizontal scaling for text
pdfioContentStroke
strokes the current path
pdfioContentTextBegin
begins a block of text
pdfioContentTextEnd
ends a block of text
pdfioContentTextMoveLine
moves to the next line with an offset in a text block
pdfioContentTextMoveTo
moves within the current line in a text block
pdfioContentTextNewLine
moves to the beginning of the next line in a text block
pdfioContentTextNewLineShow
moves to the beginning of the next line in a text block and shows literal text with optional word and character spacing
pdfioContentTextNewLineShowf
moves to the beginning of the next line in a text block and shows formatted text with optional word and character spacing
pdfioContentTextShow
draws a literal string in a text block
pdfioContentTextShowf
draws a formatted string in a text block
pdfioContentTextShowJustified
draws an array of literal strings with offsets between them
The pdfioinfo.c
example program opens a PDF file and prints the title, author, creation date, and number of pages:
#include <pdfio.h>
#include <time.h>
int // O - Exit status
main(int argc, // I - Number of command-line arguments
char *argv[]) // Command-line arguments
{
const char *filename; // PDF filename
pdfio_file_t *pdf; // PDF file
time_t creation_date; // Creation date
struct tm *creation_tm; // Creation date/time information
char creation_text[256]; // Creation date/time as a string
// Get the filename from the command-line...
if (argc != 2)
{
fputs("Usage: ./pdfioinfo FILENAME.pdf\n", stderr);
return (1);
}
filename = argv[1];
// Open the PDF file with the default callbacks...
pdf = pdfioFileOpen(filename, /*password_cb*/NULL, /*password_cbdata*/NULL,
/*error_cb*/NULL, /*error_cbdata*/NULL);
if (pdf == NULL)
return (1);
// Get the creation date and convert to a string...
creation_date = pdfioFileGetCreationDate(pdf);
creation_tm = localtime(&creation_date);
strftime(creation_text, sizeof(creation_text), "%c", creation_tm);
// Print file information to stdout...
printf("%s:\n", filename);
printf(" Title: %s\n", pdfioFileGetTitle(pdf));
printf(" Author: %s\n", pdfioFileGetAuthor(pdf));
printf(" Created On: %s\n", creation_text);
printf(" Number Pages: %u\n", (unsigned)pdfioFileGetNumPages(pdf));
// Close the PDF file...
pdfioFileClose(pdf);
return (0);
}
The pdf2text.c
example code extracts non-Unicode text from a PDF file by scanning each page for strings and text drawing commands. Since it doesn't look at the font encoding or support Unicode text, it is really only useful to extract plain ASCII text from a PDF file. And since it writes text in the order it appears in the page stream, it may not come out in the same order as appears on the page.
The pdfioStreamGetToken
function is used to read individual tokens from the page streams. Tokens starting with the open parenthesis are text strings, while PDF operators are left as-is. We use some simple logic to make sure that we include spaces between text strings and add newlines for the text operators that start a new line in a text block:
pdfio_stream_t *st; // Page stream
bool first = true; // First string on line?
char buffer[1024]; // Token buffer
// Read PDF tokens from the page stream...
while (pdfioStreamGetToken(st, buffer, sizeof(buffer)))
{
if (buffer[0] == '(')
{
// Text string using an 8-bit encoding
if (first)
first = false;
else if (buffer[1] != ' ')
putchar(' ');
fputs(buffer + 1, stdout);
}
else if (!strcmp(buffer, "Td") || !strcmp(buffer, "TD") || !strcmp(buffer, "T*") ||
!strcmp(buffer, "\'") || !strcmp(buffer, "\""))
{
// Text operators that advance to the next line in the block
putchar('\n');
first = true;
}
}
if (!first)
putchar('\n');
The image2pdf.c
example code creates a PDF file containing a JPEG or PNG image file and optional caption on a single page. The create_pdf_image_file
function creates the PDF file, embeds a base font and the named JPEG or PNG image file, and then creates a page with the image centered on the page with any text centered below:
#include <pdfio.h>
#include <pdfio-content.h>
#include <string.h>
bool // O - True on success, false on failure
create_pdf_image_file(
const char *pdfname, // I - PDF filename
const char *imagename, // I - Image filename
const char *caption) // I - Caption filename
{
pdfio_file_t *pdf; // PDF file
pdfio_obj_t *font; // Caption font
pdfio_obj_t *image; // Image
pdfio_dict_t *dict; // Page dictionary
pdfio_stream_t *page; // Page stream
double width, height; // Width and height of image
double swidth, sheight; // Scaled width and height on page
double tx, ty; // Position on page
// Create the PDF file...
pdf = pdfioFileCreate(pdfname, /*version*/NULL, /*media_box*/NULL, /*crop_box*/NULL,
/*error_cb*/NULL, /*error_cbdata*/NULL);
if (!pdf)
return (false);
// Create a Courier base font for the caption
font = pdfioFileCreateFontObjFromBase(pdf, "Courier");
if (!font)
{
pdfioFileClose(pdf);
return (false);
}
// Create an image object from the JPEG/PNG image file...
image = pdfioFileCreateImageObjFromFile(pdf, imagename, true);
if (!image)
{
pdfioFileClose(pdf);
return (false);
}
// Create a page dictionary with the font and image...
dict = pdfioDictCreate(pdf);
pdfioPageDictAddFont(dict, "F1", font);
pdfioPageDictAddImage(dict, "IM1", image);
// Create the page and its content stream...
page = pdfioFileCreatePage(pdf, dict);
// Position and scale the image on the page...
width = pdfioImageGetWidth(image);
height = pdfioImageGetHeight(image);
// Default media_box is "universal" 595.28x792 points (8.27x11in or 210x279mm).
// Use margins of 36 points (0.5in or 12.7mm) with another 36 points for the
// caption underneath...
swidth = 595.28 - 72.0;
sheight = swidth * height / width;
if (sheight > (792.0 - 36.0 - 72.0))
{
sheight = 792.0 - 36.0 - 72.0;
swidth = sheight * width / height;
}
tx = 0.5 * (595.28 - swidth);
ty = 0.5 * (792 - 36 - sheight);
pdfioContentDrawImage(page, "IM1", tx, ty + 36.0, swidth, sheight);
// Draw the caption in black...
pdfioContentSetFillColorDeviceGray(page, 0.0);
// Compute the starting point for the text - Courier is monospaced with a
// nominal width of 0.6 times the text height...
tx = 0.5 * (595.28 - 18.0 * 0.6 * strlen(caption));
// Position and draw the caption underneath...
pdfioContentTextBegin(page);
pdfioContentSetTextFont(page, "F1", 18.0);
pdfioContentTextMoveTo(page, tx, ty);
pdfioContentTextShow(page, /*unicode*/false, caption);
pdfioContentTextEnd(page);
// Close the page stream and the PDF file...
pdfioStreamClose(page);
pdfioFileClose(pdf);
return (true);
}
One-dimensional barcodes are often rendered using special fonts that map ASCII characters to sequences of bars that can be read. The examples
directory contains such a font (code128.ttf
) to create "Code 128" barcodes, with an accompanying bit of example code in code128.c
.
The first thing you need to do is prepare the barcode string to use with the font. Each barcode begins with a start pattern followed by the characters or digits you want to encode, a weighted sum digit, and a stop pattern. The make_code128
function creates this string:
static char * // O - Output string
make_code128(char *dst, // I - Destination buffer
const char *src, // I - Source string
size_t dstsize) // I - Size of destination buffer
{
char *dstptr, // Pointer into destination buffer
*dstend; // End of destination buffer
int sum; // Weighted sum
static const char *code128_chars = // Code 128 characters
" !\"#$%&'()*+,-./0123456789:;<=>?"
"@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\\]^_"
"`abcdefghijklmnopqrstuvwxyz{|}~\303"
"\304\305\306\307\310\311\312";
static const char code128_start_code_b = '\314';
// Start code B
static const char code128_stop = '\316';
// Stop pattern
// Start a Code B barcode...
dstptr = dst;
dstend = dst + dstsize - 3;
*dstptr++ = code128_start_code_b;
sum = code128_start_code_b - 100;
while (*src && dstptr < dstend)
{
if (*src >= ' ' && *src < 0x7f)
{
sum += (dstptr - dst) * (*src - ' ');
*dstptr++ = *src;
}
src ++;
}
// Add the weighted sum modulo 103
*dstptr++ = code128_chars[sum % 103];
// Add the stop pattern and return...
*dstptr++ = code128_stop;
*dstptr = '\0';
return (dst);
}
The main
function does the rest of the work. The barcode font is imported using the pdfioFileCreateFontObjFromFile
function. We pass false
for the "unicode" argument since we just want the (default) ASCII encoding:
barcode_font = pdfioFileCreateFontObjFromFile(pdf, "code128.ttf", /*unicode*/false);
Since barcodes usually have the number or text represented by the barcode printed underneath it, we also need a regular text font, for which we can choose one of the standard 14 PostScript base fonts using the pdfioFIleCreateFontObjFromBase
function:
text_font = pdfioFileCreateFontObjFromBase(pdf, "Helvetica");
Once we have these fonts we can measure the barcode and regular text labels using the pdfioContentTextMeasure
function to determine how large the PDF page needs to be to hold the barcode and text:
// Compute sizes of the text...
const char *barcode = argv[1];
char barcode_temp[256];
if (!(barcode[0] & 0x80))
barcode = make_code128(barcode_temp, barcode, sizeof(barcode_temp));
double barcode_height = 36.0;
double barcode_width =
pdfioContentTextMeasure(barcode_font, barcode, barcode_height);
const char *text = argv[2];
double text_height = 0.0;
double text_width = 0.0;
if (text && text_font)
{
text_height = 9.0;
text_width = pdfioContentTextMeasure(text_font, text, text_height);
}
// Compute the size of the PDF page...
pdfio_rect_t media_box;
media_box.x1 = 0.0;
media_box.y1 = 0.0;
media_box.x2 = (barcode_width > text_width ? barcode_width : text_width) + 18.0;
media_box.y2 = barcode_height + text_height + 18.0;
Finally, we just need to create a page of the specified size that references the two fonts:
// Start a page for the barcode...
page_dict = pdfioDictCreate(pdf);
pdfioDictSetRect(page_dict, "MediaBox", &media_box);
pdfioDictSetRect(page_dict, "CropBox", &media_box);
pdfioPageDictAddFont(page_dict, "B128", barcode_font);
if (text_font)
pdfioPageDictAddFont(page_dict, "TEXT", text_font);
page_st = pdfioFileCreatePage(pdf, page_dict);
With the barcode font called "B128" and the text font called "TEXT", we can use them to draw two strings:
// Draw the page...
pdfioContentSetFillColorGray(page_st, 0.0);
pdfioContentSetTextFont(page_st, "B128", barcode_height);
pdfioContentTextBegin(page_st);
pdfioContentTextMoveTo(page_st, 0.5 * (media_box.x2 - barcode_width),
9.0 + text_height);
pdfioContentTextShow(page_st, /*unicode*/false, barcode);
pdfioContentTextEnd(page_st);
if (text && text_font)
{
pdfioContentSetTextFont(page_st, "TEXT", text_height);
pdfioContentTextBegin(page_st);
pdfioContentTextMoveTo(page_st, 0.5 * (media_box.x2 - text_width), 9.0);
pdfioContentTextShow(page_st, /*unicode*/false, text);
pdfioContentTextEnd(page_st);
}
pdfioStreamClose(page_st);
Markdown is a simple plain text format that supports things like headings, links, character styles, tables, and embedded images. The md2pdf.c
example code uses the mmd library to convert markdown to a PDF file that can be distributed.
Note: The md2pdf example is by far the most complex example code included with PDFio and shows how to layout text, add headers and footers, add links, embed images, format tables, and add an outline (table of contents) for navigation.
The md2pdf
program needs to maintain three sets of state - one for the markdown document which is represented by nodes of type mmd_t
and the others for the PDF document and current PDF page which are contained in the docdata_t
structure:
typedef struct docdata_s // Document formatting data
{
// State for the whole document
pdfio_file_t *pdf; // PDF file
pdfio_rect_t media_box; // Media (page) box
pdfio_rect_t crop_box; // Crop box (for margins)
pdfio_rect_t art_box; // Art box (for markdown content)
pdfio_obj_t *fonts[DOCFONT_MAX]; // Embedded fonts
double font_space; // Unit width of a space
size_t num_images; // Number of embedded images
docimage_t images[DOCIMAGE_MAX]; // Embedded images
const char *title; // Document title
char *heading; // Current document heading
size_t num_actions; // Number of actions for this document
docaction_t actions[DOCACTION_MAX]; // Actions for this document
size_t num_targets; // Number of targets for this document
doctarget_t targets[DOCTARGET_MAX]; // Targets for this document
size_t num_toc; // Number of table-of-contents entries
doctoc_t toc[DOCTOC_MAX]; // Table-of-contents entries
// State for the current page
pdfio_stream_t *st; // Current page stream
double y; // Current position on page
docfont_t font; // Current font
double fsize; // Current font size
doccolor_t color; // Current color
pdfio_array_t *annots_array; // Annotations array (for links)
pdfio_obj_t *annots_obj; // Annotations object (for links)
size_t num_links; // Number of links for this page
doclink_t links[DOCLINK_MAX]; // Links for this page
} docdata_t;
The output is fixed to the "universal" media size (the intersection of US Letter and ISO A4) with 1/2 inch margins - the PAGE_
constants can be changed to select a different size or margins. The media_box
member contains the "MediaBox" rectangle for the PDF pages, while the crop_box
and art_box
members contain the "CropBox" and "ArtBox" values, respectively.
Four embedded fonts are used:
DOCFONT_REGULAR
: the default font used for text,
DOCFONT_BOLD
: a boldface font used for heading and strong text,
DOCFONT_ITALIC
: an italic/oblique font used for emphasized text, and
DOCFONT_MONOSPACE
: a fixed-width font used for code.
By default the code uses the base PostScript fonts Helvetica, Helvetica-Bold, Helvetica-Oblique, and Courier. The USE_TRUETYPE
define can be used to replace these with the Roboto TrueType fonts.
Embedded JPEG and PNG images are copied into the PDF document, with the images
array containing the list of the images and their objects.
The title
member contains the document title, while the heading
member contains the current heading text.
The actions
array contains a list of action dictionaries for interior document links that need to be resolved, while the targets
array keeps track of the location of the headings in the PDF document.
The toc
array contains a list of headings and is used to construct the PDF outlines dictionaries/objects, which provides a table of contents for navigation in most PDF readers.
The st
member provides the stream for the current page content. The color
, font
, fsize
, and y
members provide the current graphics state on the page.
The annots_array
, annots_obj
, num_links
, and links
members contain a list of hyperlinks on the current page.
The new_page
function is used to start a new page. Aside from creating the new page object and stream, it adds a standard header and footer to the page. It starts by closing the current page if it is open:
// Close the current page...
if (dd->st)
{
pdfioStreamClose(dd->st);
add_links(dd);
}
The new page needs a dictionary containing any link annotations, the media and art boxes, the four fonts, and any images:
// Prep the new page...
page_dict = pdfioDictCreate(dd->pdf);
dd->annots_array = pdfioArrayCreate(dd->pdf);
dd->annots_obj = pdfioFileCreateArrayObj(dd->pdf, dd->annots_array);
pdfioDictSetObj(page_dict, "Annots", dd->annots_obj);
pdfioDictSetRect(page_dict, "MediaBox", &dd->media_box);
pdfioDictSetRect(page_dict, "ArtBox", &dd->art_box);
for (fontface = DOCFONT_REGULAR; fontface < DOCFONT_MAX; fontface ++)
pdfioPageDictAddFont(page_dict, docfont_names[fontface], dd->fonts[fontface]);
for (i = 0; i < dd->num_images; i ++)
pdfioPageDictAddImage(page_dict, pdfioStringCreatef(dd->pdf, "I%u", (unsigned)i),
dd->images[i].obj);
Once the page dictionary is initialized, we create a new page and initialize the current graphics state:
dd->st = pdfioFileCreatePage(dd->pdf, page_dict);
dd->color = DOCCOLOR_BLACK;
dd->font = DOCFONT_MAX;
dd->fsize = 0.0;
dd->y = dd->art_box.y2;
The header consists of a dark gray separating line and the document title. We don't show the header on the first page:
// Add header/footer text
set_color(dd, DOCCOLOR_GRAY);
set_font(dd, DOCFONT_REGULAR, SIZE_HEADFOOT);
if (pdfioFileGetNumPages(dd->pdf) > 1 && dd->title)
{
// Show title in header...
width = pdfioContentTextMeasure(dd->fonts[DOCFONT_REGULAR], dd->title,
SIZE_HEADFOOT);
pdfioContentTextBegin(dd->st);
pdfioContentTextMoveTo(dd->st,
dd->crop_box.x1 + 0.5 * (dd->crop_box.x2 -
dd->crop_box.x1 - width),
dd->crop_box.y2 - SIZE_HEADFOOT);
pdfioContentTextShow(dd->st, UNICODE_VALUE, dd->title);
pdfioContentTextEnd(dd->st);
pdfioContentPathMoveTo(dd->st, dd->crop_box.x1,
dd->crop_box.y2 - 2 * SIZE_HEADFOOT * LINE_HEIGHT +
SIZE_HEADFOOT);
pdfioContentPathLineTo(dd->st, dd->crop_box.x2,
dd->crop_box.y2 - 2 * SIZE_HEADFOOT * LINE_HEIGHT +
SIZE_HEADFOOT);
pdfioContentStroke(dd->st);
}
The footer contains the same dark gray separating line with the current heading and page number on opposite sides. The page number is always positioned on the outer edge for a two-sided print - right justified on odd numbered pages and left justified on even numbered pages:
// Show page number and current heading...
pdfioContentPathMoveTo(dd->st, dd->crop_box.x1,
dd->crop_box.y1 + SIZE_HEADFOOT * LINE_HEIGHT);
pdfioContentPathLineTo(dd->st, dd->crop_box.x2,
dd->crop_box.y1 + SIZE_HEADFOOT * LINE_HEIGHT);
pdfioContentStroke(dd->st);
pdfioContentTextBegin(dd->st);
snprintf(temp, sizeof(temp), "%u", (unsigned)pdfioFileGetNumPages(dd->pdf));
if (pdfioFileGetNumPages(dd->pdf) & 1)
{
// Page number on right...
width = pdfioContentTextMeasure(dd->fonts[DOCFONT_REGULAR], temp, SIZE_HEADFOOT);
pdfioContentTextMoveTo(dd->st, dd->crop_box.x2 - width, dd->crop_box.y1);
}
else
{
// Page number on left...
pdfioContentTextMoveTo(dd->st, dd->crop_box.x1, dd->crop_box.y1);
}
pdfioContentTextShow(dd->st, UNICODE_VALUE, temp);
pdfioContentTextEnd(dd->st);
if (dd->heading)
{
pdfioContentTextBegin(dd->st);
if (pdfioFileGetNumPages(dd->pdf) & 1)
{
// Current heading on left...
pdfioContentTextMoveTo(dd->st, dd->crop_box.x1, dd->crop_box.y1);
}
else
{
width = pdfioContentTextMeasure(dd->fonts[DOCFONT_REGULAR], dd->heading,
SIZE_HEADFOOT);
pdfioContentTextMoveTo(dd->st, dd->crop_box.x2 - width, dd->crop_box.y1);
}
pdfioContentTextShow(dd->st, UNICODE_VALUE, dd->heading);
pdfioContentTextEnd(dd->st);
}
Four functions handle the formatting of the markdown document:
format_block
formats a single paragraph, heading, or table cell,
format_code
: formats a block of code,
format_doc
: formats the document as a whole, and
format_table
: formats a table.
Formatted content is organized into arrays of linefrag_t
and tablerow_t
structures for a line of content or row of table cells, respectively.
The format_doc
function iterates over the block nodes in the markdown document. We map a "thematic break" (horizontal rule) to a page break, which is implemented by moving the current vertical position to the bottom of the page:
case MMD_TYPE_THEMATIC_BREAK :
// Force a page break
dd->y = dd->art_box.y1;
break;
A block quote is indented and uses the italic font by default:
case MMD_TYPE_BLOCK_QUOTE :
format_doc(dd, current, DOCFONT_ITALIC, left + BQ_PADDING, right - BQ_PADDING);
break;
Lists have a leading blank line and are indented:
case MMD_TYPE_ORDERED_LIST :
case MMD_TYPE_UNORDERED_LIST :
if (dd->st)
dd->y -= SIZE_BODY * LINE_HEIGHT;
format_doc(dd, current, deffont, left + LIST_PADDING, right);
break;
List items do not have a leading blank line and make use of leader text that is shown in front of the list text. The leader text is either the current item number or a bullet, which then is directly formatted using the format_block
function:
case MMD_TYPE_LIST_ITEM :
if (doctype == MMD_TYPE_ORDERED_LIST)
{
snprintf(leader, sizeof(leader), "%d. ", i);
format_block(dd, current, deffont, SIZE_BODY, left, right, leader);
}
else
{
format_block(dd, current, deffont, SIZE_BODY, left, right, /*leader*/"• ");
}
break;
Paragraphs have a leading blank line and are likewise directly formatted:
case MMD_TYPE_PARAGRAPH :
// Add a blank line before the paragraph...
dd->y -= SIZE_BODY * LINE_HEIGHT;
// Format the paragraph...
format_block(dd, current, deffont, SIZE_BODY, left, right, /*leader*/NULL);
break;
Tables have a leading blank line and are formatted using the format_table
function:
case MMD_TYPE_TABLE :
// Add a blank line before the paragraph...
dd->y -= SIZE_BODY * LINE_HEIGHT;
// Format the table...
format_table(dd, current, left, right);
break;
Code blocks have a leading blank line, are indented slightly (to account for the padded background), and are formatted using the format_code
function:
case MMD_TYPE_CODE_BLOCK :
// Add a blank line before the code block...
dd->y -= SIZE_BODY * LINE_HEIGHT;
// Format the code block...
format_code(dd, current, left + CODE_PADDING, right - CODE_PADDING);
break;
Headings get some extra processing. First, the current heading is remembered in the docdata_t
structure so it can be used in the page footer:
case MMD_TYPE_HEADING_1 :
case MMD_TYPE_HEADING_2 :
case MMD_TYPE_HEADING_3 :
case MMD_TYPE_HEADING_4 :
case MMD_TYPE_HEADING_5 :
case MMD_TYPE_HEADING_6 :
// Update the current heading
free(dd->heading);
dd->heading = mmdCopyAllText(current);
Then we add a blank line and format the heading with the boldface font at a larger size using the format_block
function:
// Add a blank line before the heading...
dd->y -= heading_sizes[curtype - MMD_TYPE_HEADING_1] * LINE_HEIGHT;
// Format the heading...
format_block(dd, current, DOCFONT_BOLD,
heading_sizes[curtype - MMD_TYPE_HEADING_1], left, right,
/*leader*/NULL);
Once the heading is formatted, we record it in the toc
array as a PDF outline item object/dictionary:
// Add the heading to the table-of-contents...
if (dd->num_toc < DOCTOC_MAX)
{
doctoc_t *t = dd->toc + dd->num_toc;
// New TOC
pdfio_array_t *dest; // Destination array
t->level = curtype - MMD_TYPE_HEADING_1;
t->dict = pdfioDictCreate(dd->pdf);
t->obj = pdfioFileCreateObj(dd->pdf, t->dict);
dest = pdfioArrayCreate(dd->pdf);
pdfioArrayAppendObj(dest,
pdfioFileGetPage(dd->pdf, pdfioFileGetNumPages(dd->pdf) - 1));
pdfioArrayAppendName(dest, "XYZ");
pdfioArrayAppendNumber(dest, PAGE_LEFT);
pdfioArrayAppendNumber(dest,
dd->y + heading_sizes[curtype - MMD_TYPE_HEADING_1] * LINE_HEIGHT);
pdfioArrayAppendNumber(dest, 0.0);
pdfioDictSetArray(t->dict, "Dest", dest);
pdfioDictSetString(t->dict, "Title", pdfioStringCreate(dd->pdf, dd->heading));
dd->num_toc ++;
}
Finally, we also save the heading's target name and its location in the targets
array to allow interior links to work:
// Add the heading to the list of link targets...
if (dd->num_targets < DOCTARGET_MAX)
{
doctarget_t *t = dd->targets + dd->num_targets;
// New target
make_target_name(t->name, dd->heading, sizeof(t->name));
t->page = pdfioFileGetNumPages(dd->pdf) - 1;
t->y = dd->y + heading_sizes[curtype - MMD_TYPE_HEADING_1] * LINE_HEIGHT;
dd->num_targets ++;
}
break;
Paragraphs, headings, list items, and table cells all use the same basic formatting algorithm. Text, checkboxes, and images are collected until the nodes in the current block are used up or the content reaches the right margin.
In order to keep adjacent blocks of text together, the formatting algorithm makes sure that at least 3 lines of text can fit before the bottom edge of the page:
if (mmdGetNextSibling(block))
need_bottom = 3.0 * SIZE_BODY * LINE_HEIGHT;
else
need_bottom = 0.0;
Leader text (used for list items) is right justified to the left margin and becomes the first fragment on the line when present.
if (leader)
{
// Add leader text on first line...
frags[0].type = MMD_TYPE_NORMAL_TEXT;
frags[0].width = pdfioContentTextMeasure(dd->fonts[deffont], leader, fsize);
frags[0].height = fsize;
frags[0].x = left - frags[0].width;
frags[0].imagenum = 0;
frags[0].text = leader;
frags[0].url = NULL;
frags[0].ws = false;
frags[0].font = deffont;
frags[0].color = DOCCOLOR_BLACK;
num_frags = 1;
lineheight = fsize * LINE_HEIGHT;
}
else
{
// No leader text...
num_frags = 0;
lineheight = 0.0;
}
frag = frags + num_frags;
If the current content fragment won't fit, we call render_line
to draw what we have, adjusting the left margin as needed for table cells:
// See if this node will fit on the current line...
if ((num_frags > 0 && (x + width + wswidth) >= right) || num_frags == LINEFRAG_MAX)
{
// No, render this line and start over...
if (blocktype == MMD_TYPE_TABLE_HEADER_CELL ||
blocktype == MMD_TYPE_TABLE_BODY_CELL_CENTER)
margin_left = 0.5 * (right - x);
else if (blocktype == MMD_TYPE_TABLE_BODY_CELL_RIGHT)
margin_left = right - x;
else
margin_left = 0.0;
render_line(dd, margin_left, need_bottom, lineheight, num_frags, frags);
num_frags = 0;
frag = frags;
x = left;
lineheight = 0.0;
need_bottom = 0.0;
Block quotes (blocks use a default font of italic) have an orange bar to the left of the block:
if (deffont == DOCFONT_ITALIC)
{
// Add an orange bar to the left of block quotes...
set_color(dd, DOCCOLOR_ORANGE);
pdfioContentSave(dd->st);
pdfioContentSetLineWidth(dd->st, 3.0);
pdfioContentPathMoveTo(dd->st, left - 6.0, dd->y - (LINE_HEIGHT - 1.0) * fsize);
pdfioContentPathLineTo(dd->st, left - 6.0, dd->y + fsize);
pdfioContentStroke(dd->st);
pdfioContentRestore(dd->st);
}
Finally, we add the current content fragment to the array:
// Add the current node to the fragment list
if (num_frags == 0)
{
// No leading whitespace at the start of the line
ws = false;
wswidth = 0.0;
}
frag->type = type;
frag->x = x;
frag->width = width + wswidth;
frag->height = text ? fsize : height;
frag->imagenum = imagenum;
frag->text = text;
frag->url = url;
frag->ws = ws;
frag->font = font;
frag->color = color;
num_frags ++;
frag ++;
x += width + wswidth;
if (height > lineheight)
lineheight = height;
Code blocks consist of one or more lines of plain monospaced text. We draw a light gray background behind each line with a small bit of padding at the top and bottom:
// Draw the top padding...
set_color(dd, DOCCOLOR_LTGRAY);
pdfioContentPathRect(dd->st, left - CODE_PADDING, dd->y + SIZE_CODEBLOCK,
right - left + 2.0 * CODE_PADDING, CODE_PADDING);
pdfioContentFillAndStroke(dd->st, false);
// Start a code text block...
set_font(dd, DOCFONT_MONOSPACE, SIZE_CODEBLOCK);
pdfioContentTextBegin(dd->st);
pdfioContentTextMoveTo(dd->st, left, dd->y);
for (code = mmdGetFirstChild(block); code; code = mmdGetNextSibling(code))
{
set_color(dd, DOCCOLOR_LTGRAY);
pdfioContentPathRect(dd->st, left - CODE_PADDING,
dd->y - (LINE_HEIGHT - 1.0) * SIZE_CODEBLOCK,
right - left + 2.0 * CODE_PADDING, lineheight);
pdfioContentFillAndStroke(dd->st, false);
set_color(dd, DOCCOLOR_RED);
pdfioContentTextShow(dd->st, UNICODE_VALUE, mmdGetText(code));
dd->y -= lineheight;
if (dd->y < dd->art_box.y1)
{
// End the current text block...
pdfioContentTextEnd(dd->st);
// Start a new page...
new_page(dd);
set_font(dd, DOCFONT_MONOSPACE, SIZE_CODEBLOCK);
dd->y -= lineheight;
pdfioContentTextBegin(dd->st);
pdfioContentTextMoveTo(dd->st, left, dd->y);
}
}
// End the current text block...
pdfioContentTextEnd(dd->st);
dd->y += lineheight;
// Draw the bottom padding...
set_color(dd, DOCCOLOR_LTGRAY);
pdfioContentPathRect(dd->st, left - CODE_PADDING,
dd->y - CODE_PADDING - (LINE_HEIGHT - 1.0) * SIZE_CODEBLOCK,
right - left + 2.0 * CODE_PADDING, CODE_PADDING);
pdfioContentFillAndStroke(dd->st, false);
Tables are the most difficult to format. We start by scanning the entire table and measuring every cell with the measure_cell
function:
for (num_cols = 0, num_rows = 0, rowptr = rows, current = mmdGetFirstChild(table);
current && num_rows < TABLEROW_MAX;
current = next)
{
next = mmd_walk_next(table, current);
type = mmdGetType(current);
if (type == MMD_TYPE_TABLE_ROW)
{
// Parse row...
for (col = 0, current = mmdGetFirstChild(current);
current && num_cols < TABLECOL_MAX;
current = mmdGetNextSibling(current), col ++)
{
rowptr->cells[col] = current;
measure_cell(dd, current, cols + col);
if (col >= num_cols)
num_cols = col + 1;
}
rowptr ++;
num_rows ++;
}
}
The measure_cell
function also updates the minimum and maximum width needed for each column. To this we add the cell padding to compute the total table width:
// Figure out the width of each column...
for (col = 0, table_width = 0.0; col < num_cols; col ++)
{
cols[col].max_width += 2.0 * TABLE_PADDING;
table_width += cols[col].max_width;
cols[col].width = cols[col].max_width;
}
If the calculated width is more than the available width, we need to adjust the width of the columns. The algorithm used here breaks the available width into N equal-width columns - any columns wider than this will be scaled proportionately. This works out as two steps - one to calculate the the base width of "narrow" columns and a second to distribute the remaining width amongst the wider columns:
format_width = right - left - 2.0 * TABLE_PADDING * num_cols;
if (table_width > format_width)
{
// Content too wide, try scaling the widths...
double avg_width, // Average column width
base_width, // Base width
remaining_width, // Remaining width
scale_width; // Width for scaling
size_t num_remaining_cols = 0; // Number of remaining columns
// First mark any columns that are narrower than the average width...
avg_width = format_width / num_cols;
for (col = 0, base_width = 0.0, remaining_width = 0.0; col < num_cols; col ++)
{
if (cols[col].width > avg_width)
{
remaining_width += cols[col].width;
num_remaining_cols ++;
}
else
{
base_width += cols[col].width;
}
}
// Then proportionately distribute the remaining width to the other columns...
format_width -= base_width;
for (col = 0, table_width = 0.0; col < num_cols; col ++)
{
if (cols[col].width > avg_width)
cols[col].width = cols[col].width * format_width / remaining_width;
table_width += cols[col].width;
}
}
Now that we have the widths of the columns, we can calculate the left and right margins of each column for formatting the cell text:
// Calculate the margins of each column in preparation for formatting
for (col = 0, x = left + TABLE_PADDING; col < num_cols; col ++)
{
cols[col].left = x;
cols[col].right = x + cols[col].width;
x += cols[col].width + 2.0 * TABLE_PADDING;
}
Then we re-measure the cells using the final column widths to determine the height of each cell and row:
// Calculate the height of each row and cell in preparation for formatting
for (row = 0, rowptr = rows; row < num_rows; row ++, rowptr ++)
{
for (col = 0; col < num_cols; col ++)
{
height = measure_cell(dd, rowptr->cells[col], cols + col) + 2.0 * TABLE_PADDING;
if (height > rowptr->height)
rowptr->height = height;
}
}
Finally, we render each row in the table:
// Render each table row...
for (row = 0, rowptr = rows; row < num_rows; row ++, rowptr ++)
render_row(dd, num_cols, cols, rowptr);
The formatted content in arrays of linefrag_t
and tablerow_t
structures are passed to the render_line
and render_row
functions respectively to produce content in the PDF document.
The render_line
function adds content from the linefrag_t
array to a PDF page. It starts by determining whether a new page is needed:
if (!dd->st)
{
new_page(dd);
margin_top = 0.0;
}
dd->y -= margin_top + lineheight;
if ((dd->y - need_bottom) < dd->art_box.y1)
{
new_page(dd);
dd->y -= lineheight;
}
We then loops through the fragments for the current line, drawing checkboxes, images, and text as needed. When a hyperlink is present, we add the link to the links
array in the docdata_t
structure, mapping "@" and "@@" to an internal link corresponding to the linked text:
if (frag->url && dd->num_links < DOCLINK_MAX)
{
doclink_t *l = dd->links + dd->num_links;
// Pointer to this link record
if (!strcmp(frag->url, "@"))
{
// Use mapped text as link target...
char targetlink[129]; // Targeted link
targetlink[0] = '#';
make_target_name(targetlink + 1, frag->text, szeof(targetlink) - 1);
l->url = pdfioStringCreate(dd->pdf, targetlink);
}
else if (!strcmp(frag->url, "@@"))
{
// Use literal text as anchor...
l->url = pdfioStringCreatef(dd->pdf, "#%s", frag->text);
}
else
{
// Use URL as-is...
l->url = frag->url;
}
l->box.x1 = frag->x;
l->box.y1 = dd->y;
l->box.x2 = frag->x + frag->width;
l->box.y2 = dd->y + frag->height;
dd->num_links ++;
}
These are later written as annotations in the add_links
function.
The render_row
function takes a row of cells and the corresponding column definitions. It starts by drawing the border boxes around body cells:
if (mmdGetType(row->cells[0]) == MMD_TYPE_TABLE_HEADER_CELL)
{
// Header row, no border...
deffont = DOCFONT_BOLD;
}
else
{
// Regular body row, add borders...
deffont = DOCFONT_REGULAR;
set_color(dd, DOCCOLOR_GRAY);
pdfioContentPathRect(dd->st, cols[0].left - TABLE_PADDING, dd->y - row->height,
cols[num_cols - 1].right - cols[0].left +
2.0 * TABLE_PADDING, row->height);
for (col = 1; col < num_cols; col ++)
{
pdfioContentPathMoveTo(dd->st, cols[col].left - TABLE_PADDING, dd->y);
pdfioContentPathLineTo(dd->st, cols[col].left - TABLE_PADDING, dd->y - row->height);
}
pdfioContentStroke(dd->st);
}
Then it formats each cell using the format_block
function described previously. The page y
value is reset before formatting each cell:
row_y = dd->y;
for (col = 0; col < num_cols; col ++)
{
dd->y = row_y;
format_block(dd, row->cells[col], deffont, SIZE_TABLE, cols[col].left,
cols[col].right, /*leader*/NULL);
}
dd->y = row_y - row->height;
Add an array value to an array.
bool pdfioArrayAppendArray(pdfio_array_t *a, pdfio_array_t *value);
a | Array |
---|---|
value | Value |
true
on success, false
on failure
Add a binary string value to an array.
bool pdfioArrayAppendBinary(pdfio_array_t *a, const unsigned char *value, size_t valuelen);
a | Array |
---|---|
value | Value |
valuelen | Length of value |
true
on success, false
on failure
Add a boolean value to an array.
bool pdfioArrayAppendBoolean(pdfio_array_t *a, bool value);
a | Array |
---|---|
value | Value |
true
on success, false
on failure
Add a date value to an array.
bool pdfioArrayAppendDate(pdfio_array_t *a, time_t value);
a | Array |
---|---|
value | Value |
true
on success, false
on failure
Add a dictionary to an array.
bool pdfioArrayAppendDict(pdfio_array_t *a, pdfio_dict_t *value);
a | Array |
---|---|
value | Value |
true
on success, false
on failure
Add a name to an array.
bool pdfioArrayAppendName(pdfio_array_t *a, const char *value);
a | Array |
---|---|
value | Value |
true
on success, false
on failure
Add a number to an array.
bool pdfioArrayAppendNumber(pdfio_array_t *a, double value);
a | Array |
---|---|
value | Value |
true
on success, false
on failure
Add an indirect object reference to an array.
bool pdfioArrayAppendObj(pdfio_array_t *a, pdfio_obj_t *value);
a | Array |
---|---|
value | Value |
true
on success, false
on failure
Add a string to an array.
bool pdfioArrayAppendString(pdfio_array_t *a, const char *value);
a | Array |
---|---|
value | Value |
true
on success, false
on failure
Copy an array.
pdfio_array_t *pdfioArrayCopy(pdfio_file_t *pdf, pdfio_array_t *a);
PDF file | |
a | Original array |
New array or NULL
on error
Create an empty array.
pdfio_array_t *pdfioArrayCreate(pdfio_file_t *pdf);
PDF file |
New array or NULL
on error
Create an ICC-based color space array.
pdfio_array_t *pdfioArrayCreateColorFromICCObj(pdfio_file_t *pdf, pdfio_obj_t *icc_object);
PDF file | |
icc_object | ICC profile object |
Color array
Create a calibrated color space array using a CIE XYZ transform matrix.
pdfio_array_t *pdfioArrayCreateColorFromMatrix(pdfio_file_t *pdf, size_t num_colors, double gamma, const double matrix[3][3], const double white_point[3]);
PDF file | |
num_colors | Number of colors (1 or 3) |
gamma | Gamma value |
matrix[3][3] | XYZ transform |
white_point[3] | White point |
Color space array
Create an indexed color space array.
pdfio_array_t *pdfioArrayCreateColorFromPalette(pdfio_file_t *pdf, size_t num_colors, const unsigned char *colors);
PDF file | |
num_colors | Number of colors |
colors | RGB values for colors |
Color array
Create a calibrated color sapce array using CIE xy primary chromacities.
pdfio_array_t *pdfioArrayCreateColorFromPrimaries(pdfio_file_t *pdf, size_t num_colors, double gamma, double wx, double wy, double rx, double ry, double gx, double gy, double bx, double by);
PDF file | |
num_colors | Number of colors (1 or 3) |
gamma | Gama value |
wx | White point X chromacity |
wy | White point Y chromacity |
rx | Red X chromacity |
ry | Red Y chromacity |
gx | Green X chromacity |
gy | Green Y chromacity |
bx | Blue X chromacity |
by | Blue Y chromacity |
Color space array
Create a color array for a standard color space.
pdfio_array_t *pdfioArrayCreateColorFromStandard(pdfio_file_t *pdf, size_t num_colors, pdfio_cs_t cs);
PDF file | |
num_colors | Number of colors (1 or 3) |
cs | Color space enumeration |
Color array
This function creates a color array for a standard PDFIO_CS_
enumerated color space.
The "num_colors" argument must be 1
for grayscale and 3
for RGB color.
Get an array value from an array.
pdfio_array_t *pdfioArrayGetArray(pdfio_array_t *a, size_t n);
a | Array |
---|---|
n | Index |
Value
Get a binary string value from an array.
unsigned char *pdfioArrayGetBinary(pdfio_array_t *a, size_t n, size_t *length);
a | Array |
---|---|
n | Index |
length | Length of string |
Value
Get a boolean value from an array.
bool pdfioArrayGetBoolean(pdfio_array_t *a, size_t n);
a | Array |
---|---|
n | Index |
Value
Get a date value from an array.
time_t pdfioArrayGetDate(pdfio_array_t *a, size_t n);
a | Array |
---|---|
n | Index |
Value
Get a dictionary value from an array.
pdfio_dict_t *pdfioArrayGetDict(pdfio_array_t *a, size_t n);
a | Array |
---|---|
n | Index |
Value
Get a name value from an array.
const char *pdfioArrayGetName(pdfio_array_t *a, size_t n);
a | Array |
---|---|
n | Index |
Value
Get a number from an array.
double pdfioArrayGetNumber(pdfio_array_t *a, size_t n);
a | Array |
---|---|
n | Index |
Value
Get an indirect object reference from an array.
pdfio_obj_t *pdfioArrayGetObj(pdfio_array_t *a, size_t n);
a | Array |
---|---|
n | Index |
Value
Get the length of an array.
size_t pdfioArrayGetSize(pdfio_array_t *a);
a | Array |
---|
Length of array
Get a string value from an array.
const char *pdfioArrayGetString(pdfio_array_t *a, size_t n);
a | Array |
---|---|
n | Index |
Value
Get a value type from an array.
pdfio_valtype_t pdfioArrayGetType(pdfio_array_t *a, size_t n);
a | Array |
---|---|
n | Index |
Value type
Remove an array entry.
bool pdfioArrayRemove(pdfio_array_t *a, size_t n);
a | Array |
---|---|
n | Index |
true
on success, false
otherwise
Clip output to the current path.
bool pdfioContentClip(pdfio_stream_t *st, bool even_odd);
st | Stream |
---|---|
even_odd | Even/odd fill vs. non-zero winding rule |
true
on success, false
on failure
Draw an image object.
bool pdfioContentDrawImage(pdfio_stream_t *st, const char *name, double x, double y, double width, double height);
st | Stream |
---|---|
name | Image name |
x | X offset of image |
y | Y offset of image |
width | Width of image |
height | Height of image |
true
on success, false
on failure
The object name must be part of the page dictionary resources, typically
using the pdfioPageDictAddImage
function.
Fill the current path.
bool pdfioContentFill(pdfio_stream_t *st, bool even_odd);
st | Stream |
---|---|
even_odd | Even/odd fill vs. non-zero winding rule |
true
on success, false
on failure
Fill and stroke the current path.
bool pdfioContentFillAndStroke(pdfio_stream_t *st, bool even_odd);
st | Stream |
---|---|
even_odd | Even/odd fill vs. non-zero winding |
true
on success, false
on failure
Concatenate a matrix to the current graphics state.
bool pdfioContentMatrixConcat(pdfio_stream_t *st, pdfio_matrix_t m);
st | Stream |
---|---|
m | Transform matrix |
true
on success, false
on failure
Rotate the current transform matrix.
bool pdfioContentMatrixRotate(pdfio_stream_t *st, double degrees);
st | Stream |
---|---|
degrees | Rotation angle in degrees counter-clockwise |
true
on success, false
on failure
Scale the current transform matrix.
bool pdfioContentMatrixScale(pdfio_stream_t *st, double sx, double sy);
st | Stream |
---|---|
sx | X scale |
sy | Y scale |
true
on success, false
on failure
Translate the current transform matrix.
bool pdfioContentMatrixTranslate(pdfio_stream_t *st, double tx, double ty);
st | Stream |
---|---|
tx | X offset |
ty | Y offset |
true
on success, false
on failure
Close the current path.
bool pdfioContentPathClose(pdfio_stream_t *st);
st | Stream |
---|
true
on success, false
on failure
Add a Bezier curve with two control points.
bool pdfioContentPathCurve(pdfio_stream_t *st, double x1, double y1, double x2, double y2, double x3, double y3);
st | Stream |
---|---|
x1 | X position 1 |
y1 | Y position 1 |
x2 | X position 2 |
y2 | Y position 2 |
x3 | X position 3 |
y3 | Y position 3 |
true
on success, false
on failure
Add a Bezier curve with an initial control point.
bool pdfioContentPathCurve13(pdfio_stream_t *st, double x1, double y1, double x3, double y3);
st | Stream |
---|---|
x1 | X position 1 |
y1 | Y position 1 |
x3 | X position 3 |
y3 | Y position 3 |
true
on success, false
on failure
Add a Bezier curve with a trailing control point.
bool pdfioContentPathCurve23(pdfio_stream_t *st, double x2, double y2, double x3, double y3);
st | Stream |
---|---|
x2 | X position 2 |
y2 | Y position 2 |
x3 | X position 3 |
y3 | Y position 3 |
true
on success, false
on failure
Clear the current path.
bool pdfioContentPathEnd(pdfio_stream_t *st);
st | Stream |
---|
true
on success, false
on failure
Add a straight line to the current path.
bool pdfioContentPathLineTo(pdfio_stream_t *st, double x, double y);
st | Stream |
---|---|
x | X position |
y | Y position |
true
on success, false
on failure
Start a new subpath.
bool pdfioContentPathMoveTo(pdfio_stream_t *st, double x, double y);
st | Stream |
---|---|
x | X position |
y | Y position |
true
on success, false
on failure
Add a rectangle to the current path.
bool pdfioContentPathRect(pdfio_stream_t *st, double x, double y, double width, double height);
st | Stream |
---|---|
x | X offset |
y | Y offset |
width | Width |
height | Height |
true
on success, false
on failure
Restore a previous graphics state.
bool pdfioContentRestore(pdfio_stream_t *st);
st | Stream |
---|
true
on success, false
on failure
Save the current graphics state.
bool pdfioContentSave(pdfio_stream_t *st);
st | Stream |
---|
true
on success, false
on failure
Set the stroke pattern.
bool pdfioContentSetDashPattern(pdfio_stream_t *st, double phase, double on, double off);
st | Stream |
---|---|
phase | Phase (offset within pattern) |
on | On length |
off | Off length |
true
on success, false
on failure
This function sets the stroke pattern when drawing lines. If "on" and "off" are 0, a solid line is drawn.
Set device CMYK fill color.
bool pdfioContentSetFillColorDeviceCMYK(pdfio_stream_t *st, double c, double m, double y, double k);
st | Stream |
---|---|
c | Cyan value (0.0 to 1.0) |
m | Magenta value (0.0 to 1.0) |
y | Yellow value (0.0 to 1.0) |
k | Black value (0.0 to 1.0) |
true
on success, false
on failure
Set the device gray fill color.
bool pdfioContentSetFillColorDeviceGray(pdfio_stream_t *st, double g);
st | Stream |
---|---|
g | Gray value (0.0 to 1.0) |
true
on success, false
on failure
Set the device RGB fill color.
bool pdfioContentSetFillColorDeviceRGB(pdfio_stream_t *st, double r, double g, double b);
st | Stream |
---|---|
r | Red value (0.0 to 1.0) |
g | Green value (0.0 to 1.0) |
b | Blue value (0.0 to 1.0) |
true
on success, false
on failure
Set the calibrated gray fill color.
bool pdfioContentSetFillColorGray(pdfio_stream_t *st, double g);
st | Stream |
---|---|
g | Gray value (0.0 to 1.0) |
true
on success, false
on failure
Set the calibrated RGB fill color.
bool pdfioContentSetFillColorRGB(pdfio_stream_t *st, double r, double g, double b);
st | Stream |
---|---|
r | Red value (0.0 to 1.0) |
g | Green value (0.0 to 1.0) |
b | Blue value (0.0 to 1.0) |
true
on success, false
on failure
Set the fill colorspace.
bool pdfioContentSetFillColorSpace(pdfio_stream_t *st, const char *name);
st | Stream |
---|---|
name | Color space name |
true
on success, false
on failure
Set the flatness tolerance.
bool pdfioContentSetFlatness(pdfio_stream_t *st, double flatness);
st | Stream |
---|---|
flatness | Flatness value (0.0 to 100.0) |
true
on success, false
on failure
Set the line ends style.
bool pdfioContentSetLineCap(pdfio_stream_t *st, pdfio_linecap_t lc);
st | Stream |
---|---|
lc | Line cap value |
true
on success, false
on failure
Set the line joining style.
bool pdfioContentSetLineJoin(pdfio_stream_t *st, pdfio_linejoin_t lj);
st | Stream |
---|---|
lj | Line join value |
true
on success, false
on failure
Set the line width.
bool pdfioContentSetLineWidth(pdfio_stream_t *st, double width);
st | Stream |
---|---|
width | Line width value |
true
on success, false
on failure
Set the miter limit.
bool pdfioContentSetMiterLimit(pdfio_stream_t *st, double limit);
st | Stream |
---|---|
limit | Miter limit value |
true
on success, false
on failure
Set the device CMYK stroke color.
bool pdfioContentSetStrokeColorDeviceCMYK(pdfio_stream_t *st, double c, double m, double y, double k);
st | Stream |
---|---|
c | Cyan value (0.0 to 1.0) |
m | Magenta value (0.0 to 1.0) |
y | Yellow value (0.0 to 1.0) |
k | Black value (0.0 to 1.0) |
true
on success, false
on failure
Set the device gray stroke color.
bool pdfioContentSetStrokeColorDeviceGray(pdfio_stream_t *st, double g);
st | Stream |
---|---|
g | Gray value (0.0 to 1.0) |
true
on success, false
on failure
Set the device RGB stroke color.
bool pdfioContentSetStrokeColorDeviceRGB(pdfio_stream_t *st, double r, double g, double b);
st | Stream |
---|---|
r | Red value (0.0 to 1.0) |
g | Green value (0.0 to 1.0) |
b | Blue value (0.0 to 1.0) |
true
on success, false
on failure
Set the calibrated gray stroke color.
bool pdfioContentSetStrokeColorGray(pdfio_stream_t *st, double g);
st | Stream |
---|---|
g | Gray value (0.0 to 1.0) |
true
on success, false
on failure
Set the calibrated RGB stroke color.
bool pdfioContentSetStrokeColorRGB(pdfio_stream_t *st, double r, double g, double b);
st | Stream |
---|---|
r | Red value (0.0 to 1.0) |
g | Green value (0.0 to 1.0) |
b | Blue value (0.0 to 1.0) |
true
on success, false
on failure
Set the stroke color space.
bool pdfioContentSetStrokeColorSpace(pdfio_stream_t *st, const char *name);
st | Stream |
---|---|
name | Color space name |
true
on success, false
on failure
Set the spacing between characters.
bool pdfioContentSetTextCharacterSpacing(pdfio_stream_t *st, double spacing);
st | Stream |
---|---|
spacing | Character spacing |
true
on success, false
on failure
Set the text font and size.
bool pdfioContentSetTextFont(pdfio_stream_t *st, const char *name, double size);
st | Stream |
---|---|
name | Font name |
size | Font size |
true
on success, false
on failure
Set text leading (line height) value.
bool pdfioContentSetTextLeading(pdfio_stream_t *st, double leading);
st | Stream |
---|---|
leading | Leading (line height) value |
true
on success, false
on failure
Set the text transform matrix.
bool pdfioContentSetTextMatrix(pdfio_stream_t *st, pdfio_matrix_t m);
st | Stream |
---|---|
m | Transform matrix |
true
on success, false
on failure
Set the text rendering mode.
bool pdfioContentSetTextRenderingMode(pdfio_stream_t *st, pdfio_textrendering_t mode);
st | Stream |
---|---|
mode | Text rendering mode |
true
on success, false
on failure
Set the text baseline offset.
bool pdfioContentSetTextRise(pdfio_stream_t *st, double rise);
st | Stream |
---|---|
rise | Y offset |
true
on success, false
on failure
Set the inter-word spacing.
bool pdfioContentSetTextWordSpacing(pdfio_stream_t *st, double spacing);
st | Stream |
---|---|
spacing | Spacing between words |
true
on success, false
on failure
Set the horizontal scaling value.
bool pdfioContentSetTextXScaling(pdfio_stream_t *st, double percent);
st | Stream |
---|---|
percent | Horizontal scaling in percent |
true
on success, false
on failure
Stroke the current path.
bool pdfioContentStroke(pdfio_stream_t *st);
st | Stream |
---|
true
on success, false
on failure
Begin a text block.
bool pdfioContentTextBegin(pdfio_stream_t *st);
st | Stream |
---|
true
on success, false
on failure
End a text block.
bool pdfioContentTextEnd(pdfio_stream_t *st);
st | Stream |
---|
true
on success, false
on failure
Measure a text string and return its width.
double pdfioContentTextMeasure(pdfio_obj_t *font, const char *s, double size);
font | Font object created by pdfioFileCreateFontObjFromFile |
---|---|
s | UTF-8 string |
size | Font size/height |
Width
This function measures the given text string "s" and returns its width based on "size". The text string must always use the UTF-8 (Unicode) encoding but any control characters (such as newlines) are ignored.
Move to the next line and offset.
bool pdfioContentTextMoveLine(pdfio_stream_t *st, double tx, double ty);
st | Stream |
---|---|
tx | X offset |
ty | Y offset |
true
on success, false
on failure
Offset within the current line.
bool pdfioContentTextMoveTo(pdfio_stream_t *st, double tx, double ty);
st | Stream |
---|---|
tx | X offset |
ty | Y offset |
true
on success, false
on failure
Move to the next line.
bool pdfioContentTextNewLine(pdfio_stream_t *st);
st | Stream |
---|
true
on success, false
on failure
Move to the next line and show text.
bool pdfioContentTextNewLineShow(pdfio_stream_t *st, double ws, double cs, bool unicode, const char *s);
st | Stream |
---|---|
ws | Word spacing or 0.0 for none |
cs | Character spacing or 0.0 for none |
unicode | Unicode text? |
s | String to show |
true
on success, false
on failure
This function moves to the next line and then shows some text with optional word and character spacing in a PDF content stream. The "unicode" argument specifies that the current font maps to full Unicode. The "s" argument specifies a UTF-8 encoded string.
Show formatted text.
bool pdfioContentTextNewLineShowf(pdfio_stream_t *st, double ws, double cs, bool unicode, const char *format, ...);
st | Stream |
---|---|
ws | Word spacing or 0.0 for none |
cs | Character spacing or 0.0 for none |
unicode | Unicode text? |
format | printf -style format string |
... | Additional arguments as needed |
true
on success, false
on failure
This function moves to the next line and shows some formatted text with
optional word and character spacing in a PDF content stream. The "unicode"
argument specifies that the current font maps to full Unicode. The "format"
argument specifies a UTF-8 encoded printf
-style format string.
Show text.
bool pdfioContentTextShow(pdfio_stream_t *st, bool unicode, const char *s);
st | Stream |
---|---|
unicode | Unicode text? |
s | String to show |
true
on success, false
on failure
This function shows some text in a PDF content stream. The "unicode" argument specifies that the current font maps to full Unicode. The "s" argument specifies a UTF-8 encoded string.
Show justified text.
bool pdfioContentTextShowJustified(pdfio_stream_t *st, bool unicode, size_t num_fragments, const double *offsets, const char *const *fragments);
st | Stream |
---|---|
unicode | Unicode text? |
num_fragments | Number of text fragments |
offsets | Text offsets before fragments |
fragments | Text fragments |
true
on success, false
on failure
This function shows some text in a PDF content stream. The "unicode" argument specifies that the current font maps to full Unicode. The "fragments" argument specifies an array of UTF-8 encoded strings.
bool pdfioContentTextShowf(pdfio_stream_t *st, bool unicode, const char *format, ...);
st | Stream |
---|---|
unicode | Unicode text? |
format | printf -style format string |
... | Additional arguments as needed |
Show formatted text.
This function shows some formatted text in a PDF content stream. The
"unicode" argument specifies that the current font maps to full Unicode.
The "format" argument specifies a UTF-8 encoded printf
-style format string.
Remove a key/value pair from a dictionary.
bool pdfioDictClear(pdfio_dict_t *dict, const char *key);
dict | Dictionary |
---|---|
key | Key |
true
if cleared, false
otherwise
Copy a dictionary to a PDF file.
pdfio_dict_t *pdfioDictCopy(pdfio_file_t *pdf, pdfio_dict_t *dict);
PDF file | |
dict | Original dictionary |
New dictionary
Create a dictionary to hold key/value pairs.
pdfio_dict_t *pdfioDictCreate(pdfio_file_t *pdf);
PDF file |
New dictionary
Get a key array value from a dictionary.
pdfio_array_t *pdfioDictGetArray(pdfio_dict_t *dict, const char *key);
dict | Dictionary |
---|---|
key | Key |
Value
Get a key binary string value from a dictionary.
unsigned char *pdfioDictGetBinary(pdfio_dict_t *dict, const char *key, size_t *length);
dict | Dictionary |
---|---|
key | Key |
length | Length of value |
Value
Get a key boolean value from a dictionary.
bool pdfioDictGetBoolean(pdfio_dict_t *dict, const char *key);
dict | Dictionary |
---|---|
key | Key |
Value
Get a date value from a dictionary.
time_t pdfioDictGetDate(pdfio_dict_t *dict, const char *key);
dict | Dictionary |
---|---|
key | Key |
Value
Get a key dictionary value from a dictionary.
pdfio_dict_t *pdfioDictGetDict(pdfio_dict_t *dict, const char *key);
dict | Dictionary |
---|---|
key | Key |
Value
Get the key for the specified pair.
const char *pdfioDictGetKey(pdfio_dict_t *dict, size_t n);
dict | Dictionary |
---|---|
n | Pair index (0 -based) |
Key for specified pair
Get a key name value from a dictionary.
const char *pdfioDictGetName(pdfio_dict_t *dict, const char *key);
dict | Dictionary |
---|---|
key | Key |
Value
Get the number of key/value pairs in a dictionary.
size_t pdfioDictGetNumPairs(pdfio_dict_t *dict);
dict | Dictionary |
---|
Number of pairs
Get a key number value from a dictionary.
double pdfioDictGetNumber(pdfio_dict_t *dict, const char *key);
dict | Dictionary |
---|---|
key | Key |
Value
Get a key indirect object value from a dictionary.
pdfio_obj_t *pdfioDictGetObj(pdfio_dict_t *dict, const char *key);
dict | Dictionary |
---|---|
key | Key |
Value
Get a key rectangle value from a dictionary.
pdfio_rect_t *pdfioDictGetRect(pdfio_dict_t *dict, const char *key, pdfio_rect_t *rect);
dict | Dictionary |
---|---|
key | Key |
rect | Rectangle |
Rectangle
Get a key string value from a dictionary.
const char *pdfioDictGetString(pdfio_dict_t *dict, const char *key);
dict | Dictionary |
---|---|
key | Key |
Value
Get a key value type from a dictionary.
pdfio_valtype_t pdfioDictGetType(pdfio_dict_t *dict, const char *key);
dict | Dictionary |
---|---|
key | Key |
Value type
Iterate the keys in a dictionary.
void pdfioDictIterateKeys(pdfio_dict_t *dict, pdfio_dict_cb_t cb, void *cb_data);
dict | Dictionary |
---|---|
cb | Callback function |
cb_data | Callback data |
This function iterates the keys in a dictionary, calling the supplied function "cb":
bool my_dict_cb(pdfio_dict_t *dict, const char *key, void *cb_data) { ... "key" contains the dictionary key ... ... return true to continue or false to stop ... }The iteration continues as long as the callback returns
true
or all keys
have been iterated.
Set a key array in a dictionary.
bool pdfioDictSetArray(pdfio_dict_t *dict, const char *key, pdfio_array_t *value);
dict | Dictionary |
---|---|
key | Key |
value | Value |
true
on success, false
on failure
Set a key binary string in a dictionary.
bool pdfioDictSetBinary(pdfio_dict_t *dict, const char *key, const unsigned char *value, size_t valuelen);
dict | Dictionary |
---|---|
key | Key |
value | Value |
valuelen | Length of value |
true
on success, false
on failure
Set a key boolean in a dictionary.
bool pdfioDictSetBoolean(pdfio_dict_t *dict, const char *key, bool value);
dict | Dictionary |
---|---|
key | Key |
value | Value |
true
on success, false
on failure
Set a date value in a dictionary.
bool pdfioDictSetDate(pdfio_dict_t *dict, const char *key, time_t value);
dict | Dictionary |
---|---|
key | Key |
value | Value |
true
on success, false
on failure
Set a key dictionary in a dictionary.
bool pdfioDictSetDict(pdfio_dict_t *dict, const char *key, pdfio_dict_t *value);
dict | Dictionary |
---|---|
key | Key |
value | Value |
true
on success, false
on failure
Set a key name in a dictionary.
bool pdfioDictSetName(pdfio_dict_t *dict, const char *key, const char *value);
dict | Dictionary |
---|---|
key | Key |
value | Value |
true
on success, false
on failure
Set a key null in a dictionary.
bool pdfioDictSetNull(pdfio_dict_t *dict, const char *key);
dict | Dictionary |
---|---|
key | Key |
true
on success, false
on failure
Set a key number in a dictionary.
bool pdfioDictSetNumber(pdfio_dict_t *dict, const char *key, double value);
dict | Dictionary |
---|---|
key | Key |
value | Value |
true
on success, false
on failure
Set a key indirect object reference in a dictionary.
bool pdfioDictSetObj(pdfio_dict_t *dict, const char *key, pdfio_obj_t *value);
dict | Dictionary |
---|---|
key | Key |
value | Value |
true
on success, false
on failure
Set a key rectangle in a dictionary.
bool pdfioDictSetRect(pdfio_dict_t *dict, const char *key, pdfio_rect_t *value);
dict | Dictionary |
---|---|
key | Key |
value | Value |
true
on success, false
on failure
Set a key literal string in a dictionary.
bool pdfioDictSetString(pdfio_dict_t *dict, const char *key, const char *value);
dict | Dictionary |
---|---|
key | Key |
value | Value |
true
on success, false
on failure
Set a key formatted string in a dictionary.
bool pdfioDictSetStringf(pdfio_dict_t *dict, const char *key, const char *format, ...);
dict | Dictionary |
---|---|
key | Key |
format | printf -style format string |
... | Additional arguments as needed |
true
on success, false
on failure
Close a PDF file and free all memory used for it.
bool pdfioFileClose(pdfio_file_t *pdf);
PDF file |
true
on success and false
on failure
Create a PDF file.
pdfio_file_t *pdfioFileCreate(const char *filename, const char *version, pdfio_rect_t *media_box, pdfio_rect_t *crop_box, pdfio_error_cb_t error_cb, void *error_cbdata);
filename | Filename |
---|---|
version | PDF version number or NULL for default (2.0) |
media_box | Default MediaBox for pages |
crop_box | Default CropBox for pages |
error_cb | Error callback or NULL for default |
error_cbdata | Error callback data, if any |
PDF file or NULL
on error
This function creates a new PDF file. The "filename" argument specifies the
name of the PDF file to create.
The "version" argument specifies the PDF version number for the file or
NULL
for the default ("2.0").
The "media_box" and "crop_box" arguments specify the default MediaBox and
CropBox for pages in the PDF file - if NULL
then a default "Universal" size
of 8.27x11in (the intersection of US Letter and ISO A4) is used.
The "error_cb" and "error_cbdata" arguments specify an error handler callback
and its data pointer - if NULL
the default error handler is used that
writes error messages to stderr
.
Create a new object in a PDF file containing an array.
pdfio_obj_t *pdfioFileCreateArrayObj(pdfio_file_t *pdf, pdfio_array_t *array);
PDF file | |
array | Object array |
New object
This function creates a new object with an array value in a PDF file.
You must call pdfioObjClose
to write the object to the file.
Create one of the base 14 PDF fonts.
pdfio_obj_t *pdfioFileCreateFontObjFromBase(pdfio_file_t *pdf, const char *name);
PDF file | |
name | Font name |
Font object
This function creates one of the base 14 PDF fonts. The "name" parameter specifies the font nane:
Aside from "Symbol" and "Zapf-Dingbats", Base fonts use the Windows CP1252 (ISO-8859-1 with additional characters such as the Euro symbol) subset of Unicode.
Add a font object to a PDF file.
pdfio_obj_t *pdfioFileCreateFontObjFromFile(pdfio_file_t *pdf, const char *filename, bool unicode);
PDF file | |
filename | Filename |
unicode | Force Unicode |
Font object
This function embeds a TrueType/OpenType font into a PDF file. The "unicode" parameter controls whether the font is encoded for two-byte characters (potentially full Unicode, but more typically a subset) or to only support the Windows CP1252 (ISO-8859-1 with additional characters such as the Euro symbol) subset of Unicode.
Add an ICC profile object to a PDF file.
pdfio_obj_t *pdfioFileCreateICCObjFromFile(pdfio_file_t *pdf, const char *filename, size_t num_colors);
PDF file | |
filename | Filename |
num_colors | Number of color components (1, 3, or 4) |
Object
Add image object(s) to a PDF file from memory.
pdfio_obj_t *pdfioFileCreateImageObjFromData(pdfio_file_t *pdf, const unsigned char *data, size_t width, size_t height, size_t num_colors, pdfio_array_t *color_data, bool alpha, bool interpolate);
PDF file | |
data | Pointer to image data |
width | Width of image |
height | Height of image |
num_colors | Number of colors |
color_data | Colorspace data or NULL for default |
alpha | true if data contains an alpha channel |
interpolate | Interpolate image data? |
Object
This function creates image object(s) in a PDF file from a data buffer in
memory. The "data" parameter points to the image data as 8-bit color values.
The "width" and "height" parameters specify the image dimensions. The
"num_colors" parameter specifies the number of color components (1
for
grayscale, 3
for RGB, and 4
for CMYK) and the "alpha" parameter specifies
whether each color tuple is followed by an alpha value. The "color_data"
parameter specifies an optional color space array for the image - if NULL
,
the image is encoded in the corresponding device color space. The
"interpolate" parameter specifies whether to interpolate when scaling the
image on the page.
Note: When creating an image object with alpha, a second image object is
created to hold the "soft mask" data for the primary image.
Add an image object to a PDF file from a file.
pdfio_obj_t *pdfioFileCreateImageObjFromFile(pdfio_file_t *pdf, const char *filename, bool interpolate);
PDF file | |
filename | Filename |
interpolate | Interpolate image data? |
Object
This function creates an image object in a PDF file from a JPEG or PNG file.
The "filename" parameter specifies the name of the JPEG or PNG file, while
the "interpolate" parameter specifies whether to interpolate when scaling the
image on the page.
Note: Currently PNG support is limited to grayscale, RGB, or indexed files without interlacing or alpha. Transparency (masking) based on color/index is supported.
Create a new object in a PDF file containing a name.
pdfio_obj_t *pdfioFileCreateNameObj(pdfio_file_t *pdf, const char *name);
PDF file | |
name | Name value |
New object
This function creates a new object with a name value in a PDF file.
You must call pdfioObjClose
to write the object to the file.
Create a new object in a PDF file containing a number.
pdfio_obj_t *pdfioFileCreateNumberObj(pdfio_file_t *pdf, double number);
PDF file | |
number | Number value |
New object
This function creates a new object with a number value in a PDF file.
You must call pdfioObjClose
to write the object to the file.
Create a new object in a PDF file.
pdfio_obj_t *pdfioFileCreateObj(pdfio_file_t *pdf, pdfio_dict_t *dict);
PDF file | |
dict | Object dictionary |
New object
Create a PDF file through an output callback.
pdfio_file_t *pdfioFileCreateOutput(pdfio_output_cb_t output_cb, void *output_cbdata, const char *version, pdfio_rect_t *media_box, pdfio_rect_t *crop_box, pdfio_error_cb_t error_cb, void *error_cbdata);
output_cb | Output callback function |
---|---|
output_cbdata | Output callback data |
version | PDF version number or NULL for default (2.0) |
media_box | Default MediaBox for pages |
crop_box | Default CropBox for pages |
error_cb | Error callback or NULL for default |
error_cbdata | Error callback data, if any |
PDF file or NULL
on error
This function creates a new PDF file that is streamed though an output callback. The "output_cb" and "output_cbdata" arguments specify the output callback and its data pointer which is called whenever data needs to be written:
ssize_t output_cb(void *output_cbdata, const void *buffer, size_t bytes) { // Write buffer to output and return the number of bytes written }The "version" argument specifies the PDF version number for the file or
NULL
for the default ("2.0").NULL
then a default "Universal" size
of 8.27x11in (the intersection of US Letter and ISO A4) is used.NULL
the default error handler is used that
writes error messages to stderr
.
Note: Files created using this API are slightly larger than those
created using the pdfioFileCreate
function since stream lengths are
stored as indirect object references.
Create a page in a PDF file.
pdfio_stream_t *pdfioFileCreatePage(pdfio_file_t *pdf, pdfio_dict_t *dict);
PDF file | |
dict | Page dictionary |
Contents stream
Create a new object in a PDF file containing a string.
pdfio_obj_t *pdfioFileCreateStringObj(pdfio_file_t *pdf, const char *string);
PDF file | |
string | String |
New object
This function creates a new object with a string value in a PDF file.
You must call pdfioObjClose
to write the object to the file.
pdfio_file_t *pdfioFileCreateTemporary(char *buffer, size_t bufsize, const char *version, pdfio_rect_t *media_box, pdfio_rect_t *crop_box, pdfio_error_cb_t error_cb, void *error_cbdata);
buffer | Filename buffer |
---|---|
bufsize | Size of filename buffer |
version | PDF version number or NULL for default (2.0) |
media_box | Default MediaBox for pages |
crop_box | Default CropBox for pages |
error_cb | Error callback or NULL for default |
error_cbdata | Error callback data, if any |
Create a temporary PDF file.
This function creates a PDF file with a unique filename in the current
temporary directory. The temporary file is stored in the string "buffer" an
will have a ".pdf" extension. Otherwise, this function works the same as
the pdfioFileCreate
function.
Find an object using its object number.
pdfio_obj_t *pdfioFileFindObj(pdfio_file_t *pdf, size_t number);
PDF file | |
number | Object number (1 to N) |
Object or NULL
if not found
This differs from pdfioFileGetObj
which takes an index into the
list of objects while this function takes the object number.
Get the author for a PDF file.
const char *pdfioFileGetAuthor(pdfio_file_t *pdf);
PDF file |
Author or NULL
for none
Get the document catalog dictionary.
pdfio_dict_t *pdfioFileGetCatalog(pdfio_file_t *pdf);
PDF file |
Catalog dictionary
Get the creation date for a PDF file.
time_t pdfioFileGetCreationDate(pdfio_file_t *pdf);
PDF file |
Creation date or 0
for none
Get the creator string for a PDF file.
const char *pdfioFileGetCreator(pdfio_file_t *pdf);
PDF file |
Creator string or NULL
for none
Get the PDF file's ID strings.
pdfio_array_t *pdfioFileGetID(pdfio_file_t *pdf);
PDF file |
Array with binary strings
Get the keywords for a PDF file.
const char *pdfioFileGetKeywords(pdfio_file_t *pdf);
PDF file |
Keywords string or NULL
for none
Get a PDF's filename.
const char *pdfioFileGetName(pdfio_file_t *pdf);
PDF file |
Filename
Get the number of objects in a PDF file.
size_t pdfioFileGetNumObjs(pdfio_file_t *pdf);
PDF file |
Number of objects
Get the number of pages in a PDF file.
size_t pdfioFileGetNumPages(pdfio_file_t *pdf);
PDF file |
Number of pages
Get an object from a PDF file.
pdfio_obj_t *pdfioFileGetObj(pdfio_file_t *pdf, size_t n);
PDF file | |
n | Object index (starting at 0) |
Object
Get a page object from a PDF file.
pdfio_obj_t *pdfioFileGetPage(pdfio_file_t *pdf, size_t n);
PDF file | |
n | Page index (starting at 0) |
Object
Get the access permissions of a PDF file.
pdfio_permission_t pdfioFileGetPermissions(pdfio_file_t *pdf, pdfio_encryption_t *encryption);
PDF file | |
encryption | Type of encryption used or NULL to ignore |
Permission bits
This function returns the access permissions of a PDF file and (optionally) the type of encryption that has been used.
Get the producer string for a PDF file.
const char *pdfioFileGetProducer(pdfio_file_t *pdf);
PDF file |
Producer string or NULL
for none
Get the subject for a PDF file.
const char *pdfioFileGetSubject(pdfio_file_t *pdf);
PDF file |
Subject or NULL
for none
Get the title for a PDF file.
const char *pdfioFileGetTitle(pdfio_file_t *pdf);
PDF file |
Title or NULL
for none
Get the PDF version number for a PDF file.
const char *pdfioFileGetVersion(pdfio_file_t *pdf);
PDF file |
Version number or NULL
Open a PDF file for reading.
pdfio_file_t *pdfioFileOpen(const char *filename, pdfio_password_cb_t password_cb, void *password_cbdata, pdfio_error_cb_t error_cb, void *error_cbdata);
filename | Filename |
---|---|
password_cb | Password callback or NULL for none |
password_cbdata | Password callback data, if any |
error_cb | Error callback or NULL for default |
error_cbdata | Error callback data, if any |
PDF file
This function opens an existing PDF file. The "filename" argument specifies
the name of the PDF file to create.
The "password_cb" and "password_cbdata" arguments specify a password callback
and its data pointer for PDF files that use one of the standard Adobe
"security" handlers. The callback returns a password string or NULL
to
cancel the open. If NULL
is specified for the callback function and the
PDF file requires a password, the open will always fail.
The "error_cb" and "error_cbdata" arguments specify an error handler callback
and its data pointer - if NULL
the default error handler is used that
writes error messages to stderr
.
Set the author for a PDF file.
void pdfioFileSetAuthor(pdfio_file_t *pdf, const char *value);
PDF file | |
value | Value |
Set the creation date for a PDF file.
void pdfioFileSetCreationDate(pdfio_file_t *pdf, time_t value);
PDF file | |
value | Value |
Set the creator string for a PDF file.
void pdfioFileSetCreator(pdfio_file_t *pdf, const char *value);
PDF file | |
value | Value |
Set the keywords string for a PDF file.
void pdfioFileSetKeywords(pdfio_file_t *pdf, const char *value);
PDF file | |
value | Value |
Set the PDF permissions, encryption mode, and passwords.
bool pdfioFileSetPermissions(pdfio_file_t *pdf, pdfio_permission_t permissions, pdfio_encryption_t encryption, const char *owner_password, const char *user_password);
PDF file | |
permissions | Use permissions |
encryption | Type of encryption to use |
owner_password | Owner password, if any |
user_password | User password, if any |
true
on success, false
otherwise
This function sets the PDF usage permissions, encryption mode, and
passwords.
Note: This function must be called before creating or copying any objects. Due to fundamental limitations in the PDF format, PDF encryption offers little protection from disclosure. Permissions are not enforced in any meaningful way.
Set the subject for a PDF file.
void pdfioFileSetSubject(pdfio_file_t *pdf, const char *value);
PDF file | |
value | Value |
Set the title for a PDF file.
void pdfioFileSetTitle(pdfio_file_t *pdf, const char *value);
PDF file | |
value | Value |
Get the number of bytes to read for each line.
size_t pdfioImageGetBytesPerLine(pdfio_obj_t *obj);
obj | Image object |
---|
Number of bytes per line
Get the height of an image object.
double pdfioImageGetHeight(pdfio_obj_t *obj);
obj | Image object |
---|
Height in lines
Get the width of an image object.
double pdfioImageGetWidth(pdfio_obj_t *obj);
obj | Image object |
---|
Width in columns
Close an object, writing any data as needed to the PDF file.
bool pdfioObjClose(pdfio_obj_t *obj);
obj | Object |
---|
true
on success, false
on failure
Copy an object to another PDF file.
pdfio_obj_t *pdfioObjCopy(pdfio_file_t *pdf, pdfio_obj_t *srcobj);
PDF file | |
srcobj | Object to copy |
New object or NULL
on error
Create an object (data) stream for writing.
pdfio_stream_t *pdfioObjCreateStream(pdfio_obj_t *obj, pdfio_filter_t filter);
obj | Object |
---|---|
filter | Type of compression to apply |
Stream or NULL
on error
Get the array associated with an object.
pdfio_array_t *pdfioObjGetArray(pdfio_obj_t *obj);
obj | Object |
---|
Array or NULL
on error
Get the dictionary associated with an object.
pdfio_dict_t *pdfioObjGetDict(pdfio_obj_t *obj);
obj | Object |
---|
Dictionary or NULL
on error
Get the object's generation number.
unsigned short pdfioObjGetGeneration(pdfio_obj_t *obj);
obj | Object |
---|
Generation number (0 to 65535)
Get the length of the object's (data) stream.
size_t pdfioObjGetLength(pdfio_obj_t *obj);
obj | Object |
---|
Length in bytes or 0
for none
Get the name value associated with an object.
const char *pdfioObjGetName(pdfio_obj_t *obj);
obj | Object |
---|
Dictionary or NULL
on error
Get the object's number.
size_t pdfioObjGetNumber(pdfio_obj_t *obj);
obj | Object |
---|
Object number (1 to 9999999999)
Get an object's subtype.
const char *pdfioObjGetSubtype(pdfio_obj_t *obj);
obj | Object |
---|
Object subtype name or NULL
for none
This function returns an object's PDF subtype name, if any. Common subtype names include:
Get an object's type.
const char *pdfioObjGetType(pdfio_obj_t *obj);
obj | Object |
---|
Object type name or NULL
for none
This function returns an object's PDF type name, if any. Common type names include:
pdfioObjGetSubtype
will tell you the
font format)
pdfioObjGetSubtype
will
tell you which)Open an object's (data) stream for reading.
pdfio_stream_t *pdfioObjOpenStream(pdfio_obj_t *obj, bool decode);
obj | Object |
---|---|
decode | Decode/decompress data? |
Stream or NULL
on error
Copy a page to a PDF file.
bool pdfioPageCopy(pdfio_file_t *pdf, pdfio_obj_t *srcpage);
PDF file | |
srcpage | Source page |
true
on success, false
on failure
Add a color space to the page dictionary.
bool pdfioPageDictAddColorSpace(pdfio_dict_t *dict, const char *name, pdfio_array_t *data);
dict | Page dictionary |
---|---|
name | Color space name |
data | Color space array |
true
on success, false
on failure
This function adds a named color space to the page dictionary.
The names "DefaultCMYK", "DefaultGray", and "DefaultRGB" specify the default
device color space used for the page.
The "data" array contains a calibrated, indexed, or ICC-based color space
array that was created using the
pdfioArrayCreateCalibratedColorFromMatrix
,
pdfioArrayCreateCalibratedColorFromPrimaries
,
pdfioArrayCreateICCBasedColor
, or
pdfioArrayCreateIndexedColor
functions.
Add a font object to the page dictionary.
bool pdfioPageDictAddFont(pdfio_dict_t *dict, const char *name, pdfio_obj_t *obj);
dict | Page dictionary |
---|---|
name | Font name; must not contain spaces |
obj | Font object |
true
on success, false
on failure
Add an image object to the page dictionary.
bool pdfioPageDictAddImage(pdfio_dict_t *dict, const char *name, pdfio_obj_t *obj);
dict | Page dictionary |
---|---|
name | Image name |
obj | Image object |
true
on success, false
on failure
Get the number of content streams for a page object.
size_t pdfioPageGetNumStreams(pdfio_obj_t *page);
page | Page object |
---|
Number of streams
Open a content stream for a page.
pdfio_stream_t *pdfioPageOpenStream(pdfio_obj_t *page, size_t n, bool decode);
page | Page object |
---|---|
n | Stream index (0-based) |
decode | true to decode/decompress stream |
Stream
Close a (data) stream in a PDF file.
bool pdfioStreamClose(pdfio_stream_t *st);
st | Stream |
---|
true
on success, false
on failure
Consume bytes from the stream.
bool pdfioStreamConsume(pdfio_stream_t *st, size_t bytes);
st | Stream |
---|---|
bytes | Number of bytes to consume |
true
on success, false
on EOF
Read a single PDF token from a stream.
bool pdfioStreamGetToken(pdfio_stream_t *st, char *buffer, size_t bufsize);
st | Stream |
---|---|
buffer | String buffer |
bufsize | Size of string buffer |
true
on success, false
on EOF
This function reads a single PDF token from a stream. Operator tokens, boolean values, and numbers are returned as-is in the provided string buffer. String values start with the opening parenthesis ('(') but have all escaping resolved and the terminating parenthesis removed. Hexadecimal string values start with the opening angle bracket ('<') and have all whitespace and the terminating angle bracket removed.
Peek at data in a stream.
ssize_t pdfioStreamPeek(pdfio_stream_t *st, void *buffer, size_t bytes);
st | Stream |
---|---|
buffer | Buffer |
bytes | Size of buffer |
Bytes returned or -1
on error
Write a formatted string to a stream.
bool pdfioStreamPrintf(pdfio_stream_t *st, const char *format, ...);
st | Stream |
---|---|
format | printf -style format string |
... | Additional arguments as needed |
true
on success, false
on failure
Write a single character to a stream.
bool pdfioStreamPutChar(pdfio_stream_t *st, int ch);
st | Stream |
---|---|
ch | Character |
true
on success, false
on failure
Write a literal string to a stream.
bool pdfioStreamPuts(pdfio_stream_t *st, const char *s);
st | Stream |
---|---|
s | Literal string |
true
on success, false
on failure
Read data from a stream.
ssize_t pdfioStreamRead(pdfio_stream_t *st, void *buffer, size_t bytes);
st | Stream |
---|---|
buffer | Buffer |
bytes | Bytes to read |
Number of bytes read or -1
on error
This function reads data from a stream. When reading decoded image data
from a stream, you must read whole scanlines. The
pdfioImageGetBytesPerLine
function can be used to determine the
proper read length.
Write data to a stream.
bool pdfioStreamWrite(pdfio_stream_t *st, const void *buffer, size_t bytes);
st | Stream |
---|---|
buffer | Data to write |
bytes | Number of bytes to write |
true
on success or false
on failure
Create a durable literal string.
char *pdfioStringCreate(pdfio_file_t *pdf, const char *s);
PDF file | |
s | Nul-terminated string |
Durable string pointer or NULL
on error
This function creates a literal string associated with the PDF file
"pdf". The "s" string points to a nul-terminated C string.
NULL
is returned on error, otherwise a char *
that is valid until
pdfioFileClose
is called.
Create a durable formatted string.
char *pdfioStringCreatef(pdfio_file_t *pdf, const char *format, ...);
PDF file | |
format | printf -style format string |
... | Additional args as needed |
Durable string pointer or NULL
on error
This function creates a formatted string associated with the PDF file
"pdf". The "format" string contains printf
-style format characters.
NULL
is returned on error, otherwise a char *
that is valid until
pdfioFileClose
is called.
Array of PDF values
typedef struct _pdfio_array_s pdfio_array_t;
Standard color spaces
typedef enum pdfio_cs_e pdfio_cs_t;
Dictionary iterator callback
typedef bool (*pdfio_dict_cb_t)(pdfio_dict_t *dict, const char *key, void *cb_data);
Key/value dictionary
typedef struct _pdfio_dict_s pdfio_dict_t;
PDF encryption modes
typedef enum pdfio_encryption_e pdfio_encryption_t;
Error callback
typedef bool (*pdfio_error_cb_t)(pdfio_file_t *pdf, const char *message, void *data);
PDF file
typedef struct _pdfio_file_s pdfio_file_t;
Compression/decompression filters for streams
typedef enum pdfio_filter_e pdfio_filter_t;
Line capping modes
typedef enum pdfio_linecap_e pdfio_linecap_t;
Line joining modes
typedef enum pdfio_linejoin_e pdfio_linejoin_t;
Transform matrix
typedef double pdfio_matrix_t[3][2];
Numbered object in PDF file
typedef struct _pdfio_obj_s pdfio_obj_t;
Output callback for pdfioFileCreateOutput
typedef ssize_t (*pdfio_output_cb_t)(void *ctx const void *data size_t datalen);
Password callback for pdfioFileOpen
typedef const char *(*pdfio_password_cb_t)(void *data const char *filename);
PDF permission bitfield
typedef int pdfio_permission_t;
PDF rectangle
typedef struct pdfio_rect_s pdfio_rect_t;
Object data stream in PDF file
typedef struct _pdfio_stream_s pdfio_stream_t;
Text rendering modes
typedef enum pdfio_textrendering_e pdfio_textrendering_t;
PDF value types
typedef enum pdfio_valtype_e pdfio_valtype_t;
PDF rectangle
struct pdfio_rect_s {
double x1;
double x2;
double y1;
double y2;
};
x1 | Lower-left X coordinate |
---|---|
x2 | Upper-right X coordinate |
y1 | Lower-left Y coordinate |
y2 | Upper-right Y coordinate |
Standard color spaces
PDFIO_CS_ADOBE | AdobeRGB 1998 |
---|---|
PDFIO_CS_P3_D65 | Display P3 |
PDFIO_CS_SRGB | sRGB |
PDF encryption modes
PDFIO_ENCRYPTION_AES_128 | 128-bit AES encryption (PDF 1.6) |
---|---|
PDFIO_ENCRYPTION_NONE | No encryption |
PDFIO_ENCRYPTION_RC4_128 | 128-bit RC4 encryption (PDF 1.4) |
PDFIO_ENCRYPTION_RC4_40 | 40-bit RC4 encryption (PDF 1.3) |
Compression/decompression filters for streams
PDFIO_FILTER_ASCII85 | ASCII85Decode filter (reading only) |
---|---|
PDFIO_FILTER_ASCIIHEX | ASCIIHexDecode filter (reading only) |
PDFIO_FILTER_CCITTFAX | CCITTFaxDecode filter |
PDFIO_FILTER_CRYPT | Encryption filter |
PDFIO_FILTER_DCT | DCTDecode (JPEG) filter |
PDFIO_FILTER_FLATE | FlateDecode filter |
PDFIO_FILTER_JBIG2 | JBIG2Decode filter |
PDFIO_FILTER_JPX | JPXDecode filter (reading only) |
PDFIO_FILTER_LZW | LZWDecode filter (reading only) |
PDFIO_FILTER_NONE | No filter |
PDFIO_FILTER_RUNLENGTH | RunLengthDecode filter (reading only) |
Line capping modes
PDFIO_LINECAP_BUTT | Butt ends |
---|---|
PDFIO_LINECAP_ROUND | Round ends |
PDFIO_LINECAP_SQUARE | Square ends |
Line joining modes
PDFIO_LINEJOIN_BEVEL | Bevel joint |
---|---|
PDFIO_LINEJOIN_MITER | Miter joint |
PDFIO_LINEJOIN_ROUND | Round joint |
PDF permission bits
PDFIO_PERMISSION_ANNOTATE | PDF allows annotation |
---|---|
PDFIO_PERMISSION_ASSEMBLE | PDF allows assembly (insert, delete, or rotate pages, add document outlines and thumbnails) |
PDFIO_PERMISSION_COPY | PDF allows copying |
PDFIO_PERMISSION_FORMS | PDF allows filling in forms |
PDFIO_PERMISSION_MODIFY | PDF allows modification |
PDFIO_PERMISSION_NONE | No permissions |
PDFIO_PERMISSION_PRINT | PDF allows printing |
PDFIO_PERMISSION_PRINT_HIGH | PDF allows high quality printing |
PDFIO_PERMISSION_READING | PDF allows screen reading/accessibility (deprecated in PDF 2.0) |
~0 | All permissions |
Text rendering modes
PDFIO_TEXTRENDERING_FILL | Fill text |
---|---|
PDFIO_TEXTRENDERING_FILL_AND_STROKE | Fill then stroke text |
PDFIO_TEXTRENDERING_FILL_AND_STROKE_PATH | Fill then stroke text and add to path |
PDFIO_TEXTRENDERING_FILL_PATH | Fill text and add to path |
PDFIO_TEXTRENDERING_INVISIBLE | Don't fill or stroke (invisible) |
PDFIO_TEXTRENDERING_STROKE | Stroke text |
PDFIO_TEXTRENDERING_STROKE_PATH | Stroke text and add to path |
PDFIO_TEXTRENDERING_TEXT_PATH | Add text to path (invisible) |
PDF value types
PDFIO_VALTYPE_ARRAY | Array |
---|---|
PDFIO_VALTYPE_BINARY | Binary data |
PDFIO_VALTYPE_BOOLEAN | Boolean |
PDFIO_VALTYPE_DATE | Date/time |
PDFIO_VALTYPE_DICT | Dictionary |
PDFIO_VALTYPE_INDIRECT | Indirect object (N G obj) |
PDFIO_VALTYPE_NAME | Name |
PDFIO_VALTYPE_NONE | No value, not set |
PDFIO_VALTYPE_NULL | Null object |
PDFIO_VALTYPE_NUMBER | Number (integer or real) |
PDFIO_VALTYPE_STRING | String |