Commit Graph

109 Commits

Author SHA1 Message Date
Michael R Sweet
ccf3a90c97 Document how warning messages work (Issue #118) 2025-08-26 15:18:36 -04:00
Michael R Sweet
57a01a7317 Fix object map to use unique file hash instead of pointer values (Issue #125) 2025-04-24 14:00:24 -04:00
Michael R Sweet
cad8f450ab Multiple fixes to allow PDFio to read more edge-case PDFs.
- Update _pdfioFileGets to allow for really long lines where it
  doesn't matter if we lose the end of the line.
- Update "startxref" detection at the end of the file.
- Refactor repair logic so that you just get a single WARNING about
  the repair (debug messages available for testing)
- Allow whitespace after the "obj" in the object header.
- Make sure to close xref stream on error.
- Update predictor code to support Colors <= 32 (some implementations
  set Colors to the number of bytes per record in the xref stream,
  which prevents the predictor from doing anything...)
- Allow CR CR in xref table.
- Clear old trailer/root/pages/etc. objects when repairing, update
  existing objects that were already found in load_xref.
- Don't set current object in pdfioObjectCreate/OpenStream if the
  stream can't be created/opened.
2025-04-24 11:09:54 -04:00
Michael R Sweet
278ddb7fa7 Clarify error callback API, and actually use the return value.
Improve repair implementation.
2025-04-23 14:43:14 -04:00
Michael R Sweet
4ca93bd34f Add support for EncryptMetadata key in encryption dictionary. 2025-04-23 10:07:44 -04:00
Michael R Sweet
078985fc20 Try to eliminate more Windows build warnings. 2025-04-18 17:58:06 -04:00
Michael R Sweet
1116e929f7 Add pdfioFileGet/SetLanguage functions (Issue #124) 2025-04-13 20:21:23 -04:00
Michael R Sweet
c75611e274 Update documentation. 2025-04-13 17:15:53 -04:00
Michael R Sweet
81aeef46d2 Add XMP metadata to output (Issue #103) 2025-04-13 16:56:30 -04:00
Michael R Sweet
ba7371b2e1 Fix location of OutputIntents (catalog, not info dict) 2025-04-13 14:31:14 -04:00
Michael R Sweet
ec64af8b20 Add pdfioFileAddOutputIntent API (Issue #104) 2025-04-13 14:16:53 -04:00
Michael R Sweet
acd68df592 Start work on OutputIntent and better color support in PDFio:
- Add CGATS001-compatible "micro" ICC profile as a standard CMYK color space
  (this is the default used by several Adobe applications)
- Add `PDFIO_CS_CGATS001` color space enum.
- Extend `pdfioArrayCreateColorFromStandard` to support CMYK.
- Extend `pdfioFileCreateImageObjFromFile` to support CMYK JPEG files.
- Update `pdfioFileCreatePage` to add default grayscale, RGB, and CMYK color
  space resources as needed.
2025-04-13 13:31:19 -04:00
Michael R Sweet
0391df5bbd Add logging of when we are repairing the xref table. 2025-04-07 09:01:41 -04:00
Michael R Sweet
0bd9edc845 Move token buffers off the stack (Issue #117) 2025-04-04 21:20:23 -04:00
Michael R Sweet
1237599dea Clean up some compiler warnings. 2025-02-22 19:48:09 -05:00
Michael R Sweet
e996898b57 Back out object stream changes, as they would require much more significant
reworking of the "write value" private API that I don't want to do right now.
2025-02-21 16:57:01 -05:00
Michael R Sweet
aa6a20c042 Lay the groundwork for object streams. 2025-02-21 15:33:27 -05:00
Michael R Sweet
f09105dd3f Add support for writing the PCLm subset of PDF (Issue #99) 2025-02-20 18:18:53 -05:00
Michael R Sweet
44827bac1a Cleanup. 2025-02-16 12:40:39 -05:00
Michael R Sweet
3fad0d6f15 Support xref streams with encrypted output. 2025-02-16 12:35:45 -05:00
Michael R Sweet
aeee24b856 Add xref stream support (Issue #10) 2025-02-15 21:54:16 -05:00
Michael R Sweet
8d72f22efe Add support for 'repairing' damaged PDF files (Issue #45) 2025-02-15 17:26:23 -05:00
Michael R Sweet
5f98c7838c Rename pdfioFileGetModDate to pdfioFileGetModificationDate.
Add pdfioFileSetModificationDate API.

Update DLL exports file.

Update docos and changelog.
2025-02-13 18:56:43 -05:00
Thierry LARONDE
d032483ed4 Merge branch 'michaelrsweet:master' into info 2025-02-12 15:54:47 +01:00
Michael R Sweet
9e2f3aba10 Fix reading of compressed object streams (Issue #92) 2025-01-23 15:27:22 -05:00
Thierry LARONDE
8b2b013b36 Extend by adding pdfioGetModDate and extend the pdfioinfo example
When exploring a PDF, it may be convenient to have the typical
informations delivered by some "Document Properties"---and some more
about the MediaBox(es).

So just add the function to get the ModDate and extend the
pdfioinfo example as an example of what the library do have
and pdfioinfo as a debugging tool also.

Signed-off-by: Thierry LARONDE <tlaronde@kergis.com>
2025-01-18 11:25:36 +01:00
Michael R Sweet
3bc041e6d3 Delay loading of the Info object and clean up the pdfioinfo example (Issue #87) 2025-01-17 16:50:30 -05:00
Michael R Sweet
d705d7eb5d Fix reading PDF files whose trailer is missing a newline (Issue #80) 2024-12-08 19:14:58 -05:00
Sergey Vlasov
4312933409 pdfioFileCreateNameObj implemented 2024-09-25 18:40:36 +03:00
Michael R Sweet
206f75403a Add debug printfs. 2024-08-26 09:19:34 -04:00
Michael R Sweet
7d22477917 Fix opening of certain encrypted PDF files (Issue #62) 2024-08-21 11:28:39 -04:00
Michael R Sweet
a81907bdb9 Refactor get_info_string to rely on pdfioDictGetString to convert binary strings to regular ones. 2024-06-24 11:49:38 -04:00
Michael R Sweet
23883268e3 Add pdfioFileGetCatalog function (Issue #67)
Refactor the pdfioFileCreateXxx functions to use a common (private) function to
handle creating/initializing the pdfio_file_t object and base file objects.

Update unit tests to display the filename for the pdfioFileClose test.
2024-06-24 08:56:16 -04:00
Michael R Sweet
b117959725 Make sure all output code paths set the locale information (Issue #61) 2024-01-27 19:23:51 -05:00
Michael R Sweet
e882622233 Fix locale support (Issue #61) 2024-01-27 18:22:16 -05:00
Michael R Sweet
2a85baaf81 Increase the maximum number of object streams in a file (Issue #58) - most files
only contain 1 or 2...

Change the implementation of add/find object to use a custom binary insertion
sort algorithm rather than doing a qsort after every addition.  This results in
a significant improvement in open speed - from 2371 seconds (about 39.5 minutes)
to 3.1 seconds for one large test file (an ESRI standard).
2023-12-13 12:26:25 -05:00
Michael R Sweet
f4aa951165 Fix _pdfioFileSeek with whence==SEEK_CUR
Fix seek offset after trailer.

Look at the last 1k of the file to find the startxref marker.
2023-12-12 12:24:49 -05:00
Michael R Sweet
038fd8686b Fix trailer dictionary handling (Issue #58)
Fix generation number handling for object 0 (Issue #59)
2023-12-11 19:56:00 -05:00
Michael R Sweet
c992b2ba89 Update the token reading code to protect against obvious format abuses.
Update the xref loading code to protect against looping xref tables.
2023-12-07 17:50:52 -05:00
Michael R Sweet
86d842167a Bring back mis-named pdfioContentTextNextLine. 2023-12-05 13:33:07 -05:00
Michael R Sweet
16c8b830b8 Add pdfioFileCreateNumber/StringObj functions (Issue #14) 2023-12-05 08:16:41 -05:00
Michael R Sweet
c6f17cc20f Fix some warnings. 2023-12-03 19:23:36 -05:00
Michael R Sweet
ddd984215a Save work (debug printfs, etc.) 2023-11-15 08:38:47 -05:00
Michael R Sweet
b0a66eef78 Fix reading of PDF files from Crystal Reports (Issue #45) 2023-10-09 10:04:20 -04:00
Michael R Sweet
b0e4646f9d Rework CR/LF skip code to be more consistent. 2023-10-06 14:41:55 -04:00
Michael R Sweet
7f6ffcda22 Fix a couple issues with parsing PDF files produced by Microsoft Reporting
Services (Issue #46)

- Odd cross-reference stream containing 3-byte generation number field for this
  16-bit value
- Odd empty hex strings
2023-10-06 10:46:30 -04:00
Michael R Sweet
4f10021e7e Fix denial-of-service attack when reading corrupt PDF files. 2023-02-03 20:39:04 -05:00
Michael R Sweet
a3f3bbfe11 Fix pdfioFileGetAuthor, etc. APIs (Issue #33) 2022-07-12 18:36:08 -04:00
Michael R Sweet
316b0ad559 Add pdfioFileCreateTemporary function (Issue #29) 2022-05-15 22:52:53 -04:00
Michael R Sweet
a431d7806f Fix a few stack/buffer overflow bugs discovered by Bart, Steffan, and Mark from
the Radboud University NL (thanks!)

- Add depth argument to all value read functions that recurse
- Add depth argument to page tree loading code
- Validate xref stream sizes individually to avoid out-of-bounds access to local
  xref buffer.
2021-11-29 17:46:56 -05:00