pdfio/test-corpus.sh
Michael R Sweet cad8f450ab
Multiple fixes to allow PDFio to read more edge-case PDFs.
- Update _pdfioFileGets to allow for really long lines where it
  doesn't matter if we lose the end of the line.
- Update "startxref" detection at the end of the file.
- Refactor repair logic so that you just get a single WARNING about
  the repair (debug messages available for testing)
- Allow whitespace after the "obj" in the object header.
- Make sure to close xref stream on error.
- Update predictor code to support Colors <= 32 (some implementations
  set Colors to the number of bytes per record in the xref stream,
  which prevents the predictor from doing anything...)
- Allow CR CR in xref table.
- Clear old trailer/root/pages/etc. objects when repairing, update
  existing objects that were already found in load_xref.
- Don't set current object in pdfioObjectCreate/OpenStream if the
  stream can't be created/opened.
2025-04-24 11:09:54 -04:00

40 lines
957 B
Bash
Executable File
Raw Blame History

This file contains invisible Unicode characters

This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

#!/bin/sh
#
# Script to test PDFio against a directory of PDF files.
#
# Copyright © 2025 by Michael R Sweet.
#
# Licensed under Apache License v2.0. See the file "LICENSE" for more
# information.
#
# Usage:
#
# ./test-corpus.sh DIRECTORY
#
if test $# = 0; then
echo "Usage: ./test-corpus.sh DIRECTORY"
exit 1
fi
for file in $(find "$@" -name \*.pdf -print); do
# Don't worry about test files containing MIME garbage...
(head -4 $file | grep -q Content-Type) && continue;
# Or test files containing MacBinary garbage...
(file $file | grep -q MacBinary) && continue;
# Don't worry about test files that Xpdf can't handle...
pdfinfo $file >/dev/null 2>&1 || continue;
# Run testpdfio to test loading the file...
./testpdfio $file >$file.log 2>&1
if test $? = 0; then
# Passed
rm -f $file.log
else
# Failed, preserve log and write filename to stdout...
echo $file
fi
done