public class PreflightParser
extends org.apache.pdfbox.pdfparser.PDFParser
| Modifier and Type | Field and Description |
|---|---|
protected PreflightContext |
ctx |
protected DataSource |
dataSource |
static Charset |
encoding
Define a one byte encoding that hasn't specific encoding in UTF-8 charset.
|
protected PreflightDocument |
preflightDocument |
protected ValidationResult |
validationResult |
| Constructor and Description |
|---|
PreflightParser(DataSource dataSource)
Constructor.
|
PreflightParser(DataSource dataSource,
org.apache.pdfbox.io.ScratchFile scratch)
Constructor.
|
PreflightParser(File file)
Constructor.
|
PreflightParser(File file,
org.apache.pdfbox.io.ScratchFile scratch)
Constructor.
|
PreflightParser(String filename)
Constructor.
|
PreflightParser(String filename,
org.apache.pdfbox.io.ScratchFile scratch)
Constructor.
|
| Modifier and Type | Method and Description |
|---|---|
protected void |
addValidationError(ValidationResult.ValidationError error)
Add the error to the ValidationResult.
|
protected void |
addValidationErrors(List<ValidationResult.ValidationError> errors) |
protected void |
checkEndstreamKeyWord()
'endstream' must be preceded by an EOL
|
protected void |
checkPdfHeader()
Check that the PDF header match rules of the PDF/A specification.
|
protected void |
checkStreamKeyWord()
'stream' must be followed by <CR><LF> or only <LF>
|
protected void |
createContext()
Create a validation context.
|
protected void |
createPdfADocument(Format format,
PreflightConfiguration config) |
protected static ValidationResult |
createUnknownErrorResult()
Create an instance of ValidationResult with a ValidationError(UNKNOWN_ERROR)
|
org.apache.pdfbox.pdmodel.PDDocument |
getPDDocument() |
PreflightDocument |
getPreflightDocument() |
protected void |
initialParse() |
protected int |
lastIndexOf(char[] pattern,
byte[] buf,
int endOff) |
void |
parse() |
void |
parse(Format format)
Parse the given file and check if it is a confirming file according to the given format.
|
void |
parse(Format format,
PreflightConfiguration config)
Parse the given file and check if it is a confirming file according to the given format.
|
protected org.apache.pdfbox.cos.COSArray |
parseCOSArray() |
protected org.apache.pdfbox.cos.COSName |
parseCOSName() |
protected org.apache.pdfbox.cos.COSStream |
parseCOSStream(org.apache.pdfbox.cos.COSDictionary dic)
Wraps the
COSParser.parseCOSStream(org.apache.pdfbox.cos.COSDictionary) to check rules on 'stream' and 'endstream'
keywords. |
protected org.apache.pdfbox.cos.COSString |
parseCOSString()
Check that the hexa string contains only an even number of
Hexadecimal characters.
|
protected org.apache.pdfbox.cos.COSBase |
parseDirObject()
Call
BaseParser.parseDirObject() check limit range for Float, Integer and number of
Dictionary entries. |
protected org.apache.pdfbox.cos.COSBase |
parseObjectDynamically(long objNr,
int objGenNr,
boolean requireExistingNotCompressedObj) |
protected boolean |
parseXrefTable(long startByteOffset)
Same method than the COSParser.parseXrefTable(long) with additional controls : -
EOL mandatory after the 'xref' keyword - Cross reference subsection header uses single white
space as separator - and so on
|
checkPages, getAccessPermission, getDocument, getEncryption, getStartxrefOffset, isCatalog, isLenient, parseDictObjects, parseFDFHeader, parseObjectDynamically, parsePDFHeader, parseTrailerValuesDynamically, parseXref, rebuildTrailer, retrieveTrailer, setEOFLookupRange, setLenientisClosing, isClosing, isDigit, isDigit, isEndOfName, isEOL, isEOL, isSpace, isSpace, isWhitespace, isWhitespace, parseBoolean, parseCOSDictionary, readExpectedChar, readExpectedString, readExpectedString, readGenerationNumber, readInt, readLine, readLong, readObjectNumber, readString, readString, readStringNumber, skipSpaces, skipWhiteSpacespublic static final Charset encoding
protected DataSource dataSource
protected ValidationResult validationResult
protected PreflightDocument preflightDocument
protected PreflightContext ctx
public PreflightParser(File file) throws IOException
file - IOException - if there is a reading error.public PreflightParser(File file, org.apache.pdfbox.io.ScratchFile scratch) throws IOException
file - scratch - IOException - if there is a reading error.public PreflightParser(String filename) throws IOException
filename - IOException - if there is a reading error.public PreflightParser(String filename, org.apache.pdfbox.io.ScratchFile scratch) throws IOException
filename - scratch - IOException - if there is a reading error.public PreflightParser(DataSource dataSource) throws IOException
dataSource - the datasourceIOException - if there is a reading error.public PreflightParser(DataSource dataSource, org.apache.pdfbox.io.ScratchFile scratch) throws IOException
dataSource - the datasourcescratch - IOException - if there is a reading error.protected static ValidationResult createUnknownErrorResult()
protected void addValidationError(ValidationResult.ValidationError error)
error - protected void addValidationErrors(List<ValidationResult.ValidationError> errors)
public void parse()
throws IOException
parse in class org.apache.pdfbox.pdfparser.PDFParserIOExceptionpublic void parse(Format format) throws IOException
format - format that the document should follow (default Format.PDF_A1B)IOExceptionpublic void parse(Format format, PreflightConfiguration config) throws IOException
format - format that the document should follow (default Format.PDF_A1B)config - Configuration bean that will be used by the PreflightDocument. If null the format is used to determine
the default configuration.IOExceptionprotected void createPdfADocument(Format format, PreflightConfiguration config) throws IOException
IOExceptionprotected void createContext()
public org.apache.pdfbox.pdmodel.PDDocument getPDDocument()
throws IOException
getPDDocument in class org.apache.pdfbox.pdfparser.PDFParserIOExceptionpublic PreflightDocument getPreflightDocument() throws IOException
IOExceptionprotected void initialParse()
throws IOException
initialParse in class org.apache.pdfbox.pdfparser.PDFParserIOExceptionprotected void checkPdfHeader()
protected boolean parseXrefTable(long startByteOffset)
throws IOException
parseXrefTable in class org.apache.pdfbox.pdfparser.COSParserstartByteOffset - the offset to start atIOException - If an IO error occurs.protected org.apache.pdfbox.cos.COSStream parseCOSStream(org.apache.pdfbox.cos.COSDictionary dic)
throws IOException
COSParser.parseCOSStream(org.apache.pdfbox.cos.COSDictionary) to check rules on 'stream' and 'endstream'
keywords. checkStreamKeyWord() and checkEndstreamKeyWord()parseCOSStream in class org.apache.pdfbox.pdfparser.COSParserdic - dictionary that goes with this stream.IOException - if an error occurred reading the stream, like problems with reading
length attribute, stream does not end with 'endstream' after data read, stream too short etc.protected void checkStreamKeyWord()
throws IOException
IOExceptionprotected void checkEndstreamKeyWord()
throws IOException
IOExceptionprotected org.apache.pdfbox.cos.COSArray parseCOSArray()
throws IOException
parseCOSArray in class org.apache.pdfbox.pdfparser.BaseParserIOExceptionprotected org.apache.pdfbox.cos.COSName parseCOSName()
throws IOException
parseCOSName in class org.apache.pdfbox.pdfparser.BaseParserIOExceptionprotected org.apache.pdfbox.cos.COSString parseCOSString()
throws IOException
BaseParser.parseCOSString()parseCOSString in class org.apache.pdfbox.pdfparser.BaseParserIOException - If there is an error reading from the stream.protected org.apache.pdfbox.cos.COSBase parseDirObject()
throws IOException
BaseParser.parseDirObject() check limit range for Float, Integer and number of
Dictionary entries.parseDirObject in class org.apache.pdfbox.pdfparser.BaseParserIOException - if there is an error during parsing.protected org.apache.pdfbox.cos.COSBase parseObjectDynamically(long objNr,
int objGenNr,
boolean requireExistingNotCompressedObj)
throws IOException
parseObjectDynamically in class org.apache.pdfbox.pdfparser.COSParserIOExceptionprotected int lastIndexOf(char[] pattern,
byte[] buf,
int endOff)
lastIndexOf in class org.apache.pdfbox.pdfparser.COSParserCopyright © 2002–2025 The Apache Software Foundation. All rights reserved.