Uses of Class com.extractpdf4j.parsers.BaseParser (ExtractPDF4J 2.1.0 API)

Packages that use BaseParser

Package

Description

com.extractpdf4j.annotations

Defines annotations used throughout ExtractPDF4J for configuration, metadata declaration, and extension points.

com.extractpdf4j.parsers

Implements the primary PDF parsing strategies and extraction components used to convert document content into structured tabular output.

Uses of BaseParser in com.extractpdf4j.annotations

Methods in com.extractpdf4j.annotations that return BaseParser

Modifier and Type

Method

Description

static BaseParser

ExtractPdfAnnotations.parserFrom(Class<?> type)

Builds a parser instance (no filepath) from the ExtractPdfConfig annotation on a class.

static BaseParser

ExtractPdfAnnotations.parserFrom(Class<?> type, String filepath)

Builds a parser instance from the ExtractPdfConfig annotation on a class.
Uses of BaseParser in com.extractpdf4j.parsers

Subclasses of BaseParser in com.extractpdf4j.parsers

Modifier and Type

Class

Description

class

HybridParser

HybridParser

class

LatticeParser

LatticeParser

class

OcrStreamParser

OcrStreamParser (header-aware): - Removes horizontal *and* vertical rules before OCR.

class

StreamParser

StreamParser

Methods in com.extractpdf4j.parsers that return BaseParser

Modifier and Type

Method

Description

BaseParser

BaseParser.pages(String pages)

Sets the pages to parse.

BaseParser

HybridParser.pages(String pages)

Sets the page selection for this parser and propagates the same selection to all underlying strategies.

BaseParser

BaseParser.stripText(boolean strip)

Enables or disables text normalization for stream-style extraction.