C# code tutorial

PDF Extracting PDF Creating & Inserting PDF to Image PDF Digital Signature PDF Processing & Editing PDF Protecting PDF Chart PDF Barcode

Popular .NET SDK

PDF to text converter to extract text data from PDF files without having to install any software.

The Portable Document Format (PDF) is designed for end-use files, those that will be viewed and printed, but not substantially modified. You may want to extract text from PDF files.

XsPDF text extractor is designed to extract text from Adobe PDF files for use in other applications. To extract text from a PDF file, the PDF file must meet the condition which is formatted to contain text and not just images, otherwise, you may need the PDF OCR tool, it can recognize text from PDF and images.

In this C# guide, we will show how you can easily extract text from PDF files or convert PDF files to text files.

Extract text content from each PDF page using CSharp.
// Read a local PDF file in the disk
PdfTextExtractor extractor = new PdfTextExtractor("sample.pdf");

for (int i = 0; i < extractor.PageCount; i++)
    // Extract text from each page of PDF
    string text = extractor.PageToText(i);

All the text extracted from the PDF page will be combined together, removing all the style and layout, not distinguishing the title, paragraph, list, form or table.

Notice - If you used the trial version of PDF SDK, can only extract text in the first 3 pages

More Excel tutorial