I am a newbie here. Just working on a console application for converting a .pdf file to a .txt file. How can i convert pdf to .txt file or .doc, especially for .txt?
This article on XsPDF.com is what you looking for, check out: Convert a pdf file to text in C#
Answers
Your question is about converting PDF to text, but this functionality is not very simple to accomplish without other dependencies. As you see, there’s no exact publishing code that is fully suitable for your application to convert PDF to text. No doubt that, what you need is to use a library to do this work for you. Just search on Google, you will find one finally.
I have ever tried to use the Ghostscript converter program, however while doing so i run into the question that it didn't create the .txt file. Ops... It is not worked as i respected. Finally, i used another library for my project development.
XsPDF SDK can do what you need, converting a .pdf file to .txt file. Below is a sample code that shows how to do PDF text conversion or text extraction. Directly copy to test and see if it works for you):
// Read a local PDF file in the disk PdfTextExtractor extractor = new PdfTextExtractor("sample.pdf"); for (int i = 0; i < extractor.PageCount; i++) { // Extract text from each page of PDF string text = extractor.PageToText(i); Console.WriteLine(text); }