Reading PDF file to text in c#I used PDFBox. PDFBox is Java PDF Library but .net version is also there.
So first step is to download PDFBox from the URL http://sourceforge.net/projects/pdfbox/files/
Then add the reference of following two file from the bin directory of downloaded file
PDFBox-0.7.2.dll
IKVM.GNU.Classpath
Then put the following code in a class file to read pdf file:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using org.pdfbox.pdmodel;
using org.pdfbox.util;
/// <summary>
/// Summary description for ConvertFromPDF
/// </summary>
public class ConvertFromPDF
{
public static string parseUsingPDFBox(string filename)
{
PDDocument doc = PDDocument.load(filename);
PDFTextStripper stripper = new PDFTextStripper();
return stripper.getText(doc);
}
}