itextsharp - iText not returning text contents of a PDF after first page -


I am trying to use the iText library with C # to capture the text portion of PDF files.

I have created a PDF from Excel 2013 (Export) and then documented how to copy the sample from the web (Lib Referee has been added to the project).

It reads the first page completely, but it gets distorted, after that the information is keeping the part of the first page and merging the information with the next page. When I was trying to solve this problem, the string "thepage" was created inside the loop.

Here is the code I can email to PDF on this issue.

In advance thanks

  public static string ExtractTextFromPdf (string path) {ITextExtractionStrategy = its iTextSharp.text. Pdf.parser.LocationTextExtractionStrategy (); Using (PDF Reader Reader = New PDF Reader (Path)) {StringBinder Text = New StringBuilder (); // string [] theLines; // theLines = new string [COLUMNS]; // string thePage; (Int i = 1; i & lt; = reader.NumberOfPages; i ++) for {string thePage = ""; ThePage = PdfTextExtractor.GetTextFromPage (Reader, I, its); String [] theLines = thePage.Split ('\ n'); Foreign Language (In Lines in the Lines) {text.AppendLine (theLine); } // text.AppendLine (""); // array Clear (The Lines, 0, The Lines. Lang); // thePage = ""; } Return text. Toasting (); }}    

A strategy object collects text data and is not aware that any new The page is started or not.

In this way, use a new strategy object for each page

Comments

Popular posts from this blog

ios - Adding an SKSpriteNode to SKScene from a child SKSpriteNode -

Matlab transpose a table vector -

c# - Textbox not clickable but editable -