Back to Devexpress

PdfDocumentProcessor.GetText() Method

officefileapi-devexpress-dot-pdf-dot-pdfdocumentprocessor-ce4a486b.md

latest2.1 KB
Original Source

PdfDocumentProcessor.GetText() Method

Retrieves the document content.

Namespace : DevExpress.Pdf

Assembly : DevExpress.Docs.v25.2.dll

NuGet Package : DevExpress.Document.Processor

Declaration

csharp
public string GetText()
vb
Public Function GetText As String

Returns

TypeDescription
String

The text obtained from the document.

|

Remarks

The GetText method uses the page coordinate system. Refer to the following help topic for more details: Coordinate Systems.

Set the PdfTextExtractionOptions.ClipToCropBox property to false and pass the PdfTextExtractionOptions object as the method parameter to extract document content without clipping to the crop box.

The code sample below retrieves all document content:

csharp
using (DevExpress.Pdf.PdfDocumentProcessor processor =
 new DevExpress.Pdf.PdfDocumentProcessor())
{
    processor.LoadDocument("TextExtraction.pdf");

    string pageText = processor.GetText();
    Console.WriteLine(pageText);
}
vb
Using processor As New DevExpress.Pdf.PdfDocumentProcessor()
    processor.LoadDocument("TextExtraction.pdf")

    Dim pageText As String = processor.GetText()
    Console.WriteLine(pageText)
End Using

See Also

PdfDocumentProcessor Class

PdfDocumentProcessor Members

DevExpress.Pdf Namespace