Back to Devexpress

PdfDocumentProcessor.GetText(PdfDocumentPosition, PdfDocumentPosition) Method

officefileapi-devexpress-dot-pdf-dot-pdfdocumentprocessor-dot-gettext-x28-devexpress-dot-pdf-dot-pdfdocumentposition-devexpress-dot-pdf-dot-pdfdocumentposition-x29.md

latest3.3 KB
Original Source

PdfDocumentProcessor.GetText(PdfDocumentPosition, PdfDocumentPosition) Method

Retrieves the text located between the specified document positions.

Namespace : DevExpress.Pdf

Assembly : DevExpress.Docs.v25.2.dll

NuGet Package : DevExpress.Document.Processor

Declaration

csharp
public string GetText(
    PdfDocumentPosition startPosition,
    PdfDocumentPosition endPosition
)
vb
Public Function GetText(
    startPosition As PdfDocumentPosition,
    endPosition As PdfDocumentPosition
) As String

Parameters

NameTypeDescription
startPositionPdfDocumentPosition

A PdfDocumentPosition object that is the initial document position.

| | endPosition | PdfDocumentPosition |

A PdfDocumentPosition object that is the final document position.

|

Returns

TypeDescription
String

A String value that is the target the text.

|

Remarks

The overloaded GetText method uses the page coordinate system. See the following help topic to learn more: Coordinate Systems.

This method overload selects text similar to Adobe Acrobat reader’s cursor selection. Use the GetText method overloads with the PdfDocumentArea area parameter to get text from a rectangle.

If there is no text between the specified positions, this method returns text that is nearest to these positions.

csharp
using (DevExpress.Pdf.PdfDocumentProcessor processor = new DevExpress.Pdf.PdfDocumentProcessor())
{
    processor.LoadDocument("TextExtraction.pdf");
    PdfDocumentPosition startPosition = new PdfDocumentPosition(1, new PdfPoint(0, 0));
    PdfDocumentPosition endPosition = new PdfDocumentPosition(1, new PdfPoint(500, 500));

    string pageText = processor.GetText(startPosition, endPosition);
    Console.WriteLine(pageText);
}
vb
Using processor As New DevExpress.Pdf.PdfDocumentProcessor()
  processor.LoadDocument("TextExtraction.pdf")
  Dim startPosition As New PdfDocumentPosition(1, New PdfPoint(0, 0))
  Dim endPosition As New PdfDocumentPosition(1, New PdfPoint(500, 500))

  Dim pageText As String = processor.GetText(startPosition, endPosition)
  Console.WriteLine(pageText)
End Using

See Also

PdfDocumentProcessor Class

PdfDocumentProcessor Members

DevExpress.Pdf Namespace