Back to Devexpress

PdfDocumentProcessor.Text Property

officefileapi-devexpress-dot-pdf-dot-pdfdocumentprocessor-051fb26d.md

latest4.0 KB
Original Source

PdfDocumentProcessor.Text Property

Provides access to the PDF text.

Namespace : DevExpress.Pdf

Assembly : DevExpress.Docs.v25.2.dll

NuGet Package : DevExpress.Document.Processor

Declaration

csharp
public string Text { get; }
vb
Public ReadOnly Property Text As String

Property Value

TypeDescription
String

A String value that is the target text.

|

Remarks

The Text property obtains the document content clipped to the crop box. Use the PdfDocumentProcessor.GetText() method to obtain content without clipping.

Text Normalization in PDF Document API

PdfDocumentProcessor applies FormKC normalization when the bidirectional or RTL text is processed. In other cases, no normalization is applied.

Example

The following code snippet uses the PdfDocumentProcessor.Text property to extract the text of a PDF file at runtime.

View Example

csharp
using DevExpress.Pdf;

string ExtractTextFromPDF(string filePath)
{
    string documentText = "";
    try {
        using (PdfDocumentProcessor documentProcessor = 
        new PdfDocumentProcessor())
        {
            documentProcessor.LoadDocument(filePath);
            documentText = documentProcessor.Text;
        }
    }
    catch { }
    return documentText;
}
vb
Imports DevExpress.Pdf

Private Function ExtractTextFromPDF(ByVal filePath As String) As String
    Dim documentText As String = ""
    Try
        Using documentProcessor As New PdfDocumentProcessor()
            documentProcessor.LoadDocument(filePath)
            documentText = documentProcessor.Text
        End Using
    Catch
    End Try
    Return documentText
End Function

The following code snippet (auto-collected from DevExpress Examples) contains a reference to the Text property.

Note

The algorithm used to collect these code examples remains a work in progress. Accordingly, the links and snippets below may produce inaccurate results. If you encounter an issue with code examples below, please use the feedback form on this page to report the issue.

winforms-pdf-viewer-operate-pdf-content-at-runtime/CS/WindowsFormsApplication1/MainForm.cs#L28

csharp
documentProcessor.LoadDocument(filePath);
    documentText = documentProcessor.Text;
}

winforms-pdf-viewer-operate-pdf-content-at-runtime/VB/WindowsFormsApplication1/MainForm.vb#L30

vb
documentProcessor.LoadDocument(filePath)
    documentText = documentProcessor.Text
End Using

See Also

PdfDocumentProcessor Class

PdfDocumentProcessor Members

DevExpress.Pdf Namespace