Back to Devexpress

PdfPageWord Class

officefileapi-devexpress-dot-pdf-d278d6ad.md

latest3.6 KB
Original Source

PdfPageWord Class

An individual word related to a specific PDF page.

Namespace : DevExpress.Pdf

Assembly : DevExpress.Docs.v25.2.dll

NuGet Package : DevExpress.Document.Processor

Declaration

csharp
public class PdfPageWord :
    PdfWord
vb
Public Class PdfPageWord
    Inherits PdfWord

The following members return PdfPageWord objects:

Remarks

The PdfPageWord class uses the page coordinate system. Refer to the Coordinate Systems topic for more information.

Use the PageNumber property of this class to obtain the page number corresponding to a specific word in a PDF. The Characters property provides access to the character settings of a word.

In PDF Document API, a PdfPageWord instance is returned by the PdfDocumentProcessor.NextWord and PdfDocumentProcessor.PrevWord methods.

The code sample below shows how to use the NextWord method to retrieve the list of document fonts.

csharp
static void Main(string[] args)
{
    HashSet<string> FontNames = new HashSet<string>();

    using (PdfDocumentProcessor processor = new PdfDocumentProcessor())
    {
        processor.LoadDocument("Document.pdf");

        //Check all words in the document
        PdfPageWord currentWord = processor.NextWord();
        while (currentWord != null)
        {
            //Add the current font name to the list
            for (int i = 0; i < currentWord.Characters.Count; i++)
            {
                    FontNames.Add(currentWord.Characters[i].Font.FontName);
            }
            currentWord = processor.NextWord();
        }
    }
    Console.WriteLine(string.Format("The loaded document contains the following fonts:\r\n{0}", 
    string.Join("\r\n", FontNames.ToArray())));
}
vb
Private Shared Sub Main(ByVal args As String())
    Dim FontNames As HashSet(Of String) = New HashSet(Of String)()

    Using processor As PdfDocumentProcessor = New PdfDocumentProcessor()
        processor.LoadDocument("Document.pdf")
        Dim currentWord As PdfPageWord = processor.NextWord()

        While currentWord IsNot Nothing

            For i As Integer = 0 To currentWord.Characters.Count - 1
                FontNames.Add(currentWord.Characters(i).Font.FontName)
            Next

            currentWord = processor.NextWord()
        End While
    End Using

    Console.WriteLine(String.Format("The loaded document contains the following fonts:" & vbCrLf & "{0}", String.Join(vbCrLf, FontNames.ToArray())))
End Sub

Inheritance

Object PdfWord PdfPageWord

See Also

PdfPageWord Members

DevExpress.Pdf Namespace