website/docs/ocr-testing/what-is-wdio-ocr-service.md
Automated testing on mobile native apps and desktop sites can be particularly challenging when dealing with elements that lack unique identifiers. Standard WebdriverIO selectors may not always help you. Enter the world of the @wdio/ocr-service, a powerful service that leverages OCR (Optical Character Recognition) to search, wait for, and interact with on-screen elements based on their visible text.
The following custom commands will be provided and added to the browser/driver object so you will get the right toolset to do your job.
await browser.ocrGetTextawait browser.ocrGetElementPositionByTextawait browser.ocrWaitForTextDisplayedawait browser.ocrClickOnTextawait browser.ocrSetValueThis service will
Username can also find the text Usename or vice versa.npx ocr-service) to validate your images and retrieve text through your terminalAn example of steps 1, 2 and 3 can be found in this image
It works with ZERO system dependencies (besides what WebdriverIO uses), but if needed it can also work with a local installation from Tesseract which will reduce the execution time drastically! (See also the Test Execution Optimization on how to speed up your tests.)
Enthusiastic? Start using it today by following the Getting Started guide.
:::caution Important There are a variety of reasons you might not get good quality output from Tesseract. One of the biggest reasons that could be related to your app and this module could be the fact that there is no proper color distinction between the text that needs to be found and the background. For example, white text on a dark background can easily be found, but light text on a white background or dark text on a dark background can hardly be found.
See also this page for more information from Tesseract.
Also don't forget to read the FAQ. :::