website/docs/ocr-testing/ocr-faq.md
When you are using this @wdio/ocr-service you are not using it to speed up your tests, you use it because you have a hard time locating elements in your web/mobile app, and you want an easier way to locate them. And we all hopefully know that when you want something, you lose something else. But...., there is a way to make the @wdio/ocr-service execute faster than normal. More information about that can be found here.
Yes, you can combine the commands to make your script even more powerful! The advice is to use the default WebdriverIO commands/selectors as much as possible and only use this service when you can't find a unique selector, or your selector will become too brittle.
First, it's important to understand how the OCR process in this module works, so please read this page. If you still can't find your text, you might try the following things.
When the module needs to process a large area of the screenshot it might not find the text. You can provide a smaller area by providing a haystack when you use a command. Please check the commands which commands support providing a haystack.
This means that you might have light text on a white background or dark text on a dark background. This can result in not being able to find text. In the examples below you can see that the text Why WebdriverIO? is white and surrounded by a grey button. In this case, it will result in not finding the Why WebdriverIO? text. By increasing the contrast for the specific command it finds the text and can click on it, see the second image.
await driver.ocrClickOnText({
haystack: { height: 44, width: 1108, x: 129, y: 590 },
text: "WebdriverIO?",
// // With the default contrast of 0.25, the text is not found
contrast: 1,
});
This can happen on some text fields where the click is determined too long and considered a long tap. You can use the clickDuration option on ocrClickOnText and ocrSetValue to alleviate this. See here.
No, this is currently not possible. If the module finds multiple elements that match the provided selector it will automatically find the element that has the highest matching score.
I've never done it, but in theory, it should be possible. Please let us know if you succeed with that ☺️.
{languageCode}.traineddata being added, what is this?{languageCode}.traineddata is a language data file used by Tesseract. It contains the training data for the selected language, which includes the necessary information for Tesseract to recognize English characters and words effectively.
{languageCode}.traineddataThe file generally contains:
{languageCode}.traineddata Important?{languageCode}.traineddata, Tesseract would not be able to recognize English text.{languageCode}.traineddata file is included in your project making it easier to replicate the OCR environment across different systems or team members' machines.{languageCode}.traineddataIncluding {languageCode}.traineddata in your version control system is recommended for the following reasons:
Yes, you can use our CLI wizard for that. Documentation can be found here