Screen Analysis Tools

Selecting the most suitable analysis tool for the situation enhances the completeness of the scenario and provides the most robust and fastest way to author it.
Currently, Stego offers a total of 7 screen analysis tools and an external image-based Screenshot URL feature, with continuous updates through ongoing research and development.

OD (Object Detection) #

The AI engine, trained on commonly used icons and components in mobile apps, analyzes the screen and displays available screen elements.

The AI engine recognizes a total of 16 types of screen elements (LABEL), which are listed below:

Home
Back
HamburgerMenu
Search
Clickable
AD
More
Spinner
SearchArea
Swipe
EditText
Keyboard
Tabs
Close
Switch
Selectable
PatternLock
Slider

OD Result Inspector #

Detailed information about the screen elements recognized by AI through Object Detection can be found by clicking “Inspector” at the bottom of the screen analysis panel.
Clicking “Inspector” will display the detected screen elements in a table format.

1. LABEL : Indicates the type of screen element.
2. TEXT : Displays any text included in the screen element.
3. BOX : Displays the position information of the screen element.

Clicking an item in the Inspector table highlights the corresponding section in yellow within the screen analysis results.
Conversely, clicking a specific screen element in the analysis results will also highlight the corresponding item in the Inspector table.

OD Attributes #

Clicking a screen element in a step using OD allows configuring various OD attributes.
Refer to ‘Detailed Screen Elements Settings‘ for more information.

Label : UI type distinguished by the AI engine.
Text : Sets comparison conditions for the text within the screen element when testing.
= : Checks if the screen element’s text matches the value.
*= : Checks if the value is included in the screen element’s text.
^= : Checks if the value matches the beginning of the screen element’s text.
$= : Checks if the value matches the end of the screen element’s text.
search : Uses regular expressions to check if any part of the UIObject text matches the pattern.
not used : Text is not used as a comparison condition.
Text Similarity(Threshold): Specifies the similarity threshold for text comparison (applicable only when ‘=‘ is used).
Specifies the required similarity between the text at the time of scenario creation and during test execution.
Due to AI characteristics, slight recognition errors may occur depending on resolution and other environmental factors. Adjusting the similarity setting can help address such issues.
A higher value increases comparison accuracy.
– Default : 0.8
– Min : 0
– Max : 1
Case Sensitive: Specifies whether to distinguish between uppercase and lowercase letters in alphabetic characters.
Selector: Specifies which instance to use when multiple matching elements are found during testing.
Can be set to any integer value except 0. (Note: Setting ‘-1‘ selects the last element.)

OCR (Optical Character Recognition) #

OCR reads text information displayed on the screen and provides it as screen elements.
Currently, it can recognize Korean, English, numbers, and some special characters.

OCR recognizes text on a word-by-word basis by default.

If a sentence consisting of multiple words needs to be used as a single screen element,
you can group multiple words into one element using drag-and-drop.

OCR Result Inspector #

Detailed information about recognized words via OCR can be checked by clicking the “Inspector” button at the bottom of the screen analysis panel.
Clicking “Inspector” will display the recognized words in a table format.

TEXT : Displays the recognized word content.
BOX : Displays the position information of the word on the screen.

Clicking an item in the Inspector table highlights the selected section in yellow within the screen analysis results.
Conversely, clicking a specific screen element in the analysis results will also highlight the corresponding item in the Inspector table.

(Example of Activated OCR Inspector)

OCR Attributes #

Clicking a screen element in a step using OCR allows configuring various OCR attributes.
Refer to ‘Detailed Screen Element Settings‘ for more information.

Text : Sets comparison conditions for the text within the screen element when testing.
= : Checks if the screen element’s text matches the value.
*= : Checks if the value is included in the screen element’s text.
^= : Checks if the value matches the beginning of the screen element’s text.
$= : Checks if the value matches the end of the screen element’s text.
search : Uses regular expressions to check if any part of the screen element’s text matches the pattern.
not used : Text is not used as a comparison condition.
Ignore Line Break : Ignores line breaks when checking text equality. (applicable only when ‘=‘ is used)
Text Similarity(Threshold) : Specifies the similarity threshold for text comparison (applicable only when ‘=‘ is used).
Specifies the required similarity between the text at the time of scenario creation and during test execution.
Due to AI characteristics, slight recognition errors may occur depending on resolution and other environmental factors. Adjusting the similarity setting can help address such issues.
A higher value increases comparison accuracy.
– Default : 0.8
– Min : 0
– Max : 1
Case Sensitive : Specifies whether to distinguish between uppercase and lowercase letters in alphabetic characters.
Font Style Sensitivity : Specifies whether to consider font style variations.
Selector : Specifies which instance to use when multiple matching elements are found during testing.
Can be set to any integer value except 0. (Note: Setting ‘-1‘ selects the last element.)

Crop Image #

Used to detect images displayed on the screen.

Users can directly specify an area on the device screen using drag-and-drop.
The selected screen element is analyzed using the “Feature Matching” technique to find similar images on the device.

When creating screen elements using Crop Image, there are a few important considerations:

1. Complex images have higher recognition accuracy than simple ones.

Since feature points are extracted from the image for comparison,
complex images contain more feature points, leading to higher recognition accuracy compared to simple images.

2. Avoid including the background when selecting an image to improve recognition accuracy.

As the proportion of solid color areas increases, the ratio of feature points decreases.
Selecting an image with minimal background improves recognition accuracy compared to including the background.

When the background is included:
When the background is minimized:

Crop Image Attributes #

Clicking a screen element in a step using Crop Image allows modifying the image coordinates.

Left : X-coordinate of the top-left corner of the screen element.
Top : Y-coordinate of the top-left corner of the screen element.
Right : X-coordinate of the bottom-right corner of the screen element.
Bottom : Y-coordinate of the bottom-right corner of the screen element.

Custom Box #

Used to define a specific area on the screen.

Users can directly specify an area on the device screen via drag-and-drop.
Primarily used for area-based actions such as scrolling and swiping, rather than specific elements.

Custom Box Attributes #

Clicking a screen element in a step using Custom Box allows modifying the area coordinates.

Left : X-coordinate of the top-left corner of the screen element.
Top : Y-coordinate of the top-left corner of the screen element.
Right : X-coordinate of the bottom-right corner of the screen element.
Bottom : Y-coordinate of the bottom-right corner of the screen element.

Full Screen #

Used to define the entire screen as a screen element.

Primarily used for actions such as scrolling or swiping the entire screen.
Similar to Custom Box, but does not require specifying an area; instead, the entire screen is used directly.

Full Screen Attributes #

In case of Full Screen, the attribute cannot be modified.

Left : X-coordinate of the top-left corner of the screen element.
Top : Y-coordinate of the top-left corner of the screen element.
Right : X-coordinate of the bottom-right corner of the screen element.
Bottom : Y-coordinate of the bottom-right corner of the screen element.

Accessibility #

Used when mirroring does not properly display the screen due to DRM restrictions (e.g., black screen or hidden secure areas).

Using the Inspector to select and set elements via drag-and-drop is recommended.

Accessibility Result Inspector #

Detailed information about analyzed screen elements via Accessibility can be checked by clicking the “Inspector” button at the bottom of the screen analysis panel.
Clicking “Inspector” will display the recognized elements in a table format.

IDENTIFIER : Displays the identifier string provided to identify the screen element.
TYPE : Displays the UIObject type.
TEXT : Displays the text contained within the screen element.
BOX : Displays the position information of the screen element.

(Example of Activated Accessibility Inspector)

Accessibility Attributes #

Clicking a screen element in a step using Accessibility allows configuring various Accessibility attributes.

Identifier : Displays the identifier string provided to identify the screen element.
= : Checks if the element’s identifier matches the value.
not used : Identifier is not used as a comparison condition.
Type : The UIObject type provided by the test device
= : Checks if the element’s type matches the value.
not used : Type is not used as a comparison condition.
Text : Sets comparison conditions for the text within the screen element.
= : Checks if the element’s text matches the value.
*= : Checks if the value is included in the element’s text.
^= : Checks if the value matches the beginning of the element’s text.
$= : Checks if the value matches the end of the element’s text.
search : Uses regular expressions to check if any part of the screen element’s text matches the pattern.
not used : Text is not used as a comparison condition.
Specifies the similarity threshold for text comparison (applicable only when ‘=‘ is used).
Specifies the required similarity between the text at the time of scenario creation and during test execution.
Due to AI characteristics, slight recognition errors may occur depending on resolution and other environmental factors. Adjusting the similarity setting can help address such issues.
A higher value increases comparison accuracy.
– Default : 0.8
– Min : 0
– Max : 1
Case Sensitive: Specifies whether to distinguish between uppercase and lowercase letters in alphabetic characters.
Selector: Specifies which instance to use when multiple matching elements are found during testing.
Can be set to any integer value except 0. (Note: Setting ‘-1‘ selects the last element.)

Relative #

If it is difficult to specify screen elements using OD or OCR due to real-time changes,
a fixed screen element can be used as a reference to define elements based on relative positioning.

The reference point is set using OD, OCR, or Crop Image and is highlighted in yellow,
while the actual element selected and set via drag-and-drop appears in blue.

Refer to the following examples for detailed usage:
– Verifying a Specific Word at a Given Location
– Using Conditional Actions

Relative Attributes #

Relative screen elements provide two sets of attributes: reference point and Relative attributes.

Reference Point Attributes
Refer to the attributes of the screen analysis tool used to set the reference point.

OD Attributes
OCR Attributes
Crop Image Attributes

Relative Attributes

Left : X-coordinate of the top-left corner of the screen element.
Top : Y-coordinate of the top-left corner of the screen element.
Right : X-coordinate of the bottom-right corner of the screen element.
Bottom : Y-coordinate of the bottom-right corner of the screen element.

Screenshot URL #

This feature allows analyzing screens not only from mirrored devices but also from external images loaded via URL without connecting a device.
If scenario modifications are needed based on test execution results, using this feature is more convenient than reconnecting the device to replicate the situation.

① Click the ‘Add screenshot URL’ button.
② Enter the URL of the external image to be analyzed in the Screenshot URL input popup.
③ Click ‘OK’ to load the image via the URL and analyze the screen.

(Example of analyzing a screen using an image URL via OCR from an Apptest.ai blog post)

Still stuck? How can we help?

Updated on 02-06-2025

Ptero Intro

Executing Tests and Reviewing Results in Ptero

Ptero Scenario Repository

Stego Intro

Authoring Scenarios in Stego

Executing Tests and Reviewing Results in Stego