frontend/packages/browser-plugin/README.md
This repository is for the RPA browser plugin, which handles web-related automation.
npm run build to package the universal extension for Chrome/Edge browsersnpm run build:browser is used to package the plugin for a specific browser. It supports any Chromium/Firefox-based browser variant. You only need to set custom_agent in src/3rd/rpa_websocket.js and add the corresponding command in package.json. The browser refers to the executable file name of the target browser.To trigger deep search, you need to hover the mouse for more than 5 seconds. Deep search is designed to solve the problem of some elements being covered. However, for some uncovered elements, deep search may result in selecting an element that is not the one the user wants, so not all elements require deep search.
By using document.elementsFromPoint(x, y) to get all elements under the mouse coordinates (x, y), then for each element, calculate the distance from its left, top, right, and bottom to the target (x, y), and find the closest element. If there are multiple results, take the first one.
See the shadowRootElement function for details. It recursively checks if an element is a shadowRoot element. If so, it continues to search within the shadowRoot until a normal element is found. The path is concatenated with $shadow$ for easy identification. When retrieving elements by path, $shadow$ is also recognized, and the function getElementBySelector is used to find the target element. Currently, shadowRoot elements only support CSS selectors, not XPath selectors.
In cross-origin iframes, due to the same-origin policy and the fact that iframe src can be fixed but the URL may change, and there may be identical src iframes in the same tab, finding the desired iframe by src is inaccurate. The solution is to obtain the nesting relationship (parentFrameId) from the plugin's main process, get the nesting order, and then, by passing the x, y coordinates and the nesting order, search layer by layer. The x, y coordinates are also reduced layer by layer by the iframe's position and box model (borderLeft, paddingLeft) to get the accurate position in the iframe content. See the getIframeElement function in backgroundInject and contentInject for details.
Same as above. Although the same-origin restriction is less strict, using the same method avoids redundancy, reduces variables, and minimizes errors.
In content, all frame elements in the current window are marked, and the iframe's XPath is sent to the corresponding iframe via iframe.contentWindow.postMessage. The iframe receives the XPath via window.addEventListener('message', function(e) { }) and sends it to the main process. This binds the iframe's XPath to its id. Thus, the full iframe path can be obtained according to the id nesting relationship, and the target element in the iframe can be located via the iframeXpath field.
The logic for locating iframeXpath is the same as for other elements, and there are also dynamic issues.
chrome://policy policy. On Windows, add to the registry at Software\Policies\Google\Chrome\ExtensionInstallAllowlist. On Linux, add to /etc/opt/chrome/policies/managed/policy.json. See https://chromeenterprise.google/policies/?policy=ExtensionInstallAllowlist for details.