怎么提取手机当前页面的文字及位置 · 问答

要提取手机当前页面的文字及其位置，你可以使用 Hamibot 的控件选择器功能结合 UI 树分析来实现。以下是一个完整的方法说明和示例代码。

// 文档地址 https://docs.hamibot.com/

// 启用无障碍服务（确保已开启）
if (!auto.service) {
    auto.waitFor();
}

// 遍历所有文本类控件，获取文字内容和其在屏幕上的位置
className("TextView").or(className("Button")).or(className("EditText")).find().forEach(function(widget) {
    var text = widget.text() || widget.desc(); // 优先取文本，没有则尝试取描述
    var bounds = widget.bounds(); // 获取该控件的边界矩形 {left, top, right, bottom}
    var center = {
        x: (bounds.left + bounds.right) / 2,
        y: (bounds.top + bounds.bottom) / 2
    };

    if (text) {
        log("文字: " + text + 
            " | 位置: (" + center.x + ", " + center.y + ")" +
            " | 区域: [" + bounds.left + "," + bounds.top + "," + 
                       bounds.right + "," + bounds.bottom + "]");
    }
});

说明：

className("TextView") 等用于匹配带有文本的 UI 控件。
使用 .or() 可以组合多个条件，覆盖按钮、输入框等可能包含文字的控件。
widget.text() 获取控件的文本内容，widget.desc() 获取内容描述（Accessibility 描述）。
widget.bounds() 返回一个 Rect 对象，表示控件在屏幕上的区域，格式为 {left, top, right, bottom}。
中心点 (x, y) 可用于模拟点击或定位。

⚠️ 注意：此方法依赖于 Android 的无障碍服务解析当前界面结构，因此需要保证 auto.service 已启用，并且目标应用的界面是可被访问的（部分加密或原生游戏界面可能无法读取）。

你也可以使用 dump() 函数输出整个 UI 层级结构进行调试：

// 输出完整的 UI 布局树（便于分析）
auto.windowRoots.forEach(function(root) {
    log(dump(root));
});

但注意 dump() 输出较详细，适合调试而非生产环境使用。

如需进一步处理 OCR 图像文字（当文字为图片时），请考虑集成图像识别模块（如 OpenCV 或专用 OCR 插件），但默认情况下上述基于控件的方式是最高效且准确的方案。

回答

说明：