Pipeline Protocol
Basic Format
The task pipeline is described using JSON format and consists of several nodes. Each node contains the following core attributes:
{
"NodeA": {
"recognition": "OCR", // Recognition algorithm
"action": "Click", // Action to be executed
"next": [ // List of subsequent nodes
"NodeB",
"NodeC"
],
// Other extended properties...
},
// Other node definitions...
}Execution Logic
Process Control Mechanism
Task Triggering
- Start the task by specifying the entry node through the
tasker.post_taskinterface
- Start the task by specifying the entry node through the
Sequential Detection:
- Perform sequential detection on the current node's
nextlist - Attempt to identify the recognition features configured for each sub-node in sequence
- Perform sequential detection on the current node's
Interruption Mechanism:
- When a sub-node is successfully matched, immediately terminate subsequent node detections
- Execute the operation defined by the matched node's
action
Successor Processing:
- After the operation is completed, switch the active node to the current node
- Repeat the above detection process
Termination Conditions
The task process terminates when any of the following conditions are met:
- The current node's
nextlist is empty - All subsequent node detections continuously fail until timeout
Application Examples
Scenario Description
There are menus Display/Storage/Accessibility in the Android settings interface, among which Storage contains the submenus Games/Apps after opening.
Configuration Example
{
"Start Example": {
"next": [
"Enter Display",
"Enter Storage",
"Enter Accessibility"
]
},
"Enter Display": {
"recognition": "XXX",
"action": "Click",
// ...
},
"Enter Storage": {
"recognition": "XXX",
"action": "Click",
"next": [
"Enter Games Submenu",
"Enter Applications Submenu"
]
},
"Enter Accessibility": {
// ...
},
"Enter Games Submenu": {
"action": "Swipe",
"next": []
},
"Enter Applications Submenu": {
// ...
},
// ...
}Execution Process Simulation
Case 1
Case 2
Property Fields
TIP
For mandatory fields, they may remain empty in the Pipeline JSON file, but must be set via the interface before actual execution.
Folders and JSON files that begin with a dot (.) will not be read.
JSON root fields that begin with a $ will not be parsed.
Pipeline v1
recognition: string
Recognition algorithm type. Optional, default isDirectHit.
Possible values:DirectHit|TemplateMatch|FeatureMatch|ColorMatch|OCR|NeuralNetworkClassify|NeuralNetworkDetect|Custom
See Algorithm Types for details.action: string
Action to execute. Optional, default isDoNothing.
Possible values:DoNothing|Click|LongPress|Swipe|MultiSwipe|Scroll|ClickKey|LongPressKey|InputText|StartApp|StopApp|StopTask|Command|Shell|Custom
See Action Types for details.next: string | NodeAttr | list<string | NodeAttr, >
List of nodes to execute next. Optional, default is empty.
It recognizes each node in sequence and executes the first one it recognizes.
💡 Since v5.1, nodes with attributes, NodeAttr objects, or heterogeneous arrays combining both are supported. See Node Attributes for details.is_sub: bool
(Deprecated in version 5.1, use[JumpBack]in Node Attributes instead)
We provide a migration script to help you automatically convert your pipeline to the new style without loss of quality.interrupt: string | list<string, >
(Deprecated in version 5.1, use[JumpBack]in Node Attributes instead)
We provide a migration script to help you automatically convert your pipeline to the new style without loss of quality.rate_limit: uint
Identification rate limit, in milliseconds. Optional, default 1000.
Each round of identification consumes at leastrate_limitmilliseconds, and sleep will wait if the time is less than that.timeout: uint
Timeout for recognizing nodes, in milliseconds. Optional, Default is 20,000 milliseconds (20 seconds).
The detailed logic iswhile(!timeout) { foreach(next); sleep_until(rate_limit); }on_error: string | NodeAttr | list<string | NodeAttr, >
When recognition timeout or the action fails to execute, the nodes in this list will be executed next. Optional, empty by default.
💡 Since v5.1, nodes with attributes, NodeAttr objects, or heterogeneous arrays combining both are supported. See Node Attributes for details.anchor: string | list<string, >
Anchor name. Optional, default is empty.
When the node executes successfully, this anchor name will be set to the current node. Multiple nodes can set the same anchor name, with later executions overwriting earlier ones.
Innextoron_error, you can reference this anchor using the[Anchor]attribute, which resolves at runtime to the last node that set this anchor.
See Node Attributes for details.inverse: bool
Reverse the recognition result: recognized as not recognized, and not recognized as recognized. Optional, default is false.
Please note that nodes recognized through this setting will have their own clicking actions disabled (because nothing was actually recognized). If there is a need, you can set thetargetseparately.enabled: bool
Whether to enable this node. Optional, default is true.
If set to false, this node will be skipped when it appears in the "next" lists of other nodes, meaning it won't be recognized or executed.max_hit: uint
The maximum number of times this node can be recognized successfully. Optional, default is UINT_MAX (unlimited).
If this count is exceeded, this node will be skipped when it appears in the "next" lists of other nodes, meaning it won't be recognized or executed.pre_delay: uint
Delay in milliseconds between recognizing a node and executing the action. Optional, default is 200 milliseconds.
It is recommended to add intermediate nodes whenever possible and use less delay to maintain both speed and stability.post_delay: uint
Delay in milliseconds between executing the action and recognizing the "next" nodes. Optional, default is 200 milliseconds.
It is recommended to add intermediate nodes whenever possible and use less delay to maintain both speed and stability.pre_wait_freezes: uint | object
Time in milliseconds to wait for the screen to stop changing between recognizing a node and executing the action. Optional, default is 0 (no waiting).
It will exit the action only when the screen has not had significant changes for "pre_wait_freezes" milliseconds in a row.
If it's an object, more parameters can be set, see Waiting for the Screen to Stabilize for details. The specific order ispre_wait_freezes-pre_delay-action-post_wait_freezes-post_delay.post_wait_freezes: uint | object
Time in milliseconds to wait for the screen to stop changing between executing the action and recognizing the "next" nodes. Optional, default is 0 (no waiting).
Other logic is the same aspre_wait_freezes.repeat: uint
Number of times to repeat the action. Optional, default is 1 (no repeat).
The execution flow isaction- [repeat_wait_freezes-repeat_delay-action] × (repeat-1).repeat_delay: uint
Delay between each repeated action in milliseconds. Optional, default is 0.
Only effective whenrepeat> 1, waits before the second and subsequent action executions.repeat_wait_freezes: uint | object
Time in milliseconds to wait for the screen to stop changing between each repeated action. Optional, default is 0 (no waiting).
Only effective whenrepeat> 1, waits for the screen to stabilize before the second and subsequent action executions.
If it's an object, more parameters can be set, see Waiting for the Screen to Stabilize for details.focus: any
Focus on the node, resulting in additional callback messages. Optional, default is null (no messages).
See Node Notifications for details.attach: object
Additional JSON object for storing additional node configuration. Optional, default is an empty object.
This field can be used to store custom configuration information that does not affect the node's execution logic but can be retrieved through related interfaces.
Note: This field will be merged with theattachin the default value using dict merge, not overwritten. That is, theattachin the node will be merged with theattachin the default value, where values with the same keys will be overwritten by the node's values, but other keys will be preserved.
The lifecycle of a node is as follows:
Pipeline v2
NOTE
MaaFW supports Pipeline v2 protocol since version v4.4.0, and is compatible with v1.
Compared with v1, the main difference is that the fields related to recognition and action are put into the secondary dictionary (the type is put into the type field, and the other parameters are put into the param field), and the rest are the same. Example:
{
"NodeA": {
"recognition": {
"type": "TemplateMatch",
"param": {
// Recognition related fields are placed in recognition.param, and the keys and values remain unchanged from v1
"template": "A.png",
"roi": [100, 100, 10, 10]
}
},
"action": {
"type": "Click",
"param": {
// Action related fields are placed in action.param, and the keys and values remain unchanged from v1
"target": "XXX"
}
},
// Non-recognition and action fields remain unchanged from v1
"next": ["NodeB"],
"pre_delay": 1000,
// ...
}
}Default Properties
Please refer to default_pipeline.json
default_pipeline.json is used to set default parameters for all nodes and specific algorithm/action types, reducing redundant configuration. This file is automatically loaded during resource loading, and the configurations within serve as baseline defaults inherited by all nodes.
Configuration Structure
{
"Default": {
// Default values for common fields, applicable to all nodes
"rate_limit": 2000,
"timeout": 30000,
"pre_delay": 500
},
"TemplateMatch": {
// Default parameters for TemplateMatch algorithm
// Can use v1 or v2 format
"recognition": "TemplateMatch", // v1 format: directly in root object
"threshold": 0.7
// Or use v2 format:
// "recognition": { "type": "TemplateMatch", "param": { "threshold": 0.7 } }
},
"Click": {
// Default parameters for Click action
"action": "Click", // v1 format
"target": true
// Or use v2 format:
// "action": { "type": "Click", "param": { "target": true } }
}
}Field Description
Defaultobject: Can set default values for all common Pipeline fields (such asrate_limit,timeout,pre_delay,post_delay, etc.)- Algorithm name objects (e.g.,
TemplateMatch,OCR,FeatureMatch): Can set default parameters for the corresponding recognition algorithm type - Action name objects (e.g.,
Click,Swipe,StartApp): Can set default parameters for the corresponding action type
Inheritance Priority
Parameter priority from highest to lowest:
- Parameters directly defined in the node
- Default parameters for the corresponding algorithm/action type in
default_pipeline.json - Default parameters in the
Defaultobject ofdefault_pipeline.json - Built-in default values in the framework
Usage Example
Assuming default_pipeline.json is configured as follows:
{
"Default": {
"rate_limit": 2000
},
"TemplateMatch": {
"recognition": "TemplateMatch",
"threshold": 0.7
}
}In Pipeline JSON:
{
"NodeA": {
"recognition": "TemplateMatch",
"template": "A.png"
// rate_limit will inherit 2000 from Default
// threshold will inherit 0.7 from TemplateMatch
},
"NodeB": {
"recognition": "TemplateMatch",
"template": "B.png",
"threshold": 0.9, // Override default value
"rate_limit": 1000 // Override default value
}
}Notes
default_pipeline.jsonshould be placed in the root directory of the resource bundle (Bundle), at the same level as thepipelinefolder- The key names for algorithm/action objects must exactly match the corresponding algorithm/action type names (e.g.,
TemplateMatch,OCR,Click), case-sensitive - The
attachfield will be dictionary merged (dict merge) rather than overwritten, meaning the node'sattachwill merge with the defaultattach - If you don't need a certain default configuration, simply set it explicitly to another value in the node to override
Multiple Bundle Loading Behavior
When loading multiple Bundles sequentially:
- Their
default_pipeline.jsonfiles are merged (dict merge) in order, with later-loaded fields overriding earlier ones - Already-loaded nodes are NOT affected by subsequent Bundle's
default_pipeline.json - Each node uses the merged defaults at the time of its first loading, and remains unchanged thereafter
Example: Bundle_Base → Bundle_Debug
// Bundle_Base/default_pipeline.json
{ "Default": { "rate_limit": 2000 } }
// Bundle_Base/pipeline/main.json
{ "NodeA": { "recognition": "OCR" } } // Uses rate_limit: 2000
// Bundle_Debug/default_pipeline.json
{ "Default": { "rate_limit": 100 } } // Overrides defaults
// Bundle_Debug/pipeline/debug.json
{ "NodeB": { "recognition": "OCR" } } // Uses rate_limit: 100After loading, NodeA keeps rate_limit: 2000, NodeB uses rate_limit: 100.
Best Practice: Separating Special Configurations
Organize nodes with special configurations into a separate Bundle with dedicated default_pipeline.json to achieve configuration isolation.
Example: Separate debug nodes into an independent Bundle
MaaResourcePostPath(resource, "resource/base"); // base nodes use rate_limit: 2000
MaaResourcePostPath(resource, "resource/debug"); // debug nodes use rate_limit: 100Since already-loaded nodes are not affected by subsequent defaults, base Bundle nodes retain their original configuration while debug Bundle nodes use the new defaults.
Algorithm Types
DirectHit
Direct hit, meaning no recognition is performed, and the action is executed directly.
TemplateMatch
Template matching, also known as "find image."
This algorithm property requires additional fields:
roi: array<int, 4> | string
Recognition area coordinates. Optional, default [0, 0, 0, 0], i.e. full screen.- array<int, 4>: Recognition area coordinates, [x, y, w, h], if you want full screen, you can set it to [0, 0, 0, 0].
- string: Fill in the node name, and identify within the target range identified by a previously executed node.
roi_offset: array<int, 4>
Move additionally based onroias the range, and add the four values separately. Optional, default [0, 0, 0, 0].template: string | list<string, >
Path to the template image, relative to the "image" folder. Required. The images used need to be cropped from the lossless original image and scaled to 720p. Reference to here.
Supports filling in folder paths, and all image files in them will be recursively loaded.threshold: double | list<double, >
Template matching threshold. Optional, default is 0.7.
If it's an array, its length should match the length of thetemplatearray.order_by: string
How the results are sorted. Optional, default isHorizontal.
Possible values:Horizontal|Vertical|Score|Random.
You can use it with theindexfield.index: int
Index to hit. Optional, default is0.
If there are N results in total, the value range ofindexis [-N, N - 1], where negative numbers are converted to N - index using Python-like rules. If it exceeds the range, it is considered that there is no result in the current identification.method: int
Template matching algorithm, equivalent to cv::TemplateMatchModes. Optional, default is 5.
For more details, refer to the OpenCV official documentation.
10001 is the inverted version of TM_SQDIFF_NORMED, where higher scores indicate better matches (opposite to the original).method Algorithm Name Speed Color Sensitivity Lighting Robustness Use Case 10001 TM_SQDIFF_NORMED (Inverted) Fast Sensitive Poor Precise matching, easy threshold setting 3 TM_CCORR_NORMED Medium Sensitive Medium Works well with bright templates 5 TM_CCOEFF_NORMED Slower Insensitive Good Recommended, easy threshold setting green_mask: bool
Whether to apply a green mask. Optional, default is false.
If set to true, you can paint the unwanted parts in the image green with RGB: (0, 255, 0), and those green parts won't be matched.
Note: The algorithm itself has strong robustness, so this feature is usually unnecessary for normal background variations. If you do need to use it, only mask the interfering areas and avoid excessive masking that could cause loss of main subject edge features.
FeatureMatch
Feature matching, a more powerful "find image" with better generalization, resistant to perspective and size changes.
This algorithm property requires additional fields:
roi: array<int, 4> | string
Same asTemplateMatch.roi.roi_offset: array<int, 4>
Same asTemplateMatch.roi_offset.template: string | list<string, >
Path to the template image, relative to the "image" folder. Required.
Supports filling in folder paths, and all image files in them will be recursively loaded.
Notes:- Template images should not be too small; at least 64x64 pixels is recommended. Otherwise, too few feature points may cause matching failures or false matches.
- Templates should contain sufficient texture details (such as edges and corners); solid color or gradient areas have sparse feature points.
- Avoid templates with large amounts of repetitive textures (such as grids or stripes), which may cause feature point matching confusion.
count: uint
The minimum number of matching feature points required (threshold). Optional, default is 4.order_by: string
How the results are sorted. Optional, default isHorizontal.
Possible values:Horizontal|Vertical|Score|Area|Random. You can use it with theindexfield.index: int
Index to hit. Optional, default is0.
If there are N results in total, the value range ofindexis [-N, N - 1], where negative numbers are converted to N - index using Python-like rules. If it exceeds the range, it is considered that there is no result in the current identification.green_mask: bool
Whether to apply a green mask. Optional, default is false.
If set to true, you can paint the unwanted parts in the image green with RGB: (0, 255, 0), and those green parts won't be matched.
Note: The algorithm itself has strong robustness, so this feature is usually unnecessary for normal background variations. If you do need to use it, only mask the interfering areas and avoid excessive masking that could cause loss of main subject edge features.detector: string
Feature detector. Optional, default isSIFT.Detector Speed Scale Invariance Rotation Invariance Accuracy Use Case SIFT Slow Yes Yes Highest Recommended, high accuracy requirements KAZE Slower Yes Yes High 2D/3D images, good edge preservation AKAZE Medium Yes Yes Good Balance between speed and accuracy BRISK Fast Yes Yes Medium Real-time requirements ORB Fastest No Yes Lower Use when template and target have same scale ratio: double
The distance ratio for KNN matching, [0 - 1.0], where larger values make the matching more lenient (easier to connect). Optional, default is 0.6.
ColorMatch
Color matching, also known as "find color."
This algorithm property requires additional fields:
roi: array<int, 4> | string
Same asTemplateMatch.roi.roi_offset: array<int, 4>
Same asTemplateMatch.roi_offset.method: int
Color matching method, equivalent to cv::ColorConversionCodes. Optional, default is 4 (RGB).
Common values are 4 (RGB, 3 channels), 40 (HSV, 3 channels), and 6 (GRAY, 1 channel).
For more details, refer to the OpenCV official documentation.lower: list<int, > | list<list<int, >>
Lower bound for colors. Required. The innermost list length should match the number of channels in themethod.upper: list<int, > | list<list<int, >>
Upper bound for colors. Required. The innermost list length should match the number of channels in themethod.count: uint
The minimum number of matching pixels required (threshold). Optional, default is 1.order_by: string
How the results are sorted. Optional, default isHorizontal.
Possible values:Horizontal|Vertical|Score|Area|Random.
You can use it with theindexfield.index: int
Index to hit. Optional, default is0.
If there are N results in total, the value range ofindexis [-N, N - 1], where negative numbers are converted to N - index using Python-like rules. If it exceeds the range, it is considered that there is no result in the current identification.connected: bool
Whether to count only connected points. Optional, default is false.
If set to true, after applying color filtering, it will only count the maximum connected block of pixels. If set to false, it won't consider whether these pixels are connected.
OCR
Text recognition.
This algorithm property requires additional fields:
roi: array<int, 4> | string
Same asTemplateMatch.roi.roi_offset: array<int, 4>
Same asTemplateMatch.roi_offset.expected: string | list<string, >
The expected results, supports regular expressions. Required.threshold: double
Model confidence threshold. Optional, default is 0.3.replace: array<string, 2> | list<array<string, 2>>
Some text recognition results may not be accurate, so replacements are performed. Optional.order_by: string
How the results are sorted. Optional, default isHorizontal.
Possible values:Horizontal|Vertical|Area|Length|Random|Expected.
You can use it with theindexfield.index: int
Index to hit. Optional, default is0.
If there are N results in total, the value range ofindexis [-N, N - 1], where negative numbers are converted to N - index using Python-like rules. If it exceeds the range, it is considered that there is no result in the current identification.only_rec: bool
Whether to recognize only (without detection, requires preciseroi). Optional, default is false.model: string
Model folder path. Use a relative path to the "model/ocr" folder. Optional, default is empty.
If empty, it will use the models in the root of the "model/ocr" folder. The folder should include three files:rec.onnx,det.onnx, andkeys.txt.
NeuralNetworkClassify
Deep learning classification, to determine if the image in a fixed position matches the expected "category."
This algorithm property requires additional fields:
roi: array<int, 4> | string
Same asTemplateMatch.roi.roi_offset: array<int, 4>
Same asTemplateMatch.roi_offset.labels: list<string, >
Labels, meaning the names of each category. Optional.
It only affects debugging images and logs. If not filled, it will be filled with "Unknown."model: string
Model file path. Use a relative path to the "model/classify" folder. Required.
Currently, only ONNX models are supported. Refer to NNClassify Recipe.expected: int | list<int, >
The expected category index. Required.order_by: string
How the results are sorted. Optional, default isHorizontal.
Possible values:Horizontal|Vertical|Score|Random|Expected.
You can use it with theindexfield.index: int
Index to hit. Optional, default is0.
If there are N results in total, the value range ofindexis [-N, N - 1], where negative numbers are converted to N - index using Python-like rules. If it exceeds the range, it is considered that there is no result in the current identification.
For example, if you want to recognize whether a cat or a mouse appears in a fixed position in the image, and you've trained a model that supports this three-category classification, and you want to click when it recognizes a cat or a mouse but not when it recognizes a dog, the relevant fields would be:
{
"labels": ["Cat", "Dog", "Mouse"],
"expected": [0, 2]
}Please note that these values should match the actual model output.
NeuralNetworkDetect
Deep learning object detection, an advanced version of "find image."
The main difference from classification is the flexibility to find objects at arbitrary positions. However, this often requires more complex models, more training data, longer training times, and significantly higher resource usage during inference.
This algorithm property requires additional fields:
roi: array<int, 4> | string
Same asTemplateMatch.roi.roi_offset: array<int, 4>
Same asTemplateMatch.roi_offset.labels: list<string, >
Labels, meaning the names of each category. Optional.
It only affects debugging images and logs. If not filled, it will be automatically read from the model metadata (supports reading fromnames,name,labels,class_namesfields), generally no need to set manually. If the model metadata also doesn't contain labels, it will be filled with "Unknown."model: string
Model file path. Use a relative path to the "model/detect" folder. Required.
Currently supports ONNX models exported from YOLOv8 and YOLOv11. Other Yolo models with the same input/output should theoretically be supported but are untested.
Training reference: NNDetect Recipe.expected: int | list<int, >
The expected category index.threshold: double | list<double, >
Model confidence threshold. Optional, default is 0.3.
If it's an array, its length should match the length of theexpectedarray.order_by: string
How the results are sorted. Optional, default isHorizontal.
Possible values:Horizontal|Vertical|Score|Area|Random|Expected.
You can use it with theindexfield.index: int
Index to hit. Optional, default is0.
If there are N results in total, the value range ofindexis [-N, N - 1], where negative numbers are converted to N - index using Python-like rules. If it exceeds the range, it is considered that there is no result in the current identification.
For example, if you want to detect cats, dogs, and mice in an image and only click when a cat or a mouse is detected but not when a dog is detected, the relevant fields would be:
{
"labels": ["Cat", "Dog", "Mouse"],
"expected": [0, 2]
}Please note that these values should match the actual model output.
Custom
Execute the recognition handle passed in through the MaaResourceRegisterCustomRecognition interface
This algorithm property requires additional fields:
custom_recognition: string
Recognition name, same as the one passed in through the registration interface. It will also be passed throughMaaCustomRecognitionCallback.custom_recognition_name. Required.custom_recognition_param: any
Recognition parameter, any type, will be passed throughMaaCustomRecognitionCallback.custom_recognition_param. Optional, defaultnull.roi: array<int, 4> | string
Same asTemplateMatch.roi, will be passed throughMaaCustomRecognitionCallback.roi. Optional, default [0, 0, 0, 0].roi_offset: array<int, 4>
Same asTemplateMatch.roi_offset.
Action Types
DoNothing
Does nothing.
Click
Clicks.
Additional properties for this action:
target: true | string | array<int, 2> | array<int, 4>
The position of the click target. Optional, default is true.- true: The target is the position just recognized in this node (i.e., itself).
- string: Enter the node name, as the target, to use the position recognized by a previously executed node.
- array<int, 2>: Fixed coordinate point
[x, y]. - array<int, 4>: Fixed coordinate area
[x, y, w, h]. A point is sampled inside the rectangle with higher probability near the center and lower probability near the edges. To target the entire screen, set it to [0, 0, 0, 0].
target_offset: array<int, 4>
Additional movement from thetargetbefore clicking, where the four values are added together. Optional, default is [0, 0, 0, 0].contact: uint
Contact id used to distinguish different touch points. Optional, default0.- Adb Controller: Finger id (0 for first finger, 1 for second finger, etc.)
- Win32 Controller: Mouse button id (0 for left, 1 for right, 2 for middle, 3 for XBUTTON1, 4 for XBUTTON2)
LongPress
Long press。
Additional properties for this action:
target: true | string | array<int, 2> | array<int, 4>
The position of the long press target. Optional, default is true. The values are the same asClick.target.target_offset: array<int, 4>
Additional movement from thetargetbefore long pressing, where the four values are added together. Optional, default is [0, 0, 0, 0].duration: uint
Duration of the long press in milliseconds. Optional, default is 1000.contact: uint
Contact id used to distinguish different touch points. Optional, default0.- Adb Controller: Finger id (0 for first finger, 1 for second finger, etc.)
- Win32 Controller: Mouse button id (0 for left, 1 for right, 2 for middle, 3 for XBUTTON1, 4 for XBUTTON2)
Swipe
Linear swipe.
Additional properties for this action:
begin: true | string | array<int, 2> | array<int, 4>
The starting point of the swipe. Optional, default is true. The values are the same asClick.target.begin_offset: array<int, 4>
Additional movement from thebeginbefore swiping, where the four values are added together. Optional, default is [0, 0, 0, 0].end: true | string | array<int, 2> | array<int, 4> | list<true | string | array<int, 2> | array<int, 4>>
The end point of the swipe. Optional, default is true. The values are the same asClick.target.
💡 v4.5.x version adds support for list, which can be used to add swipe waypoints! Unlike multiple swipes, multiple ends will not lift between them, i.e., a single polyline swipe.end_offset: array<int, 4> | list<array<int, 4>>
Additional movement from theendbefore swiping, where the four values are added together. Optional, default is [0, 0, 0, 0].duration: uint | list<uint,>
Duration of the swipe in milliseconds. Optional, default is 200.end_hold: uint | list<uint,>
After swiping to the end point, wait for an additional time before lifting, in ms. Optional, default 0.only_hover: bool
Only move the mouse cursor, no press/release action. Optional, default false.contact: uint
Contact id used to distinguish different touch points. Optional, default0.- Adb Controller: Finger id (0 for first finger, 1 for second finger, etc.)
- Win32 Controller: Mouse button id (0 for left, 1 for right, 2 for middle, 3 for XBUTTON1, 4 for XBUTTON2)
MultiSwipe
Multi-finger linear swipe.
Additional properties for this action:
swipes: list<object,>
Multi swipe array. Required.
The order of the array elements has no effect, only based onstarting.starting: uint
Swipe start time, in milliseconds. Optional, default 0.MultiSwipeadditional field, the swipe will start pressing at thestartingmillisecond in this action.begin: true | string | array<int, 2> | array<int, 4>
The starting point of the swipe. Optional, default is true. The values are the same asClick.target.begin_offset: array<int, 4>
Additional movement from thebeginbefore swiping, where the four values are added together. Optional, default is [0, 0, 0, 0].end: true | string | array<int, 2> | array<int, 4> | list<true | string | array<int, 2> | array<int, 4>>
The end point of the swipe. Optional, default is true. The values are the same asClick.target.
💡 v4.5.x version adds support for list, which can be used to add swipe waypoints! Unlike multiple swipes, multiple ends will not lift between them, i.e., a single polyline swipe.end_offset: array<int, 4> | list<array<int, 4>>
Additional movement from theendbefore swiping, where the four values are added together. Optional, default is [0, 0, 0, 0].duration: uint | list<uint,>
Duration of the swipe in milliseconds. Optional, default is 200.end_hold: uint | list<uint,>
After swiping to the end point, wait for an additional time before lifting, in ms. Optional, default 0.only_hover: bool
Only move the mouse cursor, no press/release action. Optional, default false.contact: uint
Contact id used to distinguish different touch points. Optional, default0.- Adb Controller: Finger id (0 for first finger, 1 for second finger, etc.)
- Win32 Controller: Mouse button id (0 for left, 1 for right, 2 for middle, 3 for XBUTTON1, 4 for XBUTTON2)
Note: InMultiSwipe, ifcontactis 0, the index of the swipe in the array will be used as the contact id.
For example:
{
"A": {
"action": "MultiSwipe",
"swipes": [
{
"begin": [],
"end": []
},
{
"starting": 500,
"begin": [],
"end": []
}
]
}
}TouchDown
Press a touch contact.
Additional properties for this action:
contact: uint
Contact id used to distinguish different touch points. Optional, default0.- Adb Controller: Finger id (0 for first finger, 1 for second finger, etc.)
- Win32 Controller: Mouse button id (0 for left, 1 for right, 2 for middle, 3 for XBUTTON1, 4 for XBUTTON2)
target: true | string | array<int, 4>
Touch target position. Optional, defaulttrue. Same semantics asClick.target.target_offset: array<int, 4>
Offset applied on top oftarget. Optional, default[0, 0, 0, 0].pressure: int
Touch pressure. Optional, default0. The actual range depends on the controller implementation.
TouchMove
Move an existing touch contact. Fields are identical to TouchDown and update the contact position.
TouchUp
Release a touch contact.
contact: uint
Contact id used to distinguish different touch points. Optional, default0.- Adb Controller: Finger id (0 for first finger, 1 for second finger, etc.)
- Win32 Controller: Mouse button id (0 for left, 1 for right, 2 for middle, 3 for XBUTTON1, 4 for XBUTTON2)
Scroll
Mouse wheel scroll.
Additional properties for this action:
dx: int
Horizontal scroll distance, positive values scroll right, negative values scroll left. Optional, default is 0.dy: int
Vertical scroll distance, positive values scroll down, negative values scroll up. Optional, default is 0.
NOTE
- Adb Controller does not support scroll operations. Only Win32 Controller is supported.
- The
dx/dyvalues are sent directly as scroll increments. The Windows standard wheel increment is 120 (WHEEL_DELTA) per notch. Using multiples of 120 is recommended for best compatibility.
ClickKey
Click a key.
Additional properties for this action:
key: int | list<int, >
The key(s) to click, supporting only virtual key code of corresponding controller. Required.
LongPressKey
Long Press a key.
Additional properties for this action:
key: int | list<int, >
The key to press, supporting only virtual key code of corresponding controller. Required.duration: uint
Duration of the long press in milliseconds. Optional, default is 1000.
KeyDown
Press a key without releasing. Combine with KeyUp to customize key sequences.
Additional properties for this action:
key: int
The key to press, supporting only virtual key code of corresponding controller. Required.
KeyUp
Release a key. Typically follows a KeyDown to end the key press state.
Additional properties for this action:
key: int
The key to release, supporting only virtual key code of corresponding controller. Required.
InputText
Inputs text.
Additional properties for this action:
input_text: string
The text to input, some controller only supports ascii. Required.
StartApp
Starts an app.
Additional properties for this action:
package: string
Launch entry. Required.
You need to enter the package name or activity, for example,com.hypergryph.arknightsorcom.hypergryph.arknights/com.u8.sdk.U8UnityContext.
StopApp
Closes an app.
Additional properties for this action:
package: string
The app to close. Required.
You need to enter the package name, for example,com.hypergryph.arknights.
StopTask
Stops the current task chain (the individual task chain passed to MaaTaskerPostTask).
Command
Execute a command.
This action attribute requires additional fields:
exec: string
The path of the program to be executed. Required.args: list<string,>
The arguments to be executed. Optional.
supports runtime parameters replacement:{ENTRY}: Entry name.{NODE}: Node name.{IMAGE}: The path to the file where the screenshot is saved. The file is deleted before the process exits. Please copy it by yourself if you want to save it permanently.{BOX}: Identify the hit target, the format is[x, y, w, h].{RESOURCE_DIR}: The path of the resource folder loaded last time.{LIBRARY_DIR}: The path of the folder where the MaaFW library is located.
detach: bool
Detach the child process, that is, do not wait for the child process to complete, and continue directly to the next task. Optional, default false.
Example:
{
"NodeA": {
"action": "Command",
"exec": "Python",
"args": [
"{RESOURCE_DIR}/my_script/test.py"
"Haha",
"{IMAGE}",
"{NODE}",
"{BOX}"
]
},
"NodeB": {
"action": "Command",
"exec": "{RESOURCE_DIR}/my_exec/my_exec.exe"
}
}The actual command is:
# NodeA
Python C:/MaaXXX/resource/my_script/test.py Haha C:/temp/123.png NodeA [0,0,0,0]
# NodeB
C:/MaaXXX/resource/my_exec/my_exec.exeShell
Executes a shell command on an ADB device.
Additional properties for this action:
cmd: string
The shell command to execute. Required.
For example,getprop ro.build.version.sdkorsettings put global animator_duration_scale 0.
Note: This action is only valid for ADB controllers. The command output can be retrieved via
MaaTaskerGetActionDetailin the action detail.
Custom
Execute the action handle passed in through the MaaResourceRegisterCustomAction interface
This action attribute requires additional fields:
custom_action: string
Action name, same as the one passed in through the registration interface. It will also be passed throughMaaCustomActionCallback.custom_action_name. Required.custom_action_param: any
Action parameter, any type, will be passed throughMaaCustomActionCallback.custom_action_param. Optional, defaultnull.target: true | string | array<int, 2> | array<int, 4>
The position of the target, which will be passed throughMaaCustomActionCallback.box. Optional, default true. Same asClick.target.target_offset: array<int, 4>
Same asClick.target_offset.
Node Attributes
Node Attributes allow you to specify additional behavioral control parameters for each node in next and on_error lists. Since v5.1, node attributes can be set using two syntax forms.
Basic Syntax
By default, node lists contain only node names:
{
"A": {
"next": [
"B",
"C",
"D"
]
}
}Attribute Setting Methods
Object Form
Use a NodeAttr object to explicitly specify the node name and attributes:
{
"A": {
"next": [
"B",
{
"name": "C",
"jump_back": true
},
"D"
]
}
}Prefix Form
Use square bracket prefixes to specify attributes directly before the node name:
{
"A": {
"next": [
"B",
"[JumpBack]C",
"D"
]
}
}Both forms are functionally equivalent. You can choose based on your preference. The two forms can be mixed within an array, creating a heterogeneous array.
Available Attributes
jump_back / [JumpBack]
Type: boolean
Default: false
Description: Enables jump-back mechanism. When this node is recognized and matched, after all subsequent nodes in its chain have completed execution, the system will return to the parent node and continue attempting to recognize the parent node's next list.
Execution Flow:
- The parent node recognizes nodes in the
nextlist sequentially - If a node with
jump_backattribute is recognized and matched, execute that node and its subsequent node chain - After the node chain completes execution, return to the parent node
- The parent node continues recognition from the beginning of the
nextlist
Example:
{
"A": {
"next": [
"B",
"C",
"[JumpBack]D",
"E"
]
}
}Execution flow:
- Node A attempts to recognize B, C, D, E in sequence
- If D is recognized and matched (assuming B and C are not), execute D and its subsequent node chain
- After D's node chain completes, return to node A
- Node A restarts recognition from B, continuing to attempt B, C, D, E
Use Cases:
Suitable for exception handling scenarios, such as recognizing and handling network disconnection prompts, permission request dialogs, and other temporary interfaces. After handling, the original flow continues.
Compatibility Note:
Similar to the deprecated is_sub field, but jump_back only applies to specific nodes in the current next list, rather than the entire node definition.
anchor / [Anchor]
Type: boolean
Default: false
Description: Enables anchor reference. When this attribute is true, the name field is treated as an anchor name rather than a node name, and resolves at runtime to the last node that set this anchor.
Usage:
- First, set an anchor name in a node using the
anchorfield - Then, reference this anchor in
nextoron_errorusing[Anchor]oranchor
Example:
{
"A": {
"anchor": "X",
"next": ["C"]
},
"B": {
"anchor": "X",
"next": ["C"]
},
"C": {
"next": [
"D",
"[Anchor]X"
]
}
}Equivalent object form:
{
"C": {
"next": [
"D",
{ "name": "X", "anchor": true }
]
}
}Use Cases:
Suitable for scenarios requiring dynamic node references, such as when multiple nodes may handle the same type of event and you need to return to "the last node that handled this event" to continue execution.
Future Extensions
More node attributes are planned and will be introduced in future versions.
Result Sorting Methods
Recognition algorithms may return multiple results. Use the order_by field to specify the sorting method, combined with the index field to select a specific result.
Horizontal
Sort horizontally (left to right, then top to bottom for same column).
Sorting rule: Sort by x coordinate ascending first, then by y coordinate ascending when x is equal.
| 1 3 5 7 |
| 2 4 6 8 |Vertical
Sort vertically (top to bottom, then left to right for same row).
Sorting rule: Sort by y coordinate ascending first, then by x coordinate ascending when y is equal.
| 1 2 3 4 |
| 5 6 7 8 |Score
Sort by matching score in descending order. Higher scores are ranked first.
Suitable for scenarios where you want to prioritize the highest confidence result.
Area
Sort by bounding box area in descending order. Larger areas are ranked first.
Applicable to FeatureMatch, ColorMatch, OCR, and NeuralNetworkDetect algorithms.
Length
Sort by recognized text length. Only applicable to OCR algorithm.
Longer text results are ranked first.
Random
Randomly shuffle the result order.
Suitable for scenarios requiring random selection.
Expected
Sort by the order specified in the expected field.
For OCR, sort by the order of regex patterns in the expected list; results matching earlier patterns are ranked first.
For NeuralNetworkClassify and NeuralNetworkDetect, sort by the order of class indices in the expected list.
Results not matching any expected value are ranked last.
Waiting for the Screen to Stabilize
Waits for the screen to stabilize. It exits the action only when there is no significant change in the screen for a certain continuous time.
The field value can be a uint or an object. For example:
{
"NodeA": {
"pre_wait_freezes": 500
},
"NodeB": {
"post_wait_freezes": {
// more properties ...
}
}
}If the value is an object, you can set additional fields:
time: uint
It exits the action only when there has been no significant change in the screen for "time" milliseconds in a row. Optional, default is 1.target: true | string | array<int, 2> | array<int, 4>
The position of the target to wait for. Optional, default is true. The values are the same asClick.target.target_offset: array<int, 4>
Additional movement from thetargetto be used as the waiting target, where the four values are added together. Optional, default is [0, 0, 0, 0].threshold: double
The template matching threshold to determine "no significant change." Optional, default is 0.95.method: int
The template matching algorithm to determine "no significant change," i.e., cv::TemplateMatchModes. Optional, default is 5. The same asTemplateMatch.method.rate_limit: uint
Identification rate limit, in milliseconds. Optional, default 1000.
Each identification consumes at leastrate_limitmilliseconds, and sleep will be executed if the time is less than that.timeout: uint
Timeout for recognizing, in milliseconds. Optional, default is 20,000 milliseconds (20 seconds).
Node Notifications
By configuring the focus field, you can have the UI display custom messages to users at specific stages of node execution.
Configuration
Add a focus field to the node, where keys are message types and values are template strings to display. Templates can use {field_name} placeholders, which the UI will automatically replace with actual values.
{
"NodeA": {
"focus": {
"Node.Recognition.Succeeded": "{name} recognition hit, ready to start execution",
"Node.Action.Starting": "{name} starts execution, task ID: {task_id}"
}
}
}Available Message Types
| Message Type | Trigger | Available Placeholders |
|---|---|---|
Node.Recognition.Starting | Recognition starts | task_id, reco_id, name |
Node.Recognition.Succeeded | Recognition succeeded | task_id, reco_id, name |
Node.Recognition.Failed | Recognition failed | task_id, reco_id, name |
Node.Action.Starting | Action starts | task_id, action_id, name |
Node.Action.Succeeded | Action succeeded | task_id, action_id, name |
Node.Action.Failed | Action failed | task_id, action_id, name |
More Message Types
For a complete list of callback messages and detailed descriptions, please refer to Callback Protocol and MaaMsg.h.
