Quick Start
Development Approach
MaaFramework provides three integration solutions to meet different development scenarios:
Approach 1: Pure JSON Low-Code Programming
Applicable Scenarios: Quick start, simple logic implementation
Features:
- Zero coding prerequisite
- Automated processes configured through JSON
- Comes with a 🎞️ video tutorial and a ⭐ project template
- Combine drag-and-drop development with the MPE visual editor.
{
"Click Start Button": {
"recognition": "OCR", // Text recognition engine
"expected": "Start", // Target text
"action": "Click", // Execute click action
"next": ["Click Confirm Icon"] // Subsequent task chain
},
"Click Confirm Icon": {
"recognition": "TemplateMatch",// Image template matching
"template": "confirm.png", // Matching asset path
"action": "Click"
}
}Approach 2: JSON + Custom Logic Extension (Recommended)
Features:
- Retains the low-code advantage of JSON; core flows remain visual and easy to edit
- Hosts custom recognition/actions in the Agent process, making it easier to encapsulate advanced logic
- Seamlessly integrates with the ⭐ boilerplate to provide scaffolding and examples
{
"Click Confirm Icon": {
"next": ["Custom Processing Module"]
},
"Custom Processing Module": {
"recognition": "Custom",
"custom_recognition": "MyReco", // Custom recognizer ID
"action": "Custom",
"custom_action": "MyAct" // Custom action ID
}
}💡 The General UI automatically connects to your Agent process and invokes the registered recognition/action implementations when executing MyReco/MyAct.
# Python pseudo-code example
from maa.agent.agent_server import AgentServer
# Register a custom recognizer
@AgentServer.custom_recognition("MyReco")
class CustomReco:
def analyze(ctx):
return (10, 10, 100, 100) # Return your own processed recognition result
# Register a custom action
@AgentServer.custom_action("MyAct")
class CustomAction:
def run(ctx):
ctx.controller.post_click(100, 10).wait() # Execute click
ctx.override_next(["TaskA", "TaskB"]) # Dynamically adjust the task flow
# Start the Agent service
AgentServer.start_up(sock_id)For a complete example, refer to the template commit.
Approach 3: Full-Code Development
NOTE
MaaFramework offers full multi-language APIs, but code-only workflows lose ecosystem tools (visual editor, visual debugger, General UI). In most cases, the custom extensions in Approach 2 already cover advanced requirements without sacrificing those capabilities.
Applicable Scenarios:
- Deep customization requirements
- Implementation of complex business logic
- Need for flexible control over the execution flow
# Python pseudo-code example
def main():
# Execute the predefined JSON task
result = tasker.post_task("Click Start Button").wait().get()
if result.completed:
# Execute code-level operations
tasker.controller.post_click(100, 100)
else:
# Get the current screenshot
image = tasker.controller.cached_image
# Register a custom action
tasker.resource.register_custom_action("MyAction", MyAction())
# Execute a mixed task chain
tasker.post_task("Click Confirm Icon").wait()Resource Preparation
After you confirm your development approach, prepare the corresponding resource files. The example below uses the project boilerplate as a baseline.
TIP
- If you use the project boilerplate, follow the
TIPmarks below for a ready path. - If you choose full-code development, you still need resource files such as image assets and OCR models; otherwise related image-recognition features will be unavailable.
File Structure Specification
TIP
⭐If you use the boilerplate, modify the folder directly.
my_resource/
├── image/ # Image asset library
│ ├── my_button_ok.png
│ └── my_icon_close.png
├── model/
│ └── ocr/ # Text recognition models
│ ├── det.onnx
│ ├── keys.txt
│ └── rec.onnx
└── pipeline/ # Task pipelines
├── my_main.json
└── my_subflow.jsonYou can modify the names of files and folders starting with "my_", but the others have fixed file names and should not be changed. Here's a breakdown:
Task Pipeline
The files in my_resource/pipeline contain the main script execution logic and recursively read all JSON format files in the directory.
You can refer to the Task Pipeline Protocol for writing these files. You can find a simple demo for reference.
Recommended tools:
- JSON Schema
- VSCode Extension
- Config resources based on
interface.json - Support going to task definition, finding task references, renaming task, completing task, click to launch task
- Support launching as MaaPiCli
- Support screencap and crop image after connected
- Config resources based on
- MaaPipelineEditor
- No-code visual editor; supports drag-and-drop nodes and JSON import/export
Image Files
The files in my_resource/image are primarily used for template matching images, feature detection images, and other images required by the pipeline. They are read based on the template and other fields specified in the pipeline.
Use lossless source images scaled to 720p before cropping. Unless you're very familiar with MaaFramework's processing, use the capture tools below to obtain images.
Text Recognition Model Files
⭐If you use the boilerplate, just follow its documentation and run configure.py to automatically deploy the model file.
The files in my_resource/model/ocr are ONNX models obtained from PaddleOCR after conversion.
You can use our pre-converted files: MaaCommonAssets. Choose the language you need and store them according to the directory structure above.
If needed, you can also fine-tune the official pre-trained models of PaddleOCR yourself (please refer to the official PaddleOCR documentation) and convert them to ONNX files for use. You can find conversion commands here.
Debug
After you finish preparing resources, you can start debugging.
NOTE
If you choose full-code development, some tools in this section may not work; consider writing your own debug helpers instead.
Most tools will generate a config/maa_option.json file in the same directory, including:
logging: Save the log and generatedebug/maa.log. Default true.save_draw: Save visualized image-recognition results during runtime. Default false.stdout_level: Console log level. Default 2 (Error); set 0 to silence logs, or 7 to show all logs.save_on_error: Save the current screenshot when a task fails. Default true.draw_quality: JPEG quality for visualized image-recognition results (0-100). Default 85.
If you integrate it yourself, you can enable debugging options through the Toolkit.init_option / MaaToolkitConfigInitOption interface. The generated json file is the same as above.
Run
NOTE
If you choose full-code development, the UI apps in this chapter may not work; consider writing your own interaction UI.
We define a ProjectInterface protocol to describe the resource files and runtime configuration so General UI can correctly load and run your project.
In short, write an interface.json to tell the General UI where your resources are and which tasks can be executed, so it can run for you.
Best-practice references:
Full-Code Development
Please refer to the Integration Documentation and the Integrated Interface Overview.
Communication
Developers are welcome to join the official QQ group (595990173) for integration and development discussions. The group is reserved for engineering topics; product-usage support is not provided, and off-topic or spam accounts may be removed to keep the channel focused.
