Real Time
OpenJiuwen Unveils MANGO: A Data-Driven Framework for Optimizing Multi-Agent Flow Networks
Unlocking the Potential of Agent SwArms: openJiuwen's New masterpiece in Multi-Agent flow NetworksMay 13, 2026 — While multi-agent systems ho...
2 weeks ago
•
June 8, 2026
•
32 views
May 13, 2026 — While multi-agent systems hold immense promise for solving complex problems, their architectural nature is inherently prone to Error Propagation. Errors arising from suboptimal workflow generation or hallucinations from indiVidual agents can cascade along the collaboration chAIn, severely impACTing final results. From early multi-agent Frameworks like CAMEL, AutoGen, and MetaGPT—which relied heavily on manual configuration—to automated workflow generation systems such as ADAS, AFlow, AgentSquare, and AgentSwift, the Technology is transitioning from "human-designed" to "autonomous optimization." However, existing methods are often limited by heuristic strategy searches, cAPPing performance based on pre-DeFined rules. The key challenge remains: How can agents autonomously discover superior collaboration patterns to push the boundaries of intelligence? To address this, researchers from OpenJiuwen have proposed the MANGO (Multi-Agent Network Gradient Optimization) framework. Under the support of the AgentOS unified execution and scheduling backbone, MANGO falls within the research category of end-to-end collaborative optimization. It holistically models system structure, task decomposition, and path selection, achieving joint optimization of collaboration paths and execution strategies at the workflow level to enhance system stability and efficiency. Core Features:
End-to-End reinforcement learning: Ensures the attainment of global objectives.
Textual Gradient Updates: Allows local nodes to flexibly adapt to dynamic tasks.
Node Skipping Mechanism: Significantly reduces computational overhead while maintaining accuracy.
Core Methodology
MANGO adopts a data-driven strategy, utilizing historical experience and process-supervised Reinforcement Learning to dynamically learn workflow structures. It integrates local gradient signals into textual gradients for backpropagation, enabling continuous iterative optimization. The framework consists of four key components:
Flow Network Construction:Actions from workflows are iteratively inserted into the network. To maintain transition integrity, adjacent Operations are prevented from residing in the Same node. New nodes are created based on vector similarity thresholds, with each node assigned a distinct large language model (LLM), forming a multi-agent ecosystem. Reinforcement Learning (RL) for Edge Optimization:
Given a constructed flow network, MANGO treats edge selection as a Markov Decision Process (MDP).
State: Vector similarity derived from the current node's problem content and role description versus neighboring nodes.
Action: Selecting an outgoing edge (next agent).
Reward: A composite metric balancing process correctness and final task performance.
Policy: Optimized using the REINFORCE algorithm to maximize cumulative rewards.
Textual Gradient for Node Optimization:Each node's Prompt (task content and role description) is updated based on both the final task outcome (global signal) and intermediate execution feedback (local signal). This ensures gradient signals remain strong even in long workflow chains, addressing the vanishing gradient problem. Node Skipping Mechanism:Recognizing that fully optimized nodes yield diminishing returns, MANGO introduces a "Skip-k" mechanism. By selectively skipping updates to well-optimized nodes and reusing proven intermediate steps, the framework reduces computational costs significantly. This creates a symbiotic relationship where parameter updates and path selection inFluence each other in an optimization loop.
Experimental Results
The team evaluated MANGO across seven benchmarks, including coding (HumanEval, MBPP), mathematics (MATH500, GSM8K), reading comprehension (DROP), and multidisciplinary QA (MMLU, GPQA-Diamond). Using GPT-4o-mini as the base model, MANGO outperformed state-of-the-art baselines across the board. Key Results:
Effectiveness: MANGO achieved a 12.8% accuracy improvement over the best baseline (MaaS) on the MATH500 task and a 5.1% F1-score boost over AFlow on DROP.
Efficiency: LeverAGIng the Skip-3 mechanism, MANGO reduced API costs and runtime significantly. Compared to MaAS, it cut inference time by 47.4% and training time by 41.5% while maintaining the highest accuracy.
The research also includes a dEMOnstration in a financial business scenario, showcASIng MANGO's ability to optimize workflow paths and node prompts in real-world applications. Conclusion
MANGO represents a significant leap in Multi-Agent Systems. By integrating reinforcement learning, textual gradients, and a computationally efficient skipping mechanism, it effectively mitigates error propagation and sets a new Standard for autonomous collaboration in AI.
Be the first to rate this article.
Comments & Questions (0)
No comments yet
Be the first to comment!