Global tensional coupling
Every rod and cable contributes to the overall equilibrium. A local actuation can influence the entire structure, making the dynamics strongly interdependent.
Tensegrity robots combine rigid rods and elastic cables, offering high resilience and deployability while posing major challenges for locomotion control due to their underactuated and highly coupled dynamics. This paper introduces a morphology-aware reinforcement learning framework that integrates a graph neural network (GNN) into the Soft Actor-Critic (SAC) algorithm.
By representing the robot's physical topology as a graph, the proposed GNN-based policy captures coupling among components, enabling faster and more stable learning than conventional multilayer perceptron (MLP) policies. The method is validated on a physical 3-bar tensegrity robot across three locomotion primitives, including straight-line tracking and bidirectional turning.
The learned policies show improved sample efficiency, robustness to noise and stiffness variations, and better trajectory accuracy. They also transfer directly from simulation to hardware without fine-tuning, achieving stable real-world locomotion.
Contribution 1
Morphology-aware GNN Actor
We design a GNN-based actor that encodes the physical topology of tensegrity robots and captures the intrinsic coupling between rods, tendons, and structural components.
Contribution 2
Efficient and High-Performance GNN-SAC
We integrate the proposed GNN actor into the Soft Actor-Critic framework and demonstrate substantial gains in both training efficiency and final task performance compared with conventional MLP baselines.
Contribution 3
Hardware Sim-to-Real Transfer
We validate the learned locomotion primitives on hardware, including straight-line tracking and clockwise/counterclockwise turning, with direct transfer from simulation to the physical robot without fine-tuning.
Tensegrity robots are not independent-joint systems. Their locomotion emerges from global tension, structural coupling, and morphology.
Global tensional coupling
Every rod and cable contributes to the overall equilibrium. A local actuation can influence the entire structure, making the dynamics strongly interdependent.
Natural graph structure
The network of tensile and compressive elements naturally forms a graph, making tensegrity systems suitable for graph-based learning architectures.
Morphology-aware control
A GNN policy mirrors the physical topology of the robot, allowing the controller to learn coordinated behaviors that respect structural coupling.
The framework converts robot morphology into a graph, learns structural interactions through message passing, and trains the resulting policy with SAC.
Step 1
Graph Construction
The physical topology of the 3-bar tensegrity robot is modeled as a directed graph G = (V, E). Nodes represent rod end-caps, while edges represent rigid rods, passive tendons, and active tendons.
Step 2
GNN-based SAC
Node and edge features are processed through message passing. The GNN actor extracts a morphology-aware representation and outputs tendon length commands for actuation.
Step 3
Training and Evaluation
Policies are trained on straight-line tracking and bidirectional turning, then compared against MLP-based SAC and other reinforcement learning baselines.
We evaluate the proposed GNN-SAC policy across learning efficiency, motion primitive quality, robustness, cross-morphology generalization, and zero-shot sim-to-real transfer on a physical 3-bar tensegrity robot.
Learning
Efficient Learning and Better Motion Performance
The proposed GNN-SAC policy improves both training efficiency and final task performance, achieving more accurate tracking and more stable turning than MLP-based baselines.
Robustness
Robustness and Generalization
The learned policy remains effective under stiffness variation, observation noise, sloped terrain, and actuation errors, and further generalizes across robot morphologies.
Composition
Composable Motion Primitives
Straight-line tracking and bidirectional turning primitives can be sequenced by a high-level planner to follow complex waypoint trajectories such as infinity, spiral, and flower paths.
Hardware
Zero-shot Sim-to-Real Transfer
Policies trained only in simulation transfer directly to the physical 3-bar tensegrity robot without fine-tuning, demonstrating stable real-world locomotion.
Result 1 / Training
Key Findings
Result 2 / Locomotion
Primitive Quality
We evaluate straight-line tracking and bidirectional turning under randomized initial poses.
Result 3 / Robustness
The policy remains effective under stiffness variation, observation noise, inclined terrain, dynamic stiffness variation, and actuation error. Across perturbations, G-SAC maintains smoother and more stable behaviors than M-SAC.
The same policy trained on the 3-bar robot can also be deployed to 4-bar and 5-bar tensegrity configurations without retraining, demonstrating zero-shot generalization across morphologies.
Result 4 / Generalization
A straight-line tracking policy trained on the 3-bar prism tensegrity robot is directly deployed to 4-bar and 5-bar configurations without retraining. Each subplot aggregates 100 trials.
Whileaccuracy degrades as the structural difference from the training morphology increases, the locomotion remains functional.
It is important to note that conventional MLP-based policies cannot be directly applied in this setting due to their fixed input and output dimensions, which depend on the specific robot configuration.
Result 5 / Composition
From Motion Primitives to Trajectories
Straight-line tracking and bidirectional turning can be sequenced by a high-level planner to follow complex waypoint trajectories. The robot follows infinity-shaped, spiral, and six-petal flower paths by composing learned primitives.
Result 6 / Sim-to-Real
Policies trained only in simulation are deployed on the physical 3-bar tensegrity robot without fine-tuning. The robot successfully executes straight-line tracking, clockwise turning, and counterclockwise turning.
In hardware experiments, the robot achieves stable coordinated rolling, with mean forward tracking speed of 0.256 m/s, counterclockwise turning rate of 2.76°/s, and clockwise turning rate of 1.72°/s.
@article{zhang2026morphologyawaregraphreinforcementlearning,
title={Morphology-Aware Graph Reinforcement Learning for Tensegrity Robot Locomotion},
author={Chi Zhang and Mingrui Li and Wenzhe Tong and Xiaonan Huang},
journal={IEEE Robotics and Automation Letters},
year={2026},
url={https://tensegrity-graph-rl.github.io}
}