Classical Robotics: The Perception-Planning-Control Pipeline
Classical robotics follows an explicit, hand-engineered pipeline. Perception algorithms (often structured point cloud processing, CAD-model matching, or calibrated stereo vision) produce a geometric scene representation. A planning layer (RRT*, CHOMP, trajectory optimization, or model predictive control) computes a collision-free path to the goal. A control layer (PID, impedance control, or computed torque control) tracks the planned trajectory with tight real-time guarantees. Each stage has clear inputs, outputs, and failure modes that engineers can inspect, debug, and formally verify.
The strengths of this approach are not subtle. Classical controllers operate at 1 kHz control rates with deterministic latency. They provide formal stability guarantees through Lyapunov analysis, safety constraints through control barrier functions, and trajectory optimality through well-understood cost functions. They require no training data. And when they fail, the failure mode is typically interpretable: a perception error, an infeasible plan, or a tracking overshoot. For anyone deploying robots in environments where a regulator will ask "why did the robot do that," classical control provides answers that learned policies cannot.
The limitations are equally clear. Every hand-engineered pipeline is brittle to conditions its designer did not anticipate. A perception module tuned for brushed-metal parts fails on transparent objects. A motion planner optimized for a structured workstation fails in clutter. An impedance controller tuned for rigid grasping fails on deformable objects. Extending classical systems to novel situations requires more engineering, not more data, and the cost scales with the number of edge cases you need to handle.
Classical Tooling Deep Dive: IK, Motion Planning, Force Control
Understanding the specific tools in the classical stack is essential for teams evaluating which components to replace with learning and which to keep.
Inverse Kinematics (IK). Given a desired end-effector pose, IK computes the joint angles that achieve it. Analytical IK solutions exist for 6-DOF robots with specific geometries (most industrial arms) and run in microseconds. For 7-DOF redundant arms (Franka, OpenArm, Kinova Gen3), numerical IK solvers like KDL, TRAC-IK, or IKFast compute solutions in 1-10ms. IK is reliable, fast, and well-understood -- there is almost never a reason to replace it with a learned component. Even fully learned policies typically output end-effector targets that are converted to joint commands via IK.
Motion Planning. RRT* (Rapidly-exploring Random Tree, optimal variant) and its derivatives (BIT*, CHOMP, STOMP) compute collision-free trajectories in joint space or task space. Planning time ranges from 50ms for simple environments to 2-5 seconds for cluttered scenes. The key limitation: the planner needs an accurate collision model of the scene, which requires either known object geometry (from a CAD model) or real-time perception. In cluttered, unknown environments, the perception bottleneck makes classical planning brittle. MoveIt2 is the standard open-source motion planning framework, with support for most research arms.
Force Control. Impedance and admittance control regulate the mechanical interaction between the robot and its environment. Impedance control makes the robot behave like a mass-spring-damper system: when external forces act on the end-effector, the robot yields according to the specified stiffness and damping. Admittance control inverts this: the robot reads force-torque sensor data and generates position corrections proportional to the measured force error. These controllers are mathematically elegant, tunable, and provably stable -- but they require accurate knowledge of the robot's dynamics and contact model parameters. For a deeper dive on force sensing integration, see our F/T sensing guide.
The Classical Perception Pipeline. A typical structured perception stack: RGB-D camera captures a point cloud. Plane segmentation removes the table. Euclidean clustering isolates individual objects. Each object cluster is matched against a known CAD model library (using ICP or PPF matching) to estimate 6-DOF pose. This pipeline is fast (10-30ms per frame), accurate for known objects (sub-millimeter pose estimation), and completely fails for objects not in the model library. Every new object requires a new CAD model or a manual calibration step -- the engineering cost that drives teams toward learned perception.
Robot Learning: End-to-End Learned Policies
Robot learning replaces hand-engineered pipelines with data-driven models. In imitation learning, a neural network observes human demonstrations (camera images plus robot joint states) and learns a direct mapping from observation to action. In reinforcement learning, the agent learns through trial and error in simulation or the real world. In the emerging vision-language-action (VLA) paradigm, large pre-trained models take natural language task instructions and visual observations as input and produce motor commands as output.
The defining advantage of learned policies is that they handle perceptual complexity and environmental variation implicitly. A policy trained on 500 demonstrations of "pick up the cup" across 30 different cups, 5 lighting conditions, and varied table positions learns a representation that generalizes to the 31st cup without any explicit feature engineering. The policy does not have a perception module, a planning module, and a control module. It has a single model that maps pixels and proprioception to joint commands, and the relevant abstractions are learned from data rather than designed by engineers.
This capability comes at a cost. Learned policies are opaque: when they fail, diagnosing whether the failure is perceptual, planning-related, or a control execution error is difficult. They require substantial training data, typically hundreds to thousands of demonstrations for imitation learning, or millions of simulation steps for reinforcement learning. They offer no formal safety guarantees. And their behavior can change in unpredictable ways when the deployment environment shifts even slightly from the training distribution.
When Classical Robotics Wins
Precision assembly and machining. Tasks requiring sub-millimeter repeatability in known geometry. CNC machining, semiconductor wafer handling, PCB component insertion, and precision welding all demand tolerances that classical controllers achieve routinely and learned policies cannot guarantee. When the environment is fully specified and the physics well-modeled, classical control is both faster to deploy and more reliable in operation.
Known, structured environments. Automotive assembly lines, pharmaceutical packaging, and logistics sortation systems with controlled lighting, fixed object positions, and predictable physics are the natural domain of classical robotics. The engineering investment to cover all cases is finite and manageable. There is no reason to collect training data when you can write a deterministic controller that handles every situation your robot will encounter.
Safety-critical applications. Surgical robotics, collaborative robots operating near humans, and any deployment requiring regulatory certification benefit from the formal verification tools available to classical control. Control barrier functions, reachability analysis, and worst-case trajectory bounds give classical systems a safety assurance level that learned policies have not yet achieved. The FDA, for example, currently has no pathway for certifying an end-to-end learned surgical control policy.
Low-latency requirements. Applications requiring sub-millisecond control response, such as high-speed pick-and-place, balancing, or contact-sensitive assembly, need the deterministic timing that classical control loops provide. Neural network inference, even on optimized hardware, introduces variable latency that is problematic at control rates above 500 Hz.
When Robot Learning Wins
Unstructured environments. Sorting mixed items in warehouse bins, navigating cluttered homes, operating in kitchens or restaurants where object layouts change constantly. Writing a classical controller for bin picking across thousands of SKU geometries is a never-ending engineering project. Training a learned policy on diverse demonstrations is a data collection project with diminishing but continuous returns.
Dexterous manipulation. Tasks requiring finger-level coordination, deformable object handling, or contact-rich interaction. Folding laundry, tying knots, inserting flexible cables, and food preparation all involve physics that are prohibitively expensive to model analytically. Learned policies that observe the physical outcome of their actions and adapt implicitly through training data handle these tasks far more naturally than any engineered controller.
Generalization across object instances. When your robot needs to pick up any mug, not a specific mug. When your mobile robot needs to navigate any office, not a specific floor plan. When your cooking robot needs to handle any brand of pasta box. The moment your deployment requires handling novel instances within a category, learned representations from diverse training data become essential. Classical perception would need re-engineering for every new object variant.
Tasks that are hard to specify programmatically. "Wipe the table until it looks clean." "Pack the items so nothing shifts during shipping." "Arrange the flowers attractively." These tasks have objective success criteria that humans evaluate easily but that are difficult to express as mathematical cost functions. Imitation learning sidesteps the specification problem entirely by learning the task implicitly from demonstrations of the desired behavior.
Approach Comparison Table
| Dimension | Classical Robotics | Learned Policies | Hybrid |
|---|---|---|---|
| Control frequency | 1 kHz deterministic | 10-50 Hz variable | 100-1000 Hz (classical inner loop) |
| Novel objects | Requires new models | Generalizes from data | Learned perception + classical plan |
| Safety guarantees | Formal verification available | No formal guarantees | Classical safety envelope |
| Setup time | Weeks-months (engineering) | Days-weeks (data collection) | Weeks (both) |
| Debugging | Inspect each module | Black-box, need ablation | Learned modules harder |
| Deformable objects | Very difficult to model | Learns from demonstrations | Learned contact + classical motion |
| Scaling cost | O(edge cases) engineering | O(data diversity) | Both, but reduced |
Tooling Comparison for Existing Teams
| Tool Category | Classical Stack | Learning Stack |
|---|---|---|
| Middleware | ROS2 Humble/Iron | LeRobot, RoboCasa, robomimic |
| Motion planning | MoveIt2, OMPL, Drake | N/A (end-to-end) |
| Perception | PCL, Open3D, FoundationPose | DINOv2, SigLIP (learned backbone) |
| Simulation | Gazebo, Drake | Isaac Sim, MuJoCo, Genesis |
| Control | ros2_control, impedance/admittance | ACT, Diffusion Policy, VLA inference |
| Languages | C++ (real-time), Python (scripts) | Python (PyTorch), C++ (deployment) |
The Hybrid Approach: Learned Perception + Classical Planning + Learned Control
The most capable deployed robot systems in 2026 are hybrids, and the specific hybrid architecture that has emerged as dominant is worth understanding in detail.
Learned perception layer. A neural network (often a pre-trained vision foundation model like DINOv2 or CLIP, fine-tuned on task-specific data) processes camera images and produces a structured scene representation: object poses, semantic labels, surface normals, grasp candidates. This replaces the brittle hand-engineered perception of classical systems with learned representations that generalize across lighting, textures, and object instances. The perception layer runs at 10-30 Hz and outputs structured data, not raw actions.
Classical planning layer. A model predictive controller (MPC) or sampling-based planner takes the perceived scene state and computes a collision-free, dynamically feasible trajectory to achieve the task goal. This layer operates on the clean geometric representation from the perception module and applies all the safety constraints, joint limits, and optimality criteria that classical planning excels at. Planning runs at 10-50 Hz.
Learned low-level control. For contact-rich tasks, a learned residual policy adjusts the classical controller's commands in real time based on force-torque sensor feedback and visual observations of the contact. This handles the deformable-object and contact-dynamics cases where classical control models break down, while the classical controller provides the overall trajectory structure and safety envelope. The residual policy runs at 100-500 Hz, adding corrections to the classical control output.
This architecture captures the strengths of both paradigms. The learned perception handles visual complexity. The classical planner provides safety guarantees and interpretable behavior. The learned residual controller handles contact dynamics that cannot be modeled analytically. Google DeepMind's manipulation systems, several production-deployed Amazon warehouse robotics cells, and multiple surgical robotics platforms use variants of this architecture in 2026.
Transition Path for Existing Classical Robotics Teams
If your team has a working classical robotics stack and wants to add learning capabilities, here is the recommended incremental path that minimizes risk:
- Phase 1: Replace perception only. Swap your hand-engineered object detection and pose estimation with a learned model (FoundationPose, Grounding DINO, or a fine-tuned DINOv2 detector). Keep your classical planner and controller unchanged. This is the lowest-risk learning introduction and typically provides the largest immediate improvement (handles novel objects without CAD models). Timeline: 2-4 weeks of integration work.
- Phase 2: Add learned grasp planning. Replace your analytical grasp planner (if any) with a learned grasp quality predictor (GraspNet, Contact-GraspNet, or AnyGrasp). The learned model proposes grasp candidates scored by predicted success, and your classical planner generates a trajectory to the selected grasp. Timeline: 2-6 weeks.
- Phase 3: Add learned residual control. For contact-rich tasks where your classical impedance controller struggles, train a residual policy that adds corrections to the classical output. Collect 100-200 demonstrations of the contact phase only (not the full task). The residual policy handles the "last centimeter" that classical control cannot model. Timeline: 4-8 weeks including data collection.
- Phase 4: Evaluate end-to-end. Once you have experience with learned components, evaluate whether an end-to-end learned policy (ACT or Diffusion Policy) outperforms your hybrid stack on your specific tasks. For some tasks, the answer will be yes -- particularly tasks with high visual complexity and moderate precision requirements. For precision tasks, the hybrid approach typically continues to win.
SVRC can provide data collection for any of these phases through our data services, and our engineering team advises on hybrid architecture design. The SVRC platform supports both ROS2-based classical workflows and PyTorch-based learning workflows.
A Practical Decision Framework: Five Questions
When starting a new robot application, answer these five questions to determine your approach.
1. Is the environment fully specified and stable? If yes (factory line, clean room, structured warehouse cell), start with classical control. You will deploy faster and with higher reliability than any learned approach. If no (homes, restaurants, unstructured warehouses), you need learning at least in the perception layer.
2. Do you need to handle novel object instances? If the robot will encounter objects it has never seen before, you need a learned perception and possibly a learned policy. Classical perception requires explicit models of every object. If the object set is fixed and known, classical perception is faster to implement and more reliable.
3. Is the task contact-rich or involving deformable objects? If yes, you need learning in the control layer. Classical contact models are inadequate for deformable manipulation, food handling, or textile tasks. A learned residual controller or a fully learned policy trained on contact-rich demonstrations is the practical path.
4. Do you need formal safety guarantees or regulatory certification? If yes, your system architecture must include a classical safety layer, even if other components are learned. Control barrier functions, emergency stop logic, and workspace boundary enforcement should be classical and formally verified. Learned components operate within the safety envelope defined by the classical layer.
5. What is your data budget? Learned policies require demonstrations (hundreds for imitation learning) or simulation environments (for RL). If you have the budget to collect 200-500 high-quality demonstrations of your specific task, imitation learning is practical. If not, classical control or a fine-tuned foundation model with minimal task-specific data is your path. SVRC's data collection services ($2,500 pilot / $8,000 campaign) can help you build the dataset efficiently if learning is the right approach.
Learning Approach Taxonomy: Choosing an Algorithm
Within the learning paradigm, the choice of algorithm has dramatic implications for data requirements, compute costs, and deployment characteristics. This taxonomy maps the landscape as of 2026.
| Algorithm | Data Source | Sample Efficiency | Reward Required? | Best Use Case |
|---|---|---|---|---|
| Behavioral Cloning (BC) | 50-500 demos | High | No | Short-horizon tasks with consistent strategy |
| ACT (Action Chunking) | 50-200 demos | High | No | Bimanual tasks, long-horizon with action chunks |
| Diffusion Policy | 200-1000 demos | Medium | No | Multimodal tasks with multiple valid strategies |
| VLA Fine-Tune (Octo/OpenVLA) | 20-200 demos | Very High | No | Novel object generalization, language-conditioned tasks |
| PPO (on-policy RL) | 10M-100M sim steps | Low | Yes (dense preferred) | Locomotion, continuous control with clear reward |
| SAC (off-policy RL) | 1M-50M sim steps | Medium | Yes | Dexterous manipulation in sim, sample-efficient RL |
| GAIL / IRL | 10-50 demos + sim | Medium | Learned from demos | Few demonstrations + good simulator available |
| Model-Based RL (Dreamer, MBPO) | 100K-1M steps | High (for RL) | Yes | Data-limited RL where world model can be learned |
The practical decision for most manipulation teams in 2026: start with ACT or Diffusion Policy (imitation learning), move to VLA fine-tuning if you need generalization across objects or language conditioning, and reserve RL for locomotion or cases where you have an accurate simulator and a clear reward function. GAIL and model-based RL occupy niche roles for now.
Classical Pipeline Failure Modes by Stage
Understanding exactly how classical pipelines fail helps teams identify which stages to replace with learning and which to keep. Each stage in the perception-planning-control pipeline has characteristic failure modes tied to specific environmental conditions.
| Pipeline Stage | Failure Mode | Trigger Condition | Impact |
|---|---|---|---|
| Perception | Object not detected | Novel object geometry, transparent/reflective material | Complete task failure (no target) |
| Perception | Pose estimate off by >5mm | Symmetrical objects, partial occlusion, glare | Grasp misalignment, placement error |
| Planning | No feasible path found | Dense clutter, narrow passages, conflicting constraints | Task abort or timeout |
| Planning | Stale scene model | Dynamic environment, objects moved between perception and execution | Collision with moved objects |
| Control | Tracking overshoot | Aggressive trajectories, under-damped PID gains | Impact damage, position error at target |
| Control | Inadequate contact model | Deformable objects, unknown friction, compliant surfaces | Crush damage, slip, grasp failure |
| Integration | Timing desync between modules | High CPU load, ROS2 DDS congestion, GC pauses | Stale data used for planning, jerky execution |
The pattern is clear: perception failures dominate in unstructured environments, planning failures dominate in cluttered scenes, and control failures dominate in contact-rich tasks. Teams should replace with learning the stage that causes the most failures in their specific deployment, and keep classical the stages that are working reliably.
The Residual Policy Pattern: Adding Learning to Classical Control
The residual policy pattern is the safest way to introduce learning into an existing classical system. Instead of replacing the classical controller, a learned residual policy adds corrections on top of the classical output. The total commanded action is: a_total = a_classical + a_residual, where a_residual is constrained to a small range (typically +/- 5mm position, +/- 2 degrees orientation per timestep).
# residual_policy.py -- Classical + learned residual controller
import numpy as np
import torch
class ResidualPolicyController:
"""Adds learned corrections to classical impedance controller."""
def __init__(self, classical_controller, residual_model, max_residual=0.005):
self.classical = classical_controller
self.residual = residual_model # Trained policy network
self.max_residual = max_residual # 5mm max correction
def compute_action(self, obs, ft_reading, target_pose):
# Classical controller: impedance control toward target
a_classical = self.classical.compute(obs["joint_pos"], target_pose, ft_reading)
# Learned residual: correct for contact dynamics
with torch.no_grad():
residual_input = torch.cat([
torch.tensor(obs["joint_pos"]),
torch.tensor(ft_reading), # Force-torque sensor
torch.tensor(obs["wrist_image"]).flatten()
])
a_residual = self.residual(residual_input).numpy()
# Safety clamp: residual cannot exceed max_residual per joint
a_residual = np.clip(a_residual, -self.max_residual, self.max_residual)
return a_classical + a_residual
This pattern has been deployed successfully in insertion tasks (peg-in-hole, connector mating), where the classical controller handles the approach trajectory and the residual policy handles the contact-phase corrections that require sensitivity to force feedback. The residual is trained on 100-200 demonstrations of the contact phase only, keeping data requirements low. At SVRC, we use this pattern with the OpenArm 101 for precision assembly tasks where classical control alone achieves 85% success and the residual policy pushes it to 96%.
Computational Requirements Comparison
The infrastructure cost of each approach differs dramatically. Teams must understand these requirements before committing to an architecture.
| Resource | Classical Pipeline | IL (ACT / DP) | VLA Fine-Tune | RL (Sim) |
|---|---|---|---|---|
| Training GPU | None | 1x RTX 3090/4090 | 1-4x A100/H100 | 1-8x A100 (Isaac Sim) |
| Training time | N/A (hand-tuned) | 2-8 hrs | 12-48 hrs | 24-120 hrs |
| Inference GPU | None (CPU only) | 1x RTX 3060+ | 1x A100 or H100 | Same as IL at deploy |
| Inference latency | < 1 ms | 20-100 ms | 200-500 ms | Same as IL at deploy |
| Disk/storage | < 100 MB (URDF, configs) | 50-200 GB (dataset) | 200 GB-2 TB | 50-500 GB (replay buf) |
| Engineering labor | High (weeks-months) | Medium (data + train) | Low-medium (fine-tune) | High (sim engineering) |
| Cloud cost estimate | $0/month | $50-200/train run | $500-3,000/train run | $1,000-10,000/run |
These costs are for a single-task training cycle. Multi-task policies, hyperparameter sweeps, and iterative data collection multiply the numbers accordingly. Teams with tight budgets should consider SVRC's data collection service, which amortizes hardware and operator costs across multiple projects.
Failure Mode Analysis: Diagnosing Classical vs. Learned Systems
When a robot system fails, diagnosing the root cause follows fundamentally different pathways depending on the paradigm. Understanding these diagnostic frameworks saves significant debugging time.
Classical pipeline failure modes:
- Perception failure. The point cloud is noisy, the object is not detected, or the pose estimate is off by more than the controller's tolerance. Diagnostic: visualize the point cloud and detection output at the failure timestep. Fix: tune segmentation parameters, add a camera viewpoint, or improve lighting. Time to diagnose: minutes to hours.
- Planning failure. The planner returns no solution (infeasible), times out, or produces a collision. Diagnostic: visualize the planning scene and collision objects in RViz2. Fix: increase planning time, add clearance margins, or simplify the collision model. Time to diagnose: minutes.
- Control failure. The robot overshoots the target, oscillates, or fails to maintain contact. Diagnostic: plot joint position tracking error, velocity profiles, and force-torque signals. Fix: retune PID gains, adjust impedance parameters, or reduce trajectory speed. Time to diagnose: hours.
- Integration failure. Timing issues between modules -- the planner uses a stale perception output, or the controller receives a trajectory update mid-execution. Diagnostic: check message timestamps in ROS2 logs. Fix: add synchronization barriers or switch to a reactive replanning architecture. Time to diagnose: hours to days.
Learned policy failure modes:
- Distribution shift. The object is in a position, orientation, or lighting condition not sufficiently covered by training data. Diagnostic: compare the failure observation to the training distribution (e.g., by computing embedding distances using the policy's vision encoder). Fix: collect more diverse demonstrations covering the failure case. Time to diagnose: hours to days.
- Mode averaging. The policy outputs the average of two valid strategies, producing a trajectory that matches neither. Diagnostic: rollout visualization shows the robot hesitating between two approaches. Fix: switch from MSE loss to a multimodal architecture (Diffusion Policy, CVAE). Time to diagnose: hours.
- Compounding error. The policy drifts off-trajectory after 20-30 steps and cannot recover. Diagnostic: track per-step action error over time and observe accelerating divergence. Fix: increase action chunk length, add temporal ensembling, or collect DAgger data. Time to diagnose: hours.
- Calibration mismatch. Camera extrinsics shifted between data collection and deployment, causing consistent spatial offset in policy actions. Diagnostic: measure camera pose against the calibration used during data collection. Fix: recalibrate cameras or add camera pose to the observation space. Time to diagnose: minutes once suspected, days if not.
The key asymmetry: classical failures are generally faster to diagnose because each module has inspectable inputs and outputs. Learned policy failures require inference about the training data distribution, which is inherently more difficult. Hybrid architectures partially address this by isolating learned components so that classical diagnostic tools apply to most of the pipeline.
Real-World Case Studies
These examples illustrate how the choice between classical, learned, and hybrid approaches plays out in practice.
Case 1: Electronics connector insertion (classical wins). A contract manufacturer needed a robot to insert USB-C connectors into PCB sockets. Tolerance: +/- 0.15mm. The connector geometry is known, the PCB is fixtured, and the insertion trajectory is a straight line with controlled force. A classical impedance controller with spiral search at the insertion point achieved 99.2% success in 10,000 trials. No training data was needed. An IL approach was prototyped and achieved 94% success after 500 demonstrations -- worse performance at higher cost.
Case 2: Warehouse bin picking (learning wins). An e-commerce fulfillment center needed a robot to pick arbitrary items from bins containing 50+ SKU categories. Items ranged from soft pouches to rigid boxes to oddly shaped electronics. A classical pose estimation + grasp planning pipeline achieved 78% pick success, limited by perception failures on novel and reflective objects. A learned grasp planner (Contact-GraspNet) with a DINOv2 backbone achieved 93% pick success across all categories, including items never seen during training. The learned system took 3 weeks of data collection (4,000 pick demonstrations) versus 4 months of engineering for the classical system.
Case 3: Food plating (hybrid wins). A food preparation startup needed a robot to plate salad ingredients in an aesthetically pleasing arrangement. Classical control handled the precise placement of individual items (known portion sizes, calibrated dispensers). A learned perception model identified ingredient types and current plate state from overhead camera images. A learned high-level planner generated the composition layout based on training images of plated meals. The hybrid system achieved 87% acceptance rate from human quality evaluators, compared to 62% for a fully classical rule-based system and 79% for a fully learned end-to-end policy.
MoveIt2 + Learned Perception: A Minimal Hybrid Example
For teams looking to build their first hybrid system, here is the minimal integration pattern using MoveIt2 for motion planning with a learned object detector replacing classical perception.
# hybrid_pick.py -- Minimal hybrid: learned perception + classical planning
import rclpy
from moveit2 import MoveIt2
from groundingdino import GroundingDINO
import numpy as np
def hybrid_pick(node, moveit, detector, camera, prompt="the red mug"):
# --- Learned perception layer ---
rgb, depth = camera.capture()
detections = detector.predict(rgb, prompt) # GroundingDINO
best = max(detections, key=lambda d: d.confidence)
# Back-project 2D detection center to 3D using depth
cx, cy = best.center
z = depth[int(cy), int(cx)] / 1000.0 # mm to meters
x = (cx - camera.cx) * z / camera.fx
y = (cy - camera.cy) * z / camera.fy
target_pose = [x, y, z, 0, 0, 0, 1] # position + quaternion
# --- Classical planning layer (MoveIt2) ---
# Pre-grasp: approach from above
pre_grasp = target_pose.copy()
pre_grasp[2] += 0.10 # 10cm above
moveit.move_to_pose(pre_grasp)
# Grasp: descend with impedance control
moveit.move_to_pose(target_pose, velocity_scaling=0.3)
moveit.close_gripper(force=20.0) # 20N grip
# Lift
lift_pose = target_pose.copy()
lift_pose[2] += 0.15
moveit.move_to_pose(lift_pose)
This 25-line example captures the essence of the hybrid pattern: a learned model (GroundingDINO) handles the perceptual complexity of finding arbitrary objects from language descriptions, while MoveIt2 handles collision-free trajectory planning with proper joint limits and velocity constraints. At SVRC, our OpenArm 101 ships with a MoveIt2 configuration package and RealSense camera driver integration, making this type of hybrid system deployable in an afternoon.
Data Requirements Compared
Classical control requires system identification data: joint position, velocity, torque, and force-torque sensor readings from carefully designed calibration experiments. A few hours of structured experiments typically suffice. The data is low-volume but must be high-precision. No neural network training is involved.
Imitation learning typically requires 200-1,000 demonstration episodes per task, each containing synchronized camera images and robot state at 30-50 Hz. Collection time ranges from 2 hours (200 demos of a simple task) to 2 weeks (1,000 demos of a complex task with diverse objects). Data quality dominates quantity: 200 clean demonstrations outperform 1,000 noisy ones. For details on collection cost, see our cost per demonstration analysis.
Foundation model fine-tuning (starting from Octo, OpenVLA, or RT-2) requires far fewer task-specific demonstrations, typically 50-200, because the pre-trained model provides a strong visual and behavioral prior. This is the most practical approach for teams with limited data budgets who need learned behavior. Pre-trained models are available through the Open X-Embodiment ecosystem.
Reinforcement learning requires a simulation environment that accurately models the task physics. Building that simulation is itself a significant engineering effort, but once available, RL can generate millions of training episodes at near-zero marginal cost. The challenge is sim-to-real transfer: policies trained in simulation often fail on real hardware due to physics modeling inaccuracies.
The Trend: Learning Is Expanding, Classical Is Not Disappearing
The trajectory of the field is clear. Robot learning is expanding into domains previously dominated by classical control: factory assembly, logistics, quality inspection. Foundation models are reducing the data requirements that previously made learning impractical for many applications. And the hybrid architecture pattern is making it possible to add learned capabilities incrementally to classical systems without replacing the classical safety and control infrastructure.
But classical robotics is not disappearing. It is becoming the safety and precision substrate on which learned capabilities are layered. Every production robot system in 2026 that handles real-world variability through learning also contains a classical controller ensuring that the learned policy does not drive the robot into a table, exceed joint limits, or apply dangerous forces. The debate between learning and classical is resolving into a question of architecture: which components are learned, which are engineered, and how do they interface.
For teams starting new projects, SVRC supports both paradigms. Our data services provide the demonstrations needed for imitation learning. Our hardware catalog includes arms and sensors compatible with both classical and learned control stacks. And our engineering team can advise on architecture decisions for hybrid systems that combine the best of both approaches.
Related Reading
Imitation Learning Guide · Force-Torque Sensing Guide · Sim-to-Real Transfer Guide · Deployment Checklist · Arm Comparison · Data Services · SVRC Platform