IN5590 - Rapid Prototyping of Robotic Systems

Sherbot Holmes

By Eirik M?ller Nilsen og Eivind Nordahl Andersen

1) Topic C – Animatronics

We chose Animatronics because it connects creative character design with practical robotics and modern AI.
Inspired by compliant mechanisms and soft robotics from the animation industry, we aim to explore how a physical “creature” can feel alive when it understands its surroundings and responds naturally.

This topic provides a holistic learning arena — from perception (video analysis) to decision-making (AI) and safe, compliant actuation. It also has high demo value: a robot that follows a target, recognizes objects, and responds to speech is both technically challenging and engaging to present.

Our motivation lies in achieving measurable, iterative improvements. Vision can be optimized for accuracy and robustness, motion control for smooth tracking, and the voice interface for reliable understanding.

2) Goals

Project Goal

The goal of this project is to design and build a small mobile animatronic robot capable of visually detecting, locating, and guiding users toward specific objects or areas.
The robot should demonstrate a natural and lifelike interaction loop where it can:

Find and approach a relevant target, such as an empty chair, a trash bin, or a specific lecture room sign (“Smalltalk” or “Simula”).
Respond to user input either through voice commands (“Find the red chair”, “Show me the Simula room”) or through predefined physical buttons.
- Each button corresponds to a specific task (e.g., Locate chair, Locate trashcan, Locate room sign).
- The button fallback ensures reliable operation even if voice recognition fails.

The overall goal is to achieve robust, repeatable, and expressive behavior that feels intuitive and engaging during live demonstrations — focusing on the quality of perception–decision–motion integration rather than raw computational performance.

Intended Outcomes

By the end of the project, the robot should be able to:

Perceive its surroundings using edge-device, camera, microphone, speaker and proximity sensors.
Interpret a command (spoken or button-based) to identify what to look for.
Detect and locate the specified target in real time (>5fps).
Move toward the target and stop at a safe distance (~0.5 m).
Perform a simple expressive gesture upon task completion and task activation.

Inspiration and Design Character

The design draws inspiration from Alonso Martinez’s expressive 3D-printed robots[^1], but differs in two key aspects:

It emphasizes autonomous perception and interaction rather than purely expressive animation.

The goal is to blend soft-robotic motion with intelligent behavior, creating an approachable and “alive” demonstration platform for embodied AI.

[^1]: Alonso Martinez's 3D-Printed Animated Robots! — Adam Savage's Tested: https://www.youtube.com/watch?v=0vfuOW1tsX0&t=300s/

3) Sketch

Project Overview

The system combines computer vision, speech or button-based interaction, and motor control into one integrated loop.
It is composed of two main subsystems:

1. Perception & Intelligence

Uses the camera to capture the environment and identify objects of interest.
Uses the microphone for voice commands.
Runs a finetuned compact edge-device vision model(Yolo11s)[^2][^3] for detecting relevant items such as chairs, trash bins, or room labels.
Interprets simple commands:
- “Find an empty chair”
- “Show me Smalltalk”
- “Find the trashcan”
Communicates target coordinates and control signals wirelessly (via WiFi or Bluetooth Low Energy) to the robot’s microcontroller.

[^2]: You Only Look Once: Unified, Real-Time Object Detection — Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: https://arxiv.org/abs/1506.02640

[^3]: Ultralytics YOLO11 — Model documentation: https://docs.ultralytics.com/models/yolo11/

2. Actuation & Mobility (Microcontroller Layer)

Controls four Dynamixel smart servos configured in wheel mode for differential drive and balance.
Executes simple behaviors:
- Navigate forward, backward, turn left/right, or stop.
- Perform expressive gestures (LED blink, spin, chime) once a target is reached.
Houses the control electronics, power source, and communication module in a compact 3D-printed chassis.

3. Interaction Inputs

Voice control (primary): User issues natural commands via speech.
Physical buttons (fallback):
- Multiple buttons are assigned to specific pre-defined tasks (e.g., Chair, Trashcan, Simula, Smalltalk).
- Each button directly informs the perception system which target to look for — ensuring functionality even in noisy environments or without microphone access.

Physical and Aesthetic Design

The robot’s outer shell and motion elements will be 3D-printed, inspired by compliant-mechanism and soft-robotic principles.
This gives the robot smooth, lifelike movement and an approachable character suitable for interactive demonstrations.

Intended Demonstration

In the final demonstration, the robot will be asked to:

“Find an empty chair” — or, if voice fails, the Chair button is pressed.
The robot will visually search, approach, and stop near the correct object, signaling success with an expressive gesture.

Summary of Key Integrations | Layer | Components | Communication | Function | |-------|-------------|----------------|-----------| | Perception | (camera, mic, AI) | WiFi / BLE / On-device | Detects, interprets, decides | | Control | Microcontroller | WiFi / BLE / On-device | Receives motion/gesture commands | | Actuation | 4× Dynamixel Servos | Serial / TTL | Drives movement and expression | | User Input | Voice or 4 Physical Buttons | Local | Command selection |

Illustration

First demo design

Robot holsitc Scetch

Sketch witch annotations

Robot holsitc Scetch

Demo: Real-Time CV — Bounding Boxes on Target Objects

Robot demo

4) Bill of Materials (BOM)

Component	Description / Function	Qty	Notes
Microcontroller	Main control unit for motors and BLE/WiFi communication	1	Rasberry Pi 5 Wifi
Microphone	Small microphone for microcontroller	1
Camera Module	Camera for interpreting surroundings	1	Pi camera module v3
Small speaker	Communication to users	1
Dynamixel Smart Servos	Drive + feedback; run in Wheel Mode for base	4	AX-12A.
Battery Pack	Power source for logic and motors	1
Edge Device	Either microcontroller or smartphone	1	Handles object detection & voice command recognition
Chassis (3D-printed)	Structural frame for electronics	1	Custom PLA design
Wheels	Mobility system	4	3D printed or RC-type
Proximity Sensors	For obstacle avoidance	3	Ultrasonic or IR sensors
Optional Components	LED, buttons, cables etc.	–	Status and feedback

5) Plan

We divide the work into three main iterations, each with clear milestones, responsibilities, and deliverables.
Progress will be documented through logs, photos, sketches, and short demo videos.

Week	Milestone / Goal	Description	Responsible	Deliverables
42	Concept & Design	Define functional scope, design layout.	Eirik & Eivind	Initial sketch, component list (Assignment 4).
43–46	Hardware Assembly	3D-print chassis, mount components, verify motor control manually.	Eirik & Eivind	Working 4WD base, photos, wiring diagram.
44–46	Voice & Perception Integration	Implement command recognition and YOLO detection; send BLE commands.	Eirik (voice) / Eivind (vision)	BLE verified, YOLO functional, demo video (Assignment 6).
47	Autonomous Navigation	Integrate movement logic and gesture behavior.	Eirik & Eivind	“Find and approach” demo (Assignment 6).
48	Documentation & Demo	Summarize results, reflect on challenges, finalize report.	Eirik & Eivind	Final demo and report.

Documentation Notes

During each phase, we will:

Take photos and videos of key milestones and test setups.
Capture screenshots of detection results and BLE logs.
Write short development notes on what worked, what failed, and what was improved.
Maintain all material for the final reflection and documentation.