Nemo: The Server Robot

We've all experienced going out to restaurants - you wait for your server order, and wait to be served. Most of us engage in smalltalk while waiting for our order and are typically interrupted by the server. In this project, we build a prototype of a robot server that determines the best time to interject in a customer's conversation

Project Dates: 10/1/21 - 12/1/21

Date Published: 6/23/24

Overview

In this project, I had the privilege of leading the software development and implementation of the ROS-based architecture for our robot, Nemo. My responsibilities included designing the software architecture, integrating various sensory inputs, and ensuring smooth communication between the decision-making and control modules. This project would not have been possible without the invaluable contributions of my collaborators, Matthew Toven and Soham Gaggenapally, who played critical roles in the physical design and experimental setup of Nemo.

The project term paper, final presentation, and code repository are publicly available.

Introduction

Going out to restaurants and bars is a large and common part of many people’s lives. The process is fairly simple - you go to a designated person, order your items, and then wait to receive said items. Sometimes, especially at restaurants, there is a time that you have to wait between the time you place your order and get your order. Most of these times people engage in conversation or some other social activity, veering their attention away to some other task. Interruptions into these conversations are usually taken as intrusive or disruptive, and with the advent of new robots that can physically bring food and drinks to people, finding the right timing to interrupt people or their conversations becomes a key consideration in their design.

In our research, we aim to explore the optimal moments for a robot to interrupt a conversation in order to serve drinks. We investigate different social cues—such as speech, gaze, affect, and posture—to create an algorithm that helps the robot determine the right time to serve. Prior research has focused on identifying who to serve and how to take orders, but there has been less emphasis on how to re-enter the interaction to deliver the order. Our project addresses this gap by focusing on the second part of the restaurant/bar experience: getting your order from the server or, in the future, a robot.

Background

Conversations, Interruptions, and Their Importance

Conversations are an integral part of daily life, and managing interruptions is crucial for maintaining their flow. Research shows that humans take various factors into account before interrupting a conversation. In human-robot interaction (HRI), it has been found that people respond better to interruptions when there is a given pretext rather than a blind interruption. In service settings, robots that observe human social cues such as proximity, affect, and gestures tend to produce better interactions and customer satisfaction. However, re-entering an engagement after an initial interaction remains a key area for exploration, which our research aims to address.

Robotic Design

The design of social robots must balance functionality and social cues to encourage interaction. Our robot, Nemo, incorporates anthropomorphic features to facilitate human-robot interaction. The physical design includes a laser-cut acrylic base, plywood boxes, and 3D printed arms with a mechanical gripper. The robot’s head houses a camera for running the serving algorithm, and the overall design aims to be both functional and engaging.

Software Architecture and Implementation

Nemo’s software architecture integrates multiple sensory inputs to model the environment and make decisions about when to serve. Using ROS (Robot Operating System), the architecture includes percept nodes for integrating data from cameras and microphones, decision nodes for processing this data, and a central decision node that triggers actions based on weighted inputs from the sensory data. This modular approach allows for flexibility and scalability in incorporating additional sensory inputs.

Methodology

Physical Design

Nemo’s physical design consists of a laser-cut acrylic base, birch plywood boxes, and 3D printed arms with a mechanical gripper. The robot’s head houses a camera and an LCD display for visual feedback. The design is intended to be functional and to encourage human interaction.

Software Architecture

Nemo’s control system is managed by a Raspberry Pi 4, which handles motor control, audio playback, and video display. The robot’s movements are controlled by PWM signals sent to the servos, and the decision-making process is integrated with ROS for managing sensory inputs and triggering actions.

Integration

The integration between the logical decision-making module and the robot control module is achieved through serial IO communication. This setup allows the robot to receive commands from the decision-making module and act accordingly, ensuring a coordinated serving behavior.

Challenges

We faced several challenges during the implementation of our project. Physical challenges included material limitations and the need for a reliable communication method between the Raspberry Pi and the decision-making computer. Software challenges included compatibility issues and the complexity of integrating multiple sensory inputs. Despite these challenges, we were able to develop a functional prototype of Nemo.

Experimental Setup

Participants interacted with Nemo in a controlled environment where the robot introduced itself and demonstrated its movements. The camera detected participants’ facial expressions to determine the appropriate time to serve a drink. Participants were then asked to rate their experience based on the timing of the service and their likelihood of allowing Nemo to serve them in a real setting.

Results

Our results showed that participants generally found Nemo’s serving timing to be appropriate, with an average rating of 3.33 out of 5. The likelihood of allowing Nemo to serve in a real restaurant or bar was rated at 4.66 out of 5. These results suggest that with further development, Nemo could be a viable solution for service settings.

Discussion

Our study indicates that robots conscious of the timing of interruptions can improve user satisfaction in service settings. Although our study had limitations, such as limited input modes and communication issues, the results are promising. Future work will focus on improving the robustness of the system and exploring additional sensory inputs to enhance the robot’s decision-making capabilities.

In conclusion, our research highlights the potential of social robots in service settings and the importance of timing in human-robot interactions. By developing robots that can seamlessly integrate into social environments, we can enhance customer experiences and pave the way for more advanced HRI applications.