Boston Dynamics has harnessed the power of ChatGPT to train its robot dog to communicate like a seasoned tour guide, combining scripted dialogue and visual question-answering models for a captivating interactive experience.
In an effort to give its robot, Spot the ability to “speak,” Boston Dynamics leveraged OpenAI’s ChatGPT API and incorporated open-source large language models (LLMs) to meticulously fine-tune its responses. It also equipped the robot with a speaker, integrated text-to-speech capabilities, and enabled it to mimic human speech.
For each room within their facilities, the company team provided Spot with a concise script to follow. Spot then fused this script with the visual data captured by its cameras, mounted on its gripper and body. This innovative combination enabled the robot to gather more context about its surroundings before generating responses. Moreover, it adopted different personas, including a 1920s archaeologist, a teenager, and even a Shakespearean time traveller, adding a delightful and immersive touch to its tour-guiding abilities.
During their experimentation with Spot as a tour guide, Boston Dynamics encountered a few surprises. For instance, when asked about its “parents,” it intelligently moved towards the older Spot models on display in the company’s office. The company also noted that the use of LLMs occasionally led to amusing inaccuracies, such as the suggestion that Stretch, its box-moving robot, was originally designed for yoga.