Google Teaches Robot to Toss Bananas Better Than You Do

Source: Spectrum IEEE / By Evan Ackerman

TossingBot, developed by Google and Princeton, can teach itself to throw arbitrary objects with better accuracy than most humans.

Google’s TossingBot can teach itself to both grasp and throw with minimal human intervention. In a short amount of time, it’s able to operate accurately enough that it could potentially be applied to practical real-world picking systems.

As anyone who’s ever tried to learn how to throw something properly can attest to, it takes a lot of practice to be able to get it right. Once you have it down, though, it makes you much more efficient at a variety of weird tasks: Want to pass an orange ball through a hoop that’s inconveniently far off the ground? Just throw it! Want to knock over some small sticks placed on top of large sticks with a ball? Just throw it! Want to move a telephone pole in Scotland? You get the idea.

Most humans, unfortunately, aren’t talented enough for the skills we’ve developed to throw things for strange reasons to translate well to everyday practical tasks. But just imagine what we’d be capable of if we could throw arbitrary objects to arbitrary locations with high reliability—it would be so much easier to do things like cleaning a room or sorting laundry, and it would completely change work environments like warehouses, where it could potentially cut out all of that time spent walking.

Now Google researchers, working with collaborators from Princeton, Columbia, and MIT, have developed a robot arm called TossingBot that can teach itself to pick up and toss arbitrary objects very accurately. The goal is significantly speeding up pick-and-place tasks by replacing the whole “place” bit with an elegant and efficient throw.

Throwing is, in general, quite a hard problem, and it’s worth noting from the outset that humans aggressively simplify throwing by almost always using objects that are well balanced, aerodynamic, and/or symmetrical. Robots can accurately throw unbalanced or asymmetrical objects if you program them to, but usually you have to specify how to grasp and toss each object individually, figuring out the optimal motion and then instructing the robot to repeat it.

Learning to throw arbitrary objects is much harder, especially for robots that are self-taught—that is, robots that learn to pick things up and toss them through trial-and-error experiments as opposed to being explicitly trained by a human. TossingBot is notable in that it teaches itself to both grasp and throw with minimal human intervention, and in a relatively short amount of time, it’s able to operate both quickly enough and accurately enough that these techniques could potentially be applied to practical real-world picking systems.

TossingBot doesn’t spend time putting objects down, as they are instead (as the researchers put it) “immediately passed to Newton,” and the toss also means that the robot’s effective reach is significantly longer than its physical workspace.

Part of what makes TossingBot so useful is that the tossing technique significantly decreases the time that the robot spends on the “place” part of a pick-and-place task. Rather than spending time putting an object down, objects are instead (as the researchers put it) “immediately passed to Newton,” and the toss also means that the robot’s effective reach is significantly longer than its physical workspace.

At a mean pick rate of over 500 objects per hour, TossingBot is right up there with humans in terms of efficiency, at least for the particular set of items that it has experience with. While humans are likely always going to be better at dealing with novel items, TossingBot does pretty well with new types of objects. It only takes it about an hour or two of training on something new for TossingBot to achieve performance similar to that with known objects, and it can also quickly learn to throw things to locations that it hasn’t previously trained on.

The interesting bit of TossingBot itself is a deep neural network that starts with a depth image of objects in a bin, and goes all the way through from successful grasp to parameters for the throw itself. Since the throwing of an object (especially an unbalanced object) depends heavily on how it’s being held, grasping and throwing are learned at the same time. By measuring whether a grasp is successful by whether a throw is successful, TossingBot learns to favor grasps that result in accurate throws. As you can see from the video, the learning process itself is fairly clever, and the robot can be mostly just left alone to figure things out for itself, managing 10,000 grasp-and-throw attempts in 14 hours of training time.

Photo: Regina Hickman. TossingBot uses a deep neural network that starts with a depth image of objects in a bin, and then trains itself through successful grasps and throws.

An important component of this process is what the researchers call “residual physics,” which provides a sort of baseline knowledge of the world to help TossingBot learn and adapt more quickly. Lead author Andy Zeng explains in a blog post: Physics provides prior models of how the world works, and we can leverage these models to develop initial controllers for our robots. In the case of throwing, for example, we can use projectile ballistics to provide an estimate for the throwing velocity that is needed to get an object to land at a target location. We can then use neural networks to predict adjustments on top of that estimate from physics, in order to compensate for unknown dynamics as well as the noise and variability of the real world. We call this hybrid formulation Residual Physics, and it enables TossingBot to achieve throwing accuracies of 85 percent.

For some context on that 85 percent accuracy: We even tried this task ourselves, and we were pleasantly surprised to learn that TossingBot is more accurate than any of us engineers! Though take that with a grain of salt, as we’ve yet to test TossingBot against anyone with any actual athletic talent.

I don’t know, it seems like even for people with actual athletic talent, accurate banana tossing would still be a challenge.

For more details, we spoke with Andy Zeng from Princeton via email.

IEEE Spectrum: What kind of accuracy does TossingBot have, and what would that imply about throwing objects longer distances? How easy would it be to adapt this technique for larger or more powerful arms?

Andy Zeng: TossingBot has an accuracy of 85 percent into bins outside its natural range, where each bin has a 25 x 15 cm opening. In simulation, we’ve tested generalizations to longer distances (up to 5 meters) and the method works quite well. But in the real-world setup, the farthest box is only about 2 meters away from the robot. We haven’t tested any farther because throwing any harder would cause the UR5 to reach force-torque limits. Our guess is that the method should have reasonable generalization to longer distances, thanks to the initial estimates from physics/ballistics.

TossingBot might start off with lower accuracies at first to these farther locations (due to unexpected dynamics), but it should quickly adapt to the new training samples as it continues to do online learning. In terms of adapting this technique for larger and more powerful arms, it should be easy as long as the arms have good repeatability (e.g., the ones used in manufacturing) and real-time control.

Were there any particularly entertaining failures or classes of failures?

The most entertaining failure is when it grasps long objects (e.g., marker pens) by one of its tips. After it picks it up and tosses it, the object swings forward at much higher velocities, usually landing up to 3 meters away on a co-worker’s desk.

This occurs most frequently towards the beginning of training. But since we supervise grasps based on whether or not subsequent throws were accurate, the system starts to learn to avoid picking up marker pens by the tips. So as training progresses, we see TossingBot pick up marker pens by the middle more frequently.

Can you speculate about some specific use cases in which robots throwing objects might be particularly useful?

I view tossing as essentially a significantly more time-efficient version of “placing” for pick-and-place of objects that people don’t care about, or can sustain the damage from landing collisions. A good example is debris clearing in disaster response scenarios, where time is of the essence.

Would TossingBot have any trouble with a half-full bottle of water?

We haven’t tested bottle flipping, but TossingBot should have no trouble doing it. Bottle flipping should be an easier task than throwing arbitrary objects, which can have a wider range of interesting dynamics that the system needs to learn to compensate for.

Where would you like to take this research from here?

Tossing is a form of dynamic manipulation, where a robot leverages the dynamics (i.e., physics) of the world as a way to improve its capabilities. Other examples of dynamic manipulation include sliding, spinning, swinging, or catching. We don’t think about it often, but people use dynamic manipulation all the time. For example, sliding a beer across a bar table to a friend, spinning a phone sideways after pulling it out of your pocket to bring it upright, dropping vegetables into a boiling pot. Dynamic manipulation is an under-studied research area in robotics, and I think that it has significant potential for improving the efficiency of our robots.