Deep Learning Drops the Ball and Dribbles
Video games benefit from a deep reinforcement learning system and a physics-based model for dribbling. PlayStation basketball bounces back ... handily?
Join the DZone community and get the full member experience.Join For Free
Of all the major league sports in the United States, basketball is the most balletic...in my humble opinion. Basketball demonstrates a fluidity of complex full-body motion all the while interactively and iteratively guiding and correcting the trajectory of a ball which repeatedly rebounds from the hard court surface. Watching real basketball players drive through the crowd toward the hoop gives us a hint of the artfulness. Watching it in slow motion makes us stare in wonder at the complexity of the performance.
Transitioning to the world of video games (where many sports seem to find their way) the observer has quite a different impression watching today's state-of-the-art synthetic players. The rendered players go through the motions, but the simulation just doesn't seem real. Even though the characters themselves look quite good in static poses there is something clearly counterfeit about how they move. No matter how great the skills of the animator are, it seems impossible to specify all of the angular velocities at all of the joints for even the most basic moves. It's not that we search to find subtle flaws in their movement, but rather that we are instantly struck with how unnatural these players are.
Throughout the history of animation, Disney has been credited with some of the most natural movements in their caricatures. But (in case you didn't know) their most natural animation sequences are done with a technological cheat: the rotoscope. Essentially, the rotoscope used film of a live-action actor doing the desired activity and rear projected it onto a translucent class drawing board that the animators would place paper over. In this way, they could draw the character frame by frame in such a way as to track the joints (control points) of the human actor.
Following suit, almost all of the CGI characters we see in film today are the product of the next-gen, high-tech version of rotoscoping. Humans wear funny suits with ping-pong balls attached to their joints and act out their parts in a special studio that can record the movement of those control points. Then (eliminating the animator), we employ a computer to match a 3D model of the character and its equivalent control points to the already recorded motion of the human actor. In fact, James Cameron took this a step further and with the help of special head-mounted cameras, which recorded the actors' faces to further animate the expressions on the CGI avatar faces. (Obviously, that movie was Avatar.) All these techniques work well for cinematic sequences because once they are done, they are "in the can." The performances are frozen and the characters do not have to dynamically interact with other characters or props in the future.
But with video games, the situation is different. The characters must interact within a wide range of variable contexts. It may not be infinite but the variations are far too numerous to record separately and stitch together. And even if you could record all the pieces separately and stitch them together, the seams would show.
Voice applications today still try to manage a lot of variable context with prerecorded snippets that are assembled as needed. Those of us working in the conversational voice industry refer to this kind of assembled audio as Frankenspeech. It makes us laugh...it leaves us in stitches.
Recently, some work was presented that starts with motion capture and adds a lot of AI deep reinforcement learning to refine the motions of our athletic avatars. A California company that develops smart avatars Deep Motion Inc. has joined forces with researchers Libin Liu and Jessica Hodgins at Carnegie Mellon University to create a unique physics-based real-time method that allows the avatars to improve their dribbling skills. Initially, the system was seeded with motion capture of skilled players performing basic dribbling scenarios. The motion capture is where it all starts, but the refinement and perfection of the skill are done in a physics-based simulation that is iterated through millions of epochs. Once these skills have been trained, the avatar can generate new, consistent, and believable arbitrary motions. And just as importantly, this smart avatar can generate smooth and natural transitions between these learned skills (e.g. transitioning from dribbling to passing the ball behind their back to dribbling between their legs, etc.). The thing that makes this at all practical is that the method can be simulated as fast or faster than real-time, making it suitable for a gaming experience. This research was presented under the title "Learning Basketball Dribbling Skills Using Trajectory Optimization and Deep Reinforcement Learning" at SIGGRAPH 2018.
This video gives an overview of the approach and methodology and demonstrates some of the behaviors that can be expressed with the skills that the avatars learned:
While the final result of the interactive simulation is not perfect, I think you have to agree that the science of naturally animated athletic skills has taken a major step forward. It has become just a bit easier to suspend our disbelief. And just as in theater the suspension of disbelief goes hand-in-hand with the perceived immersion of the experience.
This research team definitely did not drop the ball!
Opinions expressed by DZone contributors are their own.