Monica Manney
Welcome back to UVA data points. I'm your host, Monica Manney, for our fourth and final episode of season one, we're featuring a conversation in the area of analytics. As a recap, here's Professor Don Brown with a description of analytics.
Don Brown
So analytics, the really easy way to think about it is analytics is the part of our model that turns the data into useful information. For instance, in health people have been using tissues, since Leeuwenhoek, I guess, since the advent of the microscope, but you know, we, we've been thinking about, looking at samples of the body and trying to understand what's going on. in pathology, you take tissue samples, and you're trying to understand what what that tissue is telling you about possible disease states, it's very difficult to interpret. Then pathologist spent a lot of time in school understanding how to how to look at those images and interpret them in ways that are meaningful to understanding disease states. Well, we finding that we can take and build machine learning analytics that can really do as well or better in some cases, because it's it's not something that humans come about naturally. Looking at that kind of data. I mean, we're accustomed to looking at, frankly, the Savanna, you know, trees, bushes, antelopes, lions, things like that. We're not accustomed to looking at cells and cell lines and trying to understand how these things are normal or different, or what's happening in them. So by applying analytics to that problem, we can really start to understand characteristics that are meaningful to predicting disease states, or to estimating disease dates. And predicting trajectories of those disease states. Frankly, when people think about data science, they typically think only about analytics. And there's a reason for it, because analytics is really what turns the data, as I say, into something that people can use in order to get to make decisions and to to understand what's happening in the world.
Monica Manney
For our exploration of analytics, we're diving into the world of sports. But just so we're clear upfront, this isn't simply a discussion of Moneyball. As you'll hear in the conversations, Moneyball was only the very tip of the iceberg. Because of the advances in machine learning, wearable technology, and computer vision. The field of sports Analytics is a new game. Today's conversation gets into the details of these advances, as well as the ethics surrounding the implementation of this technology. To discuss all this we've brought in two experts from the UVA School of Data Science, Natalie Kupperman, and Steven Baek. Natalie conducts research within the domain of Sports Medicine, and Stephen's research has a focus on geometric data and computer vision. And so their combined knowledge leads to a wide ranging conversation on the cutting edge of sports analytics. Given the complexities and nuances of the topic, we asked Professor Don Brown, who you just heard from, to moderate the discussion. Don is a heavy hitter here at UVA, in addition to his expertise in engineering, data science and medicine and business, Don is also a key reason that UVA School of data science exists. He was instrumental in the formation of the Data Science Institute, which eventually became the School of Data Science. Don also has a wealth of knowledge within the area of analytics. So he leads a wonderful discussion between Stephen and Natalie. So with that, here's episode four of UVA data points.
Don Brown
All right, well, hello, I'm Don Brown. I'm the Senior Associate Dean for Research in the School of Data Science. I'm also the co director of Institute for Translation health and Virginia. And I will, a faculty member in the School of data science and in the School of Engineering, and I'll kick it over to Natalie.
Natalie Kupperman
My name is Natalie Kupperman. I'm an assistant professor in the School of Data Science.
Stephen Baek
All right, my name is Steve Baek. I'm an associate professor of Data Science here at UVA.
Don Brown
Okay, and I guess to kick things off, we'll just go around and talk a little bit about research that we've been doing. that's related to, to sports and sports analytics. For me, it's been fairly recent, I've done a lot of work in the past in areas including time series analysis, and recently, I've been looking at lots of data that's coming out of time series from sports, particularly around exercise testing, and things like that, using these methods and deep learning. How about you, Natalie?
Natalie Kupperman
So my research started on the sports medicine side, I'm trained as an athletic trainer, that's most of my education. That pivoted a little bit when I learned about the sensors that sports teams were using and how we could use those sensors to start to understand injury risk on the field. So then when I was working on my PhD, I was specifically working on how do we take all of that sensor data and other data that we collect on the medical side and try to put it together to reduce injury risk. And so I'm continuing that on and also broadening my scope a bit in the sports analytics realm.
Don Brown
That's great. Thanks. And Steve?
Stephen Baek
So my research is about shapes, I'm interested in analyzing geometric data. And then to my perspective, human body is such an interesting geometric objects. So my PhD research actually started, you know, with this concept of modeling human body shapes in a data driven way. So that, you know, we can have like a computational model of people, and then we do and, you know, analyze a variety of different aspects of people using those computer models. So that's kind of how all this gets started. And then sports analytics, I think is kind of a general connection to my original research, because, you know, it is related to, you know, human body and how they move and you know, things like that. So, I use the tools of computer computer vision and computational geometry and machine learning to understand and analyze, you know, how people move from the perspective of, you know, their geometrical change of posture and the body shape, you know, those kinds of things.
Don Brown
That's great. You know, when people think about sports analytics, they naturally gravitate to Moneyball, in that in those ideas and I think it's, it's natural, because that, obviously, was a cool book and a cool movie and all that kind of stuff. But in listening to you, it seems like you're both doing things that are quite different from that as am I. So maybe Natalie, can you talk a little bit about the evolution of sports analytics, as you've seen it and how you're involved in different aspects of it?
Natalie Kupperman
Yeah. I like to tell people that we've been interested in athletes, since the Greeks and the Olympics The Greeks studied their athletes to make them better, better performers. And then, in recent years, in modern times, since the, you know, 70s 80s, we started to collect more data on our athletes, stats have gotten better, stats started to become more publicly available, which is where Moneyball came from. Moneyball was looking through just tons and tons of stats and amalgamating that data and creating models. Since then, we've gotten even more technology, including sensor technology, we, we store our data better from the medical side to this that side, we have next gen stats. So where I think things are really going now is not just how do we make the stats better, and the game outcomes better, we're now looking at individual athletes, and how do we make them healthier and better performers on the field. Whether they're in high school, college or professional level. And that's where some exercise testing comes in. That's where the sports medicine comes in. There's more cameras on fields, on pitches, which is where Steve's research is really coming into play on a large scale. So some of the problems now are, how do we pull this data together? Some of it is how do we do this at scale? And then also, I think, how do we think about an individual athlete? And also how do we think about a whole team? And not just the outcome of the game? Or who's the best player out there? It's really looking at these dynamical systems of sports.
Don Brown
That's great. Steve, what do you think about the evolution of sports analytics,
Stephen Baek
So I think Natalie summarized things pretty well. So I can only talk about from the perspective of computer vision. Computer Vision has been around for a while, in sports industry. They've been used to, you know, produce all this cool, like replays, and, you know, analysis result in the TV broadcasts and things like that. But recently, there has been a huge progress with all this deep learning and artificial intelligence stuff, where computers now have this cognitive capacity to, you know, to tell, you know, where they're, where people's major body joints are located at each different video frames. And then from there, we can generate, you know, a huge quantity of motion data of the players during the game during practice. All of this creates this very exciting opportunity as of today, where, you know, you can, you know, collect all this data and then distill some, you know, interesting knowledge about you know, how athletes you know, perform and get injured and you know, things like that. So, that's kind of, you know, all this, you know, where all this exciting new trend is going towards, so to speak.
Natalie Kupperman
I think that's really where like mine and Steve's fields are colliding is on the fact that from video analysis, not using markers, so there's not markers and I think people who might be listening who are more know about things like Vikon, where we use markers, to look at motion analysis, how much how time intensive that is, and how you only get maybe 10 steps, if you want to watch someone to run of their motion, where now we can get an entire game. And the fact that we can, are to the point where they can get major joint angles, is going to revolutionize how we look at injury in these sporting contexts. I don't think I can emphasize that enough to people that that is, that's a huge deal in our world.
Don Brown
Can you say a little bit more about this, I mean, how is it possible to get these joint angles, particularly in a game that has, you know, 22 players on the field just to pick one?
Stephen Baek
Right, the traditional motion capture systems, because computers were, you know, back in the days, were not that smart, to be able to understand where in the image or video, you know, people's body joints are located. People had to place markers, visual markers on the physically on the, you know, body of the, you know, of the athletes, and then, you know, those markers are, those markers basically provide a visual cue for the computer systems to analyze and guess where the body joints are located, and then how the bones move, you know, as the athletes, you know, perform certain things. But those markers are, if you think about it quite burdensome, because each time you want to scan the movement, you have to physically place the markers on the body of the subject. And then they take, that takes a lot of time, because you have to place the markers in a very meticulous way. And then there are like hundreds of markers that has to be attached to the subject, which is super time consuming and tedious. But nowadays, with the advances of data science, in general, what computers now able to do is to analyze all the pixels in the videos and images, and then make a very good estimate of where those body joints are located. And then the fact that you are able to tell the body joints without any markers, means that you can reconstruct the 3d body configuration of the people from videos and images. That means, you know, you don't really need any expensive motion capture system, you don't need to spend your time on placing markers and you know, things like that. But you can use some, you know, plain, you know, video cameras, like your smartphone, even cameras, and then you can collect the motion, you can record the videos of people doing various things, and then eventually run this, you know, machine learning algorithms to tell where the body joints are located. And then you know, from there, you can, you know, compute the 3d joint angles, the joint velocities, and you know, things like that which all can become really useful information for people like Natalie, to understand and analyze people's movement.
Natalie Kupperman
Yeah. And also be able to capture that with multiple people. Often when we use the markered systems, we have to be in a lab environment that doesn't go on to a field or a pitch. So not only do we get the individual, we also get the individual in the context of the team and the opponent, which then gets until a bit more of the competition uses for computer vision.
Don Brown
So let's talk a little bit more about that in order to capture again, for for a large game with a lot of players on it, you're going to need multiple cameras and multiple angles. Do you do integration of those data in order to make sure you've got the right player at the right spot across those cameras? Or how do you put that together, and you get one one picture, because obviously, depending on where the camera is, the player is going to be blocked, they're only going to be seen by certain other cameras, and you're going to have to look for those correct angles in it. It seems like it'd be a tedious thing, obviously the computer, the algorithms that we built will allow us to go through that very rapidly. But still it's it's an issue and how to integrate those, those different perspectives from the different cameras. So can you talk a little about how that's done?
Stephen Baek
Yeah, that's a great question, Don. There are various different recipes, so to speak, in terms of how we, you know, set up the cameras and then how to, you know, point those those cameras. I think the typical setup is there are a multitude of cameras installed in a stadium from various different angles. I think people prefer to be, be more, you know, abundant, as opposed to, you know, like, finding the optimal number of cameras, which is, you know, because like, exactly because of the reason that you, you just talk about. Like, in, in a lot of team sports, you know, specifically, there are a lot of collisions, physical contacts, and you know, things like that. And then from computer vision perspective, that's kind of a nightmare, because, like, you know, there's a player on an image, and then on top of the player, there's another player, you know, obstructing the view, and, you know, like, for example, football is kind of like the most challenging problem, I would argue. And then that's because, you know, like, all the players during the play, kind of, you know, in a very close physical contact, and then, you know, that poses a lot of challenges in terms of finding the right view of, you know, where the important event has happened. So, right now, I think the current paradigm is to just place as many cameras as possible, and then, you know, using the technology called camera calibration, which is the process of finding the parameters of how the cameras are set up. So basically, you know, where the cameras are oriented, where the cameras are located in the 3d space, what are the whether there, you know, focal length, and you know, those kind of things can be pre computed. And then once you have the pre computed set of camera parameters, what you can do is to run what is called a pose estimation algorithms, which is essentially a machine learning algorithm to find the, the major body joint locations in the videos, and then those two dimensional informations then gets reconstructed by using the camera calibration matrices. And then eventually, you can reconstruct the 3d view of, you know, the players on the field or in the practice areas, and you know, things like that.
Don Brown
So, it seems to me, as you say, football is the most, got to be one of the most challenging ones. And I'm imagining, for instance, baseball, and cricket, would be among the least challenging, but correct me if I'm wrong in this and maybe hockey in, and basketball in between, because of the speed of motion, of there, there's a lot of contact, and those, but they're very, very fast. Talk a little bit about the kinds of sports that this has been used for, and the ones that you think is particularly successful, and the ones for which we've got some work to do.
Stephen Baek
So, from the things that I know, I think baseball was the sports where all this, you know, computer vision application has started. So baseball, obviously, is a sport that has, I think, the one of the least amount of contacts between players. And then, you know, if you think about it, like all the players are sort of, you know, distributed on the field, and then they're kind of far apart from each other. So from computer vision perspective, it's much easier problem compared to, you know, high contact sports, like football. So I think about like a decade ago, there are people who are developing an app, basically, where you can point a camera towards a pitcher. And then the algorithm behind the camera basically analyze all the body movement of the pitcher, so that they can calculate, you know, the release velocity of the ball, and you know, how their joints are, you know, how, how their joints move when they throw a ball, and so on and so forth. But now, with the events of computer vision technology, I think we are gradually tackle and go into more complicated problems like football. And then one of the things that, you know, enables this kind of, you know, one of the things that allows us to solve this kind of, you know, high challenging, highly challenging situation is because of the fact that now machine learning algorithms can do so much more complicated things than just recognizing pixels, it can actually make a pretty reasonable guess, in case there's an occlusion for example. So when the player posture is hidden, because of some occlusion, nowadays, what machine learning algorithms can do is to make a reasonable guess from the information that's available. Things like what happened before and after the current frame, you can make a you can sort of interpolate their body movement from those eight JSON frames, or, you know, in case there are multiple camera views, you can combine all this information to sort of reconstruct and make a reasonable guess in terms of what happened in that, you know, occluded scene, that sort of things.
Natalie Kupperman
I think it's important to note too, that a lot of the computer vision first came up for competition and rule sake. We saw Hawkeye being used in tennis verse at the US Open to corroborate the line judging, and that was really revolutionary in the sport of tennis. We've seen some of it in baseball too, with the rules with bumping calls. And now we're going to see it in the World Cup with FIFA. They are instituting computer vision to do their offsides call. And what I think is really interesting there is that as of now, and this could change once all the players arrived for the World Cup, is that the ball itself is going to be sensored for those offsides calls. And so I think that it's important to note that this stuff came a lot from competition and all the debates we have around line calls and safer out or offsides in, in soccer and hockey, that's a big deal to offsides. And a tough one to make when the game is moving so quickly. And because all of those cameras were put into stadiums, then scientists like Steve were able to go in and say, Hey, we can even do better. And we can actually look at these athletes in a very microscopic way at other joint angles. And so I think this depending on kind of where your interest in sports is, there's going to be a there's a lot going on in both sides competition and on health. I'm really looking forward I'm not the world's biggest soccer or football fan I guess if we're gonna if we're calling the World Cup which probably call it a football, a PISA everyone else in the world besides Americans on that term, then I think it's gonna be really interesting to see how the fans react to having offsides called at like a millimeter. And does that really go to the spirit of the games? I think those conversations are, are so interesting from how the game is played to just the watercooler talk the next day after the game is is really interesting. That being said, because of those cameras will be installed in all the world cup stadiums that they're playing at in December. All of their joint angles and all of that injury possible injury information or health information will also be available to at least FIFA. Maybe researchers eventually.
Don Brown
Well, let's let's explore that for just a second. It's not just joint and health information from the standpoint of well, injury prevention and things like that. But you can actually make inferences about the performance of the athlete, can't you based upon all those the information you're getting from the... so you talk to talk to me a little bit about, I will say the privacy and ethical issues for the athlete, because now you you can actually judge performance, you can say something about how the athlete has done in various situations can't you with this technology?
Natalie Kupperman
Yeah, I think it's easiest to speak about this at the professional level, mostly because all of the data is brokered in their collective bargaining agreements. So at the professional level, their data is very secure. And so if we're using the MLS that will stay on the soccer example, the MLS might be collecting data. But if the MLS is collecting it, then only the MLS has access to it, the the data that the teams collect the teams have access to. And then I know in the NFL, they're collecting, they're doing a lot of work with injuries. And not even the NFL has access to that data, only their research teams can look at the data from the injury levels. So there's a lot of levels of protection. And I'm just grazing the surface on all these technical, technical nuances of these contracts. But I will say I do feel like most professional sports in the US have done a good job or the collective bargaining agreements, I should say have done a good job on brokering how that data is secured. I think as we move down levels where there isn't that interface between the people who are paying for the technology and collecting technology and the people that, that data is being collected on. There's not an interface there. To have collective bargaining agreements on, that's going to vary from institution to institution on how that data is collected and secured, and possibly used, which we could probably do a whole podcast on, in general. And I'm sure Steve has something to add to that, too.
Stephen Baek
Yeah, adding on to the data security issue. Another thing I want to mention is AI sort of being used in the decision making process. I'm pretty sure you watch the movie called Minority Report, the, spoiler alert, what that movie describes is a future where, you know, there is an artificial intelligence, which can predict the crimes, so that, you know, even before the crime is committed, the AI can tell who's going to commit a crime when and where, so that, you know, the what the movie is telling us is this kind of a dystopian future, have, you know, people get arrested, even before committing a crime? So I don't know if this, you know, analogy is appropriate. But I'm sort of, you know, also concerned about this possibility of AI being used in a decision making process that is not, you know, totally fair to the, to the players, if you will. So, what I'm trying to say here is, you know, like, AI algorithms are never perfect. They are not 100% accurate. But people tend to have this, you know, biased perception on data science, and, you know, all this AI technology where, oh, data, the data says x, and then that must be true, you know, that that's a bias. And then I think one thing that the data science community will need to talk about is, you know, how this, for example, injury prediction data gets used in the decision making process, or the performance prediction data gets used in the decision making process, given that those prediction algorithms are not 100% accurate, right. So I think that kind of poses a very interesting, ethical dilemma that I think all this data science community has to tackle.
Natalie Kupperman
And on the prediction side, I know when I talk to people about injury prediction, because if I go in and talk to coaches, and I'm like, how would we help? If we could solve any problem for you, what's the problem you want to solve? They're like, we want to predict injury. And I get that, I would love to also be able to know that next day an athlete's gonna be injured, so we can somehow help them to not get injured. However, that's not possible. We can't predict all injury in sport. And so I always try to temper my words. And I, that's why I try to use words more like injury risk mitigation. So what if we think that someone is trending towards a possible injury? What are steps we can take beforehand to either change their workload, change their equipment, something that, something like that, that doesn't inhibit their performance or their chances of being on the field, but can help keep them healthier? For whatever injury they're, we think they're headed towards.
Don Brown
You guys have raised two very interesting points, at least. And I want to come back to the prediction, particularly of the I'll say, the strategy and tactics in a sport, but let's, let's follow up Natalie with this idea that injury prediction. It seems to me that what you need for injury prediction is not just what's on the field, you need all this stuff before the game. In other words, and I know that we haven't talked, we talked a lot about cameras, but basically, the athletes are sensored up. And you're talking about the sensoring on the ball, the puck is, you know, everything is getting sensored. And they're they're really sensored up in the sense that you're, you're measuring what heart rate?
Natalie Kupperman
Heart rate is actually quite challenging. We get it we get a lot of drop out
Don Brown
What we measruing?
Natalie Kupperman
They, a lot of sports will wear a sensor either on their back in between their shoulder blades, or there's ones that also fit kind of in the waistband of their athletic shorts or pants. And those sensors have a 3d accelerometer, gyroscope and magnetometer. So we're getting directional movement, we're getting impact motion. We're getting accelerations, decelerations. If they're playing outside, those sensors also pick up GPS. So it's using the satellites in the sky. And it's been captured at a lower rate. And so the accuracy isn't the best, but it does give us a fairly good picture of at least how many miles do they run? How many times did they accelerate and at what intensity? Did they accelerate or decelerate, deceleration tends to be more of a concern, we think about injury. So we have all of that. I do think one of our next and probably people are working on this now and we'll probably work on it here at Virginia at some point is understanding how the camera data and the sensor data interact with each other and how those are similar or how they are different, because the cameras set systems are very expensive. And our athletes typically don't practice and compete in the same places. But their sensors are always with them. So if the camera data is giving us this really amazing information to work with, how can we, you know, amalgamate that with the sensor data so that our sensor data becomes even better because the sensor data can be extremely noisy. However, it's leaps and bounds over guesstimating, how many miles they ran in practice. So that's one of them. Some teams do use heart rate, it's just very noise, and there's a lot of dropout. So the teams I work with, I tend to stay away from looking at heart rate too much. And I have not found a lot of uses for it. Most athletes that we study, are very healthy and their heart rates don't really give us that much to go off of frankly.
Don Brown
Yeah, I guess it the data integration, I think is, you know, that's been an area of research for me for a long time across a bunch of domains, that clearly seems to me to be another dimension to where we can go because it's not just video. It's all the other sensors that are available. But I guess I want to go back to the question. I was asking you earlier about the injuries. It seems to me that to predict an injury, don't you need to monitor them before they're playing the game? In other words, during the period of time that they're not, you know, when they're when they're working out? I mean, I I know, in fact, ironically, I was speaking to a former Penn Stater, who went on to play in the NFL, he has two Super Bowl rings. And I just happened to talk to him this weekend. And he was talking to me about number of these issues. But he was saying that, frankly, at Penn State and you know, I don't know if he's, he said that they have without a doubt in his mind the toughest workout that the workout is way tougher than the game and that they they played all these said Joe Paterno was was fanatic about, you know, getting people to really hit hard during workouts. And so in training session, so I'm wondering, monitoring those workouts and training sessions, it seems to me is is at least as important for exit or injury prevention is the game it's 100%.
Natalie Kupperman
And we know from epidemiology research that more injuries happen in practice, for multiple reasons, there's more people participating in practice. So there's just more chances for injury, oftentimes, practices are longer than game. So our exposure is longer. So yes, we definitely need to be looking at practices. And that in that they always have the sensors on during practice. So we can look at their, we use this term to kind of it's an all encompassing term called workload. So we can look at their workload leading up to games, and how their workload trends over weeks and within a week, and compared to their teammates and for themselves. So I'm sure a lot of people are familiar with tapering. In sports, we tend to have season long tapers, we also have weekly tapers, and the taper depends on what part of the season you're in. So we can look at those trends. It's also really important that we not only look at the team practices, but many athletes will do individual work, oftentimes that individual work is one on one. And so it can be more intense as in they're getting more motion within a shorter amount of time. That intensity can lead to lead us to believe that certain types of tendon injuries might be more probable are coming up. And so making sure that we're monitoring them and those individual sessions also is really important, and then putting that whole picture together. When we talk about college athletes, too, we also have to think about their and this is any athlete, but I think what college athletes the most, I suppose, is where we are in the academic year, what's going on outside of sport that could be adding stress that could also lead to injury. We know that fatigue, which when people are stressed or anxious, they people tend to get less sleep, and how does that fatigue play into injuries? I will say the injuries that we're probably the best at mitigating with this technology or overuse injuries. I don't think we'll get to the point where I can tell you for days ahead that someone's probably going to break their finger in the next game. There are just injuries that happen because sports happen and injuries. When I think with this technology, I'm really thinking about a lot of lower extremity or low back pain type injuries, the tendon injuries, tendonitis is the Achilles tendinitis is stress fractures, those type of injuries that we can maybe modify things like workload or add in different recovery modalities. Or rehab therapies to help them mitigate that if we're seeing trends in their data.
Don Brown
Yeah. This is a great segue back to what I was asking you about the heart rate. It's actually not the heart rate itself. It's heart rate variability. Yes, the heart rate variability is very predictive of
Natalie Kupperman
Heart rate variability is phenomenal, it's very hard to get during a game. Recall, before the games we can, so there for a while, a teams were using something called omega wave to get HRV. It tends to be a little obtrusive, and it's hard to get people in that relaxed state. Now we're seeing more and more teams use the aura ring. And so the aura ring athletes can wear all the time or just that night. And their heart rate variability is very, very good. And so I'm really excited to see what we can do with that. And look at that. And that's something we're exploring here at UVA.
Don Brown
Yeah, just another little side note on what we're talking about. For those who are listening, this heart rate variability is actually a good thing for just a normal person not an elite athlete to use. It's surprising, and people are always surprised when we talk about this to know that heart rate variability is good when it's high. That's it, that's a healthy indicator, low variability is bad. So it's like a lot of things. And, frankly, physiology, that the greater the variability, the better it is, it's a good thing. What about, what about you, Steve, have you used computer vision around these training kinds of things? Or is it all been on the games themselves?
Stephen Baek
No, I think training is particularly something that, you know, computer vision is most frequently used with the current state. Because, you know, like what Natalie said, you know, computer vision is particularly good at detecting those overuse injuries and things like that. And, you know, those unfortunate accidents that happens in the field, I mean, those are something that we cannot really predict. But what we can predict are the things that are like, you know, there's a repetitive pattern of, you know, shoulder exposed to some high amount of, or abnormal amount of deceleration, or, you know, those kinds of things, is what we can capture and measure. So, a lot of teams and, you know, organizations actually use computer vision to monitor their, you know, practice sessions, so that they can accumulate the data of like, you know, what kind of movement patterns are repeatedly, you know, emerge during the practice for each different player is kind of the use case scenario. People also use computer vision system for sort of a screening test. So, like, what is your range of motion, like how far you can bend your shoulder, your elbow, you know, those kind of things. And then with the modern, you know, computer vision based motion capture systems, you can kind of automate that process, so that it becomes easier to collect those data, meaning that you can more precisely monitor to people with more frequent measurement. So, I would say those are the use cases of computer vision and training.
Don Brown
Yeah, that's great. I want to go back to the, to your Minority Report example, and this idea of prediction, particularly around tactics and elsewhere in the game. But I so, I'd say about I'm trying to think now must have been at least four years ago, I had an undergraduate and went to work for the summer for a professional baseball team. And in that he worked a he built a system that was pretty cool. It basically gave them a way to predict how to set up their their defense, their infield defense, so whether the second baseman and the shortstop position themselves based upon the current pitcher and the current batter. So as soon as the batter came up, they would adjust. And it was looking just at the data and in the way in which this particular pitcher and this particular batter were likely to interact so that the batter would hit in a certain direction in the infield defense was then adjusted accordingly. It it turned out that worked real well for that team. But I'd like I guess, you know, if that's four years ago, now the state of the art is obviously much better. Can you talk a little bit about the tactical side of things. So in other words, let's just play out the American football analogy. You're, you're you're you're down, you've got let's say you're third down in for something and you're facing a certain team. Can you see a point where we can run a predictive set of models against that situation and give a an outcome that's relatively accurate based upon the current setup of the people on the field?
Stephen Baek
I think for that Natalie is a better person to to answer.
Natalie Kupperman
Yeah, I think, I think we'll get there especially if we have historical data on teams and how we know they typically react to certain situations on the field. And I wouldn't be surprised if this is already happening on individual teams, we don't often get to know what individual teams are doing. It's proprietary information. They don't always like to share with the, with the researchers. I also, this also brings up some conversations I've had with another professor here, Alex gates, in network analysis. And when you have certain groups of players on the field, do they interact differently, and adapt to situations differently than other groupings of players. And I think that is really interesting, because I think one of the things we've kind of danced around in this conversation a bit is that sports has, you know, a huge human psychology aspect, and that we can have all the historical data and all the models we want. But humans are humans, and they make decisions based on in the moment, you know, feelings. And that might be who are their teammates on the field and who they feel comfortable with? Who do they kind of have this, you know, second sense about knowing that I'm going to catch this ball, even if it's not the play we talked about. And I think that's some really interesting work to get into with, you know, Alex and his network analysis type thing in sports. I was also thinking I was watching the basketball game last night. And I think UVA held NC Central in the first half to like 29% shooting or something, it was somewhere around there. And I was thinking, I was like, wow, this is really cool computer vision to know how much of that because the announcer said, UVA held as if UVA blocked every single one of those shots, how much of that was, you know, UVA's defensive strategies, versus how much of it was more of like an unforced error type issue? And like, how does that look on the on the court, when you put it into like a computer vision world? And I think those are some really cool questions that we'll be able to start answering that I think could really change how coaches, you know, think about the sports and how they set up strategies and who they put on the court, and what combinations of players they put on the court to achieve certain outcomes.
Don Brown
Yeah. Well, I mean, the psychology though, is detectable, isn't it? Or latest aspects are detectable by computer vision, are they Steve? Or are they not?
Stephen Baek
There are certainly attempts to detect psychology using things like facial, facial recognition and things like that. But I guess, you know, psychology is something that is more difficult for computers to process, because those are all very subjective, you know, and then in terms of training, a, you know, an algorithm, how human psychology works, you know, all this data points are very subjective. Meaning that, you know, it's very hard to find a sort of a deterministic rules between like, Oh, if you know, somebody's face looks like this, this is this means anger, or, you know, if somebody's face looks like this, this means oppression or something like that. I mean, it's very hard to define that, you know, linkage is kind of the challenge. So psychology is probably something that, you know, data science models are not particularly good at at at the current stage. But who knows, I mean, there are a lot of research going on, in terms of like analyzing the brainwaves of people, and then, you know, try to find some quality quantitative evidence of, you know, human psychology, and then try to incorporate that into decision making process in in sports science,
Natalie Kupperman
And then just add a helmet to the computer trying to read a facial recognition. That's the whole other challenge. Yeah. And have their faces obstructed by a mask.
Don Brown
Yeah. But of course, in basketball, you've got their full face. Yeah. So your example that you were just getting, you can actually detect, I would think, certain characteristics, but
Natalie Kupperman
I think so. Yeah, it was. It's funny when Steve was talking about pose estimation. In the beginning, one of the first talks I went to about pose estimation, was about basketball. And they're talking about how do they track the same player throughout the game. And the first one was, you know, the number is the most obvious like, it's usually very, you can only see part of it, and the computer can figure out oh, that's 21. The next thing that they said worked for their model was the nose features. And how like, you know how some people have such a different shape nose and how the computer is like, oh, I can identify that person's nose to identify them if they couldn't see the number. And so it's very close to a facial recognition between between athletes.
Don Brown
The next would be the ear, no?
Natalie Kupperman
Probably, Probably better than the arm. I'm sure all the arms look alike to the computers. So yeah, that's going to be some interesting avenues of AI human psychology is there's a whole reason there's a whole field dedicated to trying to understand human emotion and the mind. And it's very we, we talk about it all the time in sports.
Don Brown
Where can you talk a little bit about what you think are the current really fascinating challenges that we have right now?
Natalie Kupperman
Yeah, from, from the Research Lens, sometimes sometimes wear both hats of working with individual teams, and also trying to think about these larger research questions. Versus it, It's really getting all this data together. And I think a lot about how do we take? How do we go through this entire data science pipeline? And we'll use the the U shaped pipeline where going, you know, from sport, back to the sport. And it's, you know, we're taking all of how do we collect that data. So it comes into our system really nicely. How do we then put all of those data points together? Because oftentimes, and it's maybe not the right answer for this podcast, is sometimes our best answers come from a really good query, sometimes the answers we need the most, if we have that data, really well designed and laid out, we can run a quick query and get most answers that we need quickly. But that doesn't happen if all of your data isn't together and working seamlessly. And then you know, there's also the challenge of how do you get video into those common databases is another challenge. And then how do we then stream that back out to another human readable side for coaches, because there are stakeholders, coaches, athletes fans, and they're not necessarily going to understand the sensor data or the camera data that we brought in, or the medical data that we brought in. And so I really think this more of this like design problem of putting all our data in making it work really well together, and that scale, so we can continually add that data in quickly. And then being able to seamlessly put it out into reports, dashboards, fan engagement platforms, is probably the thing I think about the most. And then that's on a kind of an individual sports side. But that leads to the larger research side have been trying and trying to get more data from other schools together. And the biggest issue is that no one's data is stored the same. We all don't collect our data the same. So how do we create processes to still bring all of that data together, and we're definitely working towards it. People don't like to share their sports data. I always try to tell people that especially in college sports, we have so much turnover. If I'm asking for data from two years ago, there's nothing I'm gonna be able to predict competition wise on your team this coming year. And we're also looking at it from a health perspective, which usually gets people more engaged. But this is something that we've been talking about and working on at UVA for years. And we're still just like, barely inching towards getting more of this data together to really solve the problem of how do we keep our college athletes healthy, because many of these athletes don't go on to play professional. We just want them to be healthy, active humans in our population. And so how do we keep them that way in college sports?
Don Brown
Okay, that's great.
Stephen Baek
Yeah, just to echo what Natalie just said, I think we have all this, you know, nowadays, this cool technologies to produce all sorts of interesting data, but when it comes to like, collecting them, and then making use of them wisely, I think we're still kind of behind, especially in terms of like collecting and, you know, integrating all this data, you know, different teams use different data collection methods, you know, different teams use different sensors, and, you know, so on and so forth. So, how to kind of collect those data, and then, you know, integrate them together for, you know, for some productive use, is, I think, a remaining challenge. Another thing I want to add is, machine learning models and data science algorithms are getting more and more accurate. But on the flip side, they're becoming, you know, more and more sophisticated to analyze. So, in terms of like, analyzing the decision process of those data science models, it's becoming, you know, more and more challenging. Meaning that, you know, we, let's say there's a, there's a machine learning model that can predict, you know, the probability of somebody getting injured, but then, like, you know, the question is why, you know, what are the things that you know, machine learning models saw in the player's data that, you know, made it to predict certain probability of injury. Like, we don't have a lot of, you know, technology to understand and visualize and explain what's happening inside of machine learning models. So I think, you know, that's another very important challenge, because we need that kind of interpretation to, you know, to communicate with people communicate with not necessarily data science, we need to communicate that, you know, the machine learning prediction to coaches, players, and and other stakeholders, right, we need to clearly communicate that, and then we need to have some human interpretable justification of, you know, why that machine learning model has made that prediction, you know, that sort of thing. So I think that's another big challenge that, you know, we need to address Yeah,
Natalie Kupperman
Stephen said something very important there, especially in the sports medicine side, there's still a lot of just experimental causal things that we need to solve. And one of them... so the sensors I was talking about that they were between their shoulder blades, that's full body load, or full body accelerometry. We don't exactly know what that means for a knee tendon. If you get 500 workload units, does that change any of the structure of your knee tendon over time, and those aren't these big data problems that's controlling player load and measuring tendon changes over time. But it's very, those are very important pieces to this larger puzzle that can help us bring some interpretability to the machine learning. That also hits on a point that we made earlier, that the more human interpretable, human interpretability we can put in those models, the more protected the athletes are from decisions being made without a lot of guidance from humans, and just solely relying on computers to make those decisions for athletes. So remember, this is for the professional sports, this is their livelihood, in college sports, this has the potential to be their livelihood. And now that if it is some of their livelihoods really, and so we can't make decisions just purely based on solely what a computer is telling us without some human interpretability for them. And I think all of our athletes, coaches, you know, deserve at least that much from us.
Don Brown
Yeah, that's great. So I guess, to conclude, can you guys think of applications outside of sports, that some of the techniques you've been working on sports analytics can be used for it?
Natalie Kupperman
Ah, yes, there's, I mean, human motion, everybody, every human moves in some capacity. And so I think that in itself, is interesting. Staying still in kind of the sports medicine or medicine realm. Now, when we look at, you know, rehab throughout the lifespan, having some kind of sensor or computer vision to look at those human motions is incredible, especially when we think about, like remote therapy, not everyone has access to high quality physical therapists in their hometowns. And so how can we use this technology to, you know, bring high quality rehab in physical activity to people who don't have direct access?
Stephen Baek
No, I can think of a lot of, you know, applications outside of sports. I think what Natalie just said, you know, rehabilitation and, you know, precision medicine, I think is going to be one particular direction to where you know, this can go, what I can tell from what I can tell from my own experience, I know that, for example, US Marine Corp, is using this kind of sports analytics methodologies to, you know, train their, you know, soldiers better and things like that. I also know that, you know, manufacturing industry, is nowadays is paying a lot of attention to those repetitive use injuries of the manufacturing workers. So like you, you lift the heavy object, and then you repeat that motion, you know, over and over again, or you would lift to your shoulder in a certain way to reach to a part on a shelf, and, you know, things like that. And then there are like, you know, companies in the industry, who are developing computer vision and sensor technologies to monitor and detect all this repetitive use of, of motions in manufacturing. I also have a collaborator in the pediatrics department at UVA, who's using computer vision to analyze the movement of infants. And then there's an interesting tendency, where, you know, infants with the risk of cerebral palsy, tend to have very monotonic repetitive movement, whereas normal babies tend to just move in various different ways. So similar technology that is being used in sports industry could also be applied to automate the process of, you know, predicting the future risk of cerebral palsy and that sort of thing. So I can see, you know, a lot of applications like that outside of sports world could benefit from this new technology.
Don Brown
Yeah, I would second that, from what I've seen, the various techniques that have been used for elite athletes to to measure performance has direct applicability for normal people to improve their physical fitness and exercise mode. So I think this is all good.
Natalie Kupperman
And all of us have a sensor on us called our cell phone, you know, it actually does pick up a lot of what we do every day, and a lot of people wear, you know, smartwatches or have their own aura rings that give great HRV and can tell us a lot about how we ourselves are doing as humans.
Stephen Baek
Yeah, you know, I recently came across this app, which analyzed my golf swing, and it tells me you know, why my swing sucks. So I think yes, I think I can see this all this high technology kind of becoming more how do I say democratised. You know, and then, you know, broader group of people can start to benefit from those technologies. So which is pretty exciting.
Don Brown
Right? Right. So when we do sports, we get full benefit out of it. That's the goal,
Natalie Kupperman
Which is our golf game.
Don Brown
All right. Well, thank you very much. This has been wonderful. I've enjoyed talking to both of you.
Natalie Kupperman
Thank you.
Stephen Baek
Thanks, it was fun.
Monica Manney
Thanks for checking out this week's episode. And thanks for listening to the first season of UVA data points. We really enjoyed making this podcast and none of it would happen without listeners like you. We'll be back with a bonus episode in mid December, which features a live discussion at our recent Datapalooza event. While this current episode concludes our four part series on the 4 + 1 model of data science, we'll continue to release monthly episodes while we work on the second season of UVA data points. The subjects of these episodes will vary but still remain in the domain of data science. If there are any topics you're interested in hearing about, feel free to contact us at
[email protected]. Thanks again for coming on this journey with us. We'll see you next time.