Defensibility in Human Trafficking

Show Notes

Would you be able to recognize the subtle red flags that someone is being controlled, exploited, or groomed?

In this conversation, we will dive into the complexities of understanding human trafficking and the role AI is playing to help law enforcement identify traffickers and their victims.

Our guests are Kimberly Adams, who leads the strategic architecture of AINA Tech, and Shweta Jain, AINA’s Co-Founder and Technical Architect, whose background in digital forensics and cybersecurity shapes the system’s design.

The conversation is led by Adam Tashman, Associate Professor of Data Science at UVA. Together, they discuss designing AI for defensibility, integrity, and institutional trust.

Adam Tashman is an associate professor of data science, Director of the Data Science Capstone Program, and former Director of the Online M.S. in Data Science Program. Courses taught include reinforcement learning, distributed computing, programming for data science, mathematical finance, actuarial statistics, probability and statistics, and survival analysis. Research interests include AI in personalized medicine, digital health, computer vision, large language models, and quantitative finance.

Kimberly Adams leads the strategic framing and execution architecture of AINA. Her work focuses on building AI systems that can withstand legal and institutional scrutiny, particularly in high-stakes environments such as human trafficking investigations. She has worked alongside DOJ-funded task forces and engaged with federal stakeholders to translate governance, procurement, and evidentiary requirements into system design constraints. Through programs such as NSF I-Corps and collaborations with academic partners, she structures how AINA retires institutional risk before deployment.

Shweta Jain leads the technical architecture of AINA, focusing on defensibility, constrained inference, and system integrity. Her background in digital forensics and cybersecurity informs the development of AI systems designed to operate under evidentiary standards. She oversees the rigor, feasibility, and long-term survivability of AINA’s core architecture. She is Chair of the Department of Mathematics and Computer Science at John Jay College, an NSA-designated Center of Academic Excellence in Cyber Defense.

Episode Transcript

[00:00:02] Margaux: Welcome to UVA Data Points. I'm your host, Margaux Jacks. Would you be able to recognize the subtle red flags that someone is being controlled, exploited, or groomed? In this conversation, we will dive into the complexities of understanding human trafficking and the role AI is playing to help law enforcement identify traffickers and their victims. Our guests are Kimberly Adams, who leads the strategic architecture of AINA Tech, and Shweta Jain, AINA's co founder and technical architect, whose background in digital forensics and cybersecurity shapes the system's design. The conversation is led by Adam Tashman, associate professor of data science at uva. Together they discuss designing AI for defensibility, integrity and institutional trust. [00:00:57] Adam: Hi, I'm Adam Tashman. I'm an associate professor with the UVA School of Data Science. I direct the capstone program. I'm here today to talk with Shweta and Kim from AINA Tech about human trafficking and about the capstone projects that we've been running. So we're now in our fourth cycle of projects. We're very fortunate to have them as sponsors. The students have just learned a lot and it brings together, you know, the different areas. Right. We have a four plus one model here and they're touching on all the areas and really thinking about things from a system perspective. We've been very fortunate they've been able to work on pieces of the system for them. And with that, I was just planning to ask them both a few questions about the company and about the problem space. And Kim, this one's for you. Yeah. If you could just tell the brief backstory of AINA Tech and what was the problem that you wanted to solve? [00:01:54] Kim: Hi, Adam. It's really nice to be here with you. Okay, well, AINA Tech started because we observed that high stakes AI is useless if its logic can't survive a legal cross examination. So we decided to use human trafficking as the proving ground. And we are building a system where every recommendation is transparent, reliable, and ready for the witness stand. We call it the defensibility architect, of which the inference engine. The AI is embedded within that architecture. From the beginning, our question wasn't how do we build a smarter model. It was how do we build a system that can withstand interrogation? [00:02:46] Adam: What motivated the desire to partner with academia on this? [00:02:49] Kim: Okay, so we partnered with UVA. We've partnered with several universities. However, we have continued our partnership with UVA because we needed independent rigor for what we were trying to research, build and shredding and get more into that. But we're working In a domain where, which again is human trafficking, where performance metrics alone are not enough. We needed disciplined experimentation, benchmarking and critical challenge. Right. We didn't need to internally validate ourselves here at AINA. So the capstone gave us structured independent teams who could test our assumptions, question our outputs, and stress the system from multiple angles, which was very important. It also also created a bridge between applied AI and academic integrity, which is really, is essential for the work we're doing. [00:03:56] Adam: Well, so, so human training is, you know, a topic that has been coming up in the news and many times people think, well, the news sensationalizes things. So then, you know, for the casual a person, they may not realize, you know, how, how large a problem is this? [00:04:10] Kim: Yes, thanks for the question, Adam, because I think that there are so many misconceptions out there and there are misconcept and then there are misconceptions that cause harm. So your Starbucks barista, the student in seventh grade, the neighbor who's walking about freely and is training for a marathon, et cetera, et cetera, are all examples of a human trafficking victim. Now, when we talk about vulnerability groups, yes, there are, are groups that tend to be more vulnerable to human traffickers, but I think the key word is vulnerable. And so how the field actually has evolved in communications and unfortunately the perception still sticks is it began with the posters and the awareness showing people chained to a radiator. So when people observe someone who might be timid or shy or is always with someone who is much older than them but not related or what, you know, many things that might seem, it's probably nothing, but could be a human trafficking victim. So if you're at the, at the Starbucks and you see someone who has gang tattoos or black eyes or what, I mean, that's very extreme obviously, but I'm trying to hit home the point that they are integrated into society. So that's the number one misconception people have. And that misconception is so ingrained that even after training with law enforcement, Joe Q. Public, whoever it might be, it's still there. Well, they are free to leave and live a life. So they must not be victims of human trafficking. In the United States, we also have the conception that it doesn't happen here. It is the poor rice farmer from Malaysia who's selling his daughter for money into sexual slavery. That is false. In the United States, the majority of sex trafficked victims, you may recall I mentioned earlier about labor trafficking victims, and a lot of them are seasonal and are people from abroad. But in The United States. The majority of victims for sex trafficking are US Citizens living here. They were not sold from a poor country into the rich country of America. The other misconception that people have with human trafficking is that, and I think we're starting to see that this is not true, that it is gang activity or it is criminal network from Columbia activity, et cetera, et cetera. Well, people who participate as traffickers or the recipients of a trafficked human being can be wealthy, they can be poor, they can be middle class, they can be anyone. Again, like the victim, it doesn't really have a stereotype. And again, we won't get into detail, but I think we've all seen recently that the people who participate in human trafficking networks and or the victims of human trafficking are not your stereotypical Colombian ganglord or your East LA gamelord. The other misconception, and the reason I'm bringing up the misconceptions, is because this all lends itself to why human trafficked. So the criminality itself is not terribly difficult. But the perceptions, the confidence of what you're seeing, and again, to mention the overlapping of other crimes is where it becomes extremely difficult to prove in a court of law. And the fact, again, I repeat, that the victims don't see themselves as victims most of the time in the United States, we're thinking, oh, it's only in poor rural communities. And the fact of the matter is, sadly, I'm not going to exploit it, that, you know, everybody is a victim of human trafficking and it's everywhere. But what I am going to say is that there is no stereotype that you can rely on to signal to you that you have not or have encountered a victim of human trafficking or a trafficker or a buyer of a victim that's been trafficked. [00:09:00] Adam: It certainly is not something that starts and ends with demographics. Right. Profiling in that way actually gets into trouble and misses important signals. So there are some very subtle signals. And so these need to really be handled with care. Yeah. Okay, great. Thanks. Shweta. This question's for you. So what data science and AI skills are being engaged in the student work? [00:09:24] Shweta: Sure. Thank you for the question. So data science and AI skills are not the reason that we started this. We started these projects with a problem looking for appropriate solution. So the goal was not to tell the students that, hey, you're going to use this particular data science skill, but the goal was for the students to critically analyze the problem and come up with an appropriate G to solve it. So overall, students in every single capstone have been going through the entire data science cycle from data curation to data cleaning, model training, analysis and reporting on their findings, including the performance metrics as well as the limitations. So under the umbrella of natural language processing and image processing, those are the two umbrella terms that I can use for the projects that they have done. Students were looking into image identification and for that they found first went out to collect labeled to gather the data set that was labeled already and then a test data set which, which they would use to test their algorithms on. They were also augmenting the data so that it represents the real life presentation of the data. So for example, if the image is bent or folded, that's the data augmentation, then they would extract. They did feature extraction and performance analysis over a large number of images. And so that was an image processing type of project. They also did natural language processing. So again, data curation and data cleaning and then recognizing context and keywords within the data to extract certain indicators and then rank the documents themselves based on the indicators they found. So they followed the entire NLP pipeline here from data curation, cleaning to sometimes even labeling the data and annotating it. Basically to be able to explain the outputs they get, they were supposed to perform small tests and have input from subject matter experts so that they can validate the outputs independently. The current project is, is also going to require the whole cycle. Students will use models to get the get some outputs and then those outputs will be independently verified by subject matter expert, which is very important to build explainable AI solutions. So I think that they have been getting trained in the entire data science life cycle. [00:12:48] Adam: Yeah, thanks a lot. I see it as being very valuable. Right. Because in the classes students might be learning the skills piece by piece, but this system really needs to be robust. Right. And requires like thinking at the system level. So when you're talking about things like, you know, having to look at low quality images and things like that that would come up in real life, I imagine quite a bit. [00:13:11] Shweta: Exactly. Yes. It's putting the entire knowledge that they have learned over the years in their data science classroom and putting it to test in a real life project to that that ends up being into integrated into a product. That is an experience that the students are getting back to. [00:13:31] Adam: Kim, Kim, can you maybe discuss a little bit about what defensibility is and why it's so important to the product and your firm? [00:13:41] Kim: Okay. So most AI conversations focus on scale and efficiency. So when we look at the large language models which are your chat GPTs, Geminis essentially you prompt and the system is designed to provide you with answers to your question as quickly as possible. They are not being built for high accuracy or defensibility. And by defensibility, what I mean is that we can trace the output. We'll call the responses that a ChatGPT, for example, or a model built on top of a large language model. They're not built to provide the user with the entire chain of custody. When I say chain of custody, I mean from the original question being post and response, and that is frozen in time, cannot be tampered with, to why they got that answer, how have they been programmed to get that answer, how it relates to their answer. So that I, I as an end user, for example, can defend my position of why I use that response in my thinking, right? So I'm still very much involved and they allow me to be involved. However, where it breaks down and where the defensibility comes in is that if, Adam, you asked me, well, why did you use that piece of information in your decision or the action you took? And so I need to be able to explain to you as a reasonable person would, that this is what you know, this is the original context of the situation. And this was the information I received on how the response, the computer, the system gave me a response. And so why, with my training and other other knowledge, experience that I had, I justified this decision action. And that is really important when you're dealing in national security and public safety. A law enforcement officer has to justify to the court to the chain, be it an investigator or prosecutor or a defense attorney, why they took the decision they took. [00:16:11] Adam: I mean, is it fair to say? I mean, this system is kind of the, the eyes and ears, right? It's providing these signals to a law enforcement officer and then they're kind of using that as input in their decisioning process. Right. [00:16:26] Kim: I wouldn't go so far as to say they're the eyes and ears. I would go as far as to say a law enforcement officer, for example. That's one, this is one use case is performing their duty. And performing their duty. They hear certain things and they may not pick up on what they're hearing or associate what they're hearing with anything, which is a missed opportunity because perhaps what was being said is a signal that this could be probable cause or reasonable suspicion to ask a few more questions. So what we're trying to do, or what we will be doing is giving that officer in real time that signal. But that isn't enough. If you can't defend that signal, then it's just information. And law enforcement, for example, in this use case can get themselves well, it will get thrown out. You can't act on information has to be verified and validated. And that is the crux of the problem. When you're dealing with high stakes environments, they need that information to be verified as coming from a credible source. The reasoning behind why it was considered a signal worth telling the officer about. And they also have to make sure that they can prove it was never tampered with so that the original conversation and the response and the signals that they received weren't tampered with. And again, that's why when we look at law enforcement again, the, the courts and people are very nervous because it's like, well, if this AI gives this signal and then the officer just reacts to that, did they violate a human right, a social right? Did they have reasonable suspicion to detain that person? That may be the foundation for building a case. [00:18:33] Adam: Where do you see the field going? [00:18:35] Kim: I think we've already started to see where the direction is going. We're starting to see the regulations. [00:18:41] Shweta: Right. [00:18:43] Kim: So currently what we've seen a lot of is the defensibility part being an added feature or one algorithm talking to another. Right. When we're looking at regulations and mandates for national security organizations, law enforcement or businesses looking for, for illicit activities in their supply chain or wherever it may be, the laws are getting stricter. They have mandates, yes, you can license this, yes you can use this. But this thing you're using, for example, AI tool needs to pass certain, what we call proof gates. And so where it's going is the proof gates are getting more stringent. They can't just, defensibility can't just be an afterthought or attack on and information can't be given and used to help make decisions that can be defended anymore. The liability is too high from the court, from public opinion, et cetera, et cetera. So where I think it's going, that was a long winded answer to say. I think what we're seeing, we're going to see is that defensibility is going to play like we've seen in the health, healthcare industry is going to play a larger role that they there can't be as many gray lines or gray uses behind it. [00:20:11] Adam: Thanks a lot. I know that you selected human trafficking as the first use case. Can you talk about why that came first? [00:20:19] Kim: Yeah. [00:20:21] Kim: Human trafficking is an egregious crime that affects public safety and national security, security and actually liability in Companies, aside from that obvious reason that it's good to make any impact on that activity that you can. We selected it because when you look at the use of AI and the power behind AI, human trafficking is a perfect use case or a perfect domain to, if done well, to use AI to make a real impact, and I'm going to say in the lives of victims or survivors and those that are trying to help them, either through housing or through reintegration, or from prosecuting their, their trafficker. So human trafficking, because it is a very complex, misunderstood crime, needs to go beyond the wonderful training work that's being done now. It's very complicated. So it doesn't really. It ends at, okay, if I have overlapping crimes like domestic violence, prostitution, labor and wage violations, et cetera, et cetera, how do I distinguish between human trafficking and these other crimes? And we see in the courts or in prosecuted cases that we see convictions for prostitution, or we see cases brought up for prostitution that are not earmarked as human trafficking, but actually are human trafficking. So we see this throughout the criminal system, things you would not even associate with the exploitation of people. However, human trafficking played a huge role. So let me back up and say the two categories of human trafficking that I'm referencing is labor trafficking, which in the United States is predominantly practiced by businesses using labor from outside of the country. So not to say we don't use domestic, I'll use the word domestic, people who are here, US Citizens, green cards, et cetera, et cetera, but those that are coming in predominantly and primarily for seasonal work. So, and we have done very little with labor trafficking because you have to understand you have victims both in labor and sex. I digress a little bit that don't understand that they are victims of a crime. So especially with labor trafficking, and lots of times, many times you see the victims thinking, well, their life is better off in these slave labor conditions than it was back home. But it is a crime nonetheless. And not to get too in the weeds, but if you think of two competitors and one is not paying for their workers and another is, well, who's going to have the competitive advantage in pricing? And so it, it also reflects a very inaccurate economic picture, an unfair, again, competitive advantage. The other, the other that people think of first usually is the sex trafficking. And the sex trafficking is confused with prostitution. And again, as I mentioned previously with training, the training stops at being able to really see the granular nuance difference between prostitution and sex trafficking. So I will give you just a quick what is sex trafficking? The Law, Basically, if you're 18 or under, if you are used, exploited for sex in any way, that is sex trafficking. So a person under 18 cannot be a prostitute, full stop under the law across the United States. So when you see a 14 year old being charged with prostitution, that means the investigators, the courts, did not understand the legal definition of human trafficking. And believe, believe it or not, we see it all the time. That's number two. There's a difference between the law for adults with sex trafficking, because with minors, there does not have to actually be the act, it can be the intent of exploiting the young person. So with an adult, there has to be evidence, proof of the exploitation of sex in exchange for drugs or money or whatever the asset may be. Whereas in the case of someone under 18, they do not need to have actually gone through the sex act, but there needed to be the intent of it. So say the transaction of money and then the, the child did not have to actually do the act because someone may have stepped in or what have. So that was kind of in the weave. In the United States, you have labor, you have sex. We have a lot of training, but the training does not get down enough to actually be able to detect it when a law enforcement officer encounters perhaps a victim of human trafficking. So they either may miss it altogether or they'll misclassify the crime. So where AI, a well trained model that can hold up to the justice system, is particularly useful with human trafficking, is that it can pick up that nuance, it can pick up the subtle signals or the obvious ones that hey, this looks like it's human trafficking. So I'll give you an example of a prosecutor. This was for labor trafficking. And what came across their desk was marijuana plantation that violated the law. It was either too big or they weren't licensed or whatever the actual law was that the case was built against. When the prosecutor, who happened to be a specialist in labor trafficking and had just transferred over to a different department, was looking at it, she realized actually this is the labor trafficking case. And so mistakes like that, again, the misclassification of the actual crime, happen all the time, and it happens against human trafficking. What I mean by that is human trafficking is usually not the crime detected. So we wanted to apply the power of AI in a way that law enforcement or business or national security could actually detect it when they are confronted with the context and the language and the possibility of a human trafficking case that they're looking right at in plain sight. [00:27:42] Adam: So Kim mentioned that the system is kind of bringing the Awareness, whether it's the patrol officers or others. And, and for it to have that awareness, of course, I mean, you know, there needs to have been this data that was fed into it and curated and cleansed, et cetera. And that, that sounds like that. That's a lot of work, right? A lot of important, valuable work that AINA Tech is bringing to this problem. Now, of course, I won't get into detail, but at a high level, could you kind of, I don't know, say something about how you're approaching that problem? [00:28:21] Shweta: I would say that in order to train a model to recognize these nuances that Kim pointed out, we always need a human in the loop. So every step we are involving subject matter experts. So the data that we would use to train the model, that data will be curated with, is curated with a lot of care. It should be representative, it should be diverse, so that it captures the Personas and language of a large number of people of different cultures. There will always be a human in the loop to validate the data, to validate the labels and the annotations that are created by the AI using the best practices in explainable and defensible AI. We are basically working in small steps to make sure that the data that we use to train the models are of high quality and they have a high provenance, they have an accurate provenance. [00:29:35] Adam: So the next question is for Shweta, can you talk about where you see the field going? [00:29:41] Shweta: Yeah. So as, as you know, there is a lot of debate about AI in general, AI in education, how AI is taking the critical thinking away from people. And so that lot of people are very worried about the harmful effect of AI and we are too. I think the future is going to be for technologies that are making more responsible use of AI, that they are intentionally training the system using the right type of data. Maybe we'll be moving towards more explainability for every single case where AI is used. In fact, I would quote a recent article in communication of ACM magazine magazine where they were talking about training AI by giving it a sense of I. So one of the examples that article said was instead of reading Moby Dick to the AI, they would say, I am going to read Moby Dick. And then it goes on to read it. And so the AI gets a sense of what it is, what its Persona is in a better way than when we just give it all kinds of articles and try to train it with this, this, you know, wild data set. So think about the technology is going towards being more responsible by being more Intentional on what data we are training [00:31:11] Kim: it on and what we're seeing or hearing a lot about for those of us who are not techie and don't is prompting, right? And the agenic and I think there's going to be more space because the problem, as we all know, or maybe we don't, is the responses you get. I mean, setting the defensibility part aside, the responses you get have so much to do with how you've prompted. And that's why we're excited. Because in certain fields it would take too much time and it runs too high of a risk to prompt the system to give you the response that's useful. And then with the agenic, that is really when they're doing it for you. And clearly in our case, we cannot have a computer decide reasonable suspicion, for example. So I think the field will start to also evolve in how the interaction looks by taking into consideration the limitations that certain inputs from the human bring to the level and accuracy reliability of the answer. So that, that also, you know, just to. To add on to Schweder and the tech part, I do think we might be seeing a little more that aren't as prompt driven interactions. And I think we're starting to see that the regulations, for example, we're using EU regulations because we want to go beyond the gold standard, because the regulations can change. They might go stricter, they might be more liberal, but regulations also help any industry. And that's why I brought up the earlier point of where it's going, because we are seeing regulations for so many aspects of technology around AI in particular for this conversation. And so I think we'll start to see people developing it with different constraints than they have been. [00:33:29] Margaux: Thanks for listening to this episode of Data Points. More information can be found at datascience virginia.edu. if you're enjoying UVA Data Points, be sure to give us a rating and review wherever you listen to podcasts. We'll be back soon with another conversation about the world of data science.

Previous Episode

Defensibility in Human Trafficking

Show Notes

Episode Transcript

Other Episodes

A View From Space | How LiDAR and Hyperspectral Imaging are Changing Science

Fireside Chat with Kelli Palmer - Live from WiDS 2023

Data Meets Art