SMU creating human trafficking data 'warehouse'

Funded by DOJ grant, project provides law enforcement, researchers and policymakers with national data on victims, crimes and perpetrators. Video gaming, machine learning 'cleans' and augments the data.

DALLAS (SMU) – SMU (Southern Methodist University) is creating a federally-funded data warehouse to centralize data collection and support research into human trafficking in the United States.

An estimated 27.6 million people worldwide are forced at any given time to perform labor or engage in commercial sex acts. The SMU Human Trafficking Data Warehouse will give law enforcement agencies, researchers and policymakers a single, secure place where they can quickly analyze nationwide human trafficking data. Sections of the warehouse are set to be live this fall.

“Human trafficking data exists, but datasets are often siloed by individuals and organizations, which decreases the useability of the data,” said Beth Wheaton-Páramo, the economist leading the SMU human trafficking research team. “Many of these datasets are based on case studies, meaning they describe a very specific geographic or socioeconomic area of human trafficking that may not provide a full picture of what is happening across the nation.”

With the help of SMU’s high-performance computing capabilities, law enforcement and others devoted to anti-trafficking work will be able to utilize the datasets housed in the data warehouse to analyze trends in forced labor and sex trafficking much faster than the months it would take for a single investigator to acquire the information from press releases, case studies and other sources. Data warehouse users will have credentialled access to a secure online dashboard to identify trends in human trafficking across datasets in a way that was previously impossible.

M3 in the SMU Data Center
M3 in the SMU Data Center. Credit: Ian Aberle


SMU offers the advantage of neutrality in storing the data, breaking down the potential for rivalry that can interfere with data sharing, said Wheaton-Páramo.

Funded by a $1.187 million U.S. Department of Justice’s National Institute of Justice grant, the project includes a study of the relationship between economics and human trafficking–both for victims and perpetrators–and the projected cost of doing nothing about this crime.

Wheaton-Páramo, author of 2019’s The Economics of Human Rights, is working with Raanju Sundararajan, SMU assistant professor in statistics, for the cost analysis, as well as an examination of spatial and temporal patterns in human trafficking data.

Dark Shadows: Video gamers will be part of the analysis team

A unique component of this project includes the development of a video game to help researchers extract and “clean” relevant human trafficking information from the U.S. Department of Justice’s press releases – such as perpetrator's names and crime locations.

Computer scientist Corey Clark and SMU postdoctoral researcher Steph Buongiorno will improve artificial intelligence(AI)/machine learning models that detect names, locations, and other information from press releases on human trafficking – saving a researcher or detective from having to manually read every press release published.

Existing AI models – no matter how well-designed – make mistakes. For instance, it might label a victim named “Madison” incorrectly as the name of the city where a crime took place.

Clark, deputy director of the Guildhall, SMU’s graduate program for video game design, and Buongiorno are creating a video game called “Dark Shadows,” which was based off of the popular game “Fallen Shadows.” In “Dark Shadows” each player acts as an investigator of an imaginary crime. The process of finding clues to the crime will be organizing actual human trafficking data by proxy.

As they play the game, video players will teach the AI driving the program to do a better job of making fewer mistakes with its algorithm, as well as how the algorithm can make connections between different press releases for possible leads in an investigation. This method is known as “human-in-the-loop AI.”

“We are improving the dataset, which improves machine learning to understand these relationships,” said Clark, who also serves as an assistant professor in computer science in SMU’s Lyle School of Engineering.

Video gamers will not be given any information that would allow them to find a human trafficking victim or perpetrator. Instead, the game mixes up the information it gives players – like a victim’s first name with a last name – to preserve privacy.

And Clark stressed that the game doesn’t give video game players or a computer program the ability to direct an investigation.

“The game is providing information and data, so that an investigator who is a human being can actually use that to make decisions based upon that knowledge,” Clark said. “What we’ve heard from law enforcement and others is that there’s too much data to process it all by themselves. So the game is a tool, not a replacement.”

Clark has used video games before to enhance data-driven research to help speed up finding new cancer-fighting drugs, along with other applications.

SMU’s Office of Information Technology (OIT) has four people working to manage the storing and processing of information in the data warehouse: Eric Godat, team lead for research and data science; Lane Duncan, assistant director for integration services; Ganesh Vardhineedi, a data architect; and Mateo Langston Smith, a data and visualization analyst. – Monifa Thomas-Nguyen

About SMU

SMU is the nationally ranked global research university in the dynamic city of Dallas. SMU’s alumni, faculty and over 12,000 students in eight degree-granting schools demonstrate an entrepreneurial spirit as they lead change in their professions, communities and the world.