Abigail Payne already had big ideas about big data when she was appointed director of MacDATA – McMaster’s newly created big data institute.
The one-time lawyer-turned-economist has been in the data business for more than two decades, working with local educators, governments and charities, analyzing their administrative and proprietary data to understand how the services they provide affect the communities in which they operate.
And now she’s keen to put that experience to work as the founding director of the multidisciplinary institute.
“Most universities take a narrow view of big data, confining their research initiatives and training programs to a single faculty, typically business, science or engineering,” says Payne, who for the past 13 years has headed PEDAL – Public Economics Data Analysis Laboratory – a secure data laboratory that transforms administrative and proprietary data to study policy relevant issues based at McMaster.
“But researchers in every faculty touch data, technology, and/or the tools needed to work with data. Each discipline brings an insight and an expertise yet they face the same issues around creation, collection, processing, storage and analysis. We need to find synergies that allow us to work together and learn from each other.”
The result is a big data initiative that is uniquely McMaster: A collaborative, cross-disciplinary approach that values innovation and is focused on outcomes. There will be no big data facility, and leadership of the institute will be shared with two associate directors, currently one from engineering and one from health sciences.
“I like to describe MacDATA as an institute that hovers,” says Payne. “We still want to encourage individuals to do their own thing but, at the same time, we want to get them talking to each other, sharing information and feeding off each other’s work in ways that will elevate everyone’s research.”
Moreover, she notes, while so many researchers embrace data in their work, MacDATA offers an opportunity to see the bigger picture. “MacDATA will showcase the insights that are being generated through the development and analysis of data as well as our technological advances that permit the collection of data to create a bigger picture of how things work in our society and how to understand better the issues individuals, organizations, and governments face.”
Having a social scientist lead the way is not a stretch, says Payne. “It’s not just about the data. It’s about how we’re using it. Is it just a lot of numbers or is it a meaningful indicator that can be used to improve decision making? Data scientists are really good on the tools, but their work can be complemented by researchers who want to answer questions that the tools will help address. Similarly, she says, researchers who ask questions benefit by understanding what tools are available to support, and even speed up, their research.
For Payne, who specializes in developing high quality research data for projects that address key public sector issues, that means improving educational outcomes, helping charities deliver better results, and strengthening communities. “But,” she acknowledges, “big data means many things across the university – it could be producing a safer car, delivering better medicine, improving transportation, or predicting the success of a business tool.”
It’s an approach that’s in lockstep, she says, with where universities should be heading. “Historically, universities were at the centre of the universe in data collection. Today, data are being created everywhere by everyone. Some of that data, with the right tools and using careful analysis, can be used to enhance our understanding of the world. So the question we, as a university must ask, is what is our role and how do we give value?”
MacDATA researchers haves several things in mind. One is to partner with businesses, nonprofits and governments to develop new tools that assist in data cleaning, data linking, and data analyses that fuel innovation and advances in industry, science, policy development and public services. Another is to provide McMaster students, researchers and practitioners with the skills they need to traverse the big data terrain, now and well into the future.
True to her background, Payne is determined to ensure students from every faculty and discipline have the opportunity to receive training in the acquisition, storage, processing, analysis and use of large data sets.
Just as data scientists can learn from social and health scientists, and others across the disciplines, the same holds true as to what we can learn from them, says Payne, adding it is equally important is “our collective understanding of the philosophical and ethical underpinnings about how we collect and use data.”
One of the institute’s challenges will be to consider and engage the university in how we think about data security and privacy. It’s a challenge Payne has had years of experience managing as Director of PEDAL, which has just received funding from the Canada Foundation for Innovation to upgrade to a high-security facility with enhanced technology and safer administrative protocols.
In an era when pop-up ads for products you’ve just googled haunt your computer, credit cards record your purchase habits and mobile phones track your every move, it’s not easy to remain anonymous. Pregnant teens found out the hard way in 2002 when Target launched a coupon campaign based on a “pregnancy prediction algorithm” that tracked their purchase of baby products, and many a marriage is now on the rocks thanks to poorly protected passwords of clients using Ashley Madison’s cheating web site.
“Big data has had a lot of bad press,” confesses Payne. Does that mean we should not pursue the use of data for research and innovation? “Absolutely not,” she says. “Instead we should consider how best to provide a spectrum of security when working with sensitive or proprietary data, and how we ensure that everyone across the institution is engaging in best practices around the handling of data.”
One solution, which McMaster researchers are now working on, is to use machine learning algorithms that would raise a red flag whenever security is compromised. Another consideration is to think more carefully about how we structure data used in analyses. “Every minute, a billion bits of data are being created. But to answer certain research questions, we may need only a million bits,” says Payne.
More important than the amount of data may be the availability of real-time data that companies and governments can use to make important decisions that impact their customers and citizens – an area where McMaster can play a critical role.
Payne points to the successful interactive data portal that the City of Edmonton has developed that publishes data on the city’s performance on a wide range of services. It’s a model – a citizen’s dashboard, if you will – that she’d like to see Hamilton implement.
“There are natural synergies between our research teams and our local community that would benefit from a similar type of dashboard,” she says, using the example of improving living conditions through data gained by examining homeless shelters. “Working together and sharing our data means we’ll accomplish more. Looking at single measures of data only tells part of the story, but connecting those data points will put us on the fast-track to putting a plan in place to find solutions,” she says.
Payne is excited about MacDATA’s approach, which is already resonating with industry, government and other organizations, and is keen to have both sides of the research enterprise at the same table.
“She recalls being the only social scientist at a 2014 IBM conference for data scientists and how she jumped to her feet when organizers pronounced they had “all the people we need right here in this room” to solve the issues around big data. “I said, no, you don’t. You have only half the people.”
MacDATA, she promises, will always have all of the right people in the room.