Better Data for a Better Internet

When people took to the streets across the U.K. last summer, the Prime Minister suggested restricting access to the Internet to limit protestors’ ability to organize. The resulting debate complemented speculation on the effects of social media in the Arab Spring and the widespread critique of President Mubarak’s decision to shut off the Internet and mobile phone systems in Egypt.

Decisions about when and how to regulate activities online have a profound societal impact. Debates underlying such decisions touch upon fundamental problems related to economics, free expression and privacy. Their outcomes will influence the structure of the Internet, how data can flow across it and who will pay to build and maintain it. Most striking about these debates are the paucity of data available to guide policy and the extent to which policymakers ignore the good data we do have.

The best approach is neither to make ill-informed decisions based on too little data nor to avoid state regulation simply because of the absence of decent data. Instead, we should begin a concerted push for highly reliable and publicly available forms of measurement of the Internet and how we use it. Better data would not only help the state meet its regulatory obligations, but also improve self-regulation by private sector players and empower individuals to make better decisions. In the meantime, we as researchers need to work harder to translate the data we have into terms that can directly inform policymakers.

First, we need to know more about the architecture of the network and how it is changing. For example, is the Web becoming more or less centralized over time? How much are unrelated content and services converging to common hosting within a comparative handful of cloud providers? Second, we need to know more about how information flows or stutters across the network. Where are there blockages? From what sources do they arise? And third, we need to know more about human practices in these digitally mediated environments.

We need to commit to systematic, longitudinal studies of how digitally mediated communications are changing behavior everywhere across the networked world, such as disclosure of personally identifiable information. For example, debates in the U.S. over amending the Children’s Online Privacy Protection Act (COPPA), which is intended to protect children under 13 from privacy risks, is poorly informed. We have not figured out if children are actually better off as a result of COPPA or even how to start gathering data to answer that question. Studies by the Pew Internet and American Life project and ethnographic work pioneered by danah boyd get too little attention in policy discussions about digital education and child safety, resulting in both over- and under-regulation. But through long-term studies, our findings can be translated into better policymaking and consumer-facing technology design.

The open and responsive nature of a new class of engaged research projects will help policymakers in government and corporate settings remain nimble and make better decisions in the fast-moving world of digital technology.

Cross-posted from http://policybythenumbers.blogspot.com. Full article published in December, 2011 issue of Science.

Better Data for a Better Internet

Jonathan L Zittrain

Boosting Interoperability of Joint Forces with AI: A Unified Language for Joint Warfighting

Private Sector Intelligence Careers: Analyzing Job Titles and Professional Trends

AI’s New Frontier in War Planning: How AI Agents Can Revolutionize Military Decision-Making