RYAN DAVIS: Good evening, and welcome to the John F. Kennedy, Jr., Forum. My name is Ryan Davis, and I'm a junior studying economics here at the College. I'm also the chair of the JFK, Jr., Forum Committee here at the Institute of Politics.
Before we begin, please note the exit doors which are located on both the park side and the JFK Street side of the Forum. In the event of an emergency, please walk to the exit closest to you and congregate in the JFK Park.
Please also take a moment now to silence your cell phones.
You can join the conversation tonight online by tweeting with the hashtag #AIPolicyForum, which is also listed in your program.
Please take your seats now and join me in welcoming our guests, Dr. Jason Matheny, and tonight's moderator, Eric Rosenbach. [applause]
ERIC ROSENBACH: Hello, everyone, good evening. Eric Rosenbach, as you heard. This is Jason Matheny, the director of IARPA, which is the Intelligence Advanced Research Activity-
JASON MATHENY: That's close enough.
ERIC ROSENBACH: So we're going to cut right to the chase here. Tonight we're going to talk about AI and public policy, which, as you all know, is a big topic, not only at the Kennedy School, but in the real world and also in the intelligence community. So we're very lucky to have Jason here. This is a smart guy. You can read his bio and see that even in the hallowed halls of Harvard, this is someone who can probably outdegree degree for degree almost anyone here. He also is a thoughtful guy. There's not always a strong correlation between people who have lots of degrees being actually thoughtful; in this case, that's true.
And Jason, also he's a good guy. So that's what we look for most when we're trying to bring a good guest, is people who are smart, who are thoughtful, and they're also good people.
And what we'd like to do, because Jason, in his position at IARPA, has a lot of exposure to AI, machine learning, and the nexus of that and national security, have a conversation with him. I'm going to start in, ask him a couple question. We'll do that for the first 30 or 35 minutes. And then I'll go out to you all in the audience and let you ask some questions, too.
For starters, Jason, since I can't remember exactly what the acronym for IARPA stands for, tell us a little bit about that. What is IARPA? It's not a DARPA. What is it, and what are you doing on a daily basis as the director?
JASON MATHENY: We do rhyme. So we've got that down. So first, it's really such a pleasure being here. I think the work that you all are doing here makes us smarter within government. When you're critical of the kinds of policy choices that are made, when you're thoughtful about what we could be doing that we're not already doing, you make us smarter and wiser. And we're grateful for that. I really do deeply value these kinds of engagements. And it's also just a personal privilege to be up here with a national hero and somebody who really has set an example, not only for national service, but for being a good human being. That's important to us, too.
ERIC ROSENBACH: You've already got your Forum spot, so you don't have to- [laughter]
JASON MATHENY: But I want to be invited back. So IARPA is an organization that funds advanced research for national intelligence. We fund basic and applied research at over 500 universities, colleges, companies, national labs. And most of that work is done, unclassified, in the open, as level playing field competitions. We sort of run research tournaments. And we fund work as diverse as applied research in mathematics, computer science, physics, chemistry, biology, neuroscience. We fund a large amount of work in the social sciences, understanding how human analysts can make errors in judgment, how they can avoid those errors. And we fund work in political science and sociology, even anthropology.
So we have one of the most diverse portfolios I think of any federal R&D organization. But that work only succeeds because we fund places like Harvard to actually solve our hardest problems for us. We don't have any researchers in-house. In fact, if you come and visit IARPA, it's sort of a letdown because you sort of imagine that you're visiting like Q branch from the James Bond movies and-
ERIC ROSENBACH: So you're not doing the weaponized watches?
JASON MATHENY: No weaponized watches. We outsource all of that to Harvard.
ERIC ROSENBACH: It's in a deep corner of the Belfer Center.
JASON MATHENY: That's right. They're all Timexes from circa 1983 to the big calculator watch versions.
ERIC ROSENBACH: Ash Carter has one.
JASON MATHENY: So the work that we do is really focused on trying to understand how we can make better intelligence judgments about a very complex world faster and in a way that's more accurate. And that means, then, that we need to have better methods of collecting data, better methods of analyzing data, better methods of assessing the veracity of data, which has led to fairly large investments in AI, the topic for tonight.
ERIC ROSENBACH: One of the things we want to do in the Forum is also educate people and have them actually understand what some things mean. When you hear about AI and machine learning, it's sometimes hard to understand what that actually is. So in my class, I like to make students break things down by their component parts. Break down AI and machine learning. What does that actually mean? What's the difference, and what are the component parts?
JASON MATHENY: So AI is a broader category than machine learning. AI really is a set of techniques that could be used to allow a machine to mimic or replicate some aspects of human cognition or animal cognition; so, the ability to reason, plan, act autonomously, make inferences, make judgments. AI can be separated into things like expert systems which are rule-based, sort of "if this, then do that," or robotics.
But then there's a subset of AI called machine learning, which is really focused on developing systems that learn based on available data, or the experience of a system. And there are a few different components to being able to do that kind of learning: First, you need to have an algorithm or a model. You need to have data from which to learn. And then you need to have computing. With those three elements, you're able to achieve a number of fairly powerful performance milestones.
And just recently in the last several years, there's been a lot of low-hanging fruit in machine learning that's been plucked, really owing to techniques that were developed about 25 years ago. So, a whole family of machine learning techniques called deep learning or deep neural nets has proven remarkably powerful when you have large volumes of data, when you have plentiful and cheap computing, and when you have some algorithms that really haven't changed all that much in the last 25 years, but, with some modifications, have proven quite powerful.
So a number of very hard problems in machine learning have been solved over a short period of time, just within, say, the last six years. I think really the major milestone was image classifiers being applied to something called ImageNet and proving that these deep learning techniques could achieve human levels of performance very quickly.
ERIC ROSENBACH: So if you follow one line of a philosophical debate, it would be whether it's the algorithm that matters more and having a great algorithm, or having great data. And what you just said about the idea of the algorithms themselves not actually evolving that much over the last 25 years, maybe some of the same models that some of the Kennedy School students use, is it that the data and the availability of data has become more abundant? Or that the algorithms are getting better? And which really matters more? If you have to make a final call on it.
JASON MATHENY: Yeah, and I think computer scientists will probably be irritated by my saying that we haven't made so many fundamental advances in algorithms over the last two decades. I mean, there are some important classes of new work in generative adversarial networks and reinforcement learning that really are fundamental. But at the same time, I think we've seen the greatest gains of improvement in the availability of high quality label data and in computing.
Computing especially. We've seen what's sometimes called Moore's Law, this improvement in performance per unit cost that tends to double every one to two years, and has historically for quite a long time. But I think if one were to break down in the most significant applications of machine learning - say, in image classification, speech recognition, machine translation, navigation - it's a pretty even tradeoff between data and computing.
ERIC ROSENBACH: So he decided not to give a clear answer. [laughter] That's okay.
JASON MATHENY: Is that a time-honored tradition up here?
ERIC ROSENBACH: If you were my student, I would nail you down even further, but we'll get down to some of the hard policy questions here. When you're the director of IARPA, who's your boss? How do you decide what investments you're going to make? And in AI in particular, do you follow the lead of the Director of National Intelligence or the Secretary of Defense? How do you decide what you're going to do?
JASON MATHENY: So we're given quite a lot of latitude-
ERIC ROSENBACH: Specifically in AI.
JASON MATHENY: Yeah. We're giving quite a lot of latitude to be thinking over the horizon about challenges that may not yet be a national intelligence priority, that may not be a crisis yet, but would represent a deep challenge, say, in ten years to national intelligence, or an opportunity in ten years, because research that we fund takes at least five years to pay off, and very often as much as ten years.
So for example, we're funding a lot of work in quantum computing for which we're probably the largest funder of academic research. We fund a lot of work in superconducting computing, which has applications that will really pay off probably in five to ten years. And I think many of the investments that we're making in machine learning will really be paying off on the ten-year time scale. For instance, we have a program called MICrONS that's aiming to understand how animal brains actually learn from data, which is very different from the way that machines, currently using machine learning, learn from data. We know that there are these repeated circuits within the brains of animals, within the neocortex that learn from very small data sets, which is very different from the way that most machine learning is used to day where you require thousands of examples.
And yet, most toddlers do not need to see a thousand examples of a chair to recognize what a chair is. How is it that they're able to be so efficient? That's not a crisis, right? We're not suffering from a problem of achieving toddler parity in machine learning. But it's a fundamental problem that, for us to make progress in machine learning, we really need to take more inspiration from how animals learn.
ERIC ROSENBACH: So it's not an operational type organization where you may get a call one day from the White House national security staff and they say, "We need an answer in a month on how we can use AI to conduct better surveillance of video cameras across the United States"?
JASON MATHENY: No. We're the place where you would call if you want to understand what does the research landscape look like, what's the state of technology in the laboratory. We help on that side so that decision makers can get wiser about what's actually happening in science and technology.
But our programs take so long to run that if somebody wants a solution next month to a problem, it's going to take us a month just to award the grant to Harvard to start working on that problem.
ERIC ROSENBACH: Although we're the fastest, I'm sure.
JASON MATHENY: Oh, yeah, I'm sure, definitely.
ERIC ROSENBACH: There's always a question about when you're doing work on AI, and in your case in particular, whether you're trying to find something that would replace a human or augment a human in making a decision. And in the intelligence context this is a really big question, too. I remember even when I was an Army intelligence office - and this is a long time ago - I was in charge of a signals intelligence unit that collected telephone calls and email. We were both doing it for Yugoslavia, but also trying to collect on bin Laden back then. The linguists were led to believe that within two years they would no longer have a job because there was some AI mechanism that would go through all these voice cuts, transcribe it, and immediately spit out a report. That obviously didn't happen because that was almost 20 years ago.
Talk to me, Jason, about the focus now in the intel community, whether analysts, linguists, data, is it the objective to replace intellectual analysts and operators in some way, or augment it?
JASON MATHENY: The focus is definitely on augmenting. I think we're a long way from replacing human analysts. So really the goal is to reduce the amount of data that an analyst needs to look at in order to make a judgment. And one of the steps to data reduction is to be constantly scouring the data environment for signs that might be important, and then perform analytic triage for the analysts, so that you present the things that are sort of balanced into "this could be important," or, "it might not to be important," and tee it up.
To give one example, we have a program called Open Source Indicators that looked at many different kinds of unclassified data for indicators of disease outbreaks. And one of the real challenges for detecting outbreaks overseas is medical reporting isn't very good; it's spotty. So we were looking for ways of detecting earlier whether there was a hemorrhagic fever outbreak like Ebola, and some of the early indicators are people staying at home rather than going to work and back; so, absenteeism. One way to detect absenteeism is if mobile phones stay at home rather than going to work and back. And you don't need personal level data, you just need population level data of phones staying in one place during a work day.
But you could also look at whether there was an increase in crowding at certain pharmacies or health clinics from commercial overhead imagery. You could look at whether there was increased Web search queries for symptoms, for people trying not understand what they might be suffering from. Or from social media messages, they were posting their symptoms. The volume of data that was involved in making those kinds of judgments was trillions of bits, too much any human analyst to inspect.
So we trained systems that could automatically look through these data. We trained them on historical disease outbreaks in order to identify those patterns of behavior that indicated whether an outbreak was taking place. And we could then get outbreak detection that happened weeks faster than traditional medical surveillance.
So that's the kind of problem that I think machine learning is really well suited for, is taking the ocean of data and bringing it down into something about the size of a kiddie pool that an analyst can make sense of, actually really dig down into in order to understand whether there's a real event happening or whether this is an artifact from noisy data.
ERIC ROSENBACH: What are the areas that the intel community itself is most interested in gaining the leverage and the assistance of AI-enabled tools?
JASON MATHENY: I think imagery. We have a program called Finder that takes an image that might be from, say, a cell phone camera or from a ground-based camera, and even if that image is not natively geotagged by like a GPS with a latitude or longitude, the system automatically detects where on earth the picture was taken just by looking at features like the skyline, the architectural features, geologic features, botanical features. So, automated geolocation, in a way that would be impossible for human analysts to look at a picture of a random spot in Afghanistan and figure out where that was taken. And that's led to some really important operational successes.
Another example is machine translation and speech recognition in any language. So most of the big Internet companies that invest in machine translation or speech-to-text are really interested in languages where there are lots of consumers, because that's where the market is. In intelligence, we have to worry about all those other languages, too. So we have a program called Babel and one called Material that work on speech recognition and machine translation for any language, and bringing training time down to a week from probably about nine to 12 months was the norm before this program. So, getting about a two-order magnitude improvement in performance. And that's because there are these highly conserved features of language that we can use to train models.
A fourth one I would say is being able to infer from imagery what the function of a building is from satellites. So, can you tell whether a particular building is a bomb factor or an automotive plant? Can you label the features of the world in ways that are useful to intelligence analysts? Impossible to do manually just because there are so many buildings on earth, and being able to monitor those patterns of activity that can tell you whether something is a weapons plant or not is too subtle, really, for a human analyst to keep eyeballs on all the time.
ERIC ROSENBACH: How good is the intel community using these tools now? All the examples you're giving right now, most are in use I expect right now. You're not divulging classified information, so it's not that. But how good are they? How much does it actually help? How many false positives? And then, I'm going to follow up on some of the public policy issues of all that data that you just talked about and ask a little bit about that.
JASON MATHENY: So the biggest challenge is in picking a target for a research program that's sufficiently hard that it would be valuable and not so hard that it would be impossible. And what we found is that usually we underestimate just how much the performance gains can be; that is, we pick a number that we think is just on the threshold of being impossible and then we exceed it with some of these techniques.
So for example, some of the speech recognition work, you're looking at 100-fold improvement over the state-of-the-art and existing speech recognition systems. In our geolocation work, you're looking at about 100,000-fold improvement over comparable techniques. And doing those head-to-head comparisons- so you sort of have a randomized control trial and you test your methods against either manual practice or against what the preexisting state-of-the-art is so that you really do have a level playing field tournament. I think that's really key to getting analysts to see the value of these tools.
The other challenge though is not so much technical, but cultural - how do you make analysts comfortable in using these tools? And for that, we found one of the biggest difficulties is baking into machine learning systems the need for explainability. Because it's not enough just to spit out a result; you actually need to say, Here's the evidence for this result that's come out of an algorithm.
We have a program called Mercury, which is a program that looks at raw SIGINT - so, raw intercepted communications - for indicators of military mobilization or terrorism activity. These are really high consequence events, high consequence intelligence judgments. The analysts will not trust the outputs unless they see what did the system find that led to an alert, that led to a forecast of this activity.
So we do two things. One is, we actually run a forecasting tournament in which the research teams are predicting real events before they occur, which is a lot harder than predicting history, we found. We actually started these forecasting tournaments because we kept getting these business pitches basically, of people with PowerPoint slides saying, "We could have predicted 9/11, trust us. Here's the PowerPoint slide to prove it. We ran our model backwards and here's a big blip on 9/11."
So we said that's not going to work because we actually need to tell whether these systems work against events that aren't already in our data set. So we run a forecasting tournament to see, can these systems actually predict real military mobilization, real terrorism events from intercepted communications. And they can. And for analysts to trust those forecasts, then we need them to be able to see the audit or the evidence trail. I think that's just as critical to making accurate forecasts, is to be able to explain them.
ERIC ROSENBACH: I think it's fair to say, having spent almost every day for two-and-a-half years with the Secretary of Defense, that any intel analyst who would come to Ash Carter and say, "My machine just told me that the Russians are mobilizing on the western border of Poland, or going into the Crimea," without being able to explain the data, why they think that, where it all came from, would be a very uncomfortable situation for that person.
JASON MATHENY: That's right.
ERIC ROSENBACH: So you can see it helping, but you would still would want the human, I think, to be able to explain why and make sure it's there.
So let's talk about another aspect of this. Back when I was an intel officer and when I was doing oversight at NSA and CYBERCOM, the amount of data collected is enormous. But in those organizations, there are very strict rules on the collection of US person data for certain, and even ally data. And a lot of training goes into that.
When you talk about what you just did in terms of, I think you said a thousand-fold increase or even higher for geolocational data and doing terrain, or voice recognition, what's baked into the algorithm to try to protect against collection on a US person? And how do you deal with that even when you're talking about research assignments for someone up in Cambridge, Massachusetts, and they're trying to design that?
JASON MATHENY: Yeah, so maybe the best example of this is, we had a program that looked at a lot of social media, and we didn't want to inadvertently include social media from US persons. So first, we removed any social media that was in English. And then we removed any social media that was geolocated in the United States. And we lost some performance probably because of that, because there were some foreign social media messages that were in English, but we erred on the side of removing it rather than including it.
With our research, we do a civil liberties and privacy protection review for any research program that we fund, in part to avoid things that really don't have a place in research to begin with, that ultimately is going to be based on foreign intelligence.
The other thing that we do is we invest a lot in privacy-enhancing technologies. So we take privacy really seriously, not only as a policy issue, but as a technology issue. We have a program called HECTOR, which is focused on a new form of encryption, homomorphic encryption, which can allow you to run queries that are encrypted against encrypted databases so that you wouldn't have to give up personal details when all you need to know is how many people in this list have the flu, or how many people in this list have encephalitis. You don't need to know the names of the people; you just want the number of cases. So this is a technology that could allow us to, as a society, balance privacy with a lot of other security and public policy goals.
The other thing I think is an example is, we had a program called Aladdin, which goes through posted video. So it could be like videos on people creating IEDs, like how-to videos, or martyrdom videos. It turns out that when people post these kinds of videos as part of a terrorist activity, they don't tag their videos, they don't say "this is a martyrdom video," they don't tag it saying "this is an IED how-to video." Instead, they post the video, they send the URL out to their colleagues. We wanted to be able to detect the posting of these videos automatically. There are not enough human eyeballs on earth to monitor all of the posted video, so we had to develop tools that can automatically characterize that video and tell whether this is a video that's about a terrorist activity.
As part of that though, we wanted to ensure that we weren't inadvertently collecting on US persons. So we made sure again that the origin of these videos was overseas, that there wasn't personal content in the videos. Very often that's costly to do that kind of scrubbing, but it's worth it.
ERIC ROSENBACH: Could a skeptic look at the research you're doing, in particular on the algorithm - like we talked about in the beginning: there's the algorithm, the data and the computing power running it - and say, Well, you've got the algorithm, you've got the computing power; what you chose to control was the data set. Someone not as nice as you could take those two other things, put it up against a data set that's running on the United States, and do something more nefarious. Do you worry about that?
JASON MATHENY: I do. I think all of us who work in research recognize that these technologies that we create are all double-edged swords. One of the questions that we ask ourselves before we fund any new research program is, how could this technology be used against us, be used maliciously? And what can we do to prevent misuse? Can we insert intrinsic safeguards? Can we insert security measures that prevent theft or reverse engineering? And then we asked the question, under what conditions would you regret having created this technology?
I think this is something we in the research areas need to be thinking about, especially in things like AI and machine learning, also in the biosciences, because there's risks. So how can you make investments that are safe, that are reliable? Are there investments that you can make that are more defensive, that they enjoy an asymmetric advantage in being used defensively rather than offensively, such that you can steer us towards a state in which defense has the upper hand. And we really try to align our research portfolio with that goal.
ERIC ROSENBACH: So surveillance in AI is one tough public policy issue - maintain privacy, Fourth Amendments rights, all of those things that are a core part of the United States. But when I was in the Department of Defense, we started to get a lot of questions about whether or not we were going to use AI-enabled tools to allow the use of lethal force, literally to take out a terrorist in a combat zone. What are your thoughts about that? First in terms of the state of the technology that would enable that. And then philosophically, what should Americans and democracies all around the world, what should they think about a question like that?
JASON MATHENY: It's an important public policy debate. Some of the advocates for introducing more autonomy into weapon systems note that humans, particularly under stress, can make bad decisions about the use of lethal force. Opponents of autonomy or greater autonomy in weapon systems note though that there's a need for meaningful human control over weapon systems. And I would say particularly given the state of machine learning, that is a real need. The tools that exist are generally not robust to various kinds of spoofing, to misinformation. There's a class of tricks that can be played against machine learning systems called adversarial examples. And it's now a kind of favorite parlor trick among computer science undergrads - with how little effort can I fool this state-of-the-art image classifier into thinking that this picture of a tank is actually a picture of a school bus. And it doesn't take much work.
So I think we're a long ways off from where I at least would be comfortable having a significant degree of automation in, say, targeting or weapons decisions. I think our Defense Department has been pretty wise about establishing policies that prevent the automation within that targeting decision. And I think much of that owes to you during the period that you were helping to formulate policy. So if you don't mind, I'll turn around the question just a bit and ask, how do you think about this? And where do you think the world is likely to converge on an equilibrium, if it does, with respect to lethal autonomy?
ERIC ROSENBACH: That's the classic Socratic method, right? You turn it back on the prof when there's a hard question. When we were working on this in the Pentagon, believe it or not, to me this was not a hard question at all. Maybe this will sound overly simplistic, but I can barely get Siri to work to tell me where the closest hamburger joint is, much less rely on some fancy AI algorithm to make an informed decision about when to use lethal force. In particular because when you're at the top of the pyramid in the Pentagon and you're making really sensitive decisions about when to do a strike in the times that it would matter most, the process for that is so rigorous. It's only approved by Secretary Carter; in some cases only by the President. And that's after a series of four different briefings in which you ask 100 hard questions. And only then you would give the approval.
The idea that there's going to be a machine and data and a fancy algorithm that does that based on facial recognition, geolocation and terrain and say, "This is the right person, this is the right time, and the collateral damage estimate is correct, too, I'm going to fire without approval," seems very farfetched. For the United States anyway; I can see how others would do it.
So I think for us, it was an easy decision. We put that in policy more than anything just to put at ease the people who were very concerned that we were programming drones with AI algorithms to start going out and bombing people.
But let's look at it from another perspective. We're at the Kennedy School, and it's a very international place. Do you think all other nations would have that same approach? You see a lot about what others are doing, too. What are things that concern you from what you see the rest of the world doing?
JASON MATHENY: I'm worried. I think that other countries haven't demonstrated the same degree of reservation about inserting autonomy into some very high-stakes military decisions. So one public example is the perimeter system that's used by Russia for nuclear command-and-control which involves a high degree of automation. And if there's ever a place where you would not want automation, it's in nuclear command-and-control. So the notion that you could have a nuclear war that's started due to a series of errors, of computer errors, of sensor errors is, I think, one that makes me anxious.
ERIC ROSENBACH: Explain that a little bit more, just that system and the way it works. This is the unclassified stuff; again, he's far too wise to disclose classified information. From what we know in reading in the unclassified world, how does this work?
JASON MATHENY: I'll just quote Wikipedia. [laughter] And there's a whole book, by the way, on the perimeter system called The Dead Hand, by David Hoffman. The notion originally for developing an autonomous nuclear command-and-control system within Russia was that if decision makers felt like they had a very short time in which to make a decision about a second strike, a retaliatory strike, if they felt as though an attack were under way, they might make really bad decisions. They might make decisions under uncertainty, under time pressure that in fact were based on bad information.
So the original intent was, let's relieve the time pressure such that even if the Kremlin were destroyed in a nuclear attack, there would be a retaliation that would be certain to occur. So through a set of ground sensors to detect a nuclear attack in Russia, there would then be an automated retaliatory strike. Well, there's any number of ways that one could imagine that going wrong, whether it's a terrorist nuclear detonation, or sensor error, or computer error. All of this is sort of the fodder for half of the science fiction movies I've seen. And those movies usually don't end well.
So I think the notion that we would have autonomy in such critical systems seems like one that's worth more attention. I'm actually surprised by the degree to which this doesn't come up more frequently. Not just thinking about what would we wish that other countries would avoid including into their lethal autonomous weapon systems, but also, what are the stakes of those lethal autonomous weapon systems? There's certainly a continuum. And I would say nuclear command-and-control is at the rightmost part of that continuum where the most attention is needed.
ERIC ROSENBACH: So this fall, in connection with the Russian angle, President Putin said, "Artificial intelligence is the future not only of Russia, but of all mankind and whoever becomes the leader in this sphere will become the ruler of the world." To which Elon Musk responded, "Competition for AI superiority at the national level is the most likely cause for World War III." And he said, IMO - he was tweeting - "in his modest opinion," I think. To which Mark Zuckerberg responded, "Musk's AI doomsday rhetoric is pretty irresponsible."
What do you think about all that? Who's right? [laughter] Zuckerberg? Elon Musk? Or Vladimir Putin?
JASON MATHENY: Thanks for the softball. Is there a fourth option?
ERIC ROSENBACH: Nope! [laughter] When you run the chair, you get a whole bunch of bad options to make people pick from. Is that a framework to think about this? And what's your perspective on all this?
JASON MATHENY: I think that there are reasons to be concerned about the vulnerabilities of machine learning systems. Without getting too exotic, I worry less about Skynet and Terminator than I do about digital Flubber - machines that are programmed to do something and they're poorly specified because the programmers made errors or because the sensor data going into the system was errorful.
So I think including autonomy in fairly high stakes systems is something we should be very careful about. And not just in the area of weapon systems; also in financial systems, also in power systems. We have examples of algorithmic traders run amok in our financial system. We have instances of power grids run amok due to automated forms of control.
So we do, I think, as a society need to become a bit wiser about what are the vulnerabilities of these systems before we deploy them into networks that have high stakes. I think that the long-term trajectory of these technologies offers enormous promise - benefits to healthcare and being able to do more accurate and earlier diagnosis of disease; benefits to accelerating the rate of scientific innovation by automatically generating hypotheses and testing them to see which have more explanatory power over data, data at volumes that human scientists may not be able to fully analyze; improvements in material science and biochemistry; in autonomous transport. All of this has enormous upside potential for humanity.
But in order to navigate the various speed bumps along the way, I think being mindful of some of the safety and vulnerability and reliability questions is something that is worth paying attention to. The thing that really actually occupies a large part of where IARPA is investing in machine learning reliability is making sure that our systems are robust to human error, to intentional attack - so, feeding misinformation to a machine learning classifier; a poisoning attack that tries to compromise the data that goes into a machine learning system; and then the various kinds of challenges in cybersecurity where you have different forms of automation that are responding to cyberattacks.
So we have a program called CAUSE, which automatically detects and even forecasts cyberattacks based on chatter and hacker forums, the market prices of malware that are traded on the black market, sort of patterns of help desk tickets across an enterprise when your computer's acting wonky. So looking at those kinds of trends, you want to be able to make the best possible judgment. But there is an arms race in that the cyber actors will be trying to mistrain your classifiers by throwing a whole bunch of malware that has certain kinds of features so that your classifiers start to train on those features and then send a different kind of malware that really has the payload that doesn't have that feature.
So it's an extraordinary kind of machine deception arms race that we're in. And I think this convergence of policy, of cybersecurity and machine learning computer science research will get very busy over the next decade.
ERIC ROSENBACH: That's a great way to transition to going to audience questions here in a second, just that area of cyber and AI is a very rich field in and of itself. Hopefully, some students here tonight get motivated by what you just talked about and the idea of deception for AI, which is the old, classic military feint so that you then go in with the real cyber attack. I can see a whole thesis and line of books just written on that from a cyberstrategy perspective.
We do have a lot of really bright, motivated, hardworking Kennedy School students here and students from around Cambridge, too. What's your advice to them about how to get into this field? Is it in the national security space? In the private sector? Where do you think it's most interesting? And where are the most interesting public policy questions?
JASON MATHENY: My main advice is to try to find opportunities to work at the intersection of policy and machine learning, because right now there is nothing close to the pipeline of expertise at that intersection that we'll need in the coming decades. So if you're looking for a future in policy and you're interested in machine learning, I think you'll have lots of opportunities, whether it's in government, in places like the Office of Science and Technology Policy, or in Congress, or in OMB.
Or it's in industry, helping industry to navigate some of these very challenging policy issues. I think I'm really encouraged that there are companies that have realized that this is a priority and have started to hire people who are at that intersection of computer science and policy studies.
And then lastly in academia, for think tanks like Belfer, to really start building that pipeline, that generation of expertise that we'll need to make us smarter and navigate some of the policy challenges that we'll have in the future.
So whether you're a lawyer or a public policy scholar or a computer scientist, or you're all three, we need you.
ERIC ROSENBACH: That should get a lot of you motivated out there. So why don't we do this? I'd like to turn to questions from the audience. For those of you who've been in the Forum, you know how this works. If you could start by lining at the microphones and I'll call on questions as people come there. If you could, please, state your name, tell us who you are very briefly, just so we know. And remember, what we're looking for here is a question, not a speech. A question is an interrogative that ends with a question mark and usually is one sentence. If you need to set up with one sentence, it's okay. But we're just trying to get as much interaction here as possible.
So I'm looking around, and yes, sir, we'll go to you first.
Q: Hi, I'm Elliott. I'm a math student here. You made a point earlier about how there's a danger involving AI running nuclear systems because of possible radar errors or mistakes leading to an unnecessary strike. How is that much different from the status quo where, if I'm not mistaken, there have been such close calls in the past?
JASON MATHENY: Yeah, there are an alarming number of close calls, strategic miscalculation, nuclear accidents. And I think if you read a book like The Limits of Safety, by Scott Sagan, that sort of goes through them. The one thing that saved us every time was a human being looking at the data and saying, "This doesn't make sense. This is not real." And that's what I want to ensure is always in the loop, is a human being.
ERIC ROSENBACH: I would just say something very quickly, too. When I was assistant secretary, one of the things I was responsible for was nuclear command-and-control. It is not easy to launch a nuclear weapon. It is not just the President, no matter who it is, has the idea that he or she can launch a nuke and it happens. There are hundreds of people in a chain that make something like that happen, which is very different, I think, than an AI-enabled decision-making process, for sure. Which should make some of you feel better, too. Yes, sir, go ahead.
Q: Hi, my name is Alex. I'm a software engineer. And you mentioned earlier the idea of some defensive technologies, technologies that are easier to use to protect than to do harm. I was wondering if you could elaborate on what sort of things those are, and where one can find them.
JASON MATHENY: Great question. We're really interested in the idea of safety engineering or intrinsic security within machine learning systems. I think one really interesting area for research is machine learning on encrypted data. So Microsoft had a piece on cryptonets, which is on the archive. The idea, if you could train a classifier on encrypted data, not only would that allow it to be privacy preserving, but also it would prevent these kinds of data poisoning attacks that we're concerned about, or the ability to reverse engineer a data set from a model. So these sometimes are called model inversion attacks, which is another challenge. If you're training a classifier on some classified data, the classifier then has features of that data, which then really means that the classifier itself is classified. That's awkward. I mean, that's awkward if you want to get maximum use out of these tools.
So secure machine learning I think in general is a really exciting area for a lot of technical work.
ERIC ROSENBACH: Thank you. Yes, sir?
Q: Hi, my name is Iskander. I'm coming from Kazakhstan. I have a question with regard to the rising China in this sphere of innovation. According to the latest data by the National Science Foundation, it said that China is, in five to ten years, going to become the superpower in key areas in science and innovation. How do you view the future US/China relations in that regard? And what is your opinion on latest issue with regard to the [47:51] the Silicon Valley where Chinese companies are trying to invest?
ERIC ROSENBACH: That's a great question, thank you.
JASON MATHENY: So China has a well-organized plan for AI in particular; has an AI development plan that spans multiple years. It has specific milestones. In fact, the most recent policy guidance from China includes things like, "We want a speech recognition system that's 98% accurate, and we want a machine translation system that's 95% accurate, and image classifiers that are 99% accurate." So it almost reads like an IARPA or DARPA program, like there are these specific milestones, specific schedules. I think China has the ability to mobilize industry, government, academia all in the same direction. Rather than imitating that, I think, in the words of Eric Schmidt, we should be more like us.
So the United States leads in AI, in part because we have a heterogeneous community of folks who are exploring lots of things that we would not centralize. We wouldn't be giving orders to universities to all do the same thing, we wouldn't be giving orders to all of our companies to do the same thing. We have an extraordinary innovation ecosystem that's largely founded on universities. We have the world's best universities in machine learning and computer science. We should be leveraging that. But we also have extraordinarily innovative companies that have decided to put a large part of their money in basic and fundamental research, which I think is another advantage that we have.
We are getting outpublished in machine learning by China. But if you quality control that publication, if you remove out the self-citations, the US leads. I think in order to maintain that lead though, it means we need to continue to invest in what makes us so unique globally, which are the universities as a basis for fundamental research.
ERIC ROSENBACH: Great, thank you, that was a good question. Yes, sir?
Q: Gene Freuder. I'm an AI researcher. About a year ago, Harvard Business Review published an article entitled, "The Obama Administration's Road Map for AI Policy," which discussed some reports that President Obama's executive office published that laid out his plans for the future of AI. So my question is, do those documents, will they continue to have any influence on public policy? Or are they dead because President Obama's name is attached to them?
JASON MATHENY: They continue to have an influence. As one of the co-authors of that report, I'm happy to see the influence. So the National AI R&D Strategic Plan continues to inform investment decisions at the agency level. I was also happy to see that the recent policy budget guidance from the White House has a whole section on AI and machine learning. The national security strategy, I think for the first time, describes the importance of machine learning.
So I think there is a commitment in the Office of Science and Technology Policy in particular at the White House to really lead in machine learning and to push investments where they matter most. Michael Kratsios at the OSTP within the White House just had a great interview with the New York Times, talking about the importance of fundamental research. And I think if you look at the NSF investments and IARPA and DARPA and NIST, we're increasing the level of basic and applied research.
ERIC ROSENBACH: And I'd say, both when I was in the Department of Defense and I see on the trajectory, that's only increasing. So some of that, of course, is for purely military type research, but as in a lot of things - GPS, for example, voice recognition technology, maybe the Internet - there can be offspins of that that I think can be helpful, too.
JASON MATHENY: That's right.
ERIC ROSENBACH: Yes, sir, please.
Q: I'm Charlie Freifeld, associated with the Graduate School of Arts and Sciences. I work in quantitative management, investment management at the moment. I want to follow up on some of the things that are already said, and question. Go into the future, say 10 or 15 years, and imagine that we have developed artificial intelligence to a tremendous extent, way beyond what we have today. So now you're in the Defense Department and you have a computer that has just analyzed the situation, the military situation. It's taken into account all of the knowledge that everybody has, including let's say the Chinese, and it tells you that there's a 73% probability that China is going to attack in the next week. And right now the best thing to do is attack China first, right now. When you query it and say, "Wait a minute, I want to find out how you got that result," it says, "I did 100 trillion simulations and that's the net result." And you don't have any ability to do 100 trillion simulations and go through that.
So it's an old question that Norbert Wiener raised 70 years ago. What happens when the machine is so smart you have no idea how it came to its conclusion? Do you rely on its conclusion? What do you do then?
JASON MATHENY: Thank you. I think we've actually had that dilemma for the last several decades. I mean, I at least can't understand all of the parameters that, say, a linear regression is simplifying for me. And we have had systems that were designed to try to provide forecasts of military tension based on regression models. And yet, we didn't abdicate our decision making and say, okay, we should leave it to the model. I think it's very unlikely. It's hard for me at least to imagine a period in which human decision makers and, say, nuclear command-and-control within the United States would defer to a computer model in making a decision about, say, the most consequential military actions that they have to contemplate. What do you think, Eric?
ERIC ROSENBACH: I would totally agree. Even in the hypothetical fact pattern that you gave there - which is pretty compelling; unlikely but compelling - I can't imagine any Secretary of Defense would ever make a decision based on that. Or it would be more than one of 100 data points that that person would take. There are some things about this that are inherently protective against AI-enabled decisions, which, first of all there's widespread mistrust of intelligence reporting in the general, in the Pentagon and among policymakers. So that's no offense, but it is what it is. Because very often you'll get an intel report in the President's daily brief and they can't even explain the sourcing for something that came from a human and you may not rely on it, more or less again, or a fancy algorithm. And it seems very unlikely that someone would make a decision as consequential as starting a war with China based on a 73% algorithm.
Q: So you hope the Chinese don't go ahead and do it as a result of their program.
ERIC ROSENBACH: I can't speak that well for the Chinese, but I bet even the Chinese and the Russians would think twice about starting a war based on an algorithm.
Yes, sir, go ahead.
Q: Hi, my name is Oliver Wrigley, and I'm a biomedical engineer in the pharma space. My question is, if you're aware of any research into defending against social engineering attacks augmented by AI or machine learning. I think we saw in the last election that our democracy's kind of vulnerable to these asymmetrical information warfare tactics, like targeting swing voters with Facebook ads. I'm wondering how we're planning on addressing that.
JASON MATHENY: Thank you. So we funded research on some aspects of this topic. One, for example, was looking at how disinformation can propagate within social media networks. Can you predict when something is going to take off? Another project that we funded looked at detecting disinformation edits within Wikipedia - could you tell when somebody was trying to manipulate a Wikipedia page? And then we also do some work understanding censorship patterns or manipulation patterns in news reporting. DARPA has funded a program called Social Media and Strategic Communication, looking at detection of chatbots in social media. And there's a fair amount of research sort of at the basic research level of just understanding, can you detect when somebody has a disinformation campaign that's centralized.
But I think much more research is going to be needed, and this will be a moving target because the disinformation campaigns will get increasingly sophisticated.
ERIC ROSENBACH: That's a great question. Thank you very much. Yes, sir?
Q: Hi there. My name's Alex. I'm a recent alum of MIT's technology and policy program. And my question's very similar to the gentleman's down there, maybe a bit more generalized. You talked about how the explainability of machine learning needs to be key in the military context. But at the same time, deep nets are generally not that explainable. So how do you in the military context balance this tradeoff between explainability and perhaps performance because deep nets are currently the state-of-the-art in both the context of funding research and deciding what to build in terms of models.
JASON MATHENY: We, at IARPA, we've had several programs where we require explainability and we're willing to suffer the decrease in performance in order to get it. Because a tool that's not used by an analyst is non-performing. So even if the tool performs well in a technical sense, if it's not trusted by an analyst it's not going to get used. In those cases, maybe you suffer then the 10% decrease in performance. But if it's explainable, it's going to be much more valuable to the analyst.
And I do think there are some really good research efforts now aimed at explaining the behavior of deep neural nets; for example, the Explainable AI program by our colleagues in DARPA, and then some work at IARPA that's looking at explaining different classifier outputs, in part to detect manipulation.
ERIC ROSENBACH: Go ahead, please.
Q: Hi, my name's Dan. I'm an undergrad at Tufts. I study Russian and computational linguistics, and I also work on some IARPA-related projects. My question relates to the education of artificial intelligence, because I see that as being a very large roadblock for people like me and other undergrads who are hoping to, like you said, eventually get into this intersection of public policy and artificial intelligence. And right now, at least as far as things seem, there's been a huge push of people at universities migrating towards the private sector where there are just larger financial incentives. There's a lot of talk about how there's a very high demand for deep learning, artificial intelligence researchers. However, the supply's fairly minimal given that, like you said, things have only started six years ago.
And maybe this is more of an education policy-oriented question, but how do we incentivize people to, I guess, stick around in the public policy sector so that we can continue to foster this sort of artificial intelligence education for future generations? And what sort of role, I guess, does IARPA play in that?
JASON MATHENY: it can often be hard to compete on salary. We try to appeal to social service: the ability to influence where humanity's long-term welfare can be improved, the ability to shape technology such that we're safer and healthier and happier, as opposed to improving ad hits by 1%. And not to put down industry jobs; I think there's lots of important work that can be done within industry to ensure that the technologies are used to the good of humanity, but there's nothing like working either within NGOs or government agencies or think tanks that help advise the public and government about these important technology issues. The satisfaction that you gain from that I think beats the salaries.
ERIC ROSENBACH: That's a great question, thank you. I'd just say, having seen the Department of Defense and NSA struggle to recruit really high-end cyber talent, the market dynamics are very similar - pay far outstrips the government. But in that field, when you know you can legally go and hack the Iranians all day long, or, in this context, you can take on all of these hard questions that we've been grappling with here, you only do that in government. So come, you all, you can do the right thing, you can contribute to the country, to the world, too. You can go work for Facebook later. But make a difference now. That's your little pep talk.
Yes, ma'am, could you go ahead? This will be our last question because I think I'm getting the hook. So we'll just go right here to you.
Q: So much pressure. Hi, I work here, and I do research primarily on how cities are using AI in predictive analytics. So one of the things that we're really seeing in terms of policy challenge in the field is issues around intellectual property. And I was really curious- basically private companies owning algorithms that are deployed by cities and then sometimes the public advocating for algorithmic transparency. And so, my question for you, I guess, is, what does the economy of this look like in terms of national security and in terms of intellectual property?
JASON MATHENY: We fund mostly open source software just because it's easier to defend - with more eyes, all your bugs are shallow, that kind of idea. I do think that most of the companies that are attracting really the world class talent in machine learning have also realized that to attract them, they typically need to free up the models and the algorithms. The companies treat their data as being proprietary and sort of the secret sauce, but are publishing on the algorithm work. And I think that's something that in general we also follow.
So the intelligence community data is the secret sauce, but we would like for more people to be reviewing the algorithms and improving those algorithms so that they're less errorful.
ERIC ROSENBACH: Before we wrap up, there's one other person I want to recognize. This is Charles Carithers. Stand up here, Charles. This is a real live Kennedy School grad who also works in IARPA. He's the associate deputy director at IARPA helping out Jason here, too. So we've got to give him a round of applause [applause] because he, just like the question posed, he could be making a lot of money someplace else, but he's doing public service and we appreciate that. Thank you very much.
JASON MATHENY: He's done you all proud. And every time he goes to a meeting and people find out that he's from the Kennedy School, they're like, "Okay, I'm going to go there." [laughter]
ERIC ROSENBACH: So thank you all very much for coming, giving the time. Jason, thank you so much.
JASON MATHENY: Thank you, Eric. Thank you, all.
[applause]
END