The Software behind the PRISM Intelligence-Gathering Program

By Jonathan Vanian - June 10, 2013

In the past week, news of the National Security Agency’s no longer secret PRISM intelligence-gathering program has been churning out with no signs of a slowdown anytime soon. At the center of this explosive story is a sophisticated computer system that has the capability to sift through enormous amounts of data and extrapolate meaning, giving the NSA a possible way to track millions people and predict their behaviors. I guess big data is more than just hype.

To give a brief rundown of how the PRISM story got started, Edward Snowden, a former technical assistant for the CIA in IT security and a current worker for Booz Allen Hamilton, leaked confidential documents (PowerPoint files) that detail “the breadth of U.S. electronic surveillance capabilities,” as the Washington Post reports. As a result of the leaks, a media firestorm ensued, and the government is pushing back hard (with a few notable exceptions).

Now, on to the program. Through PRISM, members of the US intelligence community reportedly have “access to the servers of nine Internet companies for a wide range of digital data.” The companies include big Internet heavyweights Microsoft, Google, Facebook, and Apple, and the data include (depending on the provider) everything from email and videos to file transfers and video-conferencing information.

Wired describes PRISM as “some kind of API to automate the process of submitting court orders to the Internet companies and receiving their responses and data,” although who exactly is getting monitored is unknown at this time. US Director of National Intelligence James R. Clapper says only foreign suspects outside the United States are being targeted, as “authorized by Section 702 of the Foreign Intelligence Surveillance Act.”

Mashable chatted with independent privacy researcher Ashkan Soltani, who said PRISM is “basically a data-ingestion API”:

Soltani speculated that based on what we know now, PRISM is a "streamlined way" to submit Section 702 orders to the companies for them to review the requests, and it gives the NSA the ability to handle and process the response "in an automated fashion," just like an app like TripIt, which automatically parses information from your flight reservations.

While the exact mechanisms of how PRISM works remain unknown (as of now), it’s clear that the NSA’s data-mining capabilities are “now far greater than most outsiders believed,” The New York Times reports:

The government has poured billions of dollars into the agency over the last decade, building a one-million-square-foot fortress in the mountains of Utah, apparently to store huge volumes of personal data indefinitely. It created intercept stations across the country, according to former industry and intelligence officials, and helped build one of the world’s fastest computers to crack the codes that protect information.

UPDATE
The Wall Street Journal posted its own story detailing the technology behind the PRISM collection program; it should come as no surprise that Hadoop plays a big role.

From the Wall Street Journal:

The NSA also became an early adopter. At a 2009 conference on so-called cloud computing, an NSA official said the agency was developing a new system by linking its various databases and using Hadoop software to analyze them, according to comments reported by the trade publication InformationWeek.

The system would hold 'essentially every kind of data there is,' said Randy Garrett, who was then director of technology for the NSA's integrated intelligence program. 'The object is to do things that were essentially impossible before.'

The Software behind the PRISM Intelligence-Gathering Program

Status message