Customs and Border Protection

Customs and Border Protection (CBP) is the arm of DHS charged primarily with securing the nation’s borders. CBP uses social media information as part of its review of applications to enter the United States. Social media information is also part of CBP’s preflight risk assessments and watch list screening and is used to develop broader intelligence analysis products. CBP’s reliance on social media to perform these critically important functions is misplaced. DHS’s own pilot programs show that social media information is rarely a reliable basis for making judgments. And the vague standards used to assess social media invite discrimination against certain individuals, such as those involved in protest and activism and Muslim travelers. Unreliable social media information is easily shared within and beyond DHS, exposing personal information to a range of actors and increasing the risk that the data will be used out of context.

1. Visa Vetting

A. Visa Waivers (ESTA Program)

DHS, in consultation with the State Department, administers the Visa Waiver Program, through which citizens of 38 mainly Western European countries can travel to the United States for business or tourism without obtaining a visa. In fiscal year 2017, more than 23 million travelers came to the United States through the program. Travelers from these countries who wish to obtain a visa waiver must complete a mandatory online application on the Electronic System for Travel Authorization (ESTA). The information provided through ESTA is vetted against security and law enforcement databases to determine whether applicants are eligible to travel under the program and to ensure they do not pose a law enforcement or security risk. These travelers are also continually screened in real time.

Social media information is increasingly being used in this process to vet for national security concerns, although only one American was killed in a terrorist attack by a traveler on the Visa Waiver Program between 1975 and 2017, according to a study by the Cato Institute. While social media checks were previously used by CBP, the agency added a new question to the forms in December 2016, asking all applicants to voluntarily provide their social media identifiers, such as any usernames and platforms used. If applicants choose to provide identifying information, officers may use it to locate their profiles and accounts when the initial screening indicates “possible information of concern” or “a need to further validate information.”

Regardless of whether ESTA applicants have chosen to provide their social media identifiers, CBP officers may still choose to manually check their accounts; it does not appear that the officer must first make a finding of “possible information of concern” or “a need to further validate information” in order to do so. In such instances, in addition to the interpretive issues identified above, it is unclear how CBP officials confirm that they have correctly connected the applicant to the right social media accounts. This was a recurring problem in the pilot programs discussed previously.

Publicly available documents do not indicate what types of postings on social media would be considered by CBP to be indicative of a national security threat. But the vagueness of the standards creates the risk that innocuous social media activity will be used as a means of excluding people of certain political or religious beliefs. In a nod to this risk, CBP documents state that information from social media “will not” be the sole basis upon which CBP denies someone entry to the United States. But this restriction may not be particularly effective because CBP could combine one questionable or weak social media “find” with virtually any other information to deny a visa waiver. For example, CBP and other arms of DHS are not permitted to use ethnicity as the sole basis for suspecting an individual is undocumented, but ethnicity combined with other factors — such as appearing nervous — has been used to stop people on suspicion of undocumented status.

The social media check can also extend to associates who posted on or interacted with an applicant on their social media profile, which could include Americans and other contacts living in the United States if “relevant to making an ESTA determination.” In addition, CBP uses “link analysis” to proactively identify contacts of applicants (e.g., friends, followers, or “likes”), as well as the applicant’s secondary and tertiary contacts who might “pose a potential risk to the homeland” or “demonstrate a nefarious affiliation on the part of the applicant.” CBP has no qualms about drawing adverse conclusions from things that third parties have posted — rather, it “presumes” that at least some of the information posted on the applicant’s site, including from third parties, is accurate because “individuals generally have some degree of control over what is posted on their sites.”

Thus, even if nothing posted by the applicant suggests he or she poses a risk, CBP could still potentially deny a visa waiver based in part on concerns related to a tweet posted by a “friend” or follower, who could easily be someone the applicant does not even know. Unfortunately, unlike some other DHS programs, there is no opportunity for the applicant to address or explain the inferences that CBP draws from their social media.

DHS rules require officers to collect only the minimum personally identifiable information “necessary for the proper performance of their authorized duties.” But according to the 2017 privacy audit of ESTA, DHS’s Privacy Office could not verify whether CBP was adhering to this requirement. Other significant controls — that DHS officers are limited to reviewing publicly available information and must use official DHS accounts to conduct such checks — can be circumvented using a technique called “masked monitoring.” But the circumstances triggering such monitoring and the applicable rules are not publicly available.

All social media information about those applying for visa waivers (and potentially about their friends and contacts), as well as other data from ESTA applications and related paperwork, is stored in CBP’s Automated Targeting System (ATS). CBP agents use the information in ATS to assign risk assessments to travelers, which can impact their vetting and questioning at the border. ATS risk assessments and other analyses also feed into a number of watch lists, such as the FBI’s Terrorist Screening Database and TSA Watch Lists, as well as analytical products on trends and threats. In other words, what a person says on social media, which is often context-specific and ambiguous to outsiders, feeds into every aspect of CBP’s work and that of DHS more broadly.

ESTA information — about applicants and their friends and families — is also disseminated widely to a broad range of entities, including the Departments of Justice and State. As of December 2018, the National Vetting Center (NVC), a presidentially created clearinghouse and coordination center for vetting information, has been involved in ESTA’s work. CBP is required to regularly share ESTA application data with a number of agencies involved in the NVC, including the CIA and the Department of Defense, to be compared against the holdings of those agencies. Beyond the bulk sharing with the NVC, ESTA information sharing with other agencies is not confined to situations in which there is an indication that the traveler has violated the law. Rather, it can take place simply when DHS determines that the information “would assist in the enforcement of civil or criminal matters.” In addition, DHS and the National

Counterterrorism Center (NCTC), which is charged with collecting counterterrorism intelligence, have entered into a memorandum of understanding allowing DHS to disclose the entire ESTA data set to the NCTC. This data set would go far beyond information about individuals suspected of any connection to terrorism and would include information gathered during routine interactions with the public (e.g., screening travelers, reviewing immigration benefit applications, issuing immigration benefits).

In sum, the ESTA program demonstrates that CBP collects highly personal information available on social media about those applying for visa waivers and the people in their networks. CBP uses this information, which is highly contextual and subject to interpretation, to decide whether an individual poses an undefined “security risk.” All of this information is stored in DHS databases for years and potentially used for a range of purposes, often far removed from the purpose of the initial collection. The information is shared in bulk with the NCTC, and with other law enforcement agencies as long as it could be of “assistance” to them, creating risks to privacy and freedom of speech and association.

B. Visa Applications

The State Department has ramped up its collection of social media information from people applying for visas, which it shares with DHS to be vetted using ATS. In May 2017, the State Department began requiring some categories of visa applicants — estimated at 65,000 per year — to provide the identifiers they used on all social media platforms within the previous five years. It seems likely that this move was aimed primarily at travelers from the Muslim ban countries; the Federal Register notice announcing the rule change indicated that it was being implemented as part of the Muslim ban, and the notice’s estimate of the number of travelers who would be affected by the change approximately matched those affected by the overall ban.

In March 2018, the State Department sought to vastly expand the collection of social media identifiers to the approximately 15 million people who apply for visas each year. The Office of Management and Budget (OMB) approved the proposal in April 2019, which means the State Department will begin collecting from nearly all visa applicants their social media identifiers associated with any of 20 listed social media platforms, more than half of which are based in the United States (Facebook, Flickr, Google+, Instagram, LinkedIn, Myspace, Pinterest, Reddit, Tumblr, Twitter, Vine, and YouTube). The other platforms are based in China (Douban, QQ, Sina Weibo, Tencent Weibo, and Youku), Russia (Vkontakte), Belgium (Twoo), and Latvia (Ask.fm). Applicants will also have the option of providing identifiers for platforms not included on the list.

As with the DHS social media collection programs described throughout this paper, there is limited information on what the State Department’s review of applicants’ social media activity will entail. We only know that it is meant to enable consular officers to confirm applicants’ identity and adjudicate their eligibility for a visa under the Immigration and Nationality Act. While the notice does state that “the collection of social media platforms and identifiers will not be used to deny visas based on applicants’ race, religion, ethnicity, national origin, political views, gender, or sexual orientation,” this restriction is easily circumvented: a social media post revealing an applicant’s religious or political affiliation may not alone justify denial, but other information in his or her file could easily be used as a pretext, particularly given the broad discretion exercised by consular officials. According to the statement supporting the notice, consular officers will also be directed not to request passwords, violate the applicant’s privacy settings or the platforms’ terms of service, or engage with the applicant on social media; to comply with State Department guidance limiting the use of social media; and to avoid collecting third-party information.

The State Department’s expected trove of information will likely be used for a variety of purposes beyond visa vetting. Social media identifiers collected by the State Department will be stored in the Consolidated Consular Database, which is ingested into ATS and becomes available to DHS personnel. Further, that information will be used in coordination with other department officials and partner U.S. government agencies. Indeed, numerous other agencies have access to the visa records system in which applicants’ social media information will be stored, and — along with foreign governments — can obtain information from the system.

In sum, the State Department’s collection of social media information, which already includes 65,000 visa applicants (likely those targeted by Trump’s Muslim ban), is on track to begin creating a registry that will include 15 million people after the first year alone. Not only will this information be used by the State Department in undefined ways to make visa determinations, but it will be yet another source of personal information that is funneled into DHS’s many interconnected and far-reaching systems.

2. Warrantless Border Searches

CBP conducts warrantless searches at the border on a wide variety of electronic devices, such as phones, laptops, computers, and tablets, many of which are likely to result in the collection of social media information. According to CBP, these searches are meant to help uncover evidence concerning terrorism and other national security matters, criminal activity like child pornography and smuggling, and information about financial and commercial crimes. However, CBP documents also describe these searches as “integral” to determining an individual’s “intentions upon entry” and to providing other information regarding admissibility.

While some of these searches are conducted manually, CBP also has technical tools for extracting information from these devices, potentially including information stored remotely. It has purchased powerful handheld Universal Forensic Extraction Devices (UFEDs), developed by the Israeli company Cellebrite, which can be plugged into phones and laptops to extract in a matter of seconds the entirety of a device’s memory, including all data from social media applications both on the device and from cloud-based accounts like Facebook, Gmail, iCloud, and WhatsApp.

Searches by CBP of travelers’ electronic devices at ports of entry have increased dramatically over the past several years. In fiscal year 2015, 8,503 people had their devices searched. By fiscal year 2017, the number had reached 30,200 — an increase of over three and a half times. According to CBP, these searches do not require a warrant, due to “a reduced expectation of privacy associated with international travel.” Both American and foreign travelers are subjected to these warrantless searches. In 2017, 10 U.S. citizens and one green card holder filed suit challenging warrantless searches of electronic devices at the border. The complaint highlights the intrusiveness of these searches, both for the person being searched and for the traveler’s family, friends, and acquaintances, given the many contact lists, email messages, texts, social media postings, and voicemails that cellphones and laptops often contain. In November 2019, the U.S. District Court in Massachusetts ruled in the case that CBP’s and ICE’s suspicionless searches of electronic devices at ports of entry violate the Fourth Amendment and that these searches require reasonable suspicion that devices contain contraband.

Under a January 2018 directive, CBP is permitted to conduct two types of searches: “basic” and “advanced,” both of which would allow collection of information from social media. The 2018 directive changed CBP’s previous, more permissive rule, likely as a partial and belated response to a 2013 federal court decision, United States v. Cotterman. In that case, a federal court of appeals held that the fact that a device was seized at a border did “not justify unfettered crime-fighting searches or an unregulated assault on citizens’ private information,” and required that officers have reasonable suspicion of criminal activity to conduct forensic searches of electronic devices. CBP is permitted to refer travelers to ICE at any stage of the inspection process, at which point ICE’s rules would apply; while ICE also issued a 2018 policy barring the use of advanced searches without reasonable suspicion, it is not yet known how personnel are being directed to implement this policy, meaning that ICE’s searches may in practice be more permissive than CBP’s.

Under CBP’s new rules, a basic search permits an agent to view information that “would ordinarily be visible by scrolling through the phone manually.” No suspicion of criminal wrongdoing or national security risk is required for basic searches. For either type of search, agents are prohibited from “intentionally” accessing data that is “solely stored remotely”; only information that is “resident on the device and accessible through the device’s operating system or through other software, tools, or applications” may be viewed. CBP officers are supposed to disable network connectivity or request that the traveler do so (e.g., by switching to airplane mode) prior to the search; they are also supposed to conduct the search in the presence of the traveler in most circumstances, though the individual will not always observe the actual search.

Despite these new guidelines, CBP agents will probably still be able to access social media information during a search. If a traveler has social media data downloaded onto his or her device or cached in some way, it is likely accessible even if connectivity is turned off. For example, if a traveler was scrolling through a Twitter or Facebook feed prior to being selected for a search, any loaded data, such as his or her newsfeed, would be accessible on the user’s phone or laptop.

The officer may also request that the traveler provide any passcodes needed to access the contents of a device. Although a traveler can refuse to provide a code, CBP may then keep the device in order to try to access its contents by other means. U.S. citizens must be admitted to the country even if they do not provide passcodes, though their phones may still be held for five days or longer. Noncitizens, however, including visa holders and tourists from visa waiver countries, may be denied entry entirely.

An advanced search occurs when an officer connects an electronic device to external equipment, via a wired or wireless connection, to review, copy, or analyze its contents. Advanced searches are highly intrusive, and the tools that CBP has purchased allow it to capture all files and information on the device, including password-protected or encrypted data.

Officers are authorized to perform advanced searches if there is reasonable suspicion that one of the laws enforced or administered by CBP has been violated or if there is a “national security concern.” In creating an exception for “national security concerns,” DHS policy departs from the Cotterman decision, which required reasonable suspicion for all forensic searches. While DHS does not define what constitutes a national security concern, national security is an expansive term that could easily swallow up the requirement of suspicion for these highly intrusive searches. The examples listed in the 2018 privacy impact assessment suggest that national security searches will be based on watch lists. However, this category includes not just lists kept by the government — primarily the FBI and DHS — but other lists as well, such as unspecified “government-vetted” watch lists and a “national security-related lookout in combination with other articulable factors as appropriate.” And, of course, these examples are not exhaustive, leaving open the possibility that agents will use the cover of national security to undertake forensic searches even when there is no relevant watch list.

Following both basic and advanced searches, the officer enters notes about the interaction, including “a record of any electronic devices searched,” into TECS, CBP’s primary law enforcement system. This typically includes device details, the type of search performed (basic or advanced), and the “officer’s remarks of the inspection.” CBP may detain a device, or copies of the information it contains, for up to five days, although it can keep a device longer when there are unspecified “extenuating circumstances.” If there is no probable cause to seize and retain a device or the information it contains, the device must be returned to the traveler and any copies destroyed. However, CBP may retain without probable cause any information “relating to immigration, customs, and other enforcement matters,” which seems to allow it to essentially circumvent the probable cause requirement. For instance, information that could be considered useful for determining whether an individual may be permitted to travel to the United States could be stored in the individual’s Alien File, 100 years after their date of birth.

Any information that is copied directly from an electronic device during an advanced search (presumably based on probable cause) is stored in ATS, which allows agents to further analyze information collected by comparing it against various pools of data and applying ATS’s analytic and machine learning tools to recognize trends and patterns. CBP may disclose information from electronic device searches to other agencies, both within and outside DHS, if it is evidence of violation of a law or rule that those agencies are charged with enforcing.

Notably, a December 2018 DHS inspector general report concluded that CBP had not been following its own standard operating procedures prior to the implementation of the new rules. The report, which was based on a review of CBP’s electronic device searches at ports of entry from April 2016 to July 2017, found that officers frequently did not document searches properly, that they consistently failed to disable network connection prior to search (specifically for cell phones), and that the systems used and data collected during searches were in many cases not adequately managed and secured. For instance, officers often failed to delete travelers’ information stored on the thumb drives used to transfer data to ATS during advanced searches. The report also found that CBP had no performance measures in place to assess the effectiveness of its forensic searches of electronic devices.

The 2018 directive instructed CBP to develop and periodically administer an auditing mechanism to ensure that border searches of electronic devices were complying with its requirements. However, the agency has published neither the requirements nor the results of the audits. In February 2019, the Electronic Privacy Information Center (EPIC) sued for the release of this information.

Even if the rules are operating as intended, they may also be applied discriminatorily. For instance, Muslim travelers have long been singled out for additional scrutiny because of their faith, which President Trump and his administration have repeatedly and inaccurately connected with “terrorism.” Just months after the new policy was issued, the Council on American-Islamic Relations (CAIR) sued CBP on behalf of a Muslim American woman whose iPhone was seized and its contents imaged when she came home from Zurich. She was also questioned about her travel history and whether she had ever been a refugee. The lawsuit asked CBP to explain what suspicion warranted the forensic search and demanded deletion of the information seized. The government quickly settled, agreeing to delete the data it had seized.

In sum, CBP is increasingly deploying its claimed warrantless border search authority to search the electronic devices of both visitors and American travelers. Basic searches conducted without any suspicion of wrongdoing can result in the scrutiny of travelers’ social media information. Advanced searches will result in the collection of huge amounts of personal information, including from social media, about both the person whose device is being searched and that person’s contacts. CBP has stated that it has this broad authority in order to help uncover information related to terrorism and criminal activity and to determine admissibility. But there is little indication in public documents as to what type of content officers should be looking for, especially in deciding whether a traveler can enter the country, allowing for unfocused fishing expeditions. And these searches are not subject to even minimal safeguards—such as an instruction to avoid making decisions based solely on social media or a prohibition on profiling. And the search is just the start. CBP is permitted to retain information relating to immigration, customs, or other enforcement matters it finds useful, including a copy of the contents of phones and laptops; as discussed further below, the agency may also further analyze the information using unknown tools and algorithms.

3. Searches Pursuant to Warrant, Consent, or Abandonment

CBP also collects information from electronic devices in three other situations:

When it has a warrant authorized by a judge or magistrate based on probable cause;

When an officer finds an abandoned device that he or she suspects “might be associated with a criminal act” or was found in “unusual circumstances” (such as between points of entry in the “border zone,” the area within 100 miles of any U.S. boundary in which Border Patrol claims authority to conduct immigration checks ); and

When the owner has consented.

According to CBP, once the information is determined to be “accurate and reliable,” it is used to support the agency’s border enforcement operations and criminal investigations. DHS materials note that such information is “typically” used only to corroborate evidence already in the agency’s possession.

Agents are explicitly allowed to collect information stored in the cloud when spelled out in a warrant or when the owner consents, but it is not clear whether cloud data can be accessed from abandoned devices. A CBP officer or agent can submit devices found in one of the aforementioned scenarios for digital forensic analysis, which is usually undertaken by a team of agents at the intelligence unit for the relevant Border Patrol sector.

If the CBP agent determines after conducting one of these examinations that an electronic device holds information that is “relevant” to the agency’s law enforcement authorities, the agent may load all information into a standalone information technology system for analysis. This is the rare database that “may not be connected to a CBP or DHS network.” The tools built into these stand-alone systems allow CBP to perform various analyses on the collected information. One system, ADACS4, is used to analyze data from electronic devices in order to discover “connections, patterns, and trends” relating to “terrorism” and the smuggling of people and drugs, as well as other activities that threaten border security.

CBP retains information associated with arrests, detentions, and removals, including data obtained from electronic devices, for up to 75 years. Even information that does not lead to the arrest, detention, or removal of an individual and that may be completely irrelevant to DHS’s duties — may be stored for 20 years “after the matter is closed.”

The information collected by CBP from electronic devices is frequently disseminated within DHS and to other federal agencies or state and local law enforcement agencies with a need to know, and less frequently to foreign law enforcement partners. In addition to sharing with agencies investigating or prosecuting a violation of law, CBP may also share information for unspecified counterterrorism and intelligence reasons.

The CBP search authorities detailed above allow the collection of social media information. While the warrant and consent authorities seem reasonably cabined, the authority to search abandoned devices is quite expansive, especially if it is read to apply to all devices found within 100 miles of U.S. land or coastal borders, where two-thirds of Americans live. It is not clear why the information from these categories of devices is held in a separate database, unconnected to other DHS systems. As with other collection programs, CBP uses the social media information it collects to conduct trend or pattern analyses and shares it with other agencies, raising concerns about how potential misinterpretations and out-of-context information are deployed.

4. Analytical Tools and Databases

After CBP personnel collect social media information including from ESTA and visa applications, from electronic devices searched under their claimed border search authority, and from numerous other sources — the data is provided to analysts who conduct one or more of three main types of analyses:

A. Assigning individual risk assessments: comparing an individual’s personally identifiable information against DHS-held sources to assess his or her level of risk, such as whether the individual or her associates may present a security threat, in order to determine what level of inspection she is required to undergo and whether to allow her to enter the country;

B. Trend, pattern, and predictive analysis: identifying patterns, anomalies, and subtle relationships in data to guide operational strategy or predict future outcomes; and

C. Link and network analysis: identifying possible associations among data points, people, groups, entities, events, and investigations.

These analytical capabilities are interrelated and interdependent and serve as the backbone of CBP intelligence work. Because the ways in which CBP conducts these analyses and draws conclusions from data depend heavily on interactions among the agency’s various data systems, this section will provide an overview of the key systems and their analytical functions. It shows that the social media information in each of these databases is amassed on the basis of overbroad criteria and without accuracy requirements, shared widely with few or no restrictions, analyzed using opaque algorithms and tools, and often retained longer than the approved retention schedules.

A. Assigning Individual Risk Assessments

The primary system CBP uses for combining and analyzing data, including for assigning risk assessments, is the Automated Targeting System (ATS). There is scant publicly available information regarding the foundation, accuracy, or relevance of these risk assessments; nor do we know whether the factors used in assessments are non-discriminatory. We do know, however, that social media is likely a common source in formulating risk assessments. ATS contains copies of numerous databases and data sets that include social media information, such as CBP’s ESTA, the FBI’s Terrorist Screening Database (TSDB), and data from electronic devices collected during CBP border searches. ATS also appears to ingest social media information directly from commercial vendors. CBP agents use secret analytic tools to combine the information gathered from these various sources, including from social media, to assign risk assessments to travelers, including Americans flying domestically. These assessments may get a person placed on a watch list like the TSDB, and determine whether the person gets a boarding pass or if additional screening is necessary.

To be clear, the individuals who are subjected to these measures are not necessarily suspected of a crime or a link to criminal activity. Rather, an individual’s risk level is determined by a profile, which can be influenced by social media information contained in ATS or other databases, as well as ad hoc queries of information on the internet, including queries of social media platforms. Notably, DHS exempted ATS from accuracy requirements under the Privacy Act, so the information that goes into one’s risk assessment need not be correct, relevant, or complete.

ATS’s individual risk assessment capabilities are also leveraged by ICE in its enforcement activities against people who have overstayed their visas. ATS receives the names of potential overstays from CBP’s arrivals and departures management system, and ATS automatically vets each name against its records to create a prioritized list based on individuals’ “associated risk patterns.” The prioritized list is then sent to ICE’s lead management system, LeadTrac (discussed further in the ICE Visa Overstay Enforcement section below).

It is not clear what standard is used in determining “risk” in these profiles or how exactly social media information is weighted. But it seems likely that ATS’s data mining toolkit, which includes “social network analysis” capabilities that may rely on social media information, is an important part of formulating risk assessments.

Risk assessments and other records in ATS are retained for 15 years, unless the information is “linked to active law enforcement lookout records . . . or other defined sets of circumstances,” in which case the information is retained for “the life of the law enforcement matter.” Notably, the most recent ATS privacy impact assessment admits that the system fails to “consistently follow source system retention periods, but instead relies on the ATS-specific retention period of 15 years,” often retaining data for a period that exceeds the data retention requirements of the system from which it originated (for instance, three years for sources from ESTA). Therefore, ATS passes information to partners long after it has been corrected or deleted from other databases.

ATS information, including personally identifiable information, is disseminated broadly within DHS and to other federal agencies, and many DHS officers have direct access to ATS. It is unclear, however, whether risk assessments and the underlying social media data on which they are based may be disseminated beyond ATS.

B. Trend, Pattern, and Predictive Analysis

Essential to the process of assigning risk assessments are the CBP-formulated “rules,” or “patterns” identified as “requiring additional scrutiny,” that CBP personnel use to vet information in ATS in order to evaluate an individual’s risk level. These patterns are based on trend analyses of suspicious activity and raw intelligence, as well as CBP officer experience and law enforcement cases. In addition to assigning risk assessments, ATS is used as a vetting tool by both USCIS (for refugees and applicants for certain immigration benefits) and the Department of State (for visa applicants) and to analyze device data obtained at the border. For each of these functions, CBP agents use ATS to compare incoming information against ATS holdings and apply ATS’s analytic and machine learning tools to recognize trends and patterns.

CBP agents also use ATS for preflight screenings (which will be discussed in more detail in the TSA section) to identify individuals who, though not on any watch list, “exhibit high risk indicators or travel patterns.” ATS’s analytic capabilities likely underpin its determinations of “high risk” patterns.

ATS is also central to a DHS-wide “big data” effort, the DHS Data Framework. Similar to ATS in structure and purpose but wider in scope, the Data Framework is an information technology system with various analytic capabilities, including tools to create maps and time lines and analyze trends and patterns.

The Data Framework ingests and analyzes huge amounts of data from across the department and from other agencies. Originally the Data Framework was meant to import data sets directly from dozens of source systems and categorize the data in order to abide by retention limits, access restriction policies, and ensure that only particular data sets are subject to certain analytical processes. However, as of April 2015, data sets started being pulled straight from ATS instead of from the source systems, and the Data Framework stopped tagging and categorizing data before running analytics. DHS said this change was merely an “interim process” of mass data transfer in order to expedite its ability to identify individuals “supporting the terrorist activities” in the Middle East. The interim process was originally established to last for 180 days, with the possibility of extensions in 90-day increments. However, the interim period continued for at least three and a half years (April 2015–October 2018), and it is unclear whether it is still ongoing.

The Data Framework’s interim process and its extraction of data directly from ATS are troubling in part because ATS does not comply with the retention schedules of different source systems but rather tends to rely on its own 15-year retention period. By bypassing source systems and extracting information directly from ATS, the interim process creates a risk that outdated or incorrect information, or information that was deleted from its source system many years earlier, will be input into the Data Framework’s classified repository. Hence, information collected from an individual for one purpose — such as screening for the Visa Waiver Program — not only is retained longer than it should be, but is channeled into larger and larger analytical systems for unknown and unrelated purposes.

According to DHS senior leadership, the Data Framework also incorporates “tone” analysis. Purveyors of tone analysis software have made dubious claims about its ability to predict emotional states and aspects of people’s personality on the basis of social media data. These claims, however, have been thoroughly debunked by empirical studies. The unreliability of such software increases dramatically for non-English content, especially when people use slang or shorthand, which is often the case with social media interactions.

The Data Framework and its analytical results are used extensively throughout DHS, including by CBP, DHS’s Office of Intelligence and Analysis, TSA’s Office of Intelligence and Analysis, and the DHS Counterintelligence Mission Center. DHS uses the Data Framework’s classified data repository to disseminate information externally, including “bulk information sharing” with U.S. government partners.

C. Link and Network Analysis

A central element of CBP network analysis capabilities is the collection of information on a huge number of individuals in order to draw connections among people, organizations, and data. For this purpose, CBP agents use the CBP Intelligence Records System (CIRS) to gather information about a wide variety of individuals, including many who are not suspected of any criminal activity or seeking any type of immigration benefit, such as people who report suspicious activities; individuals appearing in U.S. visa, border, immigration, and naturalization benefit data who could be associates of people seeking visas or naturalization, including Americans; and individuals identified in public news reports. The system stores a broad range of information, including raw intelligence collected by CBP’s Office of Intelligence, data collected by CBP pursuant to its immigration and customs authorities (e.g., processing foreign nationals and cargo at U.S. ports of entry), commercial data, and information from public sources such as social media, news media outlets, and the internet. Notably, the system is exempt from a number of requirements of the Privacy Act that aim to ensure the accuracy of records. Accordingly, it appears that information in CIRS may be ingested, stored, and shared regardless of whether it is accurate, complete, relevant, or necessary for an investigation. There is no public guidance on quality controls for information eligible for inclusion in CIRS.

Huge swaths of data from CIRS, ATS, and other systems, including social media information, are then ingested by another database, the Analytical Framework for Intelligence (AFI). AFI provides a range of analytical tools that allow DHS to conduct network analysis, such as identifying links or “non-obvious relationships” between individuals or entities based on addresses, travel-related information, Social Security numbers, or other information, including social media data.

It is possible that ATS risk assessments are among the unspecified data transferred from ATS to AFI. In addition, AFI users may upload internet sources and other public and commercial data, such as social media, on an ad hoc basis. The data need only be relevant, a fairly low standard, and the rules allow data of “unclear” accuracy to be uploaded. CBP agents use AFI to search and analyze databases from various sources, including Department of State and FBI databases and commercial data aggregators. Social media information in AFI can be used in ongoing projects and finished intelligence products, which can be disseminated broadly within DHS and to external partners.

The data mining firm Palantir — a longtime government contracting partner that helped facilitate one of the National Security Agency’s most sweeping surveillance programs — is intimately involved in AFI’s operation. Documents obtained by the Electronic Privacy Information Center (EPIC) through a Freedom of Information Act (FOIA) request refer to joint “AFI and Palantir data” and state that “data from AFI and Palantir can be shared with other stakeholder[s] and agencies” in compliance with AFI rules. “Palantir data” may refer to personal information about people that Palantir ingests from disparate sourcessuch as airline reservations, cell phone records, financial documents, and social media — and combines into a colorful graphic that purports to show software-generated linkages between crimes and people.

According to an investigation by Bloomberg News, law enforcement agencies may use this “digital dragnet” to identify people who are only very tangentially related to criminal activity: “People and objects pop up on the Palantir screen inside boxes connected to other boxes by radiating lines labeled with the relationship: ‘Colleague of,’ ‘Lives with,’ ‘Operator of [cell number],’ ‘Owner of [vehicle],’ ‘Sibling of,’ even ‘Lover of.’” The value of discovering such linkages in investigations, while much hyped, is open to debate. And as the volume of information grows, so does the risk of error. Given that the information in AFI is not required to be accurate, it is likely that the data from Palantir is similarly unverified. Palantir also supplies AFI’s analytical platform and works extensively with ICE, as discussed later.

Since 2015, CBP has awarded contracts worth about $3.2 million to Babel Street, an open-source and social media intelligence company, for software licenses and maintenance for the CBP unit that manages AFI, the Targeting and Analysis Systems Program Directorate. According to the company’s website, Babel Street technologies provide access to millions of data sources in more than 200 languages; a number of analytic capabilities, including sentiment analysis in 18 languages; and link analysis. Users can also export data to integrate with Palantir analytic software. CBP likely uses Babel Street’s web-based application, Babel X, which is a multilingual text-analytics platform that has access to more than 25 social media sites, including Facebook, Instagram, and Twitter. There are few details about how Babel Street software is used by CBP and what sorts of social media data it may provide for AFI.

Additionally, ATS and the DHS Data Framework both have their own link and “social network” analysis capabilities, though little is known about how those capabilities function.

In sum, while we know that CBP undertakes extensive analyses of social media information, from assessing risk level to predictive and trend analysis to “social network analysis,” we know almost nothing about the validity of these techniques or whether they are using discriminatory proxies. Partnerships with data mining companies such as Palantir raise additional concerns about the incorporation of large pools of unverified data into DHS systems, as well as privacy concerns about allowing a private company access to sensitive personal data. The increasing consolidation of data into CBP’s expansive intelligence-gathering databases, as well as into the DHS Data Framework, further compounds the issues created by DHS’s vague, overbroad, and opaque standards for collection of social media data and its tendency to recycle that data for unknown and potentially discriminatory ends.