Princeton privacy experts are warning that advertising and analytics firms can secretly extract site usernames from browsers using hidden login fields and tie non-authenticated users visiting a site with their profiles or emails on that domain.

This type of abusive behavior is possible because of a design flaw in the login managers included with all browsers, login managers that allow browsers to remember a user's username and password for specific sites and auto-insert it in login fields when the user visits that site again.

Experts say that web trackers can embed hidden login forms on sites where the tracking scripts are loaded. Because of the way the login managers work, the browser will fill these fields with the user's login information, such as username and passwords.

Old browser design flaw. New abuse.

The trick is an old one, known for more than a decade [1, 2, 3, 4, 5], but until now it's only been used by hackers trying to collect login information during XSS (cross-site scripting) attacks.

Princeton researchers say they recently found two web tracking services that utilize hidden login forms to collect login information.

Fortunately, none of the two services collected password information, but only the user's username or email address —depending on what each domain uses for the login process.

The two services are Adthink (audienceinsights.net) and OnAudience (behavioralengine.com), and Princeton researchers said they identified scripts from these two that collected login info on 1,110 sites found on the Alexa Top 1 Million sites list.

Stealing user login data improves ad tracking profiles

In this particular case, the two companies were extracting the username/email from the login field, creating a hash, and tieing that hash with the site visitor's existing advertising profile.

Email addresses are unique and persistent, and thus the hash of an email address is an excellent tracking identifier. A user’s email address will almost never change — clearing cookies, using private browsing mode, or switching devices won’t prevent tracking. The hash of an email address can be used to connect the pieces of an online profile scattered across different browsers, devices, and mobile apps. It can also serve as a link between browsing history profiles before and after cookie clears. In a previous blog post on email tracking, we described in detail why a hashed email address is not an anonymous identifier.

Researchers from the Princeton Center for Information Technology Policy (CITP) also created a demo page that users can test (using fake credentials) and see if their browser's login manager fills in the hidden field.

At the time of writing, all major browsers except Brave appear to be vulnerable to this type of attack, coughing up usernames and passwords from the hidden login fields.

Only Chromium-based browsers will delay the disclosure of the user's password until the user has interacted with the page via a click, albeit this is not a very secure method of protecting users, as most end up on clicking on a page.

GDPR problems in the EU

But the problems that arise from this type of secret user data harvesting aren't limited to an individual user's privacy alone.

According to Dr. Lukasz Olejnik, independent cybersecurity, privacy researcher, and an affiliatee of Princeton’s CITP, web trackers engaging in such practices are complicating matters for the sites where their scripts are loaded.

Many site owners are probably unaware of this questionable tracking tactic, which means they are also unaware that they may be in a clear violation of the EU's upcoming GDPR regulation.

Direct violation of #GDPR? Website publishers may be facing fines. Best news? They probably don’t even know about it. https://t.co/UxMf6FeMTS — Lukasz Olejnik (@lukOlejnik) December 28, 2017

The simplest way to prevent such attacks would be if browsers would autofill the login fields only on user interaction with the actual login fields. If the fields are hidden, the user won't interact with them and this attack would become inefficient.

UPDATE [December 30, 2017]: Following the publishing of the Princeton research, an OnAudience spokesperson has reached out to members of the press with the following statement explaining the company's position:

"As a Big Data company, we do our best not only to collect sufficient amount of data about internet users but also to protect their privacy and security. As it is clearly visible in our scripts we are not gathering e-mail addresses or passwords. In fact we collect anonymous e-mail shortcuts generated by well-known and widely used hashing algorithm. This method is commonly used in modern marketing automation platforms and is supported by the leading ad technology providers. We used them for the sole purpose of e-mail retargeting using double opt-in mailing lists on behalf of our customers. In this case the script was gathering data for our legacy platform BehavioralEngine.com.



Our DMP OnAudience.com is a completely different technology and uses other methods to gather information. Moreover, there is no exchange of data between BehavioralEngine and OnAudience. All data gathered by our DMP is automatically anonymised and processed in real time by its machine learning algorithms to ensure the highest precision in ad targeting and other marketing activities carried out for our clients. Digital information available in our data warehouse is never combined with any data, that would allow crackers to identify people online. Since we started our activity there has never been any incident of that sort although we process over 9 billion anonymous profiles of Internet users from around the globe" - Piotr Prajsnar, CEO at Cloud Technologies