“Don’t post photos of yourself smoking on social-media sites. Do post photos of yourself running.” These two suggestions appear in a recent Wall Street Journal article about New York state’s new rules for how life insurance companies can use public data to help set premiums. Such tips — under the heading “what you pay for life insurance could depend on your next Instagram post” — seem ominous, portending a surveilled future where tweeting about rock climbing could hurt your wallet and services exist to curate photos that appeal to insurance companies.

In reality, it has long been the case that what you pay for life insurance could at least be affected by your next Instagram post. It is already legal, and increasingly common, for life insurers to use so-called “nontraditional” sources of public data — including credit scores, court documents, and motor vehicle records — to inform insurance underwriting decisions, though few use actual social media data.

New York is simply the first state to release guidelines around this practice, and its ruling is that nontraditional data is okay so long as a company doesn’t discriminate by factors like race, religion, and sexual orientation. Other states are likely to watch and follow suit. (The New York State Department of Financial Services, which released the guidance, declined to make a spokesperson available for comment.)

Life insurance companies want to update their methods and make their businesses more efficient. Consumers fear their public information being misused in discriminatory ways. The nature of the industry doesn’t do anything to alleviate those fears, either, because life insurance inherently differentiates between people; different factors cause people to pay different premiums. Government regulators want to balance the interests of both customers and businesses, but it’s not going to be simple.

At its most simple, life insurance is an attempt to financially protect others in the event of an unexpected death. You pay a premium, and if you die within a certain amount of time, the insurance company pays survivors. If not, the insurance company keeps that money. The process of setting premium rates can be slow and invasive. (It also varies by company, since underwriting methods are considered trade secrets.) Typically, a client will fill out an application that includes medical history and questions about smoking and other lifestyle habits. In other situations, they will also undergo an examination that can include an electrocardiogram and analysis of blood and urine samples. Underwriters with experience in actuarial science take all of this information to calculate levels of risk and set a rate.

Algorithms speed up this process — though there aren’t many cases where a decision is entirely automated — and can make it more precise. Sometimes, the algorithm will greenlight a person so they don’t have to go through the invasive medical tests. The convenience of immediately receiving a policy is appealing to those who don’t want to wait weeks for a doctor’s appointment, and that can lead to more life insurance policies being purchased. And while life insurance sales have traditionally been face-to-face interactions with agents, that mode is quickly falling out of favor, meaning that algorithmic processes are better for online sales.

Nontraditional data comes into play in two different ways. First, bulk, de-identified data is used to train those algorithms so they learn, for example, that a credit score of 450 corresponds to a 20 percent higher risk of death. This data comes from the many vendors of consumer data that collect, build, and sell catalogs of this information. Then, when Jane Doe goes to buy life insurance, a separate program will search the web for her existing public records to feed to the algorithm.

However, using social media data specifically is rare, according to Aite Group senior life insurance analyst Samantha Chow. Out of 160 insurers investigated by New York state, only one used social media and other internet activities in underwriting, according to the Journal, although some vendors did pitch data based on such details as “condition or type of an applicant’s electronic devices” and “how the consumer appears in a photograph.”

And when social media is used, it’s usually to reduce fraud. Tools like Carpe Data use names, emails, and birthdays to look for information on the internet that might show whether someone lied on their application about smoking or drug use. The results won’t be used to decline an applicant, but they can be used to move someone into a riskier rating class with higher premiums, says Chow.

All this is okay, in theory, says New York state, but there is “strong potential to mask the forms of discrimination” that are prohibited by law. As a result, it’s the insurer’s job to make sure the process isn’t discriminatory, which is easier said than done.

It’s simple for an algorithm to not explicitly use racial factors like “Asian”or “black” or any of the protected classes the New York guidance mentions. But if the model includes whether someone streamed Crazy Rich Asians or Black Panther, “you have a proxy for race in your model,” says Madeleine Udell, a professor of operations research at Cornell University. Plus, when models get complicated, it can be hard to pinpoint which exact factor caused a certain outcome.

One way to test for discrimination is to look at the results. Can you predict the outcomes using a protected attribute like race? “If, instead of using the original data to make a life insurance decision, I can predict the premium just by using race, that’s a sign my model is too correlated to protected attributes,” says Udell. “If [a protected attribute like race] doesn’t help me to predict the outcomes, then maybe the model is not so discriminatory.”

That suggestion still doesn’t solve another problem, which isn’t technological at all. While the law might protect against discrimination on the basis of religion or national origin, it won’t protect people from any number of other types of discrimination that an algorithm might determine relevant. Because all insurance pricing is discriminatory, the important thing is to figure out when that actually matters, says Rick Swedloff, a law professor at Rutgers University and an expert in insurance and big data. In the context of life insurance, it’s clear that people who are older will pay more than people who are younger, but few care about that judgment. So is it okay to discriminate for smoking, which already happens? Is it okay to discriminate against people who visit Pinterest if we know they tend to die younger?

“I think saying that you have to prove something is ‘not discriminatory’ punts on the hard normative question of ‘what is discriminatory?’ and ‘what are the justified ways to use data to make judgments about people?’” says Andrew Selbst, an attorney and postdoctoral scholar at Data & Society Research Institute. “I don’t know if we as a society have the right answers to that yet.”

Though The Wall Street Journal’s tips may not be fully necessary yet, technology does move quickly. Experts worry that if this social media surveillance world does come to pass, opaque algorithms and decontextualized data could lead to unfairly higher premiums or ding people who don’t know how to effectively signal health. “It may be that you’re healthy and work out a lot, but you don’t have the capacity or knowledge or resources to know how to represent yourself on social media in the right way,” says Karen Levy, a surveillance scholar and sociologist at Cornell University. The classic example of this is SAT prep classes, which are taken by more affluent students. The student taking the course isn’t necessarily smarter to begin with, but they know what to do and how to prepare to show that they’re smart.

Apart from the risk of higher premiums, being watched — or believing you’re being watched — changes people. For example, Wikipedia users modified their searches after the Edward Snowden revelations. People adjust their behavior under observation, and there is a real cost to always thinking about (and fearing) how actions in one area of our life will affect a seemingly unrelated area. “If we say that we’re going to judge you based on who you associate with, that necessarily impedes who you associate with or what you’re willing to reveal,” says Levy. “And that’s a core Constitutional right, to speak and to associate. If we start to infringe on that by creating fear, that’s a huge decision.”

Udell agrees. “If your concerns are that the information you’re consuming is going to be held against you by insurance companies and used to increase the prices they charge you, maybe that’s going to constrain the types of information you consume,” she says. “If you think they’re going to know you joined a mental health support group on Facebook, maybe you won’t join that mental health support group, and that would be very bad.”

In the absence of broad laws restricting the use of this public information, the usual solution is to call for transparency. From a regulatory perspective, it’s important for companies to tell consumers when they’re using new algorithms and new data sets, and how that might affect things, according to a spokesperson from the National Association of Insurance Commissioners.

Some, like the Massachusetts Mutual Life Insurance Company, are trying to take heed. MassMutual recently created a consumer tool that teaches customers a little about how different pieces of data influence life insurance risk. But because of trade secrets and intellectual property, organizations are never going to fully share the details of their underwriting process, according to Charlotte Tschider, a law professor at DePaul University. “They don’t tell you exactly what went into the calculations,” she says, “and I’m not sure that a detailed disclosure of how the algorithm works is useful.”

Most consumers won’t understand the technology, and there’s a limit to what transparency can achieve when it doesn’t come with power. Some bargains are what Tschider calls “adhesion contracts,” where one side has a lot more power than the other. We can’t negotiate with privacy policies (which nobody reads), and the most transparent policy in the world doesn’t help if we really need to use the service.

In New York, “everyone is doing their due diligence” to understand what it means to not be discriminatory, according to Diane Stuto, managing director for legislative and regulatory affairs at the Life Insurance Council of New York. It will be easier for some to comply than others, and the result may be that certain companies decide not to offer algorithmic underwriting in New York anymore. “We want to be able to offer these programs because we think they’re the future, so we’re grappling with details and trying to figure out what this means,” Stuto says.

One option could be to do an algorithmic impact assessment and run tests similar to the one that Udell described. Even private companies could be required to do these assessments, which means asking questions like: What kinds of data do a company use and why? What are you testing with and without? “It’s not enough to share the code,” says Selbst. “They need to be able to show that they tested for bias, and what kinds of considerations went into it.”

Both he and Swedloff agree that requiring companies to examine their practices is the first step to coming to terms with when it’s okay to charge some groups more. “The most important thing from impact assessments is understanding the rationales that companies go through and making sure they are actually thinking through and doing their homework the best they can,” Selbst continues. Part of the reason we don’t agree on when it is okay to discriminate and when it isn’t is because we don’t have full information about what’s going on. “We don’t understand what the decisions are that lead to these algorithms,” he adds. “Once the public understands that, we can have more reasoned debates.”