I recently took a job in the NYC Mayor’s Office as an unpaid consultant. It’s an interesting time to be working for the Mayor, to be sure – everyone’s waiting to see what happens this week with the election, and all sorts of things are up in the air. Planning essentially stops at December 31st.

I’m working in a data group which deals with social service agency data. That means Child Services, Homeless Services, and the like. Any agency where there there is direct contact with lots of people and their data. The idea is for me to help them out with a project that, if successful, I might be able to take to another city as a product. I’m still working full-time at the same job.

Specifically, my goal is to figure out a way to use data to help the people involved – the homeless, for example – get connected to better services. As a side effect I think this should make the agency more efficient. Far too many data studies only care about efficiency – how to make do with fewer police or fewer ambulances – with no thought or care about whether the people experiencing the services are being affected. I want to start with the people, and hope for efficiency gains, which I believe will come.

One thing that has already amazed me about this job, which I’ve just started, is the conversations people have about the ethics of data privacy.

It is a well-known fact that, as you link more and more data about people together, you can predict their behavior better. So for example, you could theoretically link all the different agency data for a given person into a profile, including crime data, health data, education and the like.

This might help you profile that person, and that might help you offer them better services. But it also might not be what that person wants you to do, especially if you start adding social media information. There’s a tension between the best model and reasonable limits of privacy and decency, even when the model is intended to be used in a primarily helpful manner. It’s more obvious when you’re attempting something insidious like predictive policing, of course.

Now, it shouldn’t shock me to have such conversations, because after all we are talking about some of the most vulnerable populations here. But even so, it does.

In all my time as a predictive modeler, I’ve never been in that kind of conversation, about the malicious things people could do with such-and-such profile information, or with this or that model, unless I started it myself.

When you work as a quant in finance, the data you work with is utterly sanitized to the point where, although it eventually trickles down to humans, you are asked to think of it as generated by some kind of machine, which we call “the market.”

Similarly, when you work in ad tech or other internet modeling, you think of users as the targets of your predatory goals: click on this, user, or buy that, user! They are prey, and the more we know about them the better our aim will be. If we can buy their profiles from Acxiom, all the better for our purposes.

This is the opposite of all of that. Super interesting, and glad I am being given this opportunity.