Podcast AI, Sounds Like a Takeover… Season 1, Episode 3

Landing Page Ep.3 1200x480 V1
Somer Simpson
Host Somer Simpson Responsible Advertising Advocate
Guest Patrick Hall Principal Scientist at Bnh.ai
Sarah Luger
GUEST Sarah K. Luger, PhD Senior Director of AI, NLP, & ML at Orange Silicon Valley

Episode Description

In this episode, your host, Somer Simpson, is joined by Senior Director of AI, NLP, and Machine Learning at Orange Silicon Valley, Dr. Sarah Luger and Principal Scientist at Bnh.ai, Patrick Hall. They unpack the role of ethics in AI and technology. Diving into questions like, how can we trust the decisions our AI models are making? What is the responsibility and role of explainable AI techniques? And how does bias affect your AI?

Stay up to date with the latest podcast episodes, blogs, and news.

Transcript:

Somer Simpson

This is What the Adtech: Let’s Talk Responsible Advertising. Over the past few years, consumers have started holding marketers’ feet over the fire, forcing them to be more conscious about ethics and advertising, and intentional about the content they use, the teams behind the campaigns, and overall investments in media. I’m Somer Simpson, and I’ll be having thought-provoking, honest, and raw discussions with some of today’s top marketing minds about the future of ethics and advertising, and what it means for both marketers and consumers today.

And I’m joined today by Patrick Hall, who is the Principal Scientist at Bnh.ai, and Sarah Luger, Senior Director of AI, NLP, and ML, at Orange Silicon Valley.

Sarah Luger

Thanks again for having me. My name is Sarah Luger and I work as an expert in AI at Orange Silicon Valley, the wholly owned subsidiary, based in San Francisco, for the Orange telco, based primarily in France, but all over Europe and West Africa. I do low resource language machine translation that includes native African languages, and responsible AI, amongst other topics touching NLP, such as call center technology, customer support, and data analytics.

Somer Simpson

Patrick, you want to do a quick intro yourself?

Patrick Hall

I work at a law firm that I co-founded. I’m not a lawyer, and I should say that upfront, my expertise is more technical. I help lawyers work through difficult issues around the use of artificial intelligence in the industry and in government. In addition to that, I teach machine learning at the graduate level at George Washington University and get to teach some data ethics classes and some responsible machine learning classes and things like that. Just very happy to be here.

Somer Simpson

Today we’re talking about two little letters that are very, very packed with a lot of stuff: AI, artificial intelligence. So welcome to you both. Thanks for joining me today.

Patrick Hall

Great to be here, and great to be here with Sarah, who I did a panel with I guess about a year ago.

Sarah Luger

Great to be here. Thank you so much for the invitation. Even though a year has passed, this is still a resonating, important theme that I expect us to be speaking about for some time.

Patrick Hall

For years to come.

Sarah Luger

For years. I hope that we don’t view this as, “Oh, we talked about it; we can move on.” This is an ongoing engagement and communication. So thank you, again.

Somer Simpson

Absolutely. It’s very, very complex, constantly changing, and it’s something that we’ve got to keep up on. AI is a hot topic. I think if you look at Google trends, you see searches for it or machine learning or natural language processing have trended up over the past couple of years. But let’s start first by clarifying what AI is. Sarah, let’s start with you and then we’ll pass it over. Like, just kind of give us your TLDR on AI.

Sarah Luger

Artificial intelligence, broadly, is the use of machines to mimic human intelligence. For adtech, that would mean the use of our decisions online, in store, multimodal. So on our phones, other digital advertisement engagement, that then will predict our purchasing and ad serving behavior in the future. Broadly, we are taking the observations that we know as humans about each other and our choices, and we’re speeding them up; we’re making them more accurate, and we’re mimicking that intelligence with machines.

Patrick Hall

What has ended up being one important thing in our work that we’ve learned, and this is just a long learned lesson in risk management, is you need to define things. So your organization likely needs to have your own definition of AI and machine learning that you can work with – it makes sense to you and is consistent internally. One way that you could know if you’ve picked a bad definition is if AI means the same thing as data mining, big data, and any kind of commercial endeavor involving data. I think we’d like to be a little bit more narrow and focused in that. In this pop sort of way, we tend to mix these terms like AI, machine learning, data mining, and data science. If you feel like your definition of AI is so broad that you’re able to do that, then it’s likely not the best definition.

Sarah Luger

That’s a great point. So artificial intelligence versus natural intelligence, which is what we call human intelligence.

Somer Simpson

One of the things we hear a lot when people talk about artificial intelligence in terms of, like, responsible media responsible advertising… you hear the term ‘bias’ come up a lot. Talk to me a little bit about: when people say bias in machine learning, what is it that they’re talking about?

Patrick Hall

It’s been a great privilege to work with NIST, National Institute of Standards and Technology, on some of their sort of proto-standard settings around bias and AI, which is a very large and difficult topic and highly impactful to people’s lives. According to this rather exhaustive study, but there have been other exhaustive studies and people can have other definitions, we tend to talk about systemic biases, human biases, and technological or statistical biases. We taxonomize something like 40 to 60 different types of individual biases and up level those into human biases, having to deal with like human cognition and behavioral psychology, systemic bias, so institutional biases, or historical biases, and then statistical biases, so things around data sampling and data labeling. It’s a broad subject.

Sarah Luger

I think something that’s really important in what Patrick said is that human bias is something, I think, with AI and the discussion of bias, we’re all talking about a little bit more, because it’s a forcing function for us to say, “Hey, computers didn’t introduce bias – we have bias in how we make decisions.” But the systemic, the how we’re building systems, and what we’re inputting into those systems whose voices are in the room, how we’re considering a problem to be contextualized and then solved, is very important. As is looking at, as Patrick said, statistical sampling. Are we mathematically, algorithmically, making the most correct decisions to protect protected groups?

Somer Simpson

That’s actually a good segue. Let’s apply this, so people have a good example and understand. One of the most common things that we do in advertising is we define the audience that we’re trying to reach. There’s lots of different ways to do that. I can buy data segments from third-party vendors, where they’ve predefined these things. I can look at what people are reading, contextual, lots of different things. How does bias appear when people are defining the audiences they’re trying to reach or can it?

Patrick Hall

I’ll jump in here first. It’s a very difficult question to answer, but I don’t think it’s helpful for your listeners, or really almost anyone to listen to experts dance around a difficult topic. So just to be as direct as possible, I’d say it very often arises from training data in machine learning. That would either mean that your data is not representative of the population that it’s going to be used on, which is pretty typical, right? Like, just because of all these systemic biases, we oftentimes end up getting more information from men and more information from white men. Then we go and use the system that’s been trained on that data on not white men, and bias can arise. Another way that this happens is valid data. A credit score is generally considered a valid input into a machine learning system because it will predict what it’s supposed to predict. A credit score helps us predict whether someone will pay their bills. Unfortunately, it’s correlated with race, and in the case of credit score, quite dramatically. So when you bring credit score into a model, you’re bringing racial information and systemic bias into your model. The thing to do if you’ve used something like credit score, or some kind of proxy for race, or gender or other demographic information, is to then test the output to help decide if it is encoding bias. I would be very remiss if I didn’t just add the caveat before I turn it over to Sarah, that bias comes from all kinds of places that aren’t data. Bias comes from human decision-making, bias comes from a lack of governance, bias comes from a lack of talking to your customers. It’s not just about data and algorithms. For the listeners, they’re trying to get the best bang for their buck here. Very often it comes from training data.

Sarah Luger

I agree, Patrick, I think broadly looking at advertising technology, we have to think about what the goals are. As technologists, we want to use the best technology possible to provide a great service to our customers. Cool, intuitive, felicitous, something that makes you happy and gives you a sense of being catered to, is a great customer experience. Now, there is a lot of data out there. As Patrick said, data is not the only place where bias hangs out. But it hangs out quite insidiously because historically data has been very expensive. We’ve collected it from a lot of different places, including with information such as zip codes, gender; as he said, our credit scores are laden with correlated data that isn’t as fair as it should be. But what happens when we buy this data and we bring it into our systems where we use it then for segmenting and gathering more context on our users, is that it carries with it the biases of the people who created it. So as consumers, and as folks who care about creating a great consumer experience, I want to make sure that I am being catered to, but I want to make sure that I’m being catered to in a way that doesn’t violate my or other folks’ data privacy. So how do we stop that? Somer, you raised an actionable, really great point. We can talk about this, but how is it actionable? Well, we have to look at the data ecosystem; we have to understand what goes into our system, what comes out of our system, and that “garbage in, garbage out” nuance in our entire pipeline. And I think that that association with trust, transparency around data, is something that is going to be very aligned with brand trust in the future. It isn’t just us who are using it right now – it’s the folks that we are buying from and the folks that we are selling to. This is going to become an ecosystem of trust. I’ll move on, so that other folks can jump in, because I believe this is a key point that we all agree on.

Somer Simpson

I want to dig in a little bit more on something that you said, Sarah – the privacy piece and transparency. When you talk about economic models, you can’t have a market that is fair unless you have transparency. Without transparency, whoever has the most of something is going to rule the roost when it comes to that market. We see it all the time in the stock market. That’s why insider trading is illegal. Talk to me a little bit more about what you said about privacy. Because with things like GDPR, where there’s more restrictions and protections on consumer data privacy, that makes the job that we do of getting good data, unbiased data in building unbiased models so that we can reach people with ads, a lot harder.

Sarah Luger

It makes it a lot harder, but as someone who works for a European based company, we take GDPR very seriously. I think it makes it more rigorous. I think it produces better products and makes a lot of very smart software developers, very engaged product leaders, understand more fully the breadth of how their product touches data and the aspects of possible product weakness. I think it’s hard, but I think it’s important. You also raise the stock market. These are areas that have a lot of governmental oversight because there are challenges with information, access, monopolies. Fairness matters, not just in adtech, but in a lot of places. We’re in the cutting-edge, early adoption, state-of-the-art – I’ll use all the buzzwords – times. So the innovation is there; we have a lot of great ideas; but the responsible checkpoints are being filtered in more slowly. This is just how technology grows. We didn’t have seatbelts and stop lights for decades after we’d invented the combustion engine and automobiles; these things come. But I think it’s important to note that GDPR has pushed us to make better products.

Patrick Hall

I’d love to chime in there on a couple of really good points. Let’s talk about the intersection of privacy and fairness first. Just because you haven’t thought about it doesn’t mean that you have to obey both very complex data privacy obligations and very complex non-discrimination obligations in the US. And this is quite difficult. No AI vendor is out there saying our “AI is great,” and think about all the legal bills and just difficulty you’re going to have complying with both non-discrimination and data privacy laws, which, by the way, in the US vary by vertical and by state, but that’s a real issue for people. And so, if you’re smart, like Sarah pointed out, I think there’s no pleasant future in which AI is less regulated. If AI is unregulated, then we’re going to live in a very bad world, in my opinion. AI is going to get more regulated, and as it does, I think you really want to be on the side of that fight where the regulation, like Sarah said, is driving you to make better products. And so, if you’re really smart right now in the market, you’re at least starting to think about, “Wow, regulators are starting to pay attention to fairness; wow, regulators are starting to pay attention to privacy.” It can be complex to find the solution to both of those at the same time to say the least. So if you’re smart, I think you’re going to start thinking about it. And if you’re smart, it’s going to make your products better.

Sarah Luger

I agree. Privacy is something that, as individuals, we are coming to terms with in the post-Facebook social media era. We’ve enjoyed a lot of value from trading our privacy. But there’s a big difference between chatting with a high school friend online or starting your small business and being chosen or not chosen for a mortgage based on the information and decisions that are out there. There are cases where your friends online, the credit worthiness of your high school buddies, affects your credit worthiness, regardless of how close you live to them, just because they’re in your network. Privacy is also very culturally divergent. Maybe we should define for us what this means for this group.

Somer Simpson

How do you? I have my definition: there’s a line and maybe it’s a personal line. For me, the line is I’m willing to allow myself to be tracked. Non-personalized advertising is a horrifying experience, and I get free access to content. Where I’m not okay is I don’t want people selling my phone number and my email address so that I end up getting spammed all the time, but that’s my line. Where do you draw the line?

Patrick Hall

I think it’s going to take too long for me to define that. For the purposes of the podcast, I’d say your definition is great and it reflects a level of education and access that, quite simply, a lot of people won’t have. All I will say to that is, again, trying to make comments that are useful to the audience here: bias in your system and unfairness in your system can arise from things like the digital divide. Not everyone sits in these offices with fast laptops and fast internet access, and all that kind of stuff. There is a difference in the way that different populations are able to access your products, and that alone can drive bias. Hearing a pretty educated definition of what privacy means in this day and age, coming from Somer, it just made me think about that. That was a point I wanted to bring up anyway. So I’m going to punt on defining privacy for me and try to draw this point and try to make this other point that I think is also important.

Sarah Luger

Sounds good. I’ll jump in. I think privacy for me is an understanding of the value of my data and a personal arbitrage with external systems about what it can get me.

Patrick Hall

I like that, too.

Sarah Luger

I think it’s very different for different folks in different environments. But I also know that if I came from a community where an ad company was very curious about taste and choices, then I would be more valuable than if I was more readily available data. And I think that if we can educate our consumers to understand that their taste, opinion, and choices matter, they might gain a little bit more confidence in terms of what they would be willing to trade their insights for.

Somer Simpson

That makes a lot of sense. It kind of interestingly leads into the next question a little bit. If there was an upvote or downvote, on every single ad that you saw, or you answered questions to deliver the ads to you, to me, that feels a little bit like your newsfeed in social. Let’s say that we stopped all activity from now moving forward on further regulating AI and further regulating privacy. Let’s take that out logically. What kind of world are we living in in 10 years?

Patrick Hall

In the US, we’re living under an extremely high degree of commercial surveillance. We’ve already talked about how fairness and privacy are things that could be different in different cultures and different for different people, but I think it is important to point out that internationally, there are reports of authoritarian regimes already using facial recognition and other types of AI just explicitly for surveillance and targeting and profiling in ways that I think most Americans would find pretty horribly objectionable. I don’t think we’re headed to that direction, as long as we stay a democracy, fingers crossed. But I think without regulation, we’re headed into a pretty intolerable state of commercial surveillance, which some people may feel like we’re already there,

Sarah Luger

I think your points are really crucial to understanding the threat. With AI, there’s the promise of good, of opportunity, of innovative change. But there’s the peril of dystopian future; the robots are in control. I’m being a little aggressive with my phrasing, but I think understanding our data value and understanding there are some things about us that we want to have reflected in our choices, and in a capitalist society, that means the things we buy, the way we decorate our home, the newspapers that we read, that is very culturally nuanced for other communities. But if we push for more oversight and reflection on the individual’s role, I think that it would actually help innovation because right now, I am not 100% happy with the ads I’m served. I think the biggest challenge as a human is creativity. Innovation around what is the item that I want to see next – Amazon has made some progress on that a decade ago – but I still get served ads for couches, even though I’ve bought a couch. I’m in a one couch home, maybe two, and the lifespan of couches would entail me to maybe not be interested for a few more years. We know this as humans. This is something that our systems could be forced to learn.

Somer Simpson

I’m interested in how you approach that. I remember years ago, on Amazon I would buy a baby gift for somebody who was having a shower, and for the next week, I got nothing but kids’ toys and baby clothes in my recommendations feed. It’s gotten better since then; Amazon’s made some investment in it. What other kinds of investments do people have to put in so that what you just described is true?

Patrick Hall

I’m gonna jump in here and say human intelligence. I hope we’re at peak, “Pour your bad data into a bad blackbox algorithm and get scores out.” Nothing against my professors or my schools, but that’s basically how I was taught to do data mining. Now I teach my students not to do that. I hope we’re getting to the end of this, “Pour bad data into a bad blackbox algorithm and blindly trust the scores to come out.” And, by the way, if you’re doing that, or you think the people at your company are doing that, or your organization are doing that, then the scores probably aren’t very good, and you’re probably not getting as much bang for your buck as you could. So I think we want to step back. I think these problems now are fixed by human intelligence, better design, better constraints on models, that are post-processing of modeling results with business rules, or if you want to sound fancy, you can say model assertions, and governance – governance structures, governance of the people who make the model. If someone doesn’t do a good job making a model, they don’t get a big bonus that year, right? We’re not rating people on “How many models can they deploy?” We’re rating people on “Did the model work in the real world or not?” I think it’s back to human intelligence for a while. I think AI is seeing some great leaps forward in the last few years, and honestly I think it’s plateauing, and it’s time to get back to human intelligence, is my take there.

Sarah Luger

I think one way to incorporate Patrick’s ideas is to think about evaluation, human in the loop. This is how humans fit into these pipelines. How do we ascertain whether or not this was a good ad? Somer’s idea of the feedback loop of thumbs up or serve me more ads like this, we can create proxies for things we like digitally. We’re still in this reductionist era of we are complicated humans who have tastes and opinion that we’re trying to turn into boxes that fit the best options in front of us. So holistically, how do we include more human opinion and more human smarts?

Somer Simpson

Where does something like that go wrong, though?

Patrick Hall

There is the risk of what is called confirmation bias, but I would actually say the risk of confirmation bias is worse when we’re using blackbox models and bad data that we would have a hard time evaluating the usefulness of anyway. People who are much smarter, richer, and more famous than me would tell me I was crazy, but today I see more risk on the side of less human involvement versus more human involvement. But I think we forget the reason that we all went down this path anyway. Why did I spend so long in school? Why did I choose this for a career? The promise is better decisions, right? The promise is more objective decisions. We’re just a long way from that now. Just because we can’t do that now, doesn’t mean we can’t do it in the future. Again, trying to bring this home for the listeners, I think what this means is: it’s okay to apply human common sense and your business domain knowledge to what your data scientists are doing and the AI tools that you’re buying from vendors. It’s okay to ask competence questions; it’s okay to process results to reflect your domain knowledge; it’s okay not to deploy a model if you don’t think it makes any sense. So I’ll leave my comments there. It’s a deep subject.

Sarah Luger

Well, Patrick, when you mentioned the promise, and the deep mathematical and predictive background to a lot of these decisions, you really nailed it by saying this is working to a degree. We get much faster results; computers can do this extraordinarily quickly, so we are getting some bad results very quickly. We just need to go through our systems with humans to say, “Is this expected; is this an outlier?” But our human in the loop crowds, or our teams in house also need to be diverse; they need to reflect broader audiences than just perhaps a couple software engineers or a couple of folks who are being served this data in an outsourced manner. We’re using humans in the loop in a lot of AI systems Patrick raised, some that could be used in a more nefarious way than others. But it’s really important that we treat every aspect of these pipelines, human and computer, with the appropriate reflection to de-bias. Who are these people? Why are they involved? Do they reflect our customers? Can they help us make better decisions?

Patrick Hall

Talk to your customers, if that’s at all possible. A major leading bias mitigate from the NIST report was what’s called Human Centered Design, which boils down to talking to people, talking to a diverse set of stakeholders and users when you design the system, keeping their input in mind when you implement the system, and going back to them routinely, and making sure that the system is continuing to work the way it was supposed to and they’re not experiencing any bias or other issues. Again, just very common sense ideas like human-centered design and governance – don’t give people big bonuses who deploy bad models. In banking, this is how it’s done. Not to put banks on a pedestal, but they’re good at model governance. Generally speaking, your average big bank is much better at governing their machine learning models than any other type of company. They don’t give bonuses and don’t promote people who turn out crap Python code.

Somer Simpson

Yeah, it’s about outcomes over output. Right? It’s not how many models do you deliver? It’s the quality of it.

Patrick Hall

Yes, human outcomes. Listeners, if you want to do AI right, it’s about human outcomes, not tech.

Sarah Luger

Exactly. Is it improving your product? Is it making it faster, better? Can you measure it? It’s not magic. Avoid magic.

Patrick Hall

And if you think it’s magic, then something’s wrong. If somebody’s telling you it’s magic, they’re wrong. There’s no magic here.

Somer Simpson

So those are good best practices for tech companies. What are some best practices or advice for people who are customers of the tech who are making these purchasing decisions?

Sarah Luger

I think, walking into discussions with a little bit of research and education, figure out, holistically, I know I keep using this term, but the big picture of what your customer is seeking and what is their success criteria. Then look at the ways that the person selling you the services is couching it. If you push back and say, “Can you tell me more about how you’re using our data, our customers’ data, or other people’s data,” they should be able to give you concrete sentences, real facts about your or someone else’s future data. But you should also be able to measure these results. What is better? So that means walking in the door with a really good sense of what is a successful product. Number one, Bloomberg says 60% of companies that say they have AI under the hood, do not. But it’s not what is the measure of AI; it’s not a salve that comes out of a tube: it’s actually about your customers’ experience and better products. If you can measure that, then you can figure out how much this product that says it has AI is valued, what the value is to you.

Patrick Hall

Just to add on to some of Sarah’s comments: AI and machine learning are strange, they have this strange characteristic, where in certain parts of the economy they’re highly developed, they’ve been used for decades. But in most of the economy, they’re very undeveloped. It’s a very immature, very frothy market. So I’m going to focus on that part of the market because that’s likely where most listeners would be interacting with it, assuming you’re not using AI the way that DARPA or some incredibly rich hedge fund would be using AI. There are snake oil actors in this immature market, right? That’s just a point of fact. So go into an interaction with a potential vendor knowing that there are snake oil vendors out there, just because it’s an immature, frothy new market. Things that I look for when interacting with AI vendors are documentation: can they give you a professionally formatted little booklet that says how their code works? They get bonus points if that booklet has a bibliography that cites textbooks and papers. Another trick to look for is release schedules. Now this is getting a little bit trickier as companies move into continuous integration, continuous developments, or continuous release schedules. But most AI companies aren’t actually there yet, and that should be a little bit of an eye-opening fact. So if a company has a really erratic release schedule, that’s also typically a bad sign. The company should be able to tell you, “Hey, new big release coming in four months, we can get your feedback into that release in four months; here’s the nice professional documentation for the current product.” Those are just some basic rules of the road that I use when interacting with AI vendors.

Sarah Luger

Patrick, I think that’s really resonating. As someone who works at a European company, we care a lot about how my colleagues, the 140,000 people I work with, feel about the products that we build. This isn’t just about our customers, but our internal customers. We spend a great deal of time working on trust and on thought leadership. We purchase products from companies that similarly focus on trust and thought leadership. The amount of money that can be made from a great predictive analytics product that uses artificial intelligence wisel, it’s tremendous. So getting there and earning the trust can happen in a bunch of different ways: white papers, having folks on staff that know what they’re doing and are very happy to come on to a call with you and do a technical deep dive, that’s key. Technical sales, that is transparently accessible, they’re speaking in a language that you understand. You are the consumer: it doesn’t matter if they say they know more than you; you are the one making that purchasing decision; it should be in language you understand.

Somer Simpson

For the folks who maybe didn’t know, didn’t do their due diligence, and they end up leveraging technology from one of those snake oil vendors, what are the risks? What are the risks to them, to their company?

Patrick Hall

Let’s focus on the risk of the consumer first, the risk to the consumer is going to cause the risk to the company. I’m sure that’s actually what you meant: I don’t want to put words in your mouth or something like that. But it is really easy in these discussions, when we all work for the company, to think about the company. We should really be focused on the consumer of the technology first. It could cause some kind of bad outcome: an offensive ad, just wrong ads, that’s probably the most likely thing, just being wrong. I nearly guarantee you, the second most likely thing is wrong in a way that’s biased and offensive. So then customers are offended, maybe they talk to reporters, maybe it goes on social media. So I would say that primarily the risks are reputational; you know, the risks are sort of harm for the consumer, varying levels of harm for the consumer, leading to reputational risk for the company. I’ll bring up two sort of extreme examples that I do think companies should be aware of. On the extreme side of the risk for the company, there are serious risks. So if you were somehow to violate some kind of federal non-discrimination or state non-discrimination statute, you could face some kind of regulatory risk. That seems a little distant in the space, though, but not entirely impossible. Another thing that doesn’t seem entirely impossible is that you could somehow trigger what’s known as UDAAP: unfair, deceptive, and predatory practices. The Federal Trade Commission has shut down three AI systems over the past few years over being unfair, deceptive, or predatory. So if you’re doing something really flagrant, you can get into legal trouble. That’s something to be aware of.

Sarah Luger

I think it’s also important to remember that your customers are your biggest advocates. When you have something like a data breach, or your system has allowed their private information to be turned into spam calls, or exposing their data in a way that not only makes them turn away from your offering, that makes them actively vocal about what a terrible experience that is – that’s a bridge that’s hard to to rebuild.

Somer Simpson

For folks that want to address bias and do bias testing and that sort of thing, any response to people who say things like, “I can’t deal with bias in my AI because of consumer data privacy?”

Patrick Hall

Just recently there was a letter from some Massachusetts senators on this issue to a very specific set of companies, at least claiming that data privacy obligations do not exempt you from your non-discrimination obligations. Again, I’m not a lawyer; I don’t ever want to pretend to be a lawyer; I think that these things are complicated. The best companies are going to start figuring out how to do both. It is going to be difficult. Data privacy does make it harder to collect the data that you need to test these algorithms for bias – that’s just a fact. But there are ways to infer the data; there are ways to be careful and obey both sets of laws. I think, especially the larger the company is, the more the expectation is that they’re going to be able to do both.

Sarah Luger

I think if someone says they can’t deal with the bias; it’s too complicated – that’s a red flag that says perhaps this is not the company that you want to partner with. That flag can come up in a bunch of different ways. I like how Patrick couched some of the decision-making around being an advocate for your customers. But when you look at the market that’s come up around AI data, around ad technology, around consumer advocacy in this realm, there are options. When someone says that, please consider another option.

Patrick Hall

What I would add here, just to try to give a tangible takeaway: a lot of my hard lessons were learned in consumer finance. A hard lesson learned there is they have very strict data collection requirements. They’re oftentimes not allowed to collect the data that they need to do bias testing, and yet they do it. And there’s a process known as BISC (Bayesian Improved Surname and Geocoding): it’s not perfect; it’’s very far from perfect. But it’s essentially an example of a technical solution that consumer finance organizations were able to come up with to say, “Data privacy prevents us from collecting the data we need to test, yet we still have to follow these testing stipulations set out in acts of Congress and the accompanying regulations.” And so they just found a way around it right. I think as we all become more serious about AI, we’re going to find ourselves having to do things like that. There is a cost to using this technology; people would like to have you believe that it was free, cheap, and easy. None of that is true. There’s just a cost to using this technology, just like everything else. There’s no magic. If it starts to feel magic, things have gone wrong.

Sarah Luger

When we talk about costs, one of the arguments used by thought leaders in responsible AI is that if the product is free, you are the product. So perhaps seems less dire with blue couch, red couch, but there are things about us that are being tracked, that are more integral to our identity. That may be facial images, fingerprints, our voice – our voice is very identifiable. Many of us use it in a fun way with smart devices. We also use it in ways that Patrick has alluded to: in finance, my voice says my password. Other products that should make our lives easier, but could also be used in inauspicious portals with a doctor.

Patrick Hall

I would just say again, trying to help the audience members think through what could be higher risk: anything having to do with biometric data is higher risk. You can do great things with biometrics: you can do horrible, terrifying, scary, unfair things with biometrics as well. So just given the two sides of that coin, I’d say almost anything involving biometric data is higher risk, and that’s one of those yellow flags to start proceeding with caution if people are using biometric data for anything, that’s like a yellow flag to start thinking about the risk.

Somer Simpson

Yup, cool.

Sarah Luger

Well put.

Somer Simpson

So let’s leave this on a positive note. I’m gonna put you both on the spot here, no prep on this question. Over the next 12 months, what are you hearing right now in the world of AI that you’re really excited about? It could be anything, doesn’t have to be ad tech. What is something that makes you say, “Oh my God, that’s really cool – this is going to have this kind of impact”?

Patrick Hall

I’ll give two things. They’re related, so two related things. For one, I think that we’re just going to see more and more regulatory attention here. It’s debatable whether the EU may pass a GDPR-like act, the EU AI act, that could have far-reaching effects in the US before we ever get it regulated here. Whether there is actual congressional movement, or just more regulatory attention, I think that we can expect that more on the regulatory attention side, less on the congressional side. What’s cool is that researchers, practitioners, and all kinds of smart people are making tools and frameworks that will help with this. So I’m a big proponent of interpretable models, causal modeling, and using human constraints in models. And, as I look out into the interwebs, I just see more and more tools for this. I just see more and more tools for injecting human knowledge into machine learning models directly and modifying machine learning models of human knowledge. I think that we’re going to see an uptick in regulatory attention. I’m happy to say that I’m seeing an uptick in the tools that would help us deal with that as well.

Sarah Luger

I really liked your point about bringing humans into the model building. I think there’s some really interesting work…

Patrick Hall

It’s all about constraints.

Sarah Luger

…that I’ve been doing with some colleagues at RIT on including human annotations farther in the machine learning pipeline. With the advantages in speed and accuracy in model building and memory, we don’t have to be so reductionist as early in the annotation to prediction pipeline. And so, this is fantastic: this means that if we view systems as they are, as reductionist representations of our multifaceted, incredibly complex world, we’re making real strides towards including voices farther along in that pipeline. I think, in that same vein, the really transcendent developments using NLP technology word embeddings in all different parts of search – Google recently showed its image and text search capabilities – how do we search? We’ve all had to change our queries into ones that smart devices understand, the same way that we’ve had to change our natural queries into ones that fit in a Google text box. OpenAI is doing some very interesting things with their DALL-E system. Again, this is viewing the world –I believe it’s called DALL-E 2; I’m not getting a percentage from them – but I do think this is really interesting, image plus text. We use gesture. This is something that we’re all becoming a lot more cognizant of, in a remote communication world. I wonder if I’m gesturing more or less than I used to, because the feedback is just different. Finally, I’d like to echo Patrick’s comments about oversight, government regulation. I do think that there’s incredible potential, but having deeper understanding in our government, in our communities, about how AI can be used and can be effectively implemented in a safe manner – these are tools for us; they have to improve our lives. If they are fomenting concern and fear, then we’re doing something wrong. So trust and using community involvement and education to get to that trust is really key. I’m upbeat as well. Thank you for encouraging us to end on upbeat. Patrick and I, after the last two years, very few of us can survive on a day-to-day basis if we did not see this as a glass more than half full.

Patrick Hall

Just really quick, I know we need to wrap up here. I think it’s important to say, though: the draft NIST AI risk management framework, which will no doubt be highly influential on the use of AI in coming decades, much like the NIST data privacy framework and the NIST cybersecurity framework, that is open for public comment now, and you should comment on it. If these matters are important to you, and you’re an American, you should comment on that, and they want comments. So, just again, trying to end on a positive note: if there’s things that are important here to you that you want to see reflected in policy, now’s your chance. So jump in.

Sarah Luger

Exactly. If you are working at a company that wants to gain trust, then think about joining some of these consortiums. LFAI, Linux Foundation on AI, was one that Orange has been a member of. There’s a lot of accessible educational resources that are also advocacy. It’s hand in hand, and this is a great way to not outsource your knowledge, but gain knowledge through community involvement.

Somer Simpson

Excellent. So get educated and get involved, right? Have a voice.

Patrick Hall

Yes.

Somer Simpson

Just like humans have to have a voice in AI.

Sarah Luger

Don’t say the computer’s got it, no. Any time you can insert the word ‘magic,’ no. These are tools.

Somer Simpson

Excellent. Thank you both, Patrick and Sarah. This has been a great conversation. I’ve learned a lot, and I hope our audience has as well.

Patrick Hall

Oh, it’s been my pleasure. Thank you so much, Somer. And I think it’s important to note that we’re actually just starting to talk about these things. This is just the beginning, so thanks for having me. Hopefully I said some useful things for your listeners.

Sarah Luger

I think that I’m not putting words in Patrick’s mouth by saying if anyone listening today has follow-up questions or ideas that they’d like to reach out to us about, please do. We’re accessible. The information will be in the podcast header, but it’s really important that this doesn’t become someone in-the-know versus not-in-the-know. This is really about a fair marketplace: us all rising to the occasion and producing great products. So thank you again for having me. And I look forward to working with Patrick and Quantcast again at a future date. We might just be a package deal. Thank you, Somer.

Somer Simpson

Great, thank you both. This podcast is brought to you by Quantcast. Our mission is to radically simplify advertising on the open internet. We are the creators of a new and innovative intelligent audience platform that empowers brands, agencies, and publishers to know and grow their audiences online. The Quantcast Platform, powered by our patented AI and machine learning engine, delivers automated insights, marketing performance, and results at scale to drive business growth responsibly. Our solutions are leveling the playing field for our customers when it comes to effectively reaching audiences online and helping them power a thriving free and open internet for everyone. Connect with us today at quantcast.com.