Getty Images CEO Craig Peters has a plan to defend photography from AI

Getty Images CEO Craig Peters has a plan to defend photography from AI

[ad_1]

We’ve got another great conversation from the Code Conference today: my chat with Getty Images CEO Craig Peters. Getty is one of the most important photography services in the world, and as you might imagine, we talked quite a bit about the promise and peril of generative AI when it comes to photography. Craig was great onstage — he’s direct and no-nonsense about what AI can and can’t do, and we got right into it.

About a year ago, Getty banned users from uploading or selling AI-generated content. At the time, the company said it was concerned about copyright issues in AI. Eventually, it sued Stability AI for training the Stable Diffusion tool on Getty’s photos — at times, Stable Diffusion even generated Getty watermarks on its images.

But Craig doesn’t want to completely stop AI from happening: just before Code, the company announced Generative AI by Getty Images — a tool that generates pretty solid AI photography of its own. Getty trained the tool itself, using images it already had the rights to, and it’s put up some pretty strict guardrails around what it can do. (It certainly cannot generate any images of known celebrities.) What’s more, Getty’s come up with a way to compensate the photographers whose images are being used to generate images, which is pretty interesting — and pretty complicated.

I had early access to the new Getty tool, and we had some fun coming up with prompts to show the Code audience. I encourage you to watch the video of this episode, available on YouTube, so you can get a look at the images we’re talking about. 

You’ll hear Craig talk a lot, not just about copyright issues, but about what people are really talking about when they talk about intellectual property: money. What does compensation for being part of an AI training set look like? How can you distribute that money fairly? 

We also talked about the other elephant in the room when it comes to AI: deepfakes and disinformation. Getty has a long history as a repository of significant, important, and, most importantly, REAL photos of people and events. The 2024 US election is barrelling toward us quickly, and Craig told us that while Getty doesn’t yet have “perfect solutions’’ for disinformation, the date isn’t moving, and he’s working with both partners and competitors to race against the clock to make sure authentic images are the ones people see. 

Okay, Craig Peters, CEO of Getty Images. Here we go.

This transcript has been lightly edited for length and clarity.

We should start with the news, because I think a lot of people expected you to show up and rail against the taking of content by AI but you actually announced an AI tool this week.

We did, we did. We launched it on Monday after coming out of Alpha. We launched that in partnership with Nvidia. So, we partnered up with them and their capabilities to launch what we think is a pretty unique tool. A tool that, first off, respects the IP that it was trained upon. It’s permissioned — it’s trained only off of Getty Images creative content. We are providing rewards back to the creators of that content. So, as we grow revenues from this service, those creators are rewarded for their contributions to the tool. It’s entirely commercially safe, so it cannot produce third-party intellectual property. It cannot produce deepfakes. It doesn’t know what the Pope is, it doesn’t know what [Balenciaga] is, and it can’t produce a merging of the two. And we think for version one, the quality is quite remarkable. Because obviously, we’re a big believer that quality in gives you a better outcome.

I have a lot to talk about with you. Craig very boldly allowed me [early] access to this tool. I told him what prompts I was going to use, but I didn’t show him the results. So, here’s the first one I put up.

Screenshot by Nilay Patel / The Verge

“The well-dressed and influential attendees at the Code Conference, [in a] fancy hotel ballroom.” I will say, I originally said, “In the Ritz-Carlton,” and [the tool] wouldn’t let me do it. Because, I think, it thinks “Ritz-Carlton” is a very fussy name for someone here. Wide angle shot of the crowd. So, here’s the result. Here’s all of you. You look great. Very excited.

AI-generated photo by Getty Images

I asked it to do it again, and I think next year, we’re going to have you all sit on the floor in the round. Pretty good. 

AI-generated photo by Getty Images

So, then I thought I should do something on the news cycle. “Famous pop star and Super Bowl-winning tight end holding hands in a convertible.” I think y’all know where this is going. It’s obviously Zac Efron having the time of his life.

AI-generated photo by Getty Images

Now, to be fair, I did not specify gender, sexuality, ethnicity — anything. This is a reasonable result for this query, in my opinion. And again, Zac and Marcedes Lewis look great. And then here, I asked it to do it again. This is great. This is more or less what Travis and Taylor look like.

AI-generated photo by Getty Images

I don’t know why she’s wearing pads, but it’s good. And then lastly, there’s the image that I personally really wanted, which is “the CEO of a major car company running away from our conference.”

AI-generated photo by Getty Images

Here’s Mary and our security guard looking good. These are remarkable results. There are the AI problems — you pay too much attention to the hands or whatever, but the hardest problems have been solved. And I can see how, if I was actually writing the story, if I was being a little bit meaner to GM than I might otherwise, I might use a photo like this. You think that that’s what people want in the markets that you’re in?

Definitively. First of all, generative AI did not just burst onto the scene. It’s been something that has been around for years. Nvidia, our partner, actually launched the first GANs model from text to image. And so we knew it was coming, and our question to our customers was, “How are you going to use it? What do you need?” 

We create services for our customers that really allow them to create at a higher level, that save them time, save them money, and eliminate their IP risk. And that last piece is critical within AI. Everything that we heard from our customers was, “We want to use this technology.” And that could vary from media customers to agency customers to corporate customers. They want to unlock some of their creativity through these tools, but they need to make sure that they aren’t in violation of third-party intellectual property.

That can also vary around the globe. If you have an image and it produces an image of a third-party brand or somebody of name and likeness like [Travis] Kelce or [Taylor] Swift, that’s a problem. But there’s much more nuanced problems in intellectual property, like showing an image of the Empire State Building. You could actually get sued for that. Tattoos are copyrighted. So, fireworks can actually be copyrighted. That smiley firework that shows up? Grucci Brothers actually own that copyright. So, there’s a lot of things that we baked in here to make sure that our customers could use this and be absolutely safe. And then we actually put our indemnification around that so that if there are any issues, which we’re confident there won’t be, we’ll stand behind that.

There’s a flip side to that. So, you know all the training data is yours. We asked a bunch of provenance questions of [Microsoft CTO] Kevin [Scott] earlier. You know the provenance, right? And the next thing that enables you to do is say, “Okay, we’re going to go pay our creators.” How does that work? What’s the actual formula for saying, “We generated this image. Someone paid us for it, and now upstream of that, we’re going to pay you however many cents”?

“If we find fairer ways of doing this, we’ll certainly embrace it.”

Right. I think Kevin talked a little bit — I was getting mic-ed up in the back — but I think he talked a little bit about attribution and whether the technologies exist in order to do that. In our case, at a pixel level, I think the question was earlier around audio. The answer is, right now, those models don’t exist. We tested out a bunch and didn’t find them to be sufficient in order to do that attribution. So, the way we’re doing it is we’re doing it off of two things: What proportion of the training set does your content represent? And then, how has that content performed in our licensing world over time? It’s kind of a proxy for quality and quantity. So, it’s kind of a blend of the two.

So you’re just doing sort of a fixed model.

Yeah, and we’ll evaluate that over time. If we find fairer ways of doing this, we’ll certainly embrace it. We looked for technology that might do that, but at this point in time, I think it falls short of the goal.

The dynamic here is really interesting. So, your customers want this — I need to generate a stock photo or something for an ad campaign, and instead of hiring a photographer, I might go to the Getty tool. Will the Getty tool be cheaper? Will it undercut the hiring of a real photographer?

I think that’s yet to play out. I think it is an entirely different model. It’s a cost-per-call type model, generative model. You played with the tool; I think this is a very good tool in terms of how it walks you through the prompts and what you can get out of it — quality from the start. It gives you high res from the start. But it’s work. I think comparing and contrasting that to licensing pre-shot is something that we’ve done some time studies with customers and things along those lines. And I think it varies.

Ultimately, again, we try to save our customers time because that is the most expensive thing that they’re applying. And I think, in some cases, this can be very creative but not necessarily the most time-efficient. And I think our pre-shot, in many cases, can be much more authentic and much more efficient because you’re searching. You’re not paying for that search. You’re getting a wide variety of content back with real people, real locations. And in many cases, brands care about that. But that can be a much more efficient process. So, I think, really, we’re going to find out over time.

Getty’s a unique company in the space. You actually employ a bunch of photographers. You send them to dangerous places. You create a bunch of news photos. Are you hearing from your own creatives that this is a problem?

I wouldn’t say that we’re hearing from our own creatives that AI is a problem. We represent over half a million photographers worldwide. So you can imagine that within this audience, there’s a lot of different perspectives and points of view, and take that and multiply it times 1,000 and you’re going to get even more. What we hear is a lot of concern about intellectual property. We hear concerns that, ultimately, things are being trained on their intellectual property, and there’s value being created for that, either through subscription services or through other models. Ultimately, people want that to be solved for. But what we hear from our customers is, again, they want to create, and they want to use these tools.

Our point of view from the get-go has always been: We believe AI can have a constructive benefit for society as a whole, but it needs to account for certain things, and so we’ve always looked for transparency in trading data. We believe creators and IP owners have the right to decide whether their material is trained on. We believe the creators of these models shouldn’t be covered by something like Section 230, that you should have some skin in the game and take on some liability if you’re creating these things and putting them out there. Again, our tool is one that… We were very conscious — as a member of the media, the last thing we wanted to do was produce a tool that actually could produce deepfakes. So your Taylor Swift and Kelce example is a good one: you’re going to struggle because it doesn’t know who Taylor Swift is, and it doesn’t know who Kelce is.

“You’re going to struggle because [the AI] doesn’t know who Taylor Swift is … and that’s intentional.”

The only thing in the world that doesn’t know who Taylor Swift is.

Yeah, that’s exactly right. And that’s intentional. And I’d like to think that Taylor Swift will appreciate that.

What are we all working for? It’s for Taylor Swift.

Inside of that, there’s some big ideas there. So, in my brief career as a copyright lawyer and my longer career as a journalist, I find that no one actually cares about copyright law. They don’t care about the IP. They care about the money. But the money right now is downstream of some very thorny copyright issues.

You heard Kevin say he thinks — Microsoft thinks — that all this is built on a fair use argument that will eventually succeed or be modified in some way. You are, in many ways, on the other side of this. You’re suing Stability for having trained on a bunch of Getty images. If you win, maybe this whole edifice falls down. Have you thought about the stakes of that lawsuit?

There are high stakes around that lawsuit. We brought it for a reason. We fundamentally believe that IP owners should have a right to have their content, whether it’s used in training sets or not, and they should have the right to be compensated if that’s their choice. And I don’t buy the argument that Kevin put out there that I read Moby Dick, and therefore… First of all, these computers are not humans. Secondly, they’re actually corporate entities and are making money, to your point. And in many cases, they’re targeting existing markets with these technologies. I think the Warhol case, I mean…

That’s why we’re here.

We’re a little bit more IP geeks than others, but I think the Warhol case—

Why do you think we played Prince in front of Kevin?

Yeah, I was wondering! What did I get? It was supposed to be the Bangles, I think. Interesting. But I think that’s going to play out. I think we’re in the right. I think a world that doesn’t reward investment in intellectual property is a pretty sad world. Whether that’s music, whether that’s editorial journalism, whether that’s imagery — I believe we want to see more creators, not less. And I think it’s actually interesting. Chris [Valenzuela] was on the stage last night from Runway doing some demos, and I think Chris’ point of view is one that we share. We want more creators because of this technology, not less. And I think that’s a great world.

So, how do we enable it? That’s why we put this tool out there. We put this tool out not to disintermediate creators; we put this tool out there to enable creators. The users of our tool are creators, and I think it’s going to allow them to create in more insightful and innovative ways. But I think others are, in some cases, pointing these technologies directly at the creators themselves or, in some cases, coders. And I don’t think that is the society that I want to necessarily push toward.

Obviously, you’re in litigation with Stability. There are lots of other companies that are potentially training on Getty Images, lots of other companies that are just out there crawling the web and training on that. Are you in productive conversations with the Microsoft or the Googles or the OpenAIs?

We are in productive conversations. Whether they result in something that’s productive, I don’t know. 

First of all, I think there’s a PR bullshit layer of like, “Alright, I’m going to join this group, and I’m going to try to cleanse my reputation because I’m a member of that. Of course, I don’t implement anything around it. I don’t do anything. I just cite the fact that my corporate name is on that website.” I think that’s not real engagement. I think we can have different points of view on law, but I think one of the things I hope our model proves is that [putting] good quality ingredients in creates a better output. It creates a more socially responsible output, and it creates one that I think businesses will adopt.

“I think there’s a PR bullshit layer of like, ‘Alright, I’m going to join this group.’”

So, yeah, we’re in conversations there, but we aren’t going to move off of the fundamental point, which is: we believe if you’re an IP owner, you should have the right to decide whether your content’s used in training and you should be compensated for that right. And that doesn’t mean a de minimis check. It’s fundamental to these tools.

We’ve talked a lot about music at the conference generally. Again, I’m a nerd for this stuff, but broadly, the music industry has developed its own private copyright law because the courts are a coin flip in fair use arguments — like crazy coin flips in fair use arguments. And so the music industry is just like, “We’re going to make our own deals on the side. We’ll have our own norms.” My favorite example of this, by the way, is the publishing rights for the “Thong Song” by Sisqo are owned by Ricky Martin’s songwriter because Sisqo just whispers the words, “Livin’ La Vida Loca.”

If you apply that standard to AI, none of these companies are making money. But the music industry has developed that set of norms. Do you see that happening here where you’re going to need the courts to figure it out for you or that there will be an industry set of norms around licensing and deal-making because the courts are unreliable?

My hope is that we will not need to rely on the courts. Unfortunately, I can’t say that’s the case across the board. I think there are some very responsible technology organizations that are having dialogue, that are willing to figure out solutions. They understand the stakes at play. And I think you can have those conversations. I think there are others that are just off doing what they want to do and damn the consequences. Damn the consequences for IP ownership. Damn the consequences for deepfakes and what that means to press and facts and democracy. So, I hope we can all agree on it.

I don’t know that that’s necessarily going to be the case, which is why we not only invested a ton of time and resources to launch a product that we think actually proves that you can do this responsibly — by the way, a lot of those actors said, “Well, you could never do this. You could never get access to licensed data, so we didn’t even try because it’s an impossible thing.” Which, again, I call bullshit on, and I think this tool is a big bullshit on that statement.

But we launched. We’re spending millions of dollars to go on the other route because we don’t have 100 percent confidence that we’re going to be able to get there. We hope, and we’ll engage with anybody that wants to, but similarly, we needed to have that other track.

Two more themes I want to touch on: You and I have talked a lot about authenticity. You mentioned deepfakes. Getty does generate some of the most important photographs of our time. Historically, this is the role that Getty plays in our culture. You’ve said to me that just marking that stuff as authentic is not good enough, that there’s another problem here. Describe what you think that other problem is.

Well, I think there’s a problem where you can’t discern what is authentic. In a world where generative AI can produce content at scale and you can disseminate that content on a breadth and reach and on a timescale that is immense, ultimately, authenticity gets crowded out.

Now, I think our brand helps cut through. I think our reputation helps cut through. I do think, ultimately, that has value, but I do worry about a world where… I think this past year, I heard the stat that there were more AI images produced than were shot on a camera. That’s just staggering. Think about where we are in the adoption curve of AI — just play that out on an exponential basis. And then again, you think about a world where there are nefarious individuals and organizations and institutions, and that worries me. Our newsroom lit up when the Pentagon image was put out there. “Is it real?” And we’re getting calls asking for us to validate that. Now, let’s put that into an election in 2024.

You’re in the Content Authenticity Initiative group, right?

We’re in the discussion. We haven’t adopted it, to be quite frank. We’re not sure that that is the right solution right now. First of all, it puts the onus and the investment on the people creating original content, authentic content, rather than on the platforms and the generative tools that are producing generative content, and we fundamentally think that’s a little bit backwards. The generative tools should be investing in order to create the right solutions around that. In the current view, it’s largely in the metadata, which is easily stripped.

“In a world where generative AI can produce content at scale … ultimately, authenticity gets crowded out.”

So, you guys are a customer of ours. You use our imagery. You strip our metadata immediately when you put it in your CMS because it’s lighter and page loads and everything else, which makes sense because you’re competing for SEO and everything else that you need to do. So, you strip it. And so I think what we need to look… And this is where we’re engaging with Kevin and Microsoft and their team — we’re really encouraged by the pledges that they made to the White House in order to identify generative content. Because we want to do the exact same thing, but we want to do it in a way that really gets at the core.

Are you taking any particular steps ahead of the 2024 election, knowing that you’re going to compete for real photographs — real images — in a world of generative AI?

We are, I would say. We’re talking with The Associated Press. We’re talking with Agence France-Presse. We’re talking with our partners and our competitors about, “How do we go about doing this?” While we’re taking steps, we don’t have perfect solutions. And again, that date isn’t moving, and we’re getting closer and closer to it.

I think this is one of those things where putting technology out into the world u

“I think if we try to eliminate creators, I think it’s a sad world.”

nder the “let’s just move fast and break things…” — we’re playing with bigger stakes here.

Last big thing, and then we’ll go to the audience here. When Craig and I first started talking, one of the things you and I discussed was that the market for photography changed inevitably forever when the internet arrived, and more people could create, and our distribution platforms changed. Pricing collapsed. I know a lot of professional photographers [whose] careers evaporated with the advent of the internet, effectively.

Does this feel like that? And you’ve built a business in response to that — you’ve changed the business. Is this that same moment, do you think? Is it the same level of change?

I think it’s clearly a lot of change. I think what we do has value, whether that’s a tool that can enable creativity or content that is highly authentic, that can engage an end audience in a meaningful way and move them — if you’re a media company, move them to understand an issue, or if you’re a corporation, move them to actually engage with your brand or your products. I don’t think that goes away.

I think this puts different challenges in that it’s actually fun to navigate and figure out. I think the most important thing that will allow photographers, videographers, writers… is that we enable more creators. That’s the end goal. If we do that, then I think the world’s a great place. And I think companies like Getty Images will thrive in that, and those that work with us will thrive within that. I think if we try to eliminate creators, I think it’s a sad world, and I think it’ll be challenging for our business and for those that try to make a living.

Audience Q&A

Andrew Sussman: Andrew Sussman. So, with copyright and patents both stemming from the same part of the Constitution, “Congress shall promote the useful arts and sciences…” yet patents over the years have seen a narrowing, a constriction in the exclusive rights that are being granted.

Yet copyright has seen largely only an expansion [of rights], whether it be how long they last, whether it’s what they cover. And in this case, it seems like some of the creative expression aspects are being wrapped into what are more procedural or data elements. Like if you were to take a text or an image and convert it into a series of numbers, where is the creative expression in that? So, just curious as to if, in connection with Congress revisiting what the scope of copyright includes, should they also be looking at how broad copyright is as a general nature?

Craig Peters: Well, first off, I think the Copyright Office right now is in an active open submission for input on AI and the degree to which copyright is or is not applied. Obviously, we have a point of view on that, and we’ll put a brief in and give input into that. But I think copyright needs to be something that constantly evolves. The world constantly evolves. I think that’s ultimately something… I think regulatory institutions — government, legislative — evolve at a slower pace than technology. They tend to lag. But I do think they are doing their best to go through and be contemplative about, “How does this move the needle or not on copyright?”

I think they’re bringing all the voices in together in order to do that, and I think they’ll come out in what is a reasonable space, I think. But my point is that copyright needs to evolve, and if we want it just to stay stagnant, it’s going to ultimately not match up to the world that we’re living in. So, I don’t know if that answers your question, but that evolution’s critical.

Jay Peters: Jay Peters, with The Verge. I know the new AI tool is designed not to create images of real people, but what if it does? What if it makes a Joe Biden? What if it makes a Donald Trump? What if it makes a Pope? What does Getty do in that situation?

Craig Peters: It can’t. It doesn’t know who they are.

JP: Is there any way for somebody to engineer a prompt, though, that gets a pretty close approximation?

CP: No, because it doesn’t know who they are. It really doesn’t. So, if you go on Bing and you do “Travis Kelce and Taylor Swift,” it will give you somebody wearing a Chiefs uniform that kind of looks like Kelce. You won’t be able to type “Taylor Swift and Travis Kelce,” but just do “Swift” and “Kelce,” and you’ll get an output. That’s because it was trained on the internet. It was trained on our content and others that sit out on that internet. Therefore, it knows who they are.

This content was trained in a universe where that doesn’t exist. So, it really does not know. Now, we put in prompt engineering. We put things in there that can inform you that that’s not something that this tool will respond to. But even if we missed that prompt because we didn’t think of that in terms of the architecture, it won’t produce that output.

Unless we open it up to train on other content, it cannot produce those outcomes. And I think that’s, again, thinking about how you build these tools so that they can really be beneficial to businesses and corporations and, at the same time, be responsible to society. I think it’s a really important element.

Now, I think what we can do is we can take third-party intellectual property where they actually have the ownership rights, and then we can do custom fitting or what is custom training of that along with Nvidia to produce a model that’s bespoke to that IP. But that’s the case again, where that IP will be owned or have the necessary permissions in order to bring that to the table. So, we’ll be doing that with brands and companies over the coming months and years in order to produce some of those. But the fundamental piece here is most large language models, generative models, because they’re trained on open data sets under the notion that that is fair use, they know a lot. And they have to now do prompt engineering in order to restrict it. And that’s like whack-a-mole, right?

Miles Fisher: Miles Fisher. Appreciate your candid responses. You talk about, big picture, enabling more creators. That’s the greater good, so I agree with that. With the advent now of deepfakes, a lot of the academic think pieces compare the criticism to the advent of the camera in the mid-1800s. Creators are left with nothing — artistry of painting, all of that rendered mute. I’d just like to know personally, what do you think makes an exceptional photograph?

CP: That’s a good question. To me, it’s the one that moves you. So, there’s a level of photography that I think, whether it’s computer generated or otherwise, it’s the one that makes you stop, think, react, have emotion, engage. That’s a great photo. Whether that’s in news, whether that’s in sport, entertainment, or in a creative moment with a brand, I think those are the things that talent truly can express. And that’s why I believe in creative talent. Because I don’t think the expertise that goes into… Just let’s take what people might call stock for a second — if you knew what went into producing a great piece of imagery that actually got your attention, these are people that are understanding how to bring empathy, trust, integrity into photos in a new way that you haven’t thought about before, and they can grab that attention to put that on an audience. That’s tough to do. I sometimes equate our business to fast fashion in the creative sense because we’re constantly having to do something new because somebody else is going to knock off what we just did yesterday. But that’s what makes photography great at its core. It’s something that makes you stop, think, emote, engage.

NP: Well, that’s a great place to end it. That was a great question. Thank you so much, Craig.

CP: My pleasure. Thank you.

Decoder with Nilay Patel /

A podcast about big ideas and other problems.

SUBSCRIBE NOW!

[ad_2]