Overview

🤔 Why does user’s request for a generic “videogame plumber” lead image-generation models such as Playground v2.5 and DALL·E 3 to produce Nintendo’s Mario? Does user’s “gotham superhero” request necessarily mean they want an image of DC Comics’ Batman?

Despite implemented guardrails, models can still be triggered by names, keywords, or descriptions to generate copyrighted characters. This mismatch between generic prompt and specific output with potential copyright risks raises serious legal concern. Our project makes the following contributions:

CopyCat
Evaluation Suite

We build CopyCat, an evaluation suite with a diverse set of popular copyrighted characters and a novel evaluation pipeline, to better ground our investigation and evaluation.

1. Copyrighted Characters

CopyCat includes a dataset with 50 diverse popular copyrighted characters from 18 studios, both US and international.

2. Similarity evaluator

CopyCat uses a GPT4-based evaluator to detect similarity with copyrighted character in generated images, resulting in a DETECT score.

3. Consistency evaluator

CopyCat also measures whether the generation is consistent with the user's intent (i.e., utility), using a CONS score indicating if the main characteristic (e.g., “cartoon mouse” for Mickey Mouse) is present in the generation.

Indirect Anchors Generate Copyrighted Characters

We formalize two modes of copyrighted character generation: (1) Character Name Anchoring: User prompt contains character name directly, and (2) Indirect Anchoring: User prompt contains keywords or descriptions without the character’s name. Indirect anchoring is especially important for both model deployers and model users: even a non-malicious user could accidentally copyrighted characters when using seemingly innocuous prompts, leading to potential legal liability for the model deployer as well as any unsuspecting user that tries to monetize the image.

We introduce a generation + reranking pipeline to semi-automatically discover keywords or phrases that can be effective indirect anchors. We first generate a candidate set of keywords, then we use the following 3 reranking approaches to semi-automatically discover indirect anchors:

  • LM-Ranked: using greedy decoding to capture the inherent ranking of LMs.
  • EmbeddingSim Ranked: rank by their embedding space distance to the copyrighted character’s name.
  • Co-Occurence Ranked: rank by their co-occurrence with the character’s name in popular training corpora.

We find that in general, ranking keywords based on their co-occurrence with the character’s name in the LAION corpus is the most effective and could more characters than using a 60-word description when only 20 keywords are used. In addition, we find that descriptions and identified keywords also transfer to generating characters from DALL·E 3 and video models.

Prompt:
Generated image
Prompt:
Generated image

Key Takeaways for Users and Model Deployers

Users

Users should be aware of Indirect Anchoring, as these prompts can result in potential copyright issues and liability even when not intending to generate copyrighted characters.

Model deployers

For model deployers who adopt mitigation strategies to prevent the generation of copyrighted characters, it's also important to note that Indirect Anchoring may bypass guardrails that rely on direct name detection. We also recommend investing in techniques beyond prompt rewriting such as combining it with negative prompts.