DZone
Thanks for visiting DZone today,
Edit Profile
  • Manage Email Subscriptions
  • How to Post to DZone
  • Article Submission Guidelines
Sign Out View Profile
  • Post an Article
  • Manage My Drafts
Over 2 million developers have joined DZone.
Log In / Join
Refcards Trend Reports
Events Video Library
Refcards
Trend Reports

Events

View Events Video Library

Related

  • How to Create a Pokémon Breeding Gaming Calculator Using HTML, CSS, and JavaScript
  • Introduction To Template-Based Email Design With Spring Boot
  • Build an AI Chatroom With ChatGPT and ZK by Asking It How!
  • How To Convert HTML to PNG in Java

Trending

  • Run Gemma 4 on Your Laptop: A Hands-On Guide to Google's Latest Open Multimodal LLM
  • S3 Vectors: How to Build a RAG Without a Vector Database
  • Introduction to Tactical DDD With Java: Steps to Build Semantic Code
  • Agentic Testing: Moving Quality From Checkpoint to Control Layer
  1. DZone
  2. Coding
  3. Languages
  4. Programmatic Brand Extraction: Pulling Logos, Colors, and Assets from Any URL

Programmatic Brand Extraction: Pulling Logos, Colors, and Assets from Any URL

OpenBrand is an open-source library that extracts structured brand assets from any URL - available as an npm package, API, or AI agent skill.

By 
Yixing Jiang user avatar
Yixing Jiang
·
May. 27, 26 · Analysis
Likes (0)
Comment
Save
Tweet
Share
217 Views

Join the DZone community and get the full member experience.

Join For Free

Here’s a problem I kept running into: I need a company’s brand assets — their logo, their colors, maybe a hero image — and there’s no API for it.

You’re building a white-label dashboard. Or a proposal generator. Or an integration that sends branded emails on behalf of customers. Every time, you end up on their website, right-clicking “Inspect Element,” eyedropping hex codes, and downloading a pixelated PNG from their footer. It’s tedious, it breaks when they redesign, and it doesn’t scale.

So I built OpenBrand, an open-source library that extracts brand assets from any URL. Give it a website, get back structured JSON with logos, colors, and backdrop images. No API key needed if you run it as a library.

The Problem Is Harder Than It Looks

You might think: “Just scrape the <link rel='icon'> and call it a day.” But favicons are 16x16 pixels. That’s not a logo — that’s a logo for ants.

Real brand extraction needs to handle:

Logo detection. Companies put their logos in wildly different places. Some use an <svg> in the header. Some use a <img> with a class like .site-logo or .brand. Some only have it as an Open Graph image in their <meta> tags. Some have it nowhere obvious, and you need to check their favicon manifest for higher-resolution variants.

Color extraction. The brand’s primary color might be in CSS custom properties (--brand-primary), in computed styles on key elements, in their stylesheet as the most-used non-white/non-black color, or embedded in their logo SVG. And you need to distinguish between “the brand color” and “the color they use for body text.”

Backdrop images. Hero images, background gradients, Open Graph images — these are useful for building branded experiences, but they’re scattered across different DOM locations and meta tags.

The point is: there’s no standard for where brands put their assets. Every website is a snowflake.

How OpenBrand Works

OpenBrand uses server-side HTML scraping with Cheerio and image analysis with Sharp. No headless browser, no Puppeteer — just direct HTTP requests and intelligent heuristics. Here’s the approach:

JavaScript
 
// Fetch the page HTML with a browser-like User-Agent
const html = await fetch('https://stripe.com', {
  headers: { 'User-Agent': 'Mozilla/5.0 ...' }
}).then(r => r.text());

// Parse with Cheerio (jQuery-like DOM API for Node.js)
const $ = cheerio.load(html);
// Run extraction heuristics across the parsed markup


For sites that block direct requests, it falls back to Jina Reader, a service that renders pages and returns clean content.

The extraction pipeline runs in this order:

  1. Logos – Check <svg> elements in header/nav, <img> elements with logo-related classes/IDs, <link rel="icon"> manifest for high-res variants, Open Graph/Twitter card images as fallback
  2. Colors – Extract theme-color meta tags, parse manifest.json, sample dominant colors from logo images using Sharp
  3. Backdrops – Find Open Graph images, hero/banner images, background images on key sections

The library returns structured data:

TypeScript
 
import { extractBrandAssets } from "openbrand";

const result = await extractBrandAssets("https://stripe.com");

if (result.ok) {
  console.log(result.data.brand_name);     // "Stripe"
  console.log(result.data.logos);           // LogoAsset[] - SVGs, PNGs with URLs and dimensions
  console.log(result.data.colors);         // ColorAsset[] - hex values with context
  console.log(result.data.backdrop_images); // BackdropAsset[] - hero images, backgrounds
}


Three Ways to Use It

As an npm package (no API key, runs on your server):

Shell
 
npm add openbrand


TypeScript
 
import { extractBrandAssets } from "openbrand";
const result = await extractBrandAssets("https://linear.app");


Lightweight and fast — no browser process to manage. Good for build scripts, CI pipelines, serverless functions, or backend services.

As an API (free API key from openbrand.sh):

Shell
 
curl "https://openbrand.sh/api/extract?url=https://stripe.com" \
  -H "Authorization: Bearer your_api_key"


Good for client-side apps or anywhere you want a simple HTTP call.

As an agent skill (for Claude Code, Cursor, Codex, Gemini CLI):

Shell
 
npx skills add ethanjyx/openbrand


Then just ask your AI agent: “Extract brand assets from linear.app.” This is probably the most interesting distribution channel — 40+ AI coding agents can use it as a tool.

What I Got Wrong (And What I’d Do Differently)

Some honest takes on the tradeoffs:

Static HTML has limits. We don’t execute JavaScript, which means heavily SPA-dependent sites may not expose all their brand assets in the initial HTML. In practice, this matters less than you’d think - logos, favicons, OG tags, and most brand-relevant markup live in static HTML. For the few sites where it fails, the Jina Reader fallback helps. We chose speed and simplicity over completeness.

Logo detection is fuzzy. There’s no semantic HTML tag for “this is the company’s logo.” Heuristics work well for ~85% of sites but break on unusual layouts. Some sites put their logo in a <div> with a background image. Some use CSS mask-image. The current approach has a priority-ranked list of strategies, but it’s not perfect.

Color extraction conflates brand color with design system color. A company might use blue as its brand color but green for its primary CTA buttons. OpenBrand currently returns both without distinguishing between them. This is a known limitation - brand identity and UI design tokens overlap but aren’t identical.

Rate limiting. If you’re extracting from many URLs, you need to be respectful. The API has rate limits built in, but the npm package doesn’t throttle — that’s your responsibility.

Where This Is Actually Useful

Real use cases I’ve seen or built:

  • White-label SaaS: Automatically theme a customer’s dashboard using their brand colors on first login
  • Proposal/invoice generators: Pull the client’s logo and colors to brand documents without asking them to upload assets
  • Competitive analysis tools: Track how competitors’ branding evolves over time
  • AI agents: Give LLMs the ability to “see” a brand without manual configuration — useful for generating branded content, emails, or presentations
  • Design system bootstrapping: Start a new project by extracting the brand’s existing visual language

Try It

The repo is at github.com/ethanjyx/openbrand. MIT licensed.

The fastest way to see if it works for your use case:

Shell
 
npm add openbrand
node -e "
  import('openbrand').then(async ({extractBrandAssets}) => {
    const r = await extractBrandAssets('https://your-target-site.com');
    if (r.ok) console.log(JSON.stringify(r.data, null, 2));
    else console.error(r.error);
  });
"


If you find sites where the extraction breaks, open an issue — the heuristics improve with every edge case.

API CSS Logo (programming language)

Opinions expressed by DZone contributors are their own.

Related

  • How to Create a Pokémon Breeding Gaming Calculator Using HTML, CSS, and JavaScript
  • Introduction To Template-Based Email Design With Spring Boot
  • Build an AI Chatroom With ChatGPT and ZK by Asking It How!
  • How To Convert HTML to PNG in Java

Partner Resources

×

Comments

The likes didn't load as expected. Please refresh the page and try again.

  • RSS
  • X
  • Facebook

ABOUT US

  • About DZone
  • Support and feedback
  • Community research

ADVERTISE

  • Advertise with DZone

CONTRIBUTE ON DZONE

  • Article Submission Guidelines
  • Become a Contributor
  • Core Program
  • Visit the Writers' Zone

LEGAL

  • Terms of Service
  • Privacy Policy

CONTACT US

  • 3343 Perimeter Hill Drive
  • Suite 215
  • Nashville, TN 37211
  • [email protected]

Let's be friends:

  • RSS
  • X
  • Facebook