Extract URLs from text
Paste logs, an email, an HTML dump, a CSV, a code file — get a clean deduplicated list of URLs. Regex plus HTML-aware (<a href>, src, background, srcset, CSS url()).
Ready.
How to use
- Paste any text. Files up to ~50 MB work fine.
- Pick Mode — Plain text (regex), HTML (parses attributes), Markdown (parses link syntax), Auto (tries all three).
- The output list is automatically deduplicated. Toggle Show duplicates if you need them.
- Filter options: Host only, Filter by domain, Only http/https, Sort A→Z.
- Download as
.txtor.csv(with a count column).
When to use which mode
- Auto — handles most inputs. Slower than pure regex but catches HTML attribute URLs that regex misses.
- Plain text — fastest, finds URLs by regex. Misses attribute values in HTML.
- HTML — parses with DOM, pulls from
href,src,srcset,style="background-image: url(...)". - Markdown — extracts the URL from
[text](url)link syntax.
FAQ
Why does Auto mode return more URLs than Plain?
Plain regex won't follow HTML quoting reliably — it'll match https://example.com">link as one URL with junk on the end. Auto mode parses HTML properly.
Does it follow links to find more?
No — extraction is local, on the text you paste. Crawling needs a server.
How are trailing punctuation (commas, periods) handled?
Stripped automatically. ...visit https://example.com. extracts as https://example.com.
What about scheme-relative URLs (//cdn.example.com/x)?
Currently skipped — they require a base URL to be meaningful. Add the protocol manually if you need them included.