WHOIS / RDAP · Tranco graph

Registration data for ~1M Tranco-listed domains

Structured registry fields for domains drawn from the Tranco list—shipped as a gzip tar archive (CSV inside) for a smaller download, with checksums for reproducible research.

Download archive

Archive: whois_results.csv.tar.gz · inside: whois_results.csv · extract: tar -xzf whois_results.csv.tar.gz

Registry responses as a linked snapshot — not live ranking or traffic.

More downloads

Why this dataset exists

This project is built for researchers, analysts, and security teams who need more than a raw WHOIS dump. Public registry responses vary widely across TLDs, registrars, and RDAP implementations, so practical analysis usually starts with a normalization step: align fields, record collection outcomes, preserve timestamps, and make each snapshot citable. That is the main purpose of this site.

The dataset is therefore not positioned as a generic domain list. It is a point-in-time research artifact for questions such as registrar concentration, nameserver usage, expiration windows, registration churn, and failure patterns in registry metadata collection. The value comes from the packaging, repeatability, and documentation around the snapshot, not just from redistributing rows.

Snapshot

Rows

Built (UTC)

CSV SHA-256

Dataset manifest (checksums and URLs).

What makes this release useful

Normalized shape

Registry and registrar responses are collected into a stable CSV schema, so repeated studies do not need to reinvent parsing or field naming conventions.

Reproducible snapshot

Each build carries timestamps and checksums, making it easier to cite exactly which release was used in a paper, report, or internal investigation.

Collection visibility

Status fields and error fields are preserved instead of silently dropped, which helps distinguish missing data from collection failures or policy redaction.

Research-oriented scope

The Tranco-aligned domain set makes the snapshot more consistent for Internet measurement than ad hoc lists assembled from unrelated sources.

How to use it well

Registrar and registry studies

Measure market concentration, compare TLD disclosure patterns, and identify where registration metadata is heavily redacted versus richly exposed.

Security enrichment

Join WHOIS fields with DNS, hosting, or abuse datasets to investigate nameserver reuse, bulk registration behavior, and lifecycle signals such as creation and expiry dates.

Methodology testing

Use the sample and column documentation to prototype parsers, filters, and SQL workflows before running heavier analysis on the full archive.

If you are evaluating whether this is useful for your workflow, start with Sample, then read Methodology before downloading the full archive.

Editorial analysis

This site now includes a longer editorial page on how to interpret sparse metadata, source differences, and failure patterns in WHOIS-style snapshots. That page is meant to help readers understand what can and cannot be inferred from registration data before they build analyses on top of it.

Read: Coverage and failure patterns in a WHOIS snapshot

What you should know before using it

Scope

This site publishes a UTF-8 CSV snapshot of WHOIS and RDAP registration fields for domains aligned with the Tranco list. The release is packaged as a gzip tar archive so large snapshots remain easier to distribute and verify.

Good for

Security research, Internet measurement, registrar studies, longitudinal metadata tracking, and teaching workflows that need a documented example dataset.

Not a substitute for

Traffic rankings, live DNS answers, malware verdicts, or content crawling. The dataset records registration metadata as observed at collection time, and registry disclosure policies differ substantially across TLDs.

Citation

When citing a release, include the build time and checksum from the manifest. Treat each build as a point-in-time release, not a live service.

For a fuller project description, see About, Methodology, Analysis, and FAQ.