How to Build an RIA List: The Practical Playbook

An RIA list built from scratch is one of those projects that looks straightforward and turns out to be a six-week engineering effort if you do it well. We've built and rebuilt this pipeline many times. Here's the playbook with the gotchas at every step.

Step 1: Decide What "RIA" Means for Your List

Sounds obvious. It's not. "RIA" can mean:

SEC-registered investment advisers only (firms over $100M AUM)
SEC + state-registered (every regulated investment adviser)
Pure-play fee-only RIAs (excluding hybrids with BD affiliation)
Hybrids (firms with both RIA and BD affiliation)
Specific advisor reps (IARs), not the firms themselves

Each definition produces a different list. The SEC-only universe is about 15,000 firms. SEC + state is closer to 35,000 firms. IARs (individual reps) number in the hundreds of thousands. We cover the channel distinctions in detail here.

Most outreach use cases want SEC + selected state-registered firms over a minimum AUM threshold. Pick before you start.

Step 2: Pull the Universe from IAPD

IAPD (Investment Adviser Public Disclosure) is the SEC's free public portal for investment adviser data. The full dataset is downloadable as Form ADV bulk files. You want:

The Investment Adviser Firm Summary file (firm-level data)
The Investment Adviser Representative file (IAR data)
The Form ADV Part 1A and Schedule A/B files (filing details)

These are large XML or CSV files. Plan for a few gigabytes uncompressed. Parse with care. The schema changes occasionally and historical filings have different formats than current ones.

The IAPD bulk download covers SEC-registered firms thoroughly. State-only firms file with their state regulator and not always with the SEC. If you need state-registered firms, you have to pull from each state separately, which is its own project.

Step 3: Cross-Reference FINRA BrokerCheck

If your list includes broker-dealer activity or hybrid registration data, FINRA BrokerCheck is the parallel source. BrokerCheck doesn't have a clean bulk download (their API is restrictive and rate-limited), but you can pull individual CRD records and aggregate. For hybrid firm identification, you need both ADV and BD registration data linked by CRD number.

FINRA also publishes the BD Database (Form BD filings) which is the parallel of ADV for broker-dealers. If you need any BD data this is your source.

Step 4: Pull Form ADV Detail

Once you have the firm universe, pull the rich Form ADV detail for each firm. The key items:

Item 1 (identifying info, address, contact email)
Item 5 (employees, AUM, client types, services)
Item 7 (affiliations)
Item 9 (custody)
Item 11 (disciplinary)
Schedule A (direct owners and officers)
Schedule B (indirect owners)

Schedule A is the goldmine. It names the principal officers, their ownership percentage, their CRD number, and their start date. This is how you identify the controlling principal at every firm.

Step 5: Apply Your Criteria

With the full firm-and-officer dataset loaded, apply your filters:

AUM band (Item 5.F)
Client type composition (Item 5.D)
Affiliation status (Item 7)
Custody model (Item 9)
Disciplinary history (Item 11)
Geographic location (Item 1)
Firm age and growth trajectory (year-over-year AUM)

A typical campaign filter (SEC-registered RIAs, $250M to $1B AUM, no BD affiliation, primary residence in target states, no significant disciplinary disclosures) cuts the 15,000-firm SEC universe to 1,500 to 2,500 firms.

Step 6: Identify the Right Person

Schedule A gives you the principal officers. The right outreach target depends on your product:

For tech and platform decisions: the COO, Director of Operations, or CTO. At small firms this is the controlling principal.
For investment-related decisions: the CIO or head of investments.
For compliance products: the CCO.
For M&A or financing: the controlling principal or CEO.

Form ADV lists officers but doesn't classify their functional role beyond title. You often need to enrich with LinkedIn or firm-website data to identify the right functional contact.

Step 7: Verified Contact Enrichment

This is the step most teams underestimate. The ADV gives you the firm address and the compliance email. It doesn't give you the principal's direct email or mobile. Enrichment options:

Generic B2B databases (ZoomInfo, Apollo, Cognism) for email + phone, with deliverability checking
LinkedIn-derived data via dedicated tools (RocketReach, ContactOut)
Direct outreach to firm websites to find named officer email patterns, then validation
Specialist advisor-data vendors (us, AdvizorPro) that bundle enrichment

Whatever your method, run email validation before sending. We use a multi-step verification including MX lookup, SMTP check, and recent-activity validation. Expect 80% to 90% deliverability for well-enriched principal-level contacts. Lower and your bounces will damage your sending domain.

Step 8: Validate and Sample-Test

Before launching a full campaign, manually check 25 to 50 records from your list. Confirm:

The firm is who you think it is (correct channel, AUM band, services)
The named person is currently with the firm (LinkedIn check)
The email format matches the firm's other public email patterns
The phone, if direct-dial, is reachable

If your manual-check error rate is over 10%, fix the pipeline before going broader. This step is non-negotiable.

Step 9: Refresh Cadence

RIA data has decay. Principals leave, firms merge, AUM changes, registrations lapse. Annual ADV amendments rebuild much of the universe every spring. Plan a refresh cadence:

Quarterly: rebuild AUM and firm-status data
Monthly: re-check email deliverability on stale records
Per-campaign: validate the actual list you're about to email

When to Outsource

If you're a wealthtech or fintech company running quarterly campaigns and don't have dedicated data engineering, outsource. The pipeline above takes 4 to 8 engineering weeks to build, plus ongoing maintenance. We've already built it and can deliver a list in 3 to 5 business days.

If you have data engineering and run continuous campaigns, build it in-house. The economics flip above about 200,000 records per year of usage.

Pipeline Architecture Patterns

Three patterns we've seen work for sustained RIA-list pipelines.

The batch refresh pattern. Pull bulk IAPD and ADV data monthly. Normalize and de-duplicate. Apply current filter criteria. Run enrichment against a contact-data vendor. Output a fresh CSV monthly. Simple, reliable, and good enough for most quarterly outreach motions.

The streaming pattern. Subscribe to ADV amendment feeds (if your vendor exposes them) or poll the IAPD search API. Detect changes (new registrations, amendment filings, AUM changes) and trigger enrichment workflows for changed firms only. Higher complexity but better data freshness. Used by teams that monitor watchlists.

The hybrid pattern. Maintain a base list refreshed quarterly. Run streaming detection on a watchlist subset (top 500 to 2,000 target accounts). Trigger high-value alerts on watchlist firms while keeping the broader universe fresh enough for cold outreach. This is what most enterprise wealthtech teams settle on.

The Cost of Bad Data

If you skip any of the steps above, the cost shows up downstream. Specific failure modes we've seen:

Sending to outdated emails: a 25% bounce rate damages your sender reputation and depresses inbox placement for 4 to 6 weeks afterward.
Targeting wrong-channel firms: pitching RIA software to bank-trust-department contacts wastes AE cycles and frustrates good prospects who get the wrong pitch.
Mis-classified AUM: pitching enterprise pricing to a $200M emerging RIA kills the deal in the first call.
Wrong functional contact: hitting the office manager instead of the CIO loses an entire outreach cycle and resets the firm's interest level.

The pipeline above seems heavy because it is. Each step exists because we've watched teams skip it and lose campaigns. The good news is that once the pipeline is built, the marginal cost per list is low.

Tools That Help

If you're building in-house, a few tools are worth knowing.

For ADV bulk parsing: Python with lxml or Polars handles the XML well. The SEC's documentation is sparse but the schema is documented in the Form ADV instructions.

For FINRA CRD lookups: there's no public API. Most teams scrape (carefully, with rate limiting) or buy access through a data vendor.

For contact enrichment: Apollo, ZoomInfo, and Cognism for breadth. NeverBounce, ZeroBounce, or Kickbox for email validation. We use a multi-vendor cascade with our own quality scoring on top.

For LinkedIn-derived data: RocketReach, ContactOut, and Hunter cover most needs. Direct LinkedIn scraping violates ToS and is risky.

For ADV-amendment monitoring: the SEC's EDGAR full-text RSS feed plus a parser. Or use a vendor that's already done this.

If you'd rather not build the pipeline, the standard build above is roughly what we run for every list. Send us the criteria and the list shows up in 3 to 5 business days.

Sanity-Checking Your Pipeline

A few sanity checks to run on any RIA list, whether you built it or bought it.

Total count. SEC-registered RIAs number around 15,000. SEC plus state-registered is closer to 35,000. If your full-universe count is wildly different, you have a problem.

AUM distribution. The distribution is heavily right-skewed. Most firms are sub-$500M. A handful of firms are over $50B. Median is well below mean. If your distribution looks normal, something's wrong with your filter or your data.

Geographic distribution. California, New York, Texas, Florida, and Illinois dominate. If your list over-indexes another state by an unexpected margin, check for filter errors.

Channel mix. Pure RIAs are roughly 30% of advisors but a much higher share of total AUM. Hybrid firms are growing. If your channel mix doesn't reflect your filter intent, recheck.

These sanity checks catch most pipeline bugs before they hit a campaign. They take 10 minutes to run and save weeks of recovery from a bad campaign.

Frequently Asked Questions

How long does it take to build an RIA list from scratch?

A first version takes 4 to 8 engineering weeks if you're starting from raw IAPD and FINRA data. Ongoing maintenance is another 0.25 to 0.5 FTE depending on freshness requirements.

Is IAPD data free?

Yes. The SEC publishes Form ADV bulk data as a free download. FINRA BrokerCheck data is also public but has more access friction and no clean bulk export.

What's the most common mistake in building an RIA list?

Treating the ADV-listed compliance email as the right outreach target. It's not. You need person-level enrichment for principals, COOs, CIOs, or CCOs depending on your product.

How often should I refresh an RIA list?

Firm and AUM data: quarterly. Contact deliverability: monthly. Per-campaign validation: every time. Annual rebuilds align with Form ADV annual amendment season (spring).

Can I build a state-registered RIA list the same way?

Partially. State-registered firms file with their state regulator, not always with the SEC. Each state's data quality and access varies. NASAA aggregates some data, but state-by-state pulls are often required.

Get 20 Free Records

FINRA verified · 3-5 day delivery · No annual contracts

How to Build an RIA List: An End-to-End Playbook