Understanding how richness is calculated helps project managers gauge data quality.

Remove ads, get exclusive features. Starting from $7.99

Richness measures how many positive-coded documents exist relative to the total set. It's a simple ratio that reveals how much of the data reflects desirable outcomes. This helps project managers see if documentation matches goals and where to focus improvements. It helps teams share progress.

Richness in your dataset isn’t a flashy metric. It’s a pragmatic compass. If you’re looking at a Relativity project, you’ll hear about richness more often than you might expect. It’s the simple idea that helps you understand how much of your file collection actually carries the meaning you care about. And yes, it’s surprisingly revealing once you see it in action.

What is richness, really?

Let me spell it out clearly. Richness is the proportion of positive-coded documents in relation to the total number of documents. In plain terms: how many documents have been tagged as positive (or relevant, or responsive) out of the entire set you’re examining.

Formula, in the simplest terms:

Richness = (number of positive-coded documents) ÷ (total number of documents)

If you’re comfortable with quick math, you’ll recognize this as a straightforward ratio. If not, think of it as a slice of the whole pie—the bigger that slice, the richer the dataset in terms of the things you’re looking for.

What does “positive-coded” mean here?

In a Relativity workflow, teams tag documents with various codes to flag relevance, responsiveness, privilege, or other criteria. A "positive-coded" document is one that has earned a tag indicating it matches the target criteria. It’s not a judgment about quality by itself; it’s a signal: this document contains something that matters for your project goals.

Why richness matters in project work

Here’s the thing: the raw number of relevant documents isn’t the only thing you want. Richness tells you how much signal there is in the dataset compared to the noise. A dataset with very few positive codes might still be perfectly valid for certain goals, but it changes how you allocate review resources, how you design sampling, and how you plan timelines.

Resource planning: If richness is high, you’ve got a solid core of relevant material to work with. If it’s low, you may need to refine search terms, adjust coding guidelines, or broaden your initial scope.
Milestone planning: Richness informs risk. A dataset where most documents aren’t positive might require more adjudication rounds, more reviewer time, or a tighter definition of “positivity.”
Stakeholder communication: It gives a concise, numbers-based way to summarize how focused or broad your data is, which helps when you’re aligning expectations with teams or clients.

A quick, tangible example

Let’s walk through a simple scenario. Suppose you’re analyzing 1,200 documents for a project. Your coding team flags 180 documents as positive.

Positive-coded documents: 180
Total documents: 1,200

Richness = 180 ÷ 1,200 = 0.15

That’s 15% richness. What does that tell you? It indicates a moderate share of the dataset carries the target signal. If your objective hinges on capturing the relevant material, 15% suggests you’re in a workable range, but you’ll want to double-check whether that level of signal aligns with your goals and timelines.

Connecting richness to practical decisions

Richness isn’t a clipboard statistic you stare at and forget. It should guide concrete steps.

If richness is higher than expected: you can push faster on review cycles, knowing the corpus is rich in relevance. It can free up time to tackle more nuanced questions, like privilege or attorney-work-product considerations.
If richness is lower than expected: you might revisit search queries, consider adding or refining criteria, or performing a quick sampling to confirm the coding rules are catching the right material.
If richness changes over time: track it as you code more documents. A rising richness can signal that your search strategy is converging on the right topics; a dip might mean you’re pulling in more boilerplate material or noise.

How to calculate in a Relativity workflow

Here’s a practical way to think about the steps, without turning this into a math class.

Define what counts as positive

Agree on the codes that represent “positive” for your current objective. It’s tempting to keep adding codes, but a focused set helps keep richness meaningful.

Count the positives

Run a quick tally of how many documents carry those positive codes.

Count the total

Take the total number of documents in the dataset or the subset you’re examining.

Compute the ratio

Divide positives by total. That’s your richness.

Interpret in context

Put the number back into your project’s frame: what does this mean for coverage, risk, and planning?

A few practical tips to keep the number trustworthy

Consistency is everything: Make sure coding guidelines are clear and followed. Inconsistent tagging can inflate or deflate richness in ways that mislead decisions.
Check for duplicates: Duplicates can distort both the numerator and denominator. Clean up duplicates before you measure.
Use sampling for validation: If you’re unsure whether the positive codes are being applied consistently, do a quick spot check. A small sanity check can catch big drift.
Track changes: If you’re adding new reviewers or revising criteria, richness will shift. Track those changes so stakeholders can interpret the trend accurately.

Relativity’s role in measuring richness

Relativity gives you a robust toolkit to manage the data and capture those codes. You can:

Create a defined set of positive codes and apply them across documents.
Run counts and generate quick reports showing the ratio of positives to total documents.
Build dashboards that plot richness over time or across different data sources, making it easy to compare datasets side by side.
Use filtering to examine richness within subsets—by custodian, by topic, by date range—so you can spot where signal concentrates or fades.

A broader view: richness in the larger data-story

Richness is one piece of the data storytelling puzzle. Think of it as a compass that orients your project toward meaningful material. It pairs nicely with other metrics—like sampling precision, inter-rater reliability, or review throughput—to give you a fuller picture of how well your data aligns with your goals.

Treading carefully: common pitfalls

Denominator zero: If your total document count is zero or if you’ve filtered to a subset with no documents, richness isn’t defined. Handle these cases explicitly in your dashboards.
Overwriting positive codes: If you expand what counts as positive mid-way without re-baselining, you’ll distort the trend. Reassess and annotate when changes happen.
Ignoring context: A high richness in a small, highly curated dataset doesn’t automatically translate to a broader project truth. Always pair richness with context about scope and sampling.

A few analogies to make it stick

Richness is like the spice level in a dish. A little spice adds flavor (signal), but too much or too little can throw off the balance. The goal is just-right richness that supports your intended outcome.
Think of a library search: richness tells you how many of the books pulled by your search actually matter for your question. It’s not about finding more books; it’s about finding the ones that matter most.

A closing thought

If you’ve ever tried to herd a dataset into shape, you know the challenge isn’t just about volume. It’s about signal. Richness gives you a clean, interpretable way to gauge how much of your data is truly relevant. With a clear definition of positive codes, a straightforward calculation, and mindful interpretation, you’ll have a reliable metric that guides decisions rather than cluttering the dashboard.

So next time you pull a collection into Relativity and start tagging, pause for a moment and ask: how rich is this dataset? The answer won’t just be a number. It’ll be a guiding line that helps you plan, prioritize, and communicate with confidence. And that kind of clarity—that’s the heart of smarter project work.

Understanding how richness is calculated helps project managers gauge data quality.

Richness measures how many positive-coded documents exist relative to the total set. It's a simple ratio that reveals how much of the data reflects desirable outcomes. This helps project managers see if documentation matches goals and where to focus improvements. It helps teams share progress.

Get the latest from Examzify