Relativity’s default index holds 12 million documents, balancing speed and manageability.

Relativity sets a default cap of 12 million documents per index to keep queries fast and maintenance simple. Large indices can slow searches, so teams explore sharding or separate indices when volumes grow. While config tweaks exist, the standard rule guides planning and data governance for reliable performance.

The 12-Million Rule: Why a Relativity Index Has a Natural Ceiling

If you ever build an index in Relativity, you’ve probably noticed numbers popping up in the documentation or in admin dashboards. One number, in particular, stands out: 12 million. That’s the default cap for how many documents you can pack into a single index. It’s not a flashy rumor or a random limit pulled from nowhere. It’s a carefully chosen threshold designed to keep things snappy, predictable, and manageable. Let me explain why that number exists and what it means for real-world work.

What does “12 million per index by default” actually mean?

Think of an index as a searchable container. You drop in documents, metadata, and storylines, and the system builds a rapid-access map so you can find things quickly. The 12 million cap is the upper limit for how many documents live in one of those containers under typical configurations. By default, if your collection grows to more than that, you won’t automatically squeeze everything into one index. Instead, you’ll hit a boundary that nudges you to rethink structure before performance slips.

This cap isn’t about keeping you from storing data forever. It’s about keeping searches fast and queries predictable. When you’re hunting for a single keyword across millions of documents, latency can drift upward as the amount of data scales. The threshold of 12 million helps the system stay responsive, so you don’t feel the drag during crucial moments of discovery or review.

Why does the threshold exist in the first place?

In data environments, speed matters. People rely on search results to be timely, accurate, and consistent. Large, monolithic indices can become unwieldy: longer indexing times, heavier resource usage, and sometimes unpredictable query performance. A ceiling gives teams a clear boundary to work within and a built-in incentive to design smarter data structures.

Another way to think about it: manageability. If an index grows without bound, it becomes harder to maintain, back up, and reproduce in different environments. By keeping a sensible cap, administrators can plan more effectively, establish cleaner data boundaries, and avoid cascading performance issues that ripple into dashboards, workflows, and downstream integrations.

What if you end up needing more than 12 million?

Here’s the practical truth: sometimes a single index isn’t enough for a large, active project. When that happens, you don’t have to abandon your indexing strategy. You adopt a few established approaches that keep things fast without turning the system into a tangled web.

  • Split into multiple indices: Create separate indices that each stay under the limit. Each index can cover a different matter, custodian group, date range, or data source. Relativity and similar platforms often support cross-index search, so you can run queries across several indices without losing cohesion.

  • Shard the data logically: Rather than loading everything into one mega index, shard by a meaningful boundary. For example, one shard per matter, or per time period, or per data source. Sharding lets you parallelize search workloads and keep per-shard performance high.

  • Archive or prune older data: If older materials aren’t needed for day-to-day discovery, consider archiving them or moving them into a separate, less-active index. That keeps current workflows nimble while preserving access to older content if needed.

  • Use specialized indexing strategies: Some environments support configurable indexing patterns, like fielded indices or summarized indexes for certain searches. This lets you tailor performance for common queries without loading every document into a single container.

  • Plan for configuration in your environment: In some setups, the default cap isn’t set in stone. You may be able to adjust thresholds through configuration or via platform-specific settings. If your project falls into a high-volume category, talk to your system administrator about the practical implications of changing those limits.

A gentle caveat: more indices mean more moving parts

Mixing in multiple indices or shards isn’t a magic fix. It introduces extra steps in management, such as ensuring consistent metadata mappings, keeping permissions aligned across shards, and coordinating searches that span several containers. There’s a trade-off between simplicity and speed. The goal isn’t to complicate things, but to keep discovery efficient and reliable as data grows.

How to plan around the 12-million cap in real-world projects

Even if you never hit the limit, it’s smart to design with the boundary in mind. Here are a few practical guidelines that can help teams stay ahead of performance concerns without turning indexing into a guessing game.

  • Start with a thoughtful partitioning strategy. Decide early how you’ll separate data by matter, by custodian, or by date. Clear boundaries make it easier to locate, retrieve, and manage documents later.

  • Establish retention and archiving policies. Not every item needs to stay in the most active index forever. A well-tuned policy keeps the newest and most relevant content front and center, while older stuff can be moved to a long-term archive.

  • Map your most frequent queries. If your daily searches tend to skim by date ranges or by specific fields (like document type or author), design indices that optimize those patterns. This reduces the load on any single container and speeds up common searches.

  • Keep an eye on growth trends. Regular data growth reviews help you spot when it’s time to re-balance, add a shard, or create a new index. A little preventive planning beats surprise slowdowns.

  • Test performance under realistic workloads. If you can simulate peak search scenarios, you’ll catch bottlenecks before they affect production work. It’s not about guessing; it’s about verifiable reliability.

  • Document your structure. A clear, written map of which data lives where—and why—helps new teammates onboard quickly and reduces the risk of misplacing items during ingestion or review.

A few analogies to keep it memorable

  • Think of an index like a bookshelf. A single shelf can hold a lot of books, but if you stuff it with thousands of large volumes, finding the right book becomes a hunt. Splitting the collection into multiple shelves—each one well organized—feels more efficient and far less frustrating.

  • Or picture a city’s highway system. A single lane might handle light traffic, but as volumes swell, you add more lanes, or create ring roads to keep journeys smooth. The same logic applies to data: distribute the load so searches don’t bottleneck.

  • Consider a library’s quiet reading rooms. If one room is crowded with visitors, the atmosphere becomes hot and noisy. Multiple rooms create little oases of calm where readers can focus. In data terms, multiple indices are multiple rooms for queries to breathe.

What this means for teams working with Relativity-like environments

For teams building out discovery workflows, the 12-million cap isn’t a roadblock so much as a parameter to design around. It’s a reminder that performance is a product of both data structure and tooling, not just the number of documents you store. When you align your indexing strategy with your typical use cases—search patterns, review processes, and reporting needs—you’ll notice the difference in speed and reliability.

It’s also a reminder to stay curious about the tools you’re using. Software platforms evolve, and the exact limits can shift with updates or different editions. In some environments, administrators can adjust thresholds, or new features can offer alternative ways to search across large datasets without piling everything into a single index. Keeping an eye on release notes and platform guidance helps you plan for future scalability without sacrificing current performance.

A quick, friendly recap

  • The default maximum number of documents per index is 12 million. This cap helps ensure fast, predictable searches and straightforward management.

  • If you approach or hit the limit, you don’t have to scrap your structure. Splitting into multiple indices or shards, archiving older data, and tailoring indexing strategies are common, sensible paths.

  • Planning ahead matters. Think about partitioning, retention, and the kinds of queries you run most often. Regular reviews and testing keep your setup robust as data grows.

  • In some environments, there may be knobs you can turn to adjust thresholds, but such changes should be weighed against complexity and long-term maintainability.

A final thought—because it helps to keep things human

Data systems are built by people who want to move fast and keep accuracy intact. The 12-million cap embodies that balance: enough room to handle everyday workloads, but a ceiling that nudges teams to design with purpose rather than to stumble into performance problems later. If you’re coordinating discovery work on Relativity or similar platforms, you’ll likely feel the benefits when your indexing strategy is clear, modular, and aligned with how you actually search and review.

If you’re curious to explore more about how indexing, search performance, and data organization interact in real-world projects, you’ll find a lot of useful, practical perspectives in industry discussions, vendor documentation, and community insights. The goal isn’t to chase a perfect number, but to keep your workflow smooth, your findings accessible, and your team confident in the path forward. After all, good structure makes the work feel less like a sprint and more like a well-tuned collaboration.

Subscribe

Get the latest from Examzify

You can unsubscribe at any time. Read our privacy policy