Demystifying Data Storage for Nonprofits

Most organizations are awash in data. They spend significant amounts of time and money attempting to harness and use vast amounts of information, yet many still find it nearly impossible to identify what data is actually important, what’s missing, and what connects to what (and how).

Data management is critical for membership organizations — nonprofits in particular, but also industry associations, unions, and others — as they look to remain competitive in a modern marketplace where success increasingly depends on data-driven decision making. They need to be able to facilitate the orderly, secure, and intelligent use of their data, and proper storage is the key.

But for some, fully wrapping their heads around what data storage even means — much less understanding which of the many available options is right for them — can make the solution seem just as complex as the problem.

It doesn’t have to be. This article offers a breakdown of the different types of data storage available, helping member-driven organizations decide which is right for them.

Different types of data, different types of storage

Data is typically characterized in one of two ways: either raw (unstructured) or processed (structured). Raw data has not been processed, while processed data has been formatted for a specific purpose.

A list of volunteers would be classified as raw data. So would survey results. But if those results were paired with the names of individual volunteers who completed the survey, and if those volunteers were then matched to giving histories and collated into a spreadsheet to determine whether donors responded differently than non-donors — well, that would be characterized as processed data.

Just as there are different types of data, there are different types of storage options better suited to each:

Data Lakes

Data lakes house raw data. A data lake (a low-cost repository for both structured and unstructured data, and for organizing large volumes of diverse data from an array of sources) allows giving histories, demographic information, email engagement data, transaction history, and event RSVP lists to all float together, even if there’s no specific purpose or purported connection.

There are many reasons why an organization might pick a data lake. It may be contemplating a future use for its data, or may simply wish to ensure certain information is captured and stored somewhere. While this option typically requires an incredibly large storage capacity, the data remains flexible, and can be used and reused as often as the organization wishes.

Further, because the data isn’t structured, a data lake is well-suited to machine learning and AI pattern recognition, which can quickly transform vast amounts of seemingly disconnected data into discernible relationships. A bar association, for example, would be able to store social media campaigns directed at lawyers who are members or streaming data from any continuing legal education seminars in a single data lake to support real-time analysis.

Pros: Flexibility and cost-effectiveness

Cons: Complicated to navigate; usually best accessed by highly skilled data scientists with specialized tools

Data Warehouses

Data warehouses organize processed data. If a data lake is waves of data crashing onto the shore, then a data warehouse — a digital storage system that connects large volumes of data to power business intelligence, reporting, and analytics — is a shelving system with aisles of data in precise formation.

Data warehouses store only data that will be used, thus requiring far less storage capacity than data lakes. Data in a warehouse has been extracted from multiple sources and then cleaned, filtered, organized, and arranged for a specific purpose, (i.e., turned into processed data to enable easy reporting, querying, and other analyses).

Data warehouses capture data across an entire organization, rather than being limited to one segment. They are particularly well-suited for users like nonprofits, which have data spanning multiple functional units, including their programmatic work, major gifts, direct mail, email, and advertising programs.

Pros: Speed; enhanced business intelligence; data organized into a standard format

Cons: Once the warehouse has been created, it is difficult to shift from that initial set-up

Data Lakehouses

Data lakehouses combine signature aspects of data lakes and data warehouses together in one place.

A data lakehouse pairs the structure and data management features of a data warehouse with the low-cost storage and flexibility of a data lake. It provides a single place to store all of an organization’s data — unstructured, structured, and semi-structured — and to use both machine learning/AI and available business intelligence tools.

Pros: Reduced data redundancy; supports direct access via business intelligence tools; applies data governance rules; can effectively contain costs

Cons: Still relatively new, far less mature than other options

How they work together

Split screen view of the data lake + the data objects

Not only can data lakes and warehouses coexist, but their complementary features actually form two parts of a truly strategic whole for IT. Integrating both is increasingly central to the development of a cohesive data-storage strategy — one where technology is more manageable, scaling is easier, and things can operate without the need for servers.

We’re now seeing an increasing blurring of the line between storage (lakes) and computation (warehouses). Rather than creating distinct differences between where and in what format data is stored, end users have the ability to access data regardless of the infrastructure: any authorized stakeholder using any tool can analyze and process data that is critical to the organization’s continued growth and success.

The first and most critical step for a membership-driven organization is to choose the data storage strategy that’s best for its specific needs. Once it does, a world of possibility opens. The correct approach will allow them to get a better sense of what data is truly important, what data is missing, and how seemingly disorganized data can actually be reconfigured and connected in order to measure outcomes, improve performance, and enable the data-informed decision making that is so important to their ultimate goal: furthering mission and purpose.

Person with glasses looking over graphs of data on their computer

Learn more about optimizing data storage strategies for membership organizations — including yours.

Contact a Data Expert Today

Additional Resource Center Content to Explore

A Technical Guide to how Civis Identity Resolution Works

Download the Whitepaper

Research Report

icons of tools working on a dashboard user interface

How the Continuous Improvement Process Guided the Latest Civis Platform Updates

In my previous post, I discussed how Civis Analytics leverages continuous improvement (CI) to make a better product. This follow-up post addresses new and upcoming updates to our Civis Platform,…

Blog Article

Black woman in three phases of problem solving

How a Global Federated Nonprofit Streamlined Data Reporting to Track Its KPIs

Automated and standardized reporting across international offices. Built out BigQuery integration for customized Platform use. Improved internal coordination between 23 isolated data teams. THE PROBLEM For decades an international federated…

Case Study

How For Our Future Uses Civis Platform’s Query Tool to Define and Refine Voter Outreach Success

The Challenge For Our Future Action Fund organizes and mobilizes voters the old-fashioned way: through grassroots, one-on-one engagement. The nonprofit builds progressive power through voter engagement and community organizing to…