Updated: Oct 10
The term “dark data” is tossed around often in the corporate sphere, but does anyone really know what it is? Dark data encompasses all data that organizations create and store as a part of regular business operations and processes, but never use. The data is usually unstructured, uncategorized, scattered, and as a result, it is often unknown to the company or the users. Some common examples of dark data include files from previous employees, ignored downloaded email attachments, and drafts of work stored on personal laptops. The list of places where dark data may reside is endless, which can leave anyone charged with information governance and/or IT within organizations feeling overwhelmed and unsure of what steps are needed to truly secure their data environments.
It is important to get a handle on dark data because, although it is ‘dark’, that does not mean that the data is useless. In fact, a good amount of dark data may be valuable and could be integrated into a corporation’s critical data set by using proper extraction and analytical tools.
Many times, dark data flies under the radar because a specific employee forgets about it or dubs it unimportant to their department, even though it might hold value to another department. Without the ability to tap into the additional value from these data assets could result in loss of new income or business opportunities. Organizations need to take charge, devise a plan, and learn about where their dark data resides. Taking the following steps can greatly help with dark data management.
1. Categorize all company data.
It is crucial to consider all the types of data that employees, vendors, and customers may generate and then create a data classification system comprised of information tiers. Data classification systems should be an integral component of an organization’s information governance program because categorized data is easier to find and retrieve. It is imperative to consider all possible scenarios where someone might generate data on behalf of the corporation and then, where that information could be hiding. The team handling data categorization should also label what type of data is meaningful, sensitive, private, or stale.
Using this type of labeling system will help organizations identify, retrieve, classify, and purge stale dark data. The benefits of having a predictable data classification system include improved time management, cost savings, better risk management, and the ability to gain new business intelligence.
2. Implement a framework around designated data categories.
This step is even more important than categorization, as it provides insight into the types of data that fit into each category. Each data set should provide data definitions denoting risk level. In addition to standard policies around using the data classification system, it is also important to create internal policies on dark data classification and retention. Lastly, to safeguard confidential data, an organization’s information governance plan should account for standard security protocols and any special measures that protecting dark data requires.
By including the risk level for each category of data, this will help determine which data needs additional protection.
3. Identify and capture dark data.
Figuring out where employees and other business partners house corporate data and capturing it is one of the bigger obstacles in the management process. At this step, organizations should be able to identify where the data resides. However, capturing dark data can be a huge undertaking if only relying on manual efforts since so much of the data is unstructured. Using technological aids can help streamline and simplify this process.
Organizations must still be careful about what methods they deploy to capture dark data to ensure nothing interferes with their internal systems and business operations. It is important to thoroughly vet any third-party vendors assisting in the process in order to uphold company policies and maintain appropriate data security during the captures.
4. Organize the captured dark data.
After identifying and extracting dark data, organizations will need to analyze and place the data into the requisite categories. Consider using an auto-classification tool like Microsoft 365 to help with this step. Using these resources can simplify this process with minimal intervention, allowing the data team to focus on analyzing the dark data to derive the potential benefits.
Being able to harmonize unstructured and structured data is key to improving operational and investment decisions. Unfortunately, so many organizations are failing to tap into their dark data and losing out on these insights. By going through the steps above, organizations can start to get a handle on their dark data. While initial costs and time investments may seem high, the resulting benefits will be unmatchable. Just having an internal conversation about where dark data resides can help jumpstart the process and lead to improved information governance and business intelligence. Ultimately, organizations that make use of their dark data can make smarter decisions and generate better outcomes.
For more information on how to handle your dark data, download our latest whitepaper: Lighting up dark data: How law firms extract value from hidden information
For more information, please contact:
Caroline Woodman, Senior Vice President and Managing Director, Asia, Epiq
To find out how we can help you, contact us here.