Insight and analysis on the information technology space from industry thought leaders.

Unstructured Data Growth and AI Give Rise to Data Services

Data services involve the management of data throughout its lifecycle. Here are the benefits of taking a data services approach.

6 Min Read
data services sign on a calculator
Alamy

Managing storage technologies has always been a heavy load for IT teams. With all data created and contained within the walls of an organization and stored often on multiple storage appliances sometimes from different vendors, historically the IT manager responsible for data storage has been a hardware professional. Ensuring uptime and installations of the devices, managing upgrades, procurement, and integrations was the focus.

Today's world of data storage is vastly different:

  • Gartner predicts that by 2026, large enterprises will triple their unstructured data capacity across their on-premises, edge, and public cloud locations, compared to 2023.

  • Hybrid cloud and edge computing are now the predominant models for IT workloads, pushing data storage outside of the corporate data center.

  • Shadow IT brought on by the cloud has complicated matters: How much data does the organization own and where does it reside?

  • Finally, IT executives are seeing the imperative of efficiently curating the right data sets across their petabytes of increasingly hybrid, multicloud storage to feed new AI tools.

Rather than managing storage, IT infrastructure leaders are now responsible for not only managing data, but also delivering data services. These data services include protecting data, managing compliance requirements, archiving data, managing data lifecycle and costs, and yes even deleting data when it's no longer needed. Data services responsibility also means making the right data easily available to end users and tools (such as cloud AI) and with cost optimization in mind.

Related:What Are Emerging Data Storage Options? update from May 2022

Defining Data Services in Data Management and Storage

Data services is a broad term that describes a range of activities typically provided by enterprise IT, such as: data processing, data integration, data security, data reduction, data protection, data storage, and unstructured data management.

As relates to data storage and unstructured data management, data services involve the management of data throughout its lifecycle. Beyond primary storage, it covers analysis and reporting on data storage growth and costs including departmental showback, data usage, self-service file search and tagging, along with data mobility use cases such as data migration, data tiering, replication, and deletion. This new approach requires the ability to understand data usage and manage data independently of storage.

A data services approach can deliver the following benefits:

  • Holistic visibility and granular search across multiple storage systems and clouds;

  • Analytics and insights on data types and usage for more accurate storage decisions;

  • Automated, policy-driven actions based on that analysis;

  • Reduced security and compliance risks;

  • Full use of data wherever it is stored, especially in the cloud;

  • User self-service access to support departmental and research needs for data storage, management, and AI workflows;

  • Greater flexibility to adopt new storage, backup, and DR technologies because data is managed independently of any vendor technology.

Data Services in Action

To better understand the potential of data services, here are examples across higher education, biotechnology, energy, media & entertainment, and retail.

Midsize university: A storage administrator can look across all the shares in the university to search for anomalies that pose risk: files belonging to people no longer with the university, sensitive files that aren't being stored in the right location, and old video files from the website team that are taking up a lot of capacity and are no longer needed. 

Medical device maker: A company with regulated products must regularly answer questions from auditors about its data: What is it, who owns it, and how is it used? Ensuring that data is being stored and protected according to various regulations such as HIPAA is imperative to avoid large fines and penalties. The company is also working to bring in "shadow" data from remote sites so that IT can ensure permissions and other protections are up to date. Using a solution that indexes all unstructured data is the key to accomplishing these critical compliance tasks.

Life sciences: The central IT team at a global pharmaceutical firm has been tiering cold data to the cloud but now wants to give its research teams the ability to identify and tag their project files for later use. This approach flips the dynamic from IT having to police data storage to providing an analytics-based data service to the business.

Energy: A global oil and gas services provider with data centers around the world is modernizing and optimizing its infrastructure, moving workloads and storage to multiple cloud service providers to shrink data center space and reduce capital expense (CAPEX) spending. The company uses the cloud to deliver digital services to its customers such as managing data generated from equipment at the bottom of the ocean. By adopting a data services approach and toolset, the energy company tiered 85% of its data to far cheaper, archival storage in the cloud. A centralized data services strategy and tools help them easily move data from one platform to another, rather than using various point migration tools. Data services is now central to how IT operates — for cost savings, risk management and, flexibility.

Retail: A national retail conglomerate went through a divestiture, shuttering a few of its brands and standardizing its IT infrastructure. The process resulted in a large quantity of zombie files from technologies no longer in use, such as Microsoft .pst files. A data services toolset allowed the organization to quickly find the unwanted files across all storage and delete them. With a simpler, more streamlined data environment, the company is in a great position to grow in its next stage of evolution.

Entertainment: The storage director at a large Hollywood studio was investigating why certain shares were taking an excessively long time to back up. By running analysis on the data, the director discovered many old files belong to a handful of users. Once the users were made aware of the situation, they deleted the files, which improved backup cycles. The IT organization plans to give departmental data managers access to the data management solution. That way, they can view their own data and tag groups of folders or shares for data management actions such as data tiering to cold storage or deletion altogether. This puts data owners in control of their data, while also helping IT meet its objectives.

A Roadmap to Data Services

There's no right way to transition to data services, but analysis is at the heart of the matter.  Using data analytics and data management to understand data usage, data growth, and data costs across storage and cloud environments is a good start. Other core requirements include allowing data teams to search and tag data based on share-based access permissions. These tags can then inform central IT to execute automated policies, such as deleting project files that are more than three years old.

New metrics for IT infrastructure teams guide the way. Whereas traditional metrics measured things like latency, IOPS, network throughput, uptime and downtime per year, and backup windows, new metrics give insights on the data.

These include but aren't limited to:

  • top data owners

  • percentage of "cold" files that haven't been accessed in over a year or more

  • most common file size and type

  • storage costs per department

  • storage costs per vendor per TB

  • percentage of backups reduced

  • rate of data growth

It is now also vital to monitor data for anomalies, such as the access or movement of many files from one location to another or a massive deletion of files. Both could indicate a security incident. As organizations invest more in AI, tools and processes to track and document the inputs and outputs of data from these technologies will help minimize privacy, security, legal, and IP risks.

Above all, data management and storage infrastructure experts will need to shift their thinking and practices from managing storage technologies to understanding and managing data for a variety of purposes. A data storage and data management infrastructure that supports flexibility and agility to shift with organizational data needs will allow IT to make the shift faster and with better outcomes for all.

Sign up for the ITPro Today newsletter
Stay on top of the IT universe with commentary, news analysis, how-to's, and tips delivered to your inbox daily.

You May Also Like