Data storage capacity requirements are set to surge by 150% in the next couple of years, new research from Hitachi Vantara has revealed.
While the average large organization currently holds 150 petabytes (PB) of data, organizations are expected to be storing over 300PB of data by the end of 2026.
This is matched by an uptick in investment, with organizations’ financial commitments in data storage set to more than double over the same period at an increase of 224%. Investments in AI and processing power, meanwhile, are also set to more than double.
According to the study, 31% of IT leaders said data storage capacity was a serious concern as the rate of growth is placing strain on critical assets, while some also complained of the increasing complexity of data.
More than one-third (76%) of those surveyed said more than half of their data is now unstructured.
While the demand for larger, more complex data storage is rising, the storage industry isn’t ready for this increase, Seagate CCO B.S. Teh said.
“As AI matures and scales, the value of data will increase, leading us to store more data for longer,” Teh told ITPro.
“However, the storage install base is forecasted to have a 17% CAGR – therefore at a significantly slower pace than the growth in data generated. And it takes a whole year to build a hard drive,” he added.
This growth rate disparity will disrupt global storage supply, Teh said, and businesses will need to build long-term plans around capacity to ensure storage supply as generative AI use becomes more strategic.
The AI data problem
The driving force behind this looming data problem is AI. As a technology, it demands not just access to larger quantities of data, but also lengthy storage periods.
“Whether it’s from capturing training checkpoints to saving source data sets, the more data we retain during the process, the more we can validate AI as trustworthy,” Seagate CEO Dave Mosley told ITPro.
“That data therefore needs to be available long term, not just to comply with evolving legal requirements but also to ensure that inference is explainable,” he added.
This ever-growing volume of data can cause different problems for organizations, depending on what route they choose for storage.
“While outsourcing storage comes with its own security, compliance, and potential risk, holding the data for AI internally can become eye-wateringly expensive,” Paul Mackay, RVP of cloud in EMEA and APAC at Cloudera, told ITPro.
What can businesses do?
To avoid feeling the sting of increasing data storage requirements, businesses will have to focus more keenly on data management strategies and data architectures, experts have told ITPro.
Chris Harris, VP of global field engineering at Couchbase, told ITPro that for organizations to fully unlock value from generative AI, infrastructure modernization projects must be conducted in parallel to the development of a robust data management strategy.
“To capitalize on AI investments now and in the future, organizations must ensure they can control data storage, access, and usage, enable real-time data sharing, and maintain a consolidated database infrastructure to prevent multiple versions of data,” he said.
As AI is increasingly integrated into applications, Couchbase’s VP of AI and Edge Mohan Varthakavi said, data architectures will also need to be redesigned to support new workloads.
“Companies will implement new data architectures that go beyond simple record storage to capture the ‘intelligence history’ and thought processes of AI systems,” Varthakavi told ITPro.
“They will need to simplify complex architectures, including consolidation of platforms, and eliminate data silos to create trustworthy data,” he added.