92% of respondents state that unstructured data issues are impacting the success of their GenAI initiatives.
STAMFORD, Conn., Nov. 14, 2024 /PRNewswire-PRWeb/ -- Global AI spending will hit over $631 billion by 2028, but at least 30% of initiatives will be abandoned after proof-of-concept according to industry firms IDC and Gartner. A new survey highlights the number one reason why ambitious GenAI investments are falling flat: relying on poor quality data to feed algorithms.
The survey from ViB Research was commissioned by Shelf, a next-generation knowledge management platform, and included more than 300 IT and data management leaders. It reveals that while there is pressure and excitement to launch GenAI projects, there are glaring holes in the process that threaten their success.
Problems arise because GenAI tools rely not only on structured data that has historically been housed in data platforms, but also on unstructured data that is stored across file repositories and knowledge bases. Unstructured data represents 90% of all an organization's data and exists outside of quality control frameworks. This unstructured data is an essential element of GenAI. The largest share of these unstructured data files is on Microsoft SharePoint, which 67% of respondents identify as the primary source of the unstructured data they use in their GenAI initiatives. It can also consist of data from Microsoft 365 documents, Microsoft OneDrive, knowledge bases, Dropbox, PDFs, emails, and social media. It's this data that gives a GenAI tool so much of its detail, style, and personality – and determines how accurate or successful it may be.
The survey indicated that business leaders are aware of how difficult it is to organize this data:
- 92% of participants indicated that unstructured data issues had an impact on their GenAI initiatives with 30% saying those issues are "large" or of "significant impact."
- 68% state that over half of their files have at least one issue within them.
- 85% of respondents reported that their organization has over 1 million documents and files to manage, with 51% of respondents saying their companies have more than 10 million files.
- Significantly, 66% of respondents say they don't have a standard process for prioritizing GenAI use case implementations across the organization.
- Despite the vast majority of participants acknowledging that their unstructured data had issues, 74% of respondents still planned on leveraging it in GenAI use cases.
"Without addressing underlying data quality issues, especially in unstructured data, any investment in Generative AI is like building on quicksand," said Sedarius Tekara Perrotta, Shelf CEO. "We keep hearing that GenAI initiatives are failing as quickly as they are adopted, and the technology is being blamed for being too risky or too nascent. In reality, most failures can be traced back to data quality issues that have nothing to do with the complexity or sophistication of GenAI capabilities. It's extremely important for leaders to understand the source of the challenges to fix them, not only for individual companies and leaders investing in AI, but for broader AI adoption across the enterprise. Data quality is make or break within all AI initiatives."
The survey findings underscore the critical importance of data quality for GenAI success, and serve as a guide toward a solution for those currently struggling with GenAI implementation.
For full report details, visit us here.
About Shelf:
Shelf is a next-generation knowledge management platform that helps businesses unlock the full potential of generative AI by addressing the critical challenge of unstructured data. By enabling organizations to identify and eliminate bad data within documents and files, Shelf ensures trusted and accurate GenAI answers, paving the way for successful AI adoption across industries. Companies like Amazon, Nespresso and Herbalife rely on Shelf to tackle complex structured and unstructured data needs at scale, and prepare them to effectively invest in a generative AI future.
Media Contact
LaunchSquad for Shelf, Shelf, 415-625-8555, [email protected], https://shelf.io
SOURCE Shelf; Shelf

Share this article