Pages

Monday, 20 January 2025

Data quality in focus for 2025

The quality of data will become a key concern in 2025 for multiple reasons, chief among them that low-quality data can lead to inaccurate generative AI (gen AI) output.

Grant Bourzikas, Chief Security Officer, Cloudflare explained: "In 2025, disinformation will transcend the Internet and social media, and move to poison and taint AI models. Information sharing exists at an order of magnitude faster, and more efficient than ever before. And in the world of AI, data is the only currency and organisations that have the most will win – but quantity doesn’t always equal quality.

"AI on its own will not solve the world’s most critical problems. The successful implementation and use of AI depends on data. But as disinformation continues to plague society, it will begin to trickle into AI models that are critical to making decisions – e.g., calculating goods needed to restock grocery store shelves, diagnosing sick patients, or analysing market trends to share financial risks with bankers." 

Praveen Thakur, SVP for Asia, Teradata, noted that a highly-sophisticated generative AI (gen AI) model with bad data can only deliver poor results. "AI models and data inherently also have biases, which means inaccurate information can be delivered without indicating biases present. For example, we are seeing that AI bias is starting to creep into company strategies in Asia when analysis comes from an AI system that was trained on a slanted collection of data," he said.

"It is thus critical that AI users in Asia start to understand the lineage of the data behind the AI systems they use, especially now that we are in an AI-driven explosion of data."

Thakur said that by 2025, there will be 120 zettabytes of data created, captured, or used. "However, this ocean of data is mostly unusable as it is generally unrefined, duplicate, or inaccessible. Finding the valuable information in the sea of raw data means filtering through a lot of unusable or polluted content. While technology can do some of this, we still need human discernment to find the rivers of clean and reliable data that should inform AI training models," he said.

The situation has led to explainable AI becoming even more urgent and necessary, Thakur added. He said: "Organisations can enable explainable AI by offering complete visibility into the data behind decisions, including how a model uses data and complies with all the proper regulations. It should also be clear to employees and users why the decision is both fair and equitable, and this could be done by ensuring the data sources are validated and proved trustworthy before any AI implementation occurs, and the reasoning behind the model’s output must be digestable and accountable to a human decision-maker.

"This level of governance must also be respected throughout an organisation’s ecosystem, including the technology tools being integrated into the partner companies with whom the organisation interacts."

"High-quality, trusted data is the foundation of successful AI that can be applied across economies and industries, and we believe that empowering three core principles of trust – a focus on people, transparency, and value-creation – will help drive a new era of creativity and confidence in business decision-making across Asia.

"Data quality is one of the most important things we must grapple with, but it’s getting harder to prove. Efforts like the EU AI Act can help, but more is needed. The current focus is on how a model was created and trained, but we need to be able to signal whether a model can be trusted," noted Qlik in a list of 2025 predictions. 

"An AI Trust Score acts as a filter through which all data should go — establishing provenance, lineage, and ultimately, overall trust. Data profiling markers will become important, particularly discoverability, accuracy, consumability, timeliness, security, and diversity."

As demand for authentic, high-quality data skyrockets, private data increases in value, Qlik added, but said it will need to be exchanged in a marketplace. "Vetted domain-specific, ethical, quality-controlled data will be worth gold. It’s a win-win for companies — you improve access to new, trusted sources, and set yourself up for a future of internally and externally gaining from the unique data you are producing," the company stated.

Unifying data silos will also be a priority, said NetApp. 

Source: NetApp Data Complexity report. Chart by country and investment. Every country is increasing their data management or infrastructure investments except India.
Source: NetApp Data Complexity report. Every country is increasing their data management or infrastructure investments except India, reflecting India’s previous investments, which accounts for the country’s lead over its peers in AI adoption.

 "Data unification is emerging as a critical driver of AI success, with 79% of global tech executives recognising the importance of unifying data to achieve optimal AI outcomes. Companies that prioritise unifying data are more likely to reach their AI goals in 2025, with only 23% of companies that prioritise unifying data saying they won’t reach their goals, versus 30% of companies that don’t prioritise unifying data," the company said during the launch of its Data Complexity report.

"Investing in data management and infrastructure has become the top priority for organisations, with executives emphasising it twice as much as other AI-related initiatives – a trend set to grow. Looking to the future, organisations that embrace data unification will be better positioned to fully harness the transformative power of AI, ensuring they stay ahead in an increasingly competitive landscape."

"In APAC, 85% of tech executives recognized the importance of unifying data to achieve optimal AI outcomes in 2025. Every country is increasing their data management or infrastructure investments except India, reflecting India’s previous investments and lead in AI adoption. In India, 44% of tech executives see data management or infrastructure as their current top priority, with a lower 37% seeing it as a future top priority. In the other parts of APAC, tech executives in Japan (42%), Singapore (49%) and A/NZ (43%) see data management and infrastructure investments as their top future investment priority – which are higher than their current top priority."

The way businesses manage data must change as large language models (LLMs) increasingly level the playing field, said Twilio. "What will set brands apart going forward is how they leverage their customer data to deliver truly seamless, superior experiences," said Liz Adeniji, Area VP, Asia Pacific & Japan, at Twilio Segment.

"Smarter practices around data collection and management will take centrestage in 2025. One emerging trend is the shift towards data expiration, as brands recognise the risks of holding excess data while customers grow more protective of their information. Moving forward, brands will focus on collecting only essential data—such as email, phone number, and name—while letting temporary or unnecessary data expire.

"The importance of having clean and consistent contextual data cannot be overstated as brands increasingly integrate AI agents and conversational AI into the customer journey. More brands will invest in customer data platforms (CDPs) and scalable data validation frameworks to ensure data reliability and trustworthiness at scale."

According to NetApp's Data Complexity report, two-thirds of companies worldwide report that their data is either fully or mostly optimised for AI. Despite this progress, 2025 will still demand investment in AI and data management, the company said.

"In fact, 40% of global technology executives believe that unprecedented investment in AI and data management will be required for their companies in 2025. While companies have made strides in optimising data for AI, achieving future breakthroughs will demand even greater commitment and resources," NetApp noted.

No comments:

Post a Comment