If you’re thinking about your big data for next year, here are some big data goals for 2021.
Copyright by www.techrepublic.com
In 2021, corporate big data leaders will be looking to improve data quality and turnaround of big data projects, as well as performance in meeting business objectives. While 2020 hasn’t been a normal year for anyone, you still have to plan for the future and get ready for what may come.
1. Manage data better
Big data continues to enter corporate networks at torrential rates, with the amount of poor data that companies obtain or use costing the US economy an estimated $3.1 trillion annually. More effort needs to be made to screen data as it comes in, and to properly clean and prepare data before it is added to corporate data repositories.
At IBM Research Switzerland, () and assisted researchers in plowing through reams of scientific papers and journals in a search for relevant information pertaining to a molecular drug design. Researchers recognized that much of the worldwide information the would be reviewing would have no relevance to the problem they were trying to address. The company made a decision to eliminate importing data from non-relevant sources upfront. This saved hours of time, gave the researchers a high relevant set of data, and eliminated data storage waste.
Once the data passes incoming criteria, it should also be cleaned and properly prepared before it is uploaded into a data repository. This means checking for incomplete, duplicate, and inaccurate data, and also normalizing data so it can be blended with other source data for analytics.
2. Speed and monitor the process
By now, most organizations are well underway with an iterative, DevOps-style development approach for big data and analytics. Now it’s time to formalize the process so users and IT/data science know when a big data analytics model is mature enough to be placed into and maintained in production.
The benchmark for corporate readiness is that big data analytics results must reach a threshold of 95% accuracy and must consistently deliver this level of performance. Since business and outside conditions change over time, it’s possible that a big data application in production can start falling below 95% accuracy.
IT and data science should establish a maintenance policy that remeasures apps for accuracy each year to assure that the apps are still delivering accurate results.
3. Formalize a hybrid architecture for big data and analytics
IT, data science, and end users have all budgeted for and independently developed big data and analytics applications. Some of these systems run on premises, while others run on public and private cloud platforms.
As the need grows for more data to be pulled together from disparate sources, an over-arching hybrid cloud architecture that includes cloud and on-prem platforms should be formalized, and enterprise security and governance should be uniformly applied throughout. Few organizations have formalized this hybrid architecture for big data. 2021 is the year to do so.
4. Build bridges between IT, data science, and users
As more vendors simplify solutions, there has been growth in citizen , where business units develop their own and big data applications. Later, when users want to train these apps and integrate them with other company data and platforms, they need IT and data science departments to help them. […]