Managing Data for AI: From Development to Production
AI driven applications have exploded and we are transitioning into a world where every application will apply some degree of AI techniques. This is enabled by vast amount of organized and unorganized data, cheap compute and algorithmic breakthroughs such as deep learning.
The promise of AI, of course, depends on the ability to access, harness and operationalize data at scale. While algorithms and models may get all the hype and glamor, nearly 80% of technical effort actually goes into enabling the underlying data. Easy experimentation with data during development, and robust data pipelines in production are a must have. Today, in most organizations, Data Science teams are having to put together pipelines on their own and are spending too much time manually managing data, because data engineering teams are overwhelmed with requests.
This paper describes the role of data in building AI applications. What readers can expect from this paper:
-
- Technology and business leaders will get a data-centric overview of the AI development process. An understanding of the process and its challenges is essential for making resource allocations
- Data Engineers will get a view into the expectations that AI development puts on data engineering
- Data Scientists and Analysts will see a lot of familiar information, systematically organized and supplemented with information gathered from from across organizations.