As an organization explores opportunities to use AI, it needs a formal approach for keeping track of those ideas: testing the possibilities, capturing what works and maintaining an ìidea graveyardî for concepts that have been tested and determined to be untenable. That might sound simple enough, but the potential quantity of ideas, and nuances among them, can quickly become overwhelming. To avoid that complexity, firms should design and implement an automated idea management process for tracking and managing the life cycle of ideas and experimentation. Doing so helps in tracking idea performance and ensuring the quality of ideas. There are also efficiencies to be gained by providing team-wide visibility to successful ideas and managing duplicate work and potential conflicts.
A similar approach can be applied to managing models. Building real-world machine-learning algorithms is complex and highly iterative. An AI scientist may build tens or even hundreds of models before arriving at one that meets some acceptance criteria. Now, imagine being that AI scientist without a formal process or tool for managing those work products.
A formal process for model management will alleviate that pain for the individuals and the organization. It makes it possible for AI scientists to track their work in detail, giving them a record of their experiments. Such a process also enables them to capture important insights along the way from how normalization affected results to how granular features appear to affect performance for a certain subset of data.
Across an organization, sound model management empowers data scientists to review, revise and build on each otherís work, helping accelerate progress and avoid wasted time. It also enables the organization to conduct meta-analyses across models to answer broader questions (e.g., “What hyperparameter settings work best for these features?”).
To succeed at an enterprise scale, an organization must be able to store, track, version and index models as well as data pipelines. Traditional model management should be expanded to include configuration management. Logging each model, its parameters and data pipelines enables models to be queried, reproduced, analyzed and shared. Consider, for example, that model management will track hyperparameters that have been tested and record what was eventually used for deployment. However, model management will not simultaneously test what features were tested and discarded, what modifications were made to data pipelines or what compute resources were made available to support sufficient training, to name just a few key activities. Together with model management data, tracking that kind of configuration information can accelerate the deployment of AI services while reducing duplicate work. An organization will never achieve that level of visibility and analysis when managing models via spreadsheets