An Integrated Framework for End-to-End Data Analytics and Machine Learning
Machine learning (ML) research in the past few years
has gifted us with numerous effective tools and techniques to power applications with data.
Currently, ML applications require a great deal of customization,
and ML practitioners spend an inordinate amount of time discovering,
through trial and error, the precise recipe for their applications.
From raw data to a learned model, perfecting a machine learning pipeline
involves many iterations with incremental modifications over
both the data preparation and the model training components.
Helix offers an integrated framework for effortlessly developing and iterating over
end-to-end machine learning pipelines. Through intelligent materialization of intermediate results
and fine-grained data provenance across data preparation and model learning,
Helix significantly reduces the manual overhead for incremental modification
and shortens the iteration cycle. The declarative programming interface allows
the data scientist to focus on data intelligence rather than system details.
Currently in private beta. Public beta to be released by Fall 2018.
The ongoing development of Helix is led by Doris Xin (dorx0 @ illinois.edu), Stephen Macke (smacke @ illinois.edu), Litian Ma (litianm2 @ illinois.edu), Jialin Liu (jialin2 @ illinois.edu) and Shuchen Song (ssong18 @ illinois.edu)under the guidance of Prof. Aditya Parameswaran (adityagp @ illinois.edu) at the University of Illinois at Urbana-Champaign.