The first step in building a thriving AutoML research community is making sure that there are enough high-quality datasets available to the community. This corpus contains a large number of datasets collected and developed under the umbrella of DARPA's Data-Driven Discovery of Models (D3M) program. Each dataset in this corpus was painstakingly curated and annotated with extensive metadata to ensure that the AutoML community is presented with challenging datasets that go beyond simple tabular datasets and cover a rich set of problem types and data types. Some of the problem and data types covered by this corpus are classification (binary, multiclass, and multilabel) and regression (univariate and multivariate) over tabular, text, image, video, and audio data; time series forecasting; object detection; graph problems such as link prediction, vertex nomination, community detection, collaborative filtering; multitable relational data; multiple-instance learning problem, etc. This corpus hopes to unite researchers in discovering the new frontiers of AutoML research.