Supporting and Sustaining Apache AsterixDB for the CISE Research Community

Supporting and Sustaining Apache AsterixDB for the CISE Research Community Contact Person:

Michael J. Carey

Other PIs/Investigators/PhD students:

Vassilis Tsotras
Ahmed Eldawy

University of California, Riverside Funding Agency:

NSF CCRI: ENS: Collaborative Research

Project Summary:

The origins of this work go back a decade, to 2009, when a team of database researchers from three UC campuses (UCI, UCR and UCSD) first embarked on the NSF-funded ASTERIX research project. Their goal at the time was to improve database storage and queries by bringing parallel database technology to bear on the emerging new (at the time) world of “Big Data.” The result, now an Apache project, is the only open-source parallel NoSQL database system available today.

Apache AsterixDB is a highly scalable Big Data Management System (BDMS) that stores, indexes and manages large volumes of structured and/or semi-structured data. At the same time, it supports a full query language with the expressiveness of SQL and more. This project continues Apache AsterixDB’s development as a resource for the NSF Computer and Information Science and Engineering (CISE) research community by working on a variety of enhancements, including improved text handling and query processing, additional standard-based geospatial data support, new user-defined function support for user-provided logic, and enhanced system storage and indexing capabilities.

The planned improvements will “benefit the broader public by providing a general-purpose foundation for extracting high-value insights from high-volume, low-value big data in areas such as public safety and health.”