The History of Apache Hadoop and Big Data

Abstract

Apache Hadoop started in 2006 as an open source implementation of Google's file system and MapReduce execution engine. It quickly became a significant part of the Big Data phenomenon. Many other tools joined the Hadoop ecosystem, such as Apache Hive, Spark, and Kafka bringing SQL, programmatic data processing, and streaming. More recently machine learning, AI, and IoT have joined the fray. And of course the tectonic shift to the cloud has changed how these tools are deployed and used.

In this talk Alan Gates, co-founder of Hortonworks (now Cloudera) and a committer on Apache big data projects since 2007, will talk about the history of big data, the current state of the art, and offer some thoughts on where it is headed in the near future.

Bio

Alan Gates is one of the founders of Hortonworks (now Cloudera). He has been developing inside databases since the 1990s. In 2007 he joined the Hadoop team in Yahoo! and helped bring Pig to Apache. Since then he has been heavily involved in Apache Hive as well as mentoring many other Apache projects. He is currently a member of the architecture team at Cloudera. Alan has a BS in Mathematics from Oregon State University and a MA in Theology from Fuller Theological Seminary. He is the author of the book Programming Pig from O’Reilly Press.

Talk Time and Place

February 26, 2019 @ 1:20PM in MH 225