Data Lakes
1. Edition March 2020
244 Pages, Hardcover
Wiley & Sons Ltd
The concept of a data lake is less than 10 years old, but they are already hugely implemented within large companies. Their goal is to efficiently deal with ever-growing volumes of heterogeneous data, while also facing various sophisticated user needs. However, defining and building a data lake is still a challenge, as no consensus has been reached so far.
Data Lakes presents recent outcomes and trends in the field of data repositories. The main topics discussed are the data-driven architecture of a data lake; the management of metadata supplying key information about the stored data, master data and reference data; the roles of linked data and fog computing in a data lake ecosystem; and how gravity principles apply in the context of data lakes.
A variety of case studies are also presented, thus providing the reader with practical examples of data lake management.
Dominique Laurent is Emeritus Professor at Cergy-Pontoise University, France. He is a member of the ETIS-CNRS laboratory and his main research interests include database theory, database updates, data mining and data warehousing.
Cedrine Madera is an Executive Information Architect at IBM, France. She is a doctor in Data Science and, in close collaboration with the world of academics, she works on the evolution of information systems.