Apache: Big Data Europe 2016
Click here to Register or for more information

Create a Hadoop Cluster and Migrate 39PB Data Plus 150000 Jobs/Day - Stuart Pook, Criteo

Criteo had an Hadoop cluster with 39 PB raw stockage, 13404 CPUs, 105 TB RAM, 40 TB data imported per day and >100000 jobs per day. This cluster was critical in both stockage and compute but without backups. This talk describes: 0/ the different options considered when deciding how to protect our data and compute capacity 1/ the criteria established for the 800 new computers and comparison tests between suppliers' hardware 2/ the non-blocking network infrastructure with 10 Gb/s endpoints scalable to 5000 machines 3/ the installation and configuration, using Chef, of a cluster on new hardware 4/ the problems encountered in moving our jobs and data from the old CDH4 cluster to the new CDH5 cluster 600 km distant 5/ running and feeding with data the two clusters in parallel 6/ fail over plans 7/ operational issues 8/ the performance of the 16800 core, 200 TB RAM and 60 PB disk CDH5 cluster.

Speakers

Stuart Pook

Senior DevOps Engineer, Criteo

Stuart loves storage (130 PB at Criteo) and is part of Criteo's Lake team that runs some small and two rather large Hadoop clusters. He also loves automation with Chef because configuring more than 2200 Hadoop nodes by hand is just too slow. Before discovering Hadoop he developed... Read More →

Pook ApacheBigDataEurope2016 CreateHadoopCluster pdf

Tuesday November 15, 2016 12:00 - 12:50 CET
Giralda VI/VII

Apache Big Data Europe 2016

Stuart Pook

Attendees (21)

Apache Big Data Europe 2016

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Stuart Pook

Attendees (21)