The Netflix data architecture is reflective of the design patterns organizations are looking at.
Over the next two years we will see a blending of SQL and NoSQL databases. The Stinger project (Hive optimization and Tez) have brought interactive query capability to the batch processing environments of Hadoop. Which means the way organizations are using Hadoop is changing quickly as well. Real-time query and ACID capabilities are next in the list of customer requests. As data lakes are defining the modern data architecture platform and more and more data gets stored in Hadoop, organizations are wanting to use data in lots of different ways.
Successful Big data projects have consistent patterns of success (the secret sauce). The technical infrastructure teams will be able to work with vendors to get the right hardware, stand up big data platforms and maintain them. However, big data projects can easily become science projects if the following is not addressed.
- Thought leadership that creates cultural change so an organization can innovate successfully. Big data is about making better business decisions faster with higher degrees of accuracy. A sense of urgency needs to exist.
- An environment of collaboration and teamwork with everyone believing in a vision. The modern data lake helps to eliminate a lot of the technology and data silos that exist across different platforms and business units. Successful big data project environments eliminate the social, territorial and political silos that often exist in traditional teams.
- A strong emphasis in data/schema design and ETL reference architectures. It's still all about the data. :)
- The ability to build a plane while flying it. Big data technologies, environments, frameworks and methodologies are evolving quickly. Organizations need to be able to adapt and learn fast.