Cloudera – Big Data Revolution Will Transform The World!

Total Shares

Mike OlsonDuring a recent keynote Mike Olson, of Cloudera, stated the case for Big Data helping to take on the challenges we will face as a planet. At the same time he outlined customer use cases, the future of Hadoop and affirmed his belief that Hadoop is now enterprise ready. A fact he backed up by talking about the fact there are now many production deployments around the world.

This blog covers the first part of Mikes Cloudera Sessions Keynote which took place on October 8th at the Allianz Arena in Munich.  Some of this blog paraphrases, using my notes, with other parts reflecting specific quotes I noted as Mike spoke. A follow-up blog shares a little on where Cloudera stands today and the new initiatives Mike spoke about in that talk including news around the One Platform Initiative, Record Service and Kudu.

Massive Challenges Ahead

After a generic history of data and analytics, that got us to the present day, Mike launched into speaking about something he is clearly passionate about. He tagged this as one of the reasons Cloudera is so excited about what they are doing and why they exist.

As a planet we are facing massive challenges such as Global Warming as well as producing and distributing energy and water for a growing population (7 billion today heading to 9 billion by 2050).

The only way to meet them is by understanding the world and acting in a more intelligent way than ever before. Big Data is going to be an important factor in solving those problems”

Big Data is going to change the world - @mikeolson #bigdata Click To Tweet

The Big Data Revolution

Mike backed this up by pointing back to the Industrial revolution which happened in the 1800’s. His argument that was up until that point economic output had remained largely the same with the same quality of life for a long period of time.  The assertion was that until then the only work you could do was that which you could physically handle as a person. There was no power and no tools to assist. Once steam power arrived a lot changed and industries were transformed. Examples quoted were transportation which moved from horseback to trains and people using power and machinery to do more than ever before. This changed the economy and the world and hence the term the Industrial Revolution.

Mike asserted that the Big Data revolution, which he believes we are now on the cusp of,  will drive an equivalent transformation. This transformation will not be in mechanical labour but in intellectual labour. He used examples such as doctors being aided by machines that will help them make quicker and better diagnosis while recommending the best course of patient care. He mentioned that machines will help lawyers and accountants understand and operate better with the law due to recommendations that that understand their field and can help them make better decisions. He then went on to say that EVERY professional will likely see some transformation in the way they work as this revolution takes hold.

“I believe, as a Knowledge economy, it will make us much more productive and generate enormous new economic value” – Mike Olson

Mike then built a link into the business. His assertion is that the higher level change we will see around the planet will also apply into the business. Every business Cloudera is speaking to is a data driven business and has been for a long time.  He essentially shared that no matter which department someone is in they will have used data to do their job an make better decisions.

.@Cloudera wants to help people use their data to do their job better than ever before - @mikeolson #Hadoop #bigdata Click To Tweet

Next Mike focused on how using more data than before can help organizations. He stated it could help them better understand how customers think to drive new revenue and make them more satisfied  so they come back regularly and faster.

After that Mike switched and focused on how data is driving entire industries. Cloudera is seeing new opportunities for using Big Data across all industries. His examples were focusing on churn in telco and equities portfolio risk in financial services so you can tune your trading based on all the information you have from the market. His point was that while these appear very different problems they both rely on data about people and how they behave as well as data about markets/transactions and data about the future (obtained via analytics).

He went on to describe that most typical big data use cases use:

  • Large scale data collection and storage
  • Machine learning and predictive analytics

Customer Use Cases

Mike shared 3 customer cases on a slide. All were from North America. The one I got notes on was eVariant so I shared that below.

eVariant: Manages patient healthcare data to drive better outcomes for patients. Lots of simple data – Birthdate, Age, Weight, Blood pressure etc. Then you have the Doctors notes and information about your prior health history.  A doctor today has to take on-board all that data.  By using Cloudera to bring all that data together to make recommendations, to doctors and insurance companies, they claim to be able to make better outcomes in hospital and clinical conditions.

During the day use cases were also presented by Otto and F. Hoffman-La Roche AG. Andreas Bitterer, from industry analyst BARC, also presented a few cases I will write up in a separate post.

New Skills Required

Next up was a discussion about new skills requirements. Of course almost every organization would love to use all their data and pervasive analytics. The issue is that to do that many will need new skills in the organization. Organizations need to develop the right team to collaborate.  The bottom line is business and IT need to work very closely in the new world. That collaborative team needs to focus on use cases and valuable projects they can launch that can make a business impact in the near term. They also need to plan how to roll those out all at the same time..

This collaborative team needs to consider:

  • What data is available?
  • How are you going to get it?
  • Where are you going to store it to start with?
  • How will you manage it?
  • How will you secure it?
  • What analytic techniques should you use?
  • Which tools are available?
  • Which partner technologies, or third party SIs, might aid in understanding and using that data better and drive value?
Big Data technology isn't interesting unless you are getting value from it - @mikeolson #bigdata Click To Tweet

The Critical Piece – Using Insights!

Having done everything in terms of finding new insights it is pretty useless unless you can use those insights to change the business behaviour.

“The combination of technology, data and business critical to making that happen” – Mike Olson

The right platform

Lastly Mike focused on the need for the right platform. He dwelled a little on his past history saying he was a part of organizations that built data platforms which were outstanding at dealing with the business issues they were running into in the 80’s and 90’s. Today those systems struggle handling diverse data sets. They also struggle to combine these new datasets with the more traditional older ones. The older systems also fail, in Mikes opinion, to be able to offer the data processing capabilities you need today.

Mike then outlined that in his opinion the Enterprise Data Hub,  the Cloudera Hadoop offering, brings together massive numbers of open source components to solve those new problems in a enterprise ready environment.  He asserted it delivers to technical people, business people and the data team the tools needed to do their job.

Hadoop GrowthHe shared that Hadoop had come a long way since it was first introduced by Google. It is no longer just batch processing which it was excellent at initially.  There are now SQL and search tools. Great new analytics frameworks such as Apache Spark. He also stated that you can even serve data in real-time using HBase not just push it and read it from HDFS. The platform has come a long way and it is going to keep moving forwards!

That “moving forwards” part was the focus of the second part of Mikes keynote I will cover in a separate blog.

In Conclusion

There is no denying the fact that Mike is a believer that Big Data is going to fundamentally alter the day to day work life of most people, disrupt and transform industries and become a core asset to every organization.  He also believes the technology is not only ready for the enterprise but that it is already heavily deployed and production within the enterprise.

In my opinion some of what is being discussed, such as churn and portfolio risk management, has been a target for data driven companies for a long time. What is really different is:

  • The granularity level of data we can now operate at;
    • Individual account level, individual person level rather than segments or one aggregation level higher.
  • The amount of history we can use in developing models;
    • Why use sampling when you do not need to? This is especially important when seeking out rare events such as Fraud activity.
  • The breadth of data we can use in building out models;
    • Enables the addition of unstructured data, open data and more to develop more accurate models.
  • The speed at which we can build models;
    • Encourages more experimentation to ensure that you have the best possible model not just the one you could build on a slow hamstrung platform where you need to dumb down the math, reduce the number of variables or avoid areas of experimentation that are high risk to deliver better results (even though the unknown is what often delivers the breakthrough).
  • The regularity with which we can execute models on incoming data to get insights faster.
    • Enables you to run more analytics in more places on more data. This is when analytics becomes pervasive and organizations can start to automate processes.

All of this is enabled, in my opinion, by a combination of an economically viable storage and compute platform with technologies that scale across it to fully leverage it.  For now I am still a believer that this sits as part of an overall data architecture but time will tell if something like the Hadoop ecosystem will mature fully to own it all!

In my next post I share a little about the three recent initiatives announced by Cloudera at Strata, which were covered by Mike at the event, plus a few other facts he shared around Cloudera.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.