Category Archives: Infosphere Biginsights

Just finished reading: Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data

I have just finished reading this book, I was excited about the IBM offering and the concepts around big data at IDUG, but after reading the book I want to find a project I can try this out on. The book can be downloaded from here: Understanding Big Data: Analytics for Enterprise Class Hadoop and Streaming Data.

The book is in two parts, Part 1: Big Data from the business prospective and Part 2: Big Data from the technology prospective. The first part of the book as it suggests does not touch on the technical aspects of big data only the benefits to businesses and how we all are already part of the Big Data world. The second part of the book explains at a high level all the different parts of the Hadoop cluster and how you get data in and out and process data in there. The second part also explains the IBM offering into this marketplace in the form of IBM InfoSphere BigInsights and Streams.

The as a high level description first part introduces the concept of the three V’s of big data, Volume, Velocity and Variety, the uses of these V’s in a number of different scenarios all of which are very interesting and I can easily see how it would bring you competitive advantage (probably the point of the case studies). The second part is for the techies explaining what Hadoop is and all of the different parts that make it up with MapReduce, common components and the file system. Also explaining all the other technologies surrounding Big Data such as Hive, Flume and Jaql.

So this is just a very light overview of the book, and well worth a read. I did it on my kindle, sometimes the text varies from page to page as it gets resized but overall it was fine.

IDUG – EMEA – 14th – First real day

As a first time attendee and a goody two shoes I went to the orientation session after my breakfast which was nice again, although the bettered Cauliflower was a little strange though. The first session that I went to was:

Pro Active DBA – Michael Tiefenbacher

This was good and very informative and cant wait to take the bull by the horns when I get back and be pro active and set my own KPI’s that I can monitor and get hold off as opposed to something that I cant monitor easily set by my managers. I found this session easy to take notes in as the pace and content was good and was a lot like being back at university with the learning pace. I learnt from this session:

  1. That although I do not have licences for Workload Manager (WLM) I can still use some of the monitors for free after I have activated them and the database
  2. There is a LAG function in DB2 that takes two timestamps.
  3. There is an automated scheduler that “replaces” the task centre that has been depreciated, but it will only run one sotred procedure at a time

The second session I went too was:

Eliminating Performance Bottlenecks in DB2 9.7 and PureScale - Steve Rees

This session was presented at a pace quite a lot faster than the last, with code samples that were quite large so I am sure that they will be useful. It concentrated on “time spent” metrics, and the use of MON_* table functions to iterate through a bottle necks. From this I learnt:

  1. There are three levels of monitors: System, Data Object and Activity —> SQL statements.
  2. db2 +c disables auto commit
  3. Some of the MON_* stuff contains XML that can be interrogated to get even more information to a greater level of detail

There were then the “Vendor Solution Presentations” these were not well advertised to what they were actually about unless you were in the know or it was obvious from the title. I took the decision to not go to one of these and do an exam instead. Good plan in practice, but unfortunately when speaking to the IBM rep on the exam desk it seems the system was “broken”!! I did get a free book out of it though on Big Data and IBM offerings in this area so should be good for a read.

The final session that I went to  was:

Sneak peek at the future of DB2 application development - Leon Katsnelson 

This was an amazing and eye opening session and I defiantly do not want to be part of a company that sticks it head in the sand about the changes that are going on in the market place, with Social, Mobile, Cloud , consumerization of IT and big data becoming very important. Also the session came back on topic with the announcement of DB2 Express – C 9.7.5 being made available today. From this session I learnt:

  1. IBM has a big data offering with a basic (express – C) version and an enterprise version
  2. InfoSphere Big Insights will connect to DB2 -woot. Heres the good bit you can query the unstructured data from DB2 SQL!!!!
  3. Russia has 151% mobile penetration, meaning every other person in Russia has more then one mobile!!
The evening was great with Triton in the drinks reception (with nibbles) in the posh restaurant in the hotel. It was good, got to meet some other DB2′ers and discuss the challenges that we all face. All the iPads have gone now, someone managed to do the Layar quiz that was set and I did not win the lucky did so they have all gone . I was also force to finish 5 puddings at the end as there were many left! Thank you Triton and Julian, James and Iqbal for the free beer  and food.
So tomorrow seems like it will be another busy day starting at 0830 with the first talk! So better get to bed soon, tomorrow I want to go to:

Tuesday, November 15, 2011

08:30 AM – 09:30 AM
Session 4
09:45 AM – 10:45 AM
Tuesday VSP
11:00 AM – 12:00 PM
Session 5
12:00 PM – 01:00 PM
Tuesday Lunch
01:00 PM – 02:00 PM
Session 6
02:15 PM – 03:15 PM
Session 7
03:15 PM – 03:45 PM
Tuesday Coffee Break
03:45 PM – 04:45 PM
Session 8

Although I may change the last session to D8: Stuffed with great enhancements.

So may see some of you bright and early for breakfast, good night.