Jan 13, 2013

Data Scientist- Critical Attributes for BI - 1 (ver Jan 2013)

I have received many email on this topic and met some who claimed to be data scientists.

Here is an executive brief of what is takes to make one in the SAP BI environment.

Just like IT, BI and HANA - Data scientist is not a technical only skill. Data scientist keeps 'Meet Business Expectations as the center of this universe'
Read the independent statements in level 0.4 and get aligned in your thinking. This singularity could save your company millions of dollars.

Here is a brief of what all you need to know / understand to become one, or claim to be one..


1. Understanding that Business Intelligence has been and is about two factors. The first being business and the second intelligence.

2. Resonating with Gartners 2010 comment ' without business in business intelligence, BI is dead'

3. Working on a 'Business First and Business Last' focus in all things business

4. Knowing that
   4.1 'Less that 50% of BI projects will meet business expectations' Gartner in 2003
   4.2 'Less that 30% of BI projects will meet business expectations by 2014' Gartner 2012
   4.3 '98% of BI projects declare success in week 1 of their go-live, yet less than 50% remain
         so by week 10' BI Valuenomics 2010

1. Conceptual: As a data science it is critical that your conceptual understanding of Business Intelligence is solid not only from a technology standpoint but also from a business side of BI
2. Technical: It is critical to understand the technology that one is dealing with along with its capabilities and limitations. It is also critical to understand alternatives for the business need
3. Judgmental: Understand Data Architecture, TDQM, and the data impact from metadata to maser data. Understand the various statistical methodologies and the ability to build algorithms.


SAP BI: Understand the fundamental of SAP BI. This includes deep understanding of SAP BW, BW Accelerator, BusinessObjects, BO Explorer and the recent HANA. It also includes understanding how each of these applications manages data along with strengths and weaknesses of each application area.
Global BI Architecture: Understanding Data Marts, Enterprise Data Warehouses with a singular purpose of building global FEDW environments (Federated Enterprise DW's) that allow global standards and 100% local independence.
Automated Modeling optimization: Just like most of us cannot put together a Rubik’s Cube in any optimized time we need to accept that we certainly cannot model a cube with 10 dimensions and 60 characters under any circumstance. Use automated modeling tools
Data Flow optimization: A clear understanding in the impact of differential data flows and the maintenance of Data Quality within each data element and the impact it has on the entire data warehouse.


Basic Understanding:
0. Matching Algorithms: This won the 2012 Nobel Prize. Stable matching, Optimal pairing, Incentive compatibility, one and two sided matching, medical markets, experimental evidence, market designs..
1. Mathematical basics like an understanding of exponentials, logs, distribution types, continuous and random variations of data sources and elements
2. Econometrics and Modeling: The economics of language based system commands, descriptive statistics, Brownian movements in data, ARCH/GARC modeling, Monte Carlo Simulations, Auto regressive modeling, etc
3. Mean variance optimization: Quadratic optimization, Tracing out efficient frontiers, Covariance or combinations of portfolios, and other portfolio analytics
4. Textual data management: Extracting information from news and blogs; framework of textual data management, word count classifiers, vector distance classifiers, confusion matrix, accuracy, etc..
5. Bayesian modeling: joint probability administration, correlated default applications, Bayes net, Accounting fraud, etc.
6. Predictive modeling: Predicting growth in markets, product and services, Bass modeling, Peak growth calibration, artificial intelligence algorithms, organic growth modeling, etc.
7. Large data extractions: Discriminative analysis, Eigen systems, Factor analysis
8. Auctions financial models: Auctions methodology, Theory of auctions, Auction and bidder types, Optimization of bids, Discriminating pricing, Collusions in auctions, Advertising by auctions, Next price auctions, etc.
9. Network financial modeling: Graph theory, Strongly connected components, Shortest path algorithms, VC Web, Centrality, etc..
10. Financial Neural networks: Non linear regression, Perceptions, Squashing functions, Feedback/backward propagations, neural nets, etc
11. Mathematical Speculation: Gambling, Odds, Edge, Book makers, Kelly criterion, Entropy analysis, Casino games, day trading
12. Cluster analysis and Prediction trees: K-mean clustering, Hierarchical clustering, Prediction classification and regression trees, etc
13. Storage and speed in big data: Distributed computing from Hadoop, Map-reduce concepts, Parallel processing engines, Prototyping, advanced language usage,
14. Misc: Dynamic programming, Fourier analysis, artificial intelligence clustering, stable matching, optimal pairing, Incentive compatibility,

So welcome to the world of the data scientist in the new world of too much data and very little information.

From Big Baloney to Big Opportunity in Big Data

In early 2012 Big-Data was a most hyped concept in the tech industry with a lot of technocratic pundits wondering what the need for any big data was in the first place. Now it is becoming more than evident that big0-data is but starting.

The amount of data we create as a planet is growing exponentially. Data is bubbling forth from many sources including cameras, sensors, signals, blogs, social media, tweets, browsing trails we leave behind, and google searches we conduct. Include into these mix e-commerce transactions, product complaints via emails and texts, medical records, DNA details, drug contradictions, GPS locations of various smart devices and vehicles and we seem to be just scratching the surface.

The world has yet seen nothing of what big data informatics can actually yield. Proactive companies are already starting to invest into the new 360 degree informatics - analytics that mash-up external and enterprise data in an efficient manner.
Predictive analytics allows you to know the weather before you start your day, or before you head for the hills for a skiing holiday. Our GPS traffic tells us the traffic 10 to 50 miles ahead and provides actionable predictive solutions by providing alternative routes when traffic is very heavy, it also provides alternative routes when the information consumer takes a wrong turn for example. It is now routine to know what time the next train will arrive that we don’t event think of it as predictive analytics.

Take it one step further and we can today predict the next hotspot for a battle in Afghanistan simply by analyzing social, text and behavioral patterns to an accuracy where the nascent algorithms could predict altercations up to 24 hours ahead of it happening in 46% of cases.
Google Now, shows us the first glimpe on the predictive power of NOW.

Two things to remember

First start seeing patterns - as we move forward and merge human genome data, with weather, health, chemical contaminations, where each person lives we will certainly be able to predict the impact of genome, lifestyle, disease and life expectancy.

Second big data is the biggest data sets you require to meet your analytics reach. For an enterprise it could be 20 to a 100 terabytes. For Google it could be 20 to 100 Exabyte’s and for the NSA it could be a few Yotabytes
All individuals need not start thinking Yotabytes of data management right now. Focus on your needs only.
Within a few years, expect to be able to do Google-like searches to learn what diseases those with similar genetics have had and what medications worked for them. Eventually, when researchers combine the medical records of a hundred million people with their genome data, work habits, and other information, such as weather and pollution data based on where they live, they will all but certainly be able to determine the correlation between genome, disease and lifestyle.

Jan 3, 2013

HP's SAP HANA 2013 Vendor selection Benchmark

Welcome to the world of 'The Scientific Principles of Decision Enhancement'

Here is the HANA update from an HP perspective. Use this as your benchmark for selecting your 2013 HANA Partner.
 HW ATTRIBUTE                                                     Jan 2013 Facts 
01. SAP Certified Global HANA Partner                      Yes
02. No of HANA Installation to Date                           Over 280 installation (Nov 2012)
03. What does this represent                                   47% of global HANA Installations
04. SAP Certified in 'Disaster Recovery'                  Yes
05. SAP Certified in 'Scale-Out' HW                         Yes
06.Current HANA Scale Out Capabilities                    42 TB (first record)
07. SAP Certified for 'High Availability'                   Yes
08. SAP Certified HANA HW Partner                         Yes
09. SAP Certified HANA RDS Partner                         Yes
10. Has their own HANA RDS Solutions                      Yes (Fast Track)
11. SAP BI and HANA Hosting Partner                        Yes
12. SAP Cloud Solution Partner                                Yes
13. # of HANA Certified Consultants                         Over 80 (Dec 2012)
14. # of HANA certified consultants (Dec 2013)         Plan for 500
15. Certified for SAP BW Upgrade for4 HANA          Yes
CRITICAL NOTE: This is just 1 of 5 checklists, Hardware Parrner,  that each executive stakeholder needs to run in order to ensure they take strategically professional decisions


2. BVA  ATTRIBUTE                       Jan 2013 Facts
3. Methodology ATTRIBUTE           Jan 2013 Facts
4. HANA SI Attributes                    Jan 2013 Facts
5. Current SAP BI State Check       Jan 2013 Fats

How 2 - ensure success in SAP HANA

By Jan 2013 as an executive you already have way too much on your plate. AS a Business Intelligence Leader, CIO and business owner you have a lot more on your plate. Staying focused on the technology alternatives can be a tough ask- with Gartner reporting that “..less than 30% of BI projects will meet business expectations..” in th 2011-14 period. . Staying focused can be tough when your business users are clamoring that they do not have access to their decision based analytics. Staying focused can be difficult if you just finished a $3, or 5, million BI project and your business users can hardly use the delivered reports, i.e. user satisfaction is low.

Is it the same for HANA projects?

The simple answer for January 2013 is that the same rules that have applied for your traditional BI deployments apply to a very large degree to your HANA BI Appliances.

Now as we enter the new world of HANA we need to minimize defects and thereby increase the probability of success. Proactive executives need to ascertain that we do not fall down the Albert Einstein chute of ‘Doing the same things and this time expecting different results” Which simply translates that if you continue to implement your HANA BI initiative as a technology only solution then your HANA results could be quite similar to your BI results of the past. As one CIO quoted “Our BI project was a technical success, but a total business failure”

So the advice for 2013, BI Implementations and HANA initiatives is 3 critical facts

1. Brutally Honest Partners and advisors

2. Without business in business intelligence BI is dead (Gartner 2010)

3. Take the 2x2 hour training for business stakeholders prior to moving ahead with your HANA initiatives. It is full of checklists, totally vendor agnostic and absolutely scientific in design.

Our brains are finely attuned to distraction, and in today's digital environment makes it especially hard to focus.

Uno: Brutally Honest HANA Advisors

1.1 Undertake a Brutally honest ‘Current BI State Strategic Workshop. Find out whare you currently are.

1.2 Understand where your business needs to be. Document their expectations and then design to meet their expectations

1.3 Get brutally honest findings with minimal interpretations, for unless we accept our current state of reality we may simply step into another fog.

Dos: ‘Without business in business intelligence, BI is dead” Gartner 2010

This statement was made by Gartner in 2010 and stands true from 1004 till date. Decisions enhancement and operational performance measures are not technical solutions but business ones.

The critical difference is between ‘Value’ and ‘BVA’, or Business Value Attainment.

The former is an external definition of value and what your business needs. It normally consists of generic measures that at best will make all companies working with the exact same method of measuring their business and its performance. No company will have any unique competitive differentiators.

BVA on the other hand is an internal and professional method of understanding your business expectations and your unique competitive advantages

Tres: Take HP’s 2x2 hour Executive Stakeholder training for HANA

Understand your weaknesses: We are used to working on our strengths. However, in BI it is critical to understand that the CIO, CFO, VP Sales, VP Procurement, i.e. your Key Stakeholders know little or nothing of SAP HANA or how to undertake a professional selection. Use the 2 x 2 hour HANA executive workshop

Strengthen your weaknesses: The workshop is basically a 2 hour session that comes with checklists and documents that can be used by non-IT stakeholders to verify that the selection process. The second 2 hour is to actually assist the team fill the checklists in a professional manner. It empowers stakeholders to undertake strategically professionally decisions.