Evaluating BI tools – Don’t get left hanging
Evaluating BI tools used to be easy.
You either took a stack approach and picked a vendor first, then selected the modules you wanted to use from their product catalog.
Or you picked one thing you wanted to do extremely well and selected a best of breed product/vendor that best meet this requirement. When evaluating using this best of breed approach you had a list of functional requirements that typically only one or two products could meet.
Easy peezy.
But the BI tool world has changed
As I have blogged earlier Gartner, have redefined its Magic Quadrant for BI and Analytics tools. And, whether you agree with their new definitions or not, it is true to say that they have been redefined due to the shift in what they are seeing customers buying.
These changes leave you with a massive risk of buying something and being left hanging. Especially if we use the old tried and true feature & function approach to evaluation.
For example, all BI tools now provide “drill down” capability, but the way they provide it may be markedly different. From providing an OLAP (Online Analytical Processing) style pre-defined hierarchy to daisy chaining summary dashboards with tabular and detailed reports to in-memory associative engines that build the drill down relationships on the fly. So, asking for drill down capability no longer enables you to differentiate products or help with evaluation scoring.
Gartner has issued a report titled “Critical Capabilities for Business Intelligence and Analytics Platforms” which you can access for free care of Birst.
The key for me here is that they are focusing on critical capabilities.
Which is one of the three areas that we focus on when assisting a customer in their BI platform evaluations.
The Three Steps BI evaluation waltz
When helping an organisation evaluate BI platforms we focus on three key areas in the evaluation process:
- Capability Delivery Approach
What delivery approach, team, and processes will they use to deliver data and content to users. This answers the question of “who will do what” - Data Warehouse and Business Intelligence Approach
What data warehousing methodologies and business intelligence styles will they use to deliver data and content to users. This answers the question of “how will they do it” - Core Capabilities
Informed by the previous two approaches, we determine what technologies and platforms provide the best capabilities to enable these approaches to be successful. This answers the question “what will they do it with”
Capability Delivery Approach
The first step is to understand how data and content will flow through the organisation, answering the “who will do what.”
In the past, I have seen a number of BI platform evaluations result in a mass of shelfware. There is a raft of reasons for this, from vendors “bundling” in capability the customer will never need, to customer inability to accurately define what capabilities they need.
One of the key reasons that will result in shelfware is the organisation is unable to implement the selected BI Platform into their delivery process. This is because they do not define this delivery process upfront.
Often there is a belief that the vendor will deliver “best practise” processes on how their platforms should be implemented and used. Typically they don’t, and nor should they be expected to, every organisation and industry is different.
But neither should the customer be left on their own to reinvent the wheel on how to deliver data warehousing, analytical models and business intelligence content with these tools. And if they are defining this process upfront before they have even selected the BI platform then it is twice as difficult.
So now we have a chicken and egg situation.
Luckily there are some process patterns which can be leveraged to make it easier.
These are changing fairly rapidly at the moment as the BI platform capabilities change and as new approaches such as AgileBI are being refined. The examples I currently use are:
- A pattern to implement a centralised capability where requests are logged by users and a centralised team to acquire and transform the data, then create the content to fulfil the request and publish it to these users
- A second pattern is to split these roles, so the centralised team capability is focussed on acquiring data and transforming it to provide reusable data structures and known business rules. Users would then access this data from a central repository via self-service reporting tools and create their own content
- A third pattern is to load all the data into an unstructured data lake and allow technically savvy users to create their own code, which applies both structure and rules to the data
- A fourth pattern is to have a governed production environment and a semi-governed sandpit environment, which run in parallel
An example visualisation of the fourth pattern is:
This diagram was defined in conjunction with the Tertiary Education Commission and published to vendors to document their requirements for evaluating business intelligence capabilities. It shows the process of a governed production environment (on the right) and a semi-governed sandpit environment (on the left) working in parallel.
One key point to note is that during the evaluation process we provide this capability approach to vendors to provide context for the vendors, not as a set of requirements that are set in stone. By framing it as a way of defining the intent of what is to be delivered, it allows the vendors to show how their BI platform can deliver this intent, via their unique platform capabilities, compared to trying to bend their capabilities to a rigid process that it might not fit.
Data Warehouse, Analytics and Business Intelligence Approach
The second area is to identify the preferred business intelligence, analytics and data warehouse approach. Answering the question “How will they do it”.
If we look at the data warehouse approach, there are many views on what the optimal structure for a data warehouse should be. From the Kimball data mart focussed approach, to Inmon centralised models, Data Vault integration approach, through to the relatively new data lakes.
Each of these approaches has positive and negative points. In our experience what is important is to agree on the different layers that will and will not be implemented in the data warehouse approach.
This layered diagram drives the questions that need to be answered to define the business intelligence and data warehouse approach. Some of these questions are:
- Multiple source systems exist, how we will acquire the data from these systems?
- Where will extracted data reside before it is loaded; what is the archiving approach for this staged data?
- Will a near real-time Operation Data Store (ODS) be required for near real-time operational reporting?
- Will time series data be required to allow detailed data mining of behavioural data?
- What data modelling approach will apply and store business context against the data and apply business rules to transform and infer data?
- Are event rules required to infer business processes that the current source applications do not adequately capture?
- How will data need to be presented to enable easy access by users and analysts?
- What BI tools will be used to access the data, create and disseminate interactive visualisations and content?
This diagram displays the data layers that need to be defined for the data warehouse, analytics, and business intelligence approach, in conjunction with the Capability Delivery.
For example, if we are using pattern three where the data is being loaded into a time series staging area (data lake), then the business context and rules are all contained within the BI tool and Access Layer, not in the data itself.
Core Capabilities
Once we have defined the team and process approach, and the data warehouse, analytics and business intelligence approach we can then refine the technical and functional capabilities required to enable these to be successfully implemented. This answers the question “what will they do it with”
Business Intelligence capabilities can currently be defined into three broad categories:
- Enterprise Content Dissemination
- Visual Discovery
- Predictive Modelling
Analytics capabilities can currently be defined into three broad categories:
- Code centric predictive modeling focussed
- Graphical User Interface (GUI) centric predictive modeling focussed
- Machine Learning focussed
Data Warehouse capabilities can currently be defined into three broad categories:
- ETL flow and Dimensional model focussed
- Data Integration and DW automation focussed
- Big Data and Data Lake focussed
What is really important in defining this list of capabilities is what is in and out of scope as this will markedly impact what you select as the preferred platform. The classic example I deal with at the moment is the need to publish BI content to external users.
The ability to publish content to members of the public (i.e users who are not required to login) has a massive impact on the platform. It impacts:
- Ease of use of accessing and exploring the BI content
- Ability to restrict features and functions within the tools by user type and role
- The data and content security requirements
- Environment architecture, navigating DMZ’s and firewalls
- The publishing and approval process
As you can see one little capability circle of “public access to content” can have a disproportionate effect on the evaluation criteria and what platform will best meet them.
So providing a clear list of the most important capabilities is crucial when evaluating BI platforms. One template we use to help refine the areas to focus on when evaluating technology options is the following diagram:
This version of the diagram was defined in conjunction with the Tertiary Education Commission to document their requirements for evaluating business intelligence capabilities. They already had a mature data warehouse environment, so this was not included in the requirements.
This diagram uses large circles to document the major business intelligence domains required, in this case, Data Blending, Visual Discovery, Performance Management, Predicative Analytics and Content Dissemination. The smaller circles document the highest priority capabilities required to be delivered and which domains we believe they fit within. Where the smaller circles are clustered together at the intersection of the large circles, that is where we have identified the requirements could be meet via products from multiple domains.
Evaluation and Consensus
Armed with details garnered via these three areas you are then able to undertake a market evaluation to evaluate and select the best technologies to meet these requirements.
There are some different options for undertaking these evaluations, and they are typically driven by the procurement governance within your organisation.
One of the key areas we focus on is how to manage the varied and often conflicting opinions on the best solution for the requirements. We, of course, have an approach we use to assist in getting consensus across a disparate group of people, but that is the subject of a future blog.
Change, learn or fade away, it’s your choice – Shane
Shane blogs about all of the things data and business intelligence.
You can read Gartner Data Integration Magic Quadrant 2016 – Behind with the times or all of Shane’s blogs here.
We run regular business intelligence courses in both Wellington and Auckland. Find out more here.
interesting