Expert Opinion

Your next productivity gain: the valorization of 'Small Data'

  • #Data Intelligence
  • #Business Intelligence

Business Intelligence (BI) enables data from operational Information Systems (IS) to be valorized. In turn, Big Data promises to utilize massive and multi-structured data, be they internal or external, to enrich insights enabling even more optimal decision-making. The automation of data processing is more and more extensive and effective. Yet too many business teams still spend considerable time manually collecting key data that is kept and utilized outside of the company IS: "Small Data". They are generally small volume data but with a complex and highly changeable structure, these being characteristics which make it practically impossible to integrate them using classic approaches of the HMI[1] and/or ETL[2]-type. "Small Data" are therefore often neglected by the enterprise's IT system. However, their collection is a real productivity issue, and their integration into Business Intelligence environments a key factor in effective performance management and the valorization of traditional BI and Big Data investments.


Imagine the following textbook case: a person tasked with undertaking a study has to produce a dashboard with a line for each factory, and one column for actual production and another for forecast production. Normally, the result is this very simple table:

Unfortunately, factory C is a recent acquisition and has not yet migrated to the company ERP[3]. As for factory D, it is too small to ever be considered for migration to the ERP: consequently, the production data for these two factories are not available in the company Datawarehouse. Moreover, the forecast data are not available either in the company IS or Datawarehouse.

In the end, the Business Intelligence environment contains only the data to fill in the two top-left boxes, in other words just one quarter of the table:

The rest of the table will have to be filled in using data absent from the IS: being declaratory data, they are most often collected, stored and analyzed in Excel, directly by the business teams.


All enterprises find themselves faced with situations similar to the above case. For these data that are neglected by the enterprise's IT, the ad hoc solution adopted by business functions is generally Excel : the spreadsheet makes it possible to devise collection forms, retain the data collected, consolidate them and analyse the results. Excel is simple to implement and does not require any particular skills; it is the "Swiss Army knife" of the business teams.

Unfortunately, using Excel in this context poses two problems: on the one hand, when the number of contributors or pieces of data to be collected increases, the mechanics of Excel become terribly time-consuming. Using Visual Basic or Access macros makes it possible to improve productivity, but that requires more technical skills, and when used it proves difficult to keep in place: just consider how many VB macros or Access bases become simply unusable barely one or two years after they are put in place?

On the other hand, data collected via Excel cannot generally be reintegrated into the Business Intelligence environment. In order to cross-reference them with other data, that other data therefore have to be extracted from the IS and then the relevant analyses and reports have to be produced directly in Excel. In this way, not only is the use of Excel doubly time-consuming (for collection and analysis), but in addition, it diverts users away from the reporting and analysis tools deployed in the enterprise (in most cases at great cost).

Many business teams find themselves victims of their own success: having demonstrated the value of the data collected via Excel, they are condemned to "produce" them on a regular basis, thereby inevitably mobilizing their team members on mechanical and ungratifying tasks …

At the same time, the enterprise's classic BI investments end up being under-utilized: forced to rely on an incomplete data model, they cannot satisfy all reporting needs and consequently users turn away from them, resulting in their profitability being called into question.


Faced with this problem, the traditional response by IT departments is generally to propose developing an ad hoc Web HMI. However, this approach comes up against a major difficulty: it does not allow the enterprise to capitalize on the existing Excel set-up, even though that set-up actually represents a significant and operational investment; indeed, this traditional response means that the enterprise has to "scrap everything to rebuild everything". That involves significant costs and time periods which can be reduced only by making trade-offs between functionalities: business users are then reluctant to accept a transfer of technology that, whilst remaining costly, involves the loss of functionalities. This is all the more the case given that the upgradability of such solutions can never be as good as with Excel and that this type of project carries with it a significant tunnel effect.

Yet other solutions exist. These solutions make it possible to achieve numerous objectives that would be unimaginable with Excel (even as supplemented by VB and Access):

  • Automation of the collection processes to eliminate all manual tasks (preparation of forms based on existing business repositories, transmission, monitoring of replies, reminders, consolidation…),
  • Debugging of data by putting in place complex checks during data entry and/or audit trails,
  • Automatic integration of the collected data into a structured database capable of being interconnected   with the rest of the IS.

In addition to these three major benefits, the success of these technologies is down to several factors:

  • Their low deployment cost, thanks notably to their minimal level of technical adhesion with the rest of the IS and to their capacity to build on the existing Excel set-up (which generally constitutes the focal point for genuine business expertise),
  • Their low operating cost, based both on low pricing and straightforward maintenance,
  • The simplicity of their implementation, allowing the business teams to retain control over the collection processes and their development,
  • Their favorable upgradability and their long-term stability.

For these technologies, the cost aspect is obviously paramount: most often, the financial gains expected following the industrialization of the process are too insignificant in relation to the cost of a traditional solution of the Web HMI-type. The simplicity of their implementation and their upgradability are also important points for the business teams, who are generally concerned about heavy project workloads and the loss of control of their key processes.


Let us begin by noting that Microsoft's BI platform enables Excel to be integrated within a complete BI architecture broadly designed around the spreadsheet but integrating all data management functionalities: ETL, DBMS, management of business repositories. It therefore represents a choice BI solution for "Excel intensive" organizations.

Let us continue by considering the "niche" business expertise solutions. These are solutions dedicated to a specific business function. All the same, there has to be a certain critical mass and a relative homogeneity of processes to allow a software package to emerge for a specific business function. It is for these reasons that these "business expertise solutions" mainly concern the two departments that can be found in all enterprises: the Human Resources Department (strategic workforce planning, social audit, etc.) and the Finance Department (cash management, budgetary monitoring, etc.). By way of some (non-exhaustive) examples, there are BPC (SAP), IBM TM1, SAS Financial Management, Tagetik or Anaplan for budgetary planning, IBM Varicent for variable remuneration management and Enablon for Corporate Social Responsibility.

Finally, for more specific needs, when no turn-key business expertise solution exists, general-purpose solutions enable processes to be automated at little cost. These solutions offered by the market have the following characteristics in common :

  • No pre-imposed data model: for each business case, the model receiving the collected data is freely defined according to the specific needs. This allows the implementation to be very flexible and facilitates the sharing of data with the rest of the IS.
  • Possibility of collecting not only data in the traditional sense but also metadata, commentaries, and all sorts of attachments.
  • Traceability of contributions and collection/validation workflow.
  • Very simple deployment of collection forms since they do not require the installation of a thick client on the computers of contributors.
  • A significant level of autonomy for business teams in the management of the solution, including in the updating of business repositories.

By way of an illustration, we can mention two solutions that each have their specific strongpoints:

Referential Data Administration (or RDA) by Vision BI[4] : this solution enables a business function administrator to create a web interface for the entry of referential data, and to do so in just a few clicks. This tool can be likened to very simple MDM[5]. It enables simple consistency checks to be put in place. The key aspects of this solution are :

  • The full autonomy of business function teams in the creation of new forms;
  • The immediate way in which a new form can be implemented and deployed;
  • The ability to address several data sources simultaneously from one and the same portal; this enables the collected data to be positioned directly at the heart of the IS.

Gathering Tools (or GT) by Calame Software : this solution makes it possible to create secured data entry forms that exactly replicate the appearance and functionalities of the most complex of Excel workbooks. The key aspects of the solution are:

  • The GT forms are created directly from the Excel workbooks; that makes it possible to capitalize on what already exists, and notably on the business rules and existing ergonomics. This enables project cycles to be very short and change management extremely simple for the business function;
  • The GT forms offer functions for checking data entry and consistency that are much more advanced than Excel: this helps greatly in improving the quality of the data collected;
  • The workflow system, which makes it possible to automate the management of collection campaigns (pre-completion of forms and their personalization for each recipient, monitoring of responses, sending of reminders, integration of responses); campaigns involving several hundred contributors can be managed in a few clicks.
  • The possibility of working in online or offline mode.

The two solutions both integrate the collected data automatically into a structured database that can be used directly for reporting: in this way, these data are immediately and automatically consumable by the IS's other applications, notably the BI tools. As a result, these tools are popular both with the business functions and IT.

In terms of methodology, the solutions presented are entirely accessible to users who are not IT experts, meaning that the business teams can retain control over their processes. That said, this type of project must not be pursued "in opposition" to IT, but rather in close collaboration with it, in order to ensure that the data collected are duly valorized in the enterprise's entire Business Intelligence environment. This coordination, as well as the choice of the right tool and the upskilling of the teams, can require initial assistance.

In conclusion, at this time of the emergence of Big Data, it is important not to lose sight of the fact that there are still very significant potential gains to be had in terms of productivity, quality, and performance by optimizing the processes for collecting "Small Data". The emergence of new tools now makes it possible to achieve these productivity gains with a very beneficial ROI whilst at the same time valorizing investments already made by the business teams. This type of project represents both a real lever for improving the enterprise's performance and a vector for valorizing the business team members who can refocus on business assignments delivering greater added value.

By Fakhreddine AMARA, Director, Consulting and Integration at Keyrus and  Sébastien PREAU, Manager, BI Consulting at Keyrus.


[1] HMI (human machine interface) : application enabling a business user to enter data in an information system.

[2] ETL (Extract, Transform, Load) : IT tool enabling data to be transformed in order to transfer them from one information system to another.

[3] ERP : Enterprise Resource Planning

[4] Vision BI is a subsidiary of the Keyrus Group.

[5] MDM (standing for Master Data Management) is a field of IT concerned with the synchronized management of the enterprise's key business repositories (repositories of customers, third parties, products, structure centers, etc…). MDM generally relies on dedicated software packages that enable the various sources for repositories and the uses of those repositories to be synchronized.