What is data silos?

“The goal is to turn data into information, and information into insight.”  – Carly Fiorina (former executive, president, and chair of Hewlett-Packard Co).

Data is the lifeblood of an organisation. It is constantly converted into insights. But when you have data that has not been used for a while in day-to-day operations, the repository of data and information within becomes siloed, i.e isolated.

PC Magazine defines siloed data as “a separate database or set of data files that are not part of an organization’s enterprise-wide data administration”. However, this is not always so. Siloed information exists where the management systems do not integrate seamlessly or where file transfers are subject to synchronisation problems.

Data silos may occur where

a) various package applications are applied which use different database management systems – and integration does not occur consistently

b) where departments have applications stacked on their own separate database using agile methodologies, and are controlled by the department heads who do not make it available to other applications.

Why is data silos considered an IT pest?

# isolation of data is a barrier to productivity

# affects flexibility and seamless flow of operations

# there is an increased risk of data getting overwritten with outdated data

# in the presence of more than one silos or self-contained repository of data, there arises confusion about which siloed data is legitimate or more recent. Data silos thus raises concerns of data integrity.

Ways to resolve or prevent the occurrence of data silos

Using the Cloud for back-up and archiving data – Data that  is not accessed regularly can be kept in a single cloud archive  ensuring data integration through the organization.

Implementing a Service Oriented Architecture (SOA) – This enable seamless communication of internal and external business operations, resulting in reduced IT costs.

An SOA with a data services infrastructure maintains data consistency and integrity “through synchronization, data agility through semantic mapping, and data access optimization through caching”.

Following a mapped route to data integration, involves a clear-cut vision of business needs and a well planned IT infrastructure, where different applications running on multiple relational databases are synchronized for consistency.

Bottomline: At the end of the day, as a data scientist or professional who deals with data every day, data silos is a problem you need to look out for. Taking measures to backup unused data on the cloud or using semantic mapping for data synchronization are ways to prevent such isolation of data or data silos.

Leave a Reply

Your email address will not be published. Required fields are marked *