In 2013, TED (Technology, Entertainment, Design) speaker and bestselling author, Dan Ariely, referred to the widely used term, big data as the “crude oil” of the new millennium – hugely valuable but useless if unrefined.
If big data really is this valuable, where and how should we be archiving the binary stockpile of black gold? This question is being asked more frequently by people in charge of production at manufacturing companies who have terabytes (TB) of information to store. Here, Johannes Petrowisch, Partner Account Manager at industrial automation software expert COPA-DATA, seeks to shed some light on big data archiving.
There are two main reasons why manufacturing companies want to archive large amounts of data on a long-term basis – they’re virtually the same for all industries; compliance is one of these reasons.
Traceability and seamless documentation of product history – including information regarding creation, quality and quantity – are key pieces of data that a company wants to store for accountability reasons. The most important ramification of this is that the firm has documented proof of its adherence to legal requirements.
The second reason, and the focus of this piece, is that knowledge is power.
In the midst of the current fourth industrial revolution, or Industry 4.0 as it it’s more commonly referred to, information is power. The more data you collate and analyze today, the stronger and more accurate your predictions will be tomorrow. Whether your reasons are quality management, predictive maintenance or simply staying ahead of the curve in innovation, there is a strong belief that the more data points archived, the better.
So, with companies now hoarding their data like precious fossil fuels, where’s best to keep this sensitive material?
Data, big or small, needs to be kept safe and readily accessible for analysis or there’s little point saving it in the first place.
Our experience at COPA-DATA has highlighted a recurring trend: the cheaper the storage medium, the worse and more time consuming the re-readability of the data.
One less costly and slightly outdated storage method entails data being moved to an external media, such as magnetic tapes. This makes searching and extrapolating information cumbersome and time consuming – it also increases the risk of data loss or theft. Not to mention the storage space is often insufficient for the gargantuan amounts of data recorded in industrial processes today.
Alternatively, the data can be saved in a database. This method is good for simple accessibility of data, but is more expensive than other methods. For example, large quantities of information would need to be split into separate database shards. This increases running costs, complexity of operations and maintenance, not to mention it runs the risks involved with a single point of failure.
So where should the masses of data go?
Many are now turning towards the cloud for their archiving needs. Indeed, COPA-DATA estimates that updating to a big data cloud solution could reduce a company’s TCO (Total Cost of Ownership) for storage by 40-60 percent.
When collecting plant data, large quantities need to be archived on a daily basis. The elasticity of the cloud makes it ideal for such scenarios. Cloud based storage is capable of rapidly processing large volumes of unstructured and often heterogeneous data to identify patterns, that in turn can be used to improve business strategies. This can even be done in real-time.
Unsurprisingly, big data environments require big supporting structures. Clusters of servers are used to support the tools necessary in processing large volumes of information. The added benefit of cloud storage is that it’s already employed on pools of servers, storage and network resources, so they can be scaled up and down as needed.
Security in industry has become of paramount importance in recent years. With firms compiling and archiving larger and larger amounts of historical data, the need for increased security has become apparent. As previously mentioned, archiving data in exterior physical forms, such as magnetic tape, increases the risk of the information being lost or stolen.
However, with the cost of cyber attacks on critical infrastructure now measuring into billions of dollars worth of damage and the recent infamous celebrity cloud hacking scandal, you’d be forgiven for thinking that these forms of storage were just as vulnerable.
In reality, the celebrity hacking scandal was less a hacking and more an abundance of personal information online that was unfortunately abused.
Cloud archiving is actually one of the safest methods of big data storage, ensuring against both theft and loss.
Loss of data and unauthorized data access are protected by automatic backup, redundancy, disaster recovery and hardware encryption. Another security advantage is that metadata is only saved in the local runtime application.
Conventional native archiving technology, consisting of aggregated archives, dynamic re-readability and trend evaluations ensure that data points are not saved locally on the panel or PC, but on a hardware appliance in the internal network, the CiS. This dynamic storage gateway, with a current capacity of 120 TB per device, guarantees that the data in the Azure cloud storage is moved and safely archived there.
For these reasons, COPA-DATA believes it makes sense for companies to look to cloud computing for their analytical and archiving needs. For today’s businesses, properly mining and refining data from an artefact into an asset, drives innovation and gives competitive advantages. A cloud-based big data archiving system provides a business with the ability and ease to analyse on a large scale without security and cost worries.
Filed Under: Industrial automation