[RECAP] “Avoid the Data Swamp: Creating a Clear IIoT Data Lake Using Standards Such as MTConnect

On June 8, 2020, AMT and MTConnect hosted Lockheed Martin Fellow Jan de Nijs for the online webinar, “Avoid the Data Swamp: Creating a Clear IIoT Data Lake Using Standards Such as MTConnect.” Jan, a mechanical engineer and expert in measurement and inspection, has been with Lockheed Martin since 2004. He currently serves as LM’s Corporate IIoT Agile Build Lead and was recently named as a Research Fellow at the company. His presentation centered on LM’s strategy to manage the tremendous amounts of data gathered from their machines and applications and how they managed and used the data. 

Jan began by introducing the concepts of a “data lake” and “data swamp.” A data lake is a common tool for housing and managing large datasets, but without the right controls, cleanup procedures, and organization, it can turn into a murky, difficult-to-use mess – a data swamp. Modern machine tools allow users to gather all types of data – temperature, speed, accuracy, etc. The same is true of sensors, sensor controllers, other production equipment, and a variety of other factory hardware. Ideally, users would send these data into cloud-based storage a data lake where they could access the data for analysis to maximize efficiency and other uses. However, because so many machines are built upon different protocols by different manufacturers, they “speak” different languages. Data collected as a short string of text may require experts to translate. Does that number in the string mean a temperature? If so, one machine may use Celsius measurements while another uses Fahrenheit. Additionally, is the data tagged to define its origin? Clearly, the issues become compounded over time and as more equipment is added. The data lake then turns into a murky data swamp of uncertainty and wasted resources.

According to Jan, the MTConnect standard offered Lockheed Martin a semantic vocabulary for manufacturing equipment to provide structured, contextualized data in a non-proprietary format. In other words, the MTConnect standard offered a practical pathway to turn data swamps into crystal clear data lakes. By applying the semantic definitions in MTConnect to raw, untranslated data gleaned from an overabundance of industrial protocols such as Profinet, Profibus, OPC UA, OPC DA, and many others, developers and integrators are able to bring all these different protocols to the same page. With uniform data, these users could focus on useful, productive manufacturing applications rather than translation. Applications that consume MTConnect data provide more efficient operations, improved production optimization, and increased productivity. Currently, Jan noted, LM’s data scientists working with operational technology data spend 90% of their time cleaning up data.

Jan continued his presentation by describing Lockheed Martin’s two approaches to standardizing their data into a clear data lake. The first approach was a workflow that included legacy machines and other machines that are not MTConnect-ready, and the second approach assumed MTConnect-ready items and skipped some of the first approach’s steps. Once a data lake was in place for storage and proper information management, users would then be able to access and utilize the data as needed; Jan discussed several use cases that LM is exploring with such data.

Jan closed his presentation with a call to users and standards development organizations to work together to purposefully direct and shape the standards that will harmonize the industry’s data. He argued that competitors on regional and global levels should cooperate to ensure the production of acceptable standards, as doing so later would become too expensive. The audience asked several questions in the Q&A session that followed, including: At what frequency are you saving machine data to the cloud? What is the sampling rate at which you’re pulling data off the machine? What is the volume of data you are pulling? At what point is contextualization added to the data? What has been the hardest part of the IIoT journey? And many more. 

If you found the MTConnect webinar “Avoid the Data Swamp: Creating a Clear IIoT Data Lake Using Standards Such as MTConnect” helpful or would like to attend seminars on standards in the future, check out the upcoming Open Industrial Digital Ecosystem Summit. This two-day virtual online workshop will be held June 30 – July 1, 2020 and will bring together industry experts, users, and standards development organizations to discuss the topic “Enabling Vendor-neutral, Standards-based Interoperability.” Join us to give your insights and help shape the future of standards development.