Knowledge Graphs for a Circular Economy

Hello! My name is Lillian Jirousek, and I am a rising junior at Northwestern majoring in Math, Physics, and Integrated Science. This summer I have been working with Dr. Chaudhuri in order to find a pathway to reach a circular economy. I am collecting data on the production and uses of cellulose in order to model its life cycle in the creation of bioplastics and the various elements (temperature, microbes, usage etc.) that may alter that life cycle over time.

Goals of the Circular Economy

What most of us are used to in the present day is what’s known as a linear economy. Materials are gathered from one location, taken to another place to be manufactured, and then taken somewhere else to be distributed and eventually disposed of in a landfill. This creates both tons of materials waste, and adds to the CO2 emissions and greenhouse gasses. Creating items doomed to become waste in landfills or the ocean, as currently practiced across the world is a high carbon footprint approach. A far more desirable alternative to this is a circular economy. This typically involves people using locally sourced materials which they subsequently re-use and repurpose, thus greatly reducing both the materials waste and the CO2 emissions.

In order to reach this circular economy, there are many different products that must be produced, such as feedstocks and metals, yet we will focus on the product of bioplastics. These bioplastics can come from renewable sources of biomass and may eventually degrade and provide nutrients back into the soil. However, biodegradation is not guaranteed in bioplastics as many of the monomers used come from petroleum-based sources. One of the key elements to most bioplastics is cellulose, which is the focus of my project.

Use of Graph Theory

When describing something as complicated as the life cycle of a bioplastic, from the making of cellulose in a plant to the degradation of it via composting much later, it is quite the daunting task. Because of this, we use something known as a knowledge graph. A knowledge graph consists of various nodes, representing different entities that are subsequently connected together by edges which describe the relationships of said entities. A common example of this could be a group of nodes representing genes in a plant, and then edges that connect genes that appear to be coexpressed and thus likely implicated in the same steps for the production of cellulose.

Components of the Knowledge Graph

We have divided my project into a few key portions of the cellulose/bioplastic life cycle that will go into our knowledge graph. These include: the genes that are commonly expressed to produce cellulose in a plant, the process of making cellulose, differences in cellulose production from various species, the structure of cellulose and cellulose nanocrystals, the bioplastics that can be made and their properties, the process of composting, and the microbes that control decomposition. Much of this information is scattered across various databases, papers, or not even known. Because a circular economy generally occurs on a more local scale, one needs to know specifics about how the locally grown plants produce cellulose, the soil composition in the area, and properties of the bioplastics that could be made. All these variables could drastically change the cellulose life cycle in a given area. Thus, we may not actually develop a method to manifest a circular economy, but we can certainly show how we would and what information we may need in the future.

With the information that we have, we may create a graph neural network to describe the entire problem. In order to put seemingly unrelated things (e.g. genes that are expressed to make cellulose, the uses of materials such as parts of appliances or plastic bags, and the microbes that decompose bioplastics) we will use an embedding matrix to place all of our items in a distributed representation that may identify all items by a distinct set of properties. We will then use machine learning methods to determine one of the best possible paths to circularity.

Conclusion

The linear economy that we are all used to is simply not tenable for the long term. The abundance of potential bioplastics makes it inexcusable for us to not reach towards a more sustainable future. We thus aim to propose a framework for developing a circular economy given enough information exists in the future.

Understanding the Environmental Impacts of COVID-19 Plastic Waste for Bioplastic Replacement

Hello there! My name is Ting Ting Li, and I am a rising sophomore at Northwestern University studying Biomedical Engineering. This summer, I have been working with 16 other students under Margaret MacDonell and the Responsible Innovation for bioPlastics in the Environment (RIPE) initiative in the Environmental Science Division (EVS) of Argonne National Laboratory. I have been focused on understanding the environmental impacts of plastic wastes that increased during the COVID-19 pandemic, notably personal protective equipment (PPE), as well as the potential for bioplastic replacements for these petroleum-based products, all written into a literature review.

Independent of the review publication, the purpose of RIPE is to develop a database and an environmental reference framework for plastics to allow developers to consider the end-of-life impacts of their material selections, as well as to promote bioplastic replacement. To do so, current literature was reviewed with relevant data extracted and synthesized for the database. The work I had done on COVID-19 plastic wastes informed one of several potential opportunities for the RIPE database to be used to support bioplastic replacements for conventional plastics, toward being available for future pandemics where PPE use and waste generation will dramatically increase again.

Sources of COVID-19 Plastics

During the pandemic, there has been increased consumption of single-use plastics in various sectors, including in healthcare, and in the general public. These increases can be attributed to both governmental mandates, as well as consumer fears. The implementation of mask mandates and lockdowns worldwide have led to the increase in PPE use, as well as an increase in deliveries using single-use food packaging and utensils. Furthermore, as variants develop with higher transmission, consumer fears have led to a phenomenon of “functional fear,” including increased preventative behaviors, such as wearing face masks. The main plastics that have experienced these increases are depicted in Figure 1 below, although this is not an exhaustive list. Note that limitations in current literature led me to focus on the impacts of face mask and shields. These are composed mainly of polypropylene (PP), polyethylene (PE), and polyvinyl chloride (PVC).

Figure 1: Main Sources of COVID-19 Plastic Wastes 

COVID-19 Plastic Waste Management Crisis 

Current literature utilizes different methods to estimate these values using parameters that account for face mask acceptance rate and economically active populations, to name a few. Regardless, estimates indicate that daily waste generation from face masks has reached 1 million tons worldwide. With 32% of plastic waste mismanaged and leaked into the environment prior to the pandemic, according to the Ellen MacArthur Foundation, the large scale of plastic waste now poses a significant threat to the environment. To make matters worse, fears about outbreaks and the transmission of the virus have led to the shutdowns of recycling facilities worldwide. As a result, most of the COVID-19 plastic wastes generated are sent to a landfill or incinerated, with residues remaining in the ash. In addition, a substantial amount of PPE is simply discarded (or leaked) as litter. Environmental impacts of these disposition paths could harm the environment, including through the release of chemicals that can contaminate water and soil, as well as enter into the food chain.  

Environmental Impact of COVID-19 Plastic Waste 

Depending on how long a waste plastic has been in the environment, it might exist as macroplastics or microplastics – which have been present longer and have undergone further degradation processes. Figure 2 details some of the environmental impacts of PPE macroplastics and microplastics. These waste plastics can potentially harm ecological and human health. Most face masks are made from nonbiodegradable plastics, and in addition to producing microplastics over time, plastic products can contain chemical additives that can be toxic at certain exposure levels. 

Figure 2: Environmental Impacts of Macroplastic and Microplastic PPE 

In addition to these environmental impacts, there are additional concerns about climate change emissions and fossil fuel depletion since most single-use plastics are derived from fossil fuels. The opportunity exists to replace petroleum-based plastics with bioplastics, and further to pursue plastics that are biodegradable at their end of life.  

Conclusion 

The next steps for this research includes exploring the current endeavors for bioplastic replacement of PPE, as well as investigating potential bioplastic materials. Several considerations need to be made regarding developing bioplastic face masks, including standards, breathability, and electrospinning abilities. From this, I am eager to investigate the exciting applications of the RIPE database to promote a circular economy that reduces the generation of plastic waste.  

Designing a Soft Robotic Gripper with Closed-Loop Control

Hi! My name is Sophia Schiffer, and I am a rising sophomore studying Mechanical Engineering at Northwestern University. Earlier this summer, I assisted in the design process and created a demonstration for the soft robotic gripper developed by my mentor Dr. Jie Xu’s research team. Then, I transitioned to working on the control system and interfacing of the gripper under the guidance of Jie Xu and Dr. Chengshi Wang. I will continue to work on this objective for the remainder of the summer. The goal of the soft gripper project is to construct a “hand” that will provide general assistance in many applications where people and workplaces can benefit from the delicate touch and handling capabilities of soft robots.

Soft Robotics
The concept of the soft robot emerged recently, only in the last decade. This technology presents a major shift in the field of robotics. In the past, robots were constructed almost exclusively with rigid materials, such as steel, aluminum, and plastic. These hard components are usually electronically actuated, adding failure modes to some robotic applications, such as marine robotics where all electronic components must stay dry for the robot to function. Additionally, rigid grippers are very precise. With intensive machine learning, they can acquire the skill of grasping a specific object. However, when the object changes, the robot must relearn to pick up the new object from scratch.

Soft grippers revolutionize robotics for packaging and pick and place applications because they only have two states: open and closed. These grippers are made from delicate materials, often silicone, which will not damage items at full grip strength. This allows the gripper to close around an object of any shape by simply squeezing as hard as it can and allowing itself to adapt to the shape of the item. Use of softer materials in robotics also strives toward minimizing injury in shared human-robot workspaces.

Soft Gripper at CNM
The team I am working with under Jie Xu at the Center for Nanoscale Materials (CNM) is developing a soft gripper that goes beyond the capabilities of the basic grippers widely used in packaging and similar industries today. The fingers on this gripper are unique in the way that they can not only bend, but also expand, getting longer and narrower.

Figure 1 – Cross-sectional view of air channels within one finger. Air is pumped into blue channels

For a soft finger to bend, air is pumped into one side of the finger, the side opposite the bending direction, while the other side remains unchanged. When one side of the finger expands to be longer than the other, this naturally forces a curvature in the finger. Our team’s design has three channels, two of which are filled with air to bend the finger (see Figure 1). Stretching the fingers can be accomplished by pressing air into all three channels simultaneously. The way to realize this expansion is to make sure that all three air channels only expand in the direction in which the finger points, not laterally. At this point, the team is still in the early stages of testing non-expanding, bend only fingers. Once these fingers are fully finalized, I look forward to trying and iterating the design of the expanding fingers until they are fully capable of squeezing into narrow openings and stretching to impressive lengths.

Figure 2 – CAD designs by Louis Wong. Design #1 (left), Design #2 (right)

The Demo
During the first four weeks of my internship, I was tasked with constructing a demonstration for the first iteration of the gripper, Design #1 (see Figure 2). The demonstration I designed highlights the advantages of soft fingers which can stretch and squeeze into narrow openings. Performing this demonstration with a hand-shaped gripper body presented many challenges, however. Among these were making the palm shrink as well as the fingers, or alternatively designing the thumb such that it could stretch even further than the other fingers so that the palm would not have to fit through a narrow tube or opening. In anticipation of these obstacles, I proposed an octopus-inspired gripper design. Based on this idea, another NAISE intern Louis Wong crafted Design #2 (see Figure 2). With the new design, I constructed a demonstration in which the gripper’s soft fingers must fit into a narrow opening. Once inside, the fingers touch unknown objects with multichannel soft sensors to categorize the mystery items. This is done entirely without visual aid, showing how the extendable fingers accomplish the goal of assisting humans by going beyond human capabilities.

Haptic Identification and Closed-Loop Control
My demonstration highlights the advantage of a robot capable of identifying objects without relying on visual data. Soft grippers can identify objects haptically by applying pressure to the objects’ surfaces. Sensors embedded onto the surface of the soft fingers measure the amount of pressure required to create a slight deformation in the object’s surface, indicating the softness of the object. My team’s grippers will implement multichannel sensors, which include pressure sensing. Working with Chengshi Wang, I will develop the Python codes to program the closed-loop control system for Design #1. Most soft grippers use an open-loop system. Here, the computer tells the gripper to perform a simple task, such as open or close, and receives no feedback from the gripper itself. In a closed-loop system, the code indicates how much air should be pumped into the channels within the fingers. As a result, the soft fingers apply a measurable amount of pressure onto the object they are gripping. The sensors send this pressure data back to the computer which then adjusts the amount of air to pump into the channels based on a set desired value for surface pressure. For now, this feedback system will allow our gripper to apply the perfect amount of pressure to objects to pick them up and manipulate them with a reduced risk of damaging the items. After further development, the multichannel sensors will also assist in haptic object identification.

The initial control system will use Arduino hardware, wired as shown in diagram below (see Figure 3). To correctly regulate pressure into the soft fingers, I will use the Arduino IDE software in conjunction with the pySerial module of Python. Using Python enables collecting data from the fingers’ sensors, adjusting pressure setpoints, and sending these new inputs to the pressure regulators all within the same program.

Figure 3 – Closed-loop circuit for air pressure regulator

Next Steps

During the rest of my ten-week internship with NAISE, I will be focusing on coding the control system for one of the fingers in Design #1 (bending only). Before my final presentation, my and Louis Wong’s goal is to have one finger printed, functional, and programed such that it can sense an object by touching it. To demonstrate this, I will code the finger to retract from the object once it “feels” it. Once the code for a single finger is complete, the team will be able to apply it to the other four fingers so they may work together to pick up and place delicate objects. Following the internship, I will work on improving Design #2, researching inspirations from nature. I will contribute to the coding of the rest of the fingers in Design #1 and hopefully also work on the control system for the second gripper, which will be used in the demonstration that I created. I am excited to continue working on designing and developing code for both the team’s grippers. I am very grateful for the mentorship of Dr. Jie Xu and additional guidance I received from Dr. Chengshi Wang which allowed me to learn and gain valuable experience in my time with NAISE so far. The potential of soft robotic technology far surpasses that which has already been explored. I am hopeful that the technology I have been working on this summer will improve the quality of life for some, and the workplace for others, in the near future.

References

[1]      “Soft Robotic grippers for packaging,” Roboticmagazine.com. [Online]. Available: http://www.roboticmagazine.com/press-releases/soft-robotic-grippers-snack-bakery-packaging. [Accessed: 03-Aug-2021].

[2]      H. Zhao, K. O’Brien, S. Li, R. Shepherd, “Optoelectronically innervated soft prosthetic hand viastretchable optical waveguides,” Researchgate.net, 2016. [Online]. Available: https://www.researchgate.net/profile/Shuo-Li-38/publication/311479816_Optoelectronically_innervated_soft_prosthetic_hand_via_stretchable_optical_waveguides/links/5f7e7bc8458515b7cf6f3985/Optoelectronically-innervated-soft-prosthetic-hand-via-stretchable-optical-waveguides.pdf. [Accessed: 03-Aug-2021].

[3]      C. Laschi, J. Rossiter, F. Iida, M. Cianchetti, and L. Margheri, Eds., Soft robotics: Trends, applications and challenges: Proceedings of the soft robotics week, April 25-30, 2016, livorno, Italy. Cham: Springer International Publishing, 2017.

[4]      H. Rutland, “Novel soft Palmar gripper for chicken handling,” Clemson University, 2020.

Machine Learning for Biomaterials Property Predictions

Hi folks, my name is Selin Cetin and I’m a rising senior studying materials science and engineering at Northwestern University. I’ve been working with Dr. Santanu Chaudhuri in the Applied Materials Division this summer on a machine learning project. The goal of this project, broadly, is to predict the properties of plant-based building blocks from their structures. The original focus of the project was on cellulose and the aim was to predict properties of cellulosic materials from their structures, but due to the complexity of cellulose’s structure, this turned out to be unfeasible. We then shifted our focus to a more fundamental version of this task: learning properties of “plant-based building blocks”. This encompasses a range of small carbohydrate molecules found in plants, including glucose, fructose, starch, and small portions of cellulose.

Why Biomaterials?

As climate concerns mount, the pressure to move toward sustainable manufacturing practices grows ever greater. Products containing fossil-fuel-based plastics attract particular scrutiny because of the significant amount of greenhouse gases released during their life cycle. A materials solution to alleviate this issue is to replace components in these products, if not the entire product itself, with biodegradable materials. Another sustainable research area of interest is that of the circular economy, in which product components are continuously reused and recycled to suit a selection of uses. To achieve these two goals of increased biomaterial usage in traditionally plastic products and development of circular economies surrounding biomaterial-based products, the “tunability” of materials must be unearthed. For example, if adding a certain functional group to a material decreases the tensile strength but increases the solubility, these are important relationships to uncover so that materials may effectively be recycled to suit different applications. 

The Database(s)

To implement machine learning, one must have a database on which to learn. The creation of this database comprised the first part of my project. The property that we are targeting for learning is the materials’ melting point. This was chosen because it is the one that is most frequently reported for molecules of interest. The size of the database was a barrier; it was difficult to procure a large structure-property database containing only plant-based carbohydrates, hence, melting point was chosen to obtain the largest possible database. Future work may build on this project, targeting a different property to learn if the necessary amount of data becomes available. To create the database, I used a web API to import a structure representation and the melting point of plant-based carbohydrates from PubChem1. The structure representation came in the form of SMILES strings, which are text-based representations of a molecule. An example of a SMILES string can be seen in Figure 1, for the molecule glucose2. This dataset was quite small, so I supplemented it with data from the Jean-Claude Bradley Open Melting Point dataset3, and data from ChemSpider4, which had to be added by hand. 

C([C@@H]1[C@H]([C@@H]([C@H](C(O1)O)O)O)O)O
Figure 1: Glucose and SMILES string representation

For the training of the model, I will also be using the ChEMBL dataset5, which is a much larger dataset that contains drug-like molecules with bioactive properties. This will be done to improve the model’s ability to isolate important features from the molecules. 

Machine Learning

We are able to use a larger dataset for much of the training process because of the role of feature extraction in machine learning. The structures of the molecules will first be featurized using either an Extended-Connectivity Fingerprint representation or a graph-based representation. The extraction process for an ECFP representation is shown in Figure 26

Figure 2: ECFP Feature Extraction

These representations can be produced from the SMILES strings contained in the database created from PubChem. The purpose of doing this is to format the data in a way that is easier for a machine learning model to learn. Then, the model can learn to connect the featurized structure of a molecule to the property of interest. It is important to note that machine learning does not elucidate the underlying physics of the structure-property relationships, but instead reveals patterns to researchers who may then investigate further.

Discussion with Dr. Prasanna Balaprakash revealed that much of the work done within a molecular machine learning model is simply identifying which features are important and which are not. This is something that can be generalized outside of the scope of plant-based carbohydrates and their melting points. It is likely that what features are important for property prediction remain constant across most organic molecules; what changes is how said features influence the property of interest. Because of this, the final few layers of a model can be optimized for a particular dataset, which is the melting point dataset of plant-based carbohydrates in this case, while the bulk of the model, which may have been trained on a larger dataset for the prediction of a different property, remains the same. A visualization of this may be seen in Figure 37

Figure 3: Transfer learning: training a model to predict Cv and repurposing to predict Cp

This approach is called transfer learning, and it has been very successful in the field of image processing. Transfer learning has been successfully applied to images of molecules and raw SMILES strings, as seen in the paper ChemNet: A Transferable and Generalizable Deep Neural Network for Small-Molecule Property Prediction8. It has been used with varying degrees of success with other representations of molecules; a paper by Hu et al. found that pre-training on graph representations needed to be done at both the node level and full graph level to be effective9. I currently intend to attempt transfer learning with ECFP molecule representations. 

I’ve enjoyed my time in the NAISE program this summer, and would like to thank Dr. Santanu Chaudhuri, Dr. Prasanna Balaprakash, Xiaoli Yan, and Jennifer Dunn for all their guidance. I look forward to digging into the machine learning model for the remainder of the month, and hope that it will be a useful tool that can be expanded and explored in the future by others. 

  1. PubChem. (n.d.). PubChem. Retrieved August 11, 2021, from https://pubchem.ncbi.nlm.nih.gov/
  2. PubChem. (n.d.). D-Glucose. Retrieved August 6, 2021, from https://pubchem.ncbi.nlm.nih.gov/compound/5793
  3. Bradley, J.-C., Williams, A., & Lang, A. (2014). Jean-Claude Bradley Open Melting Point Dataset (p. 2225265 Bytes) [Data set]. figshare. https://doi.org/10.6084/M9.FIGSHARE.1031637
  4. ChemSpider | Search and share chemistry. (n.d.). Retrieved August 11, 2021, from http://www.chemspider.com/
  5. ChEMBL Database. (n.d.). Retrieved August 11, 2021, from https://www.ebi.ac.uk/chembl/
  6. Tilbec, H. (2018, April 24). Cheminformatics—ECFP & Neural Graph Fingerprint. Medium. https://medium.com/@hacertilbec/cheminformatics-ecfp-neural-graph-fingerprint-c98a98e12b04
  7. Yamada, H., Liu, C., Wu, S., Koyama, Y., Ju, S., Shiomi, J., Morikawa, J., & Yoshida, R. (2019). Predicting Materials Properties with Little Data Using Shotgun Transfer Learning. ACS Central Science, 5(10), 1717–1730. https://doi.org/10.1021/acscentsci.9b00804
  8. Goh, G. B., Siegel, C., Vishnu, A., & Hodas, N. O. (2017). ChemNet: A Transferable and Generalizable Deep Neural Network for Small-Molecule Property Prediction. https://www.arxiv-vanity.com/papers/1712.02734/
  9. Hu, W., Liu, B., Gomes, J., Zitnik, M., Liang, P., Pande, V., & Leskovec, J. (2020). Strategies for Pre-training Graph Neural Networks. ArXiv:1905.12265 [Cs, Stat]. http://arxiv.org/abs/1905.12265

Automating Active Learning Workflows for Computational Design of Li-ion Cathodes

Hi! I’m Alex Tai and I’m a rising Junior at Northwestern studying Materials Science and Engineering with a minor in Computer Science. This summer at Argonne, I’ve been working with Dr. Noah Paulson and Dr. Joshua Gabriel in the Applied Materials Division on the simulation of solid state cathode materials using a range of computational techniques.  

Background 

Koerver, R. et. al. Energy Environ. Sci. 201811 (8), 2142–2158 

The charge/discharge cycle of a solid state battery involves lithium entering and leaving the crystal structure of the cathode material. The removal of lithium from the structure is called delithiation. As cathode materials are delithiated, they undergo a volume change and concomitant structural collapse. This destabilizes the interface between the cathode and the electrolyte, compromising the performance of the battery. The search for a cathode material that maintains stability under delithiation spans a vast composition and configuration space. The problem is intractable with a purely experimental approach, and even computational simulations can become impractically time intensive. Thus, a machine learning workflow is employed that attempts to minimize computational cost while still accurately calculating materials properties.  

The Workflow 

There are two main materials simulation methods involved. One is more accurate but more expensive, while the other is less expensive but less accurate. The specifics can vary, but in our case the more expensive method is density functional theory (DFT) and the less expensive method is machine learning force fields (MLFF).  

DFT uses functionals of the electron density to find an approximate solution of Schrodinger’s equation. Depending on the level of theory and the complexity of the system, calculations can sometimes take multiple weeks to converge. MLFFs are trained on the DFT data and use neural networks to represent a potential energy surface (PES). These neural network potentials (NNP) are orders of magnitude faster than DFT and excel at interpolation between points of the PES that the MLFF was initially trained on.  The goal is to use MLFFs to explore more configurations and compositions with quantified uncertainties on their predictions, in lesser time than DFT.  The active learning workflow is developed to incrementally improve the accuracy of the MLFF when the uncertainty on its prediction is too high.  

The determination of whether DFT calculations are necessary is informed by uncertainty quantification (UQ) calculations. When the MLFF predictions have high uncertainty for a structure, that structure needs to be evaluated with the more physically rigorous first principles DFT approach. The workflow is shown in the figure below.  

Every place you see an arrow in the diagram, a human researcher has to log onto the supercomputer (these computations are too intensive for personal devices), perhaps look at the output, move some files around, and submit the next calculation in the workflow. That might not sound particularly hard, but it becomes time consuming (not to mention very annoying) if you have several of these workflows running at once. Another consideration is that a human can’t always move the calculation onto the next step as soon as it finishes-we might be off work, or we might be busy with something else and simply forget for a few days. Implementing a program to automate the “babysitting” of the workflow would save researchers time and effort and increase the efficiency of the exploration of the space.  

Tools for automation 

Colmena is an open-source Python library built for automating simulation workflows on high performance computers. The main components of a Colmena application are the Thinker and the Doer. The Thinker does what the researcher would typically do: submit calculations, read results, and even make decisions on what calculations to perform next based on those results. The Doer is made up of the functions that actually run the calculations. Colmena also allows us to manage the computational resources used by different functions. The challenge is to structure the Colmena app so all of the parts of the workflow communicate properly. Adaptations to the already existing pieces are necessary for compatibility. 

https://github.com/exalearn/colmena

The Doers in our application include a molecular dynamics simulation to generate structures, a function to evaluate the uncertainty of a structure, a function to evaluate a structure with DFT, and a function to retrain NNPs. The Thinker should read the result of the uncertainty evaluation to decide whether a structure needs to be evaluated with DFT and added to the NN training set. It automatically creates files and submits calculations accordingly.  

How is uncertainty quantified? 

The key to the efficiency of the workflow is uncertainty quantification. In our model, we have several NNPs that make an ensemble of predictors. These potentials differ from each other because they are trained on different subsets of the data. A structure is fed into all of the potentials, and each potential returns a prediction of the energy and forces. We use these points to construct a 95% confidence interval with the student’s t-distribution. The width of that confidence interval is taken as the uncertainty. If there is high uncertainty, that means the models disagree, so at least one of them must be wrong, and we should find the ground truth with DFT. The threshold to determine what counts as “high uncertainty” is something we are still considering, but it should become clearer once more structures are evaluated.  

Training neural networks 

I am also exploring the training step of the neural networks. NNs are trained by defining a loss function, which serves as a metric for how well the model predicts a dataset. The loss function defined in the DeePMD-kit software package that we use to train the network has the form: 

Han Wang, Linfeng Zhang, Jiequn Han, and Weinan E. “DeePMD-kit: A deep learning package for many-body potential energy representation and molecular dynamics.” Computer Physics Communications 228 (2018): 178-184 

E denotes energy and F forces, and the p values are tunable parameters, also called prefactors. The total loss is essentially a weighted sum of the distance from the model’s predictions for energy and forces to the ground truth. Depending on how the parameters are set, the model will have different priorities. The parameters are tunable and we are exploring the best combination of energy and force prefactors.  

It is common in machine learning to plot the loss and error over the course of training. While DeePMD-kit produces a file with evaluations of loss and error on train and test datasets at regular intervals of the training, it does not give the actual plots. I wrote a python script to parse that output file and generate plots from it. An example is shown on the left. On the right is the correlation between the model predictions and ground truth for forces on some testing configurations. The predictions are accurate despite the model having never “seen” the testing data. Thus, it can be generalized to predict the properties of configurations within the space spanned by the dataset.  

Next Steps 

We will continue to work on writing and testing the Colmena application. I am also now working on exploring the composition space (the automated workflow, as it is now, only explores configuration space) by substituting Co, Mn, and other dopants for Ni into the structure using the pymatgen Python library.  

Adding Energy-Harvesting Sensors to the Waggle Platform with LoRa Technology

Introduction

Hey everyone! My name is Lance Go, and I am a rising junior at Northwestern University studying computer engineering and economics. Over these past few weeks, I have been working with the Sage/Waggle group at Argonne National Laboratory (ANL) and Northwestern University with guidance from Dr. Rajesh Sankaran and Dr.Jennifer Dunn. Furthermore, Dr. Josiah Hester and Dr. Branden Ghena from Northwestern University provided additional guidance and support for my research and development work. My project revolves around developing a framework to integrate energy-harvesting sensors into the already existing Waggle architecture.

What is Sage/Waggle and what is energy-harvesting?

The Waggle project is a multi-year initiative at ANL to bring edge computing to scientific sensing. Sage, a Cyberinfrastructure built on Waggle, is an NSF-funded project to bring Waggle infrastructure to various urban and rural scientific research activities. Waggle is a network of devices known as nodes loaded with many environmental sensors and a computer. This allows data to be collected and computed at the source in a process known as “edge computing.” The edge-computing capabilities of the nodes allow them to excel in a wide variety of settings and environments; however, this means the platform can be energy and bandwidth-intensive. Scenarios where energy or space is limited call for smaller energy-harvesting sensors.

An energy-harvesting device is any device that gets its power from naturally occurring phenomena or sources in its environment(i.e. not a battery). This can be solar power, thermal energy, kinetic energy, and piezo-electric among others. Since these devices require no physical connection to power and are typically very small, they fill the gaps in the current Waggle platform quite nicely.

What is LoRa and how can we use it?

The main approach to combining energy-harvesting devices and Waggle nodes is to place them in the close vicinity of the node and have them wirelessly communicate the data they collect to the node. With this approach, we can have a robust system to collect and process a large volume of data without having many high-energy nodes with the Waggle node acting as a data aggregator too. However, this poses a significant problem: how can these sensors communicate with the main Waggle node? The answer is LoRa.

LoRa is a low-power long-range radio modulation technique. Compared to more familiar network protocols like Wi-Fi or Bluetooth, LoRa offers very low bandwidth, which means it cannot quickly transmit a lot of data. However, what LoRa lacks in bandwidth, it makes up for in excellent range and reduced energy consumption; this makes it the ideal communication protocol for energy-harvesting sensors.

Implementing LoRa on the Waggle platform

A simple LoRa network contains two fundamental parts: the end nodes and the gateway. The end nodes are the devices in the network that collect and send data. The gateways collect the data sent by the end nodes and backhaul this data to be processed and archived. In the Waggle platform, the energy-harvesting devices are the end nodes and the Waggle node itself is the gateway. Besides the end nodes and the gateways, there are two popular approaches to implementing LoRa: public LoRaWAN and point-to-point communication. Both of these approaches have upsides and downsides.

Public LoRaWAN is the standard protocol for using LoRa protocol. Typically used in IoT applications, LoRaWAN allows gateways to backhaul data received from end nodes to a server managed by a private company. From there, the data can be managed and processed by an adjacent public application server (see figure 1). Public LoRaWAN protocols are strictly governed by rules laid out by the LoRa Alliance and require every gateway and end node to be registered on the network. This is ideal for IoT since it allows anyone to use any gateway registered on the network for their own IoT projects. Additionally, LoRaWAN has a lot of already established features that aid in managing data like encryption and multi-channel gateway support. Despite these conveniences, the dependencies on private entities for data-backhaul and also the general lack of consistent availability of public LoRaWAN in all areas of Sage/Waggle deployment make it less suitable for our application. A network like this also does not take advantage of the Waggle node’s existing hardware, nor does it easily integrate with the Sage CI architecture.

Figure 1: Typical private LoRaWAN network architecture

Unlike public LoRaWAN, point-to-point communication (P2P) can allow us to build our architecture on the Waggle platform entirely. P2P does not need any type of server and directly sends data received by the gateway to the Waggle node to be processed. This is the simplest approach but is held back by its scalability. For a few sensors, a single-channel gateway can manage most of the work but starts to falter when there are upwards of fifteen to twenty sensors. This is in part due to the lack of multi-channel gateway support from P2P connections. Furthermore, any of the features natively included in LoRaWAN architecture need to be developed on our end.

Figure 2: Point-to-point style network

My proposed method is a private LoRaWAN server, which lies somewhere in between the two extremes. Using ChirpStack, an open-source toolkit for making private LoRaWAN servers, we can retrofit the onboard computing of the Waggle node into a network and application server. This allows us to cut out the extra step required by using a public LoRaWAN server managed by another company. This way we can take advantage of all of the LoRaWAN features but still use our own architecture within the Waggle/Sage platform. The hardware setup of this network is also only slightly more complicated than the P2P and requires the single-channel LoRa gateway to be swapped with a multi-channel gateway. Figure 3 shows a typical private LoRaWAN server setup with two gateways targeting end nodes at different ranges. Increasing the number of gateways on a network is a simple approach toward increasing the possible number of end nodes supported.

Figure 3: Private LoRaWAN network with two gateways

Next Steps

As my internship winds down, my main goals are to gain a better understanding of using private LoRaWAN servers. Currently, I only have single-channel gateways in possession and have only physically implemented a P2P-type network. To explore private LoRaWAN gateways further, two multi-channel gateways are being shipped to me currently, one of which (Discovery Kit 2) has a Raspberry Pi 4 built-in. With these two gateways, I want to test two approaches to implementing a private LoRaWAN server. The first is the traditional approach, where both servers are connected to the Waggle node and the computer onboard the Waggle node hosts the network and application server. The second involves backhauling all data from both gateways onto the Raspberry Pi on the Discovery Kit. The Raspberry Pi will host the servers and then forward the data to the Waggle node to be processed.

Designing a UI for a Heat Table that Gathers Data from Sage Sensors

Hello! My name is Alanda Zong, and I am a rising sophomore at Northwestern University. I am majoring in computer science. This summer, I have been working with SAGE and under my mentor, Neal Conrad to create a web visualization tool using D3.js that allows not only the scientists at Sage to easily understand the data, but make it understandable for anyone who looks at the Sage website. 

What is D3.js??

I didn’t know what D3.js was at first, so I feel that I should explain it for those who don’t know what it is. D3 stands for Data-Driven Documents. It is a Javascript library that focuses the user to create custom interactive data visualizations in the web browser using HTML and CSS. So, in layman’s terms, D3.js allows the user to create graphs and charts very easily.

How will the Heat Table work?

The Heat Table will take the data that is gathered by the sensors, and it will organize it into a JSON format. I am using Sean Shahkarami’s code to aggregate the data from the sensors. The Table will also show whether data is available for not. When, I use D3.js to create the heat table, I can easily organize the table by the plugins/nodes. Also, there will be an option to see if there is data per day, per week, and per month.

The Heat table includes a hover feature, so when the user hovers over a certain rectangle, it shows the data’s value. It also has a scroll down option that will allow the user to change the data from daily/weekly/monthly. Here is what the UI for the table looks like ->

Work Cited

  1. D3 Staff. (n.d.). What is d3.js? What is D3.js? https://www.tutorialsteacher.com/d3js/what-is-d3js. 

Expanding the Processing Parameters in Capacitive Deionization using LabVIEW

Hey everyone, my name is Steven Ma and I am a rising junior in Materials Science and Engineering at Northwestern University. This summer I have been working in Dr. Lauren Valentino’s Lab with the Bioprocessing and Reactive Separations group in the Advanced Materials Division with guidance from Dr. Jennifer Dunn and Brittani Williams to investigate and develop a LabVIEW program that would allow us to study the effects of flowrates in separations processes using Capacitive Deionization (CDI).

Capacitive Deionization (CDI)

Capacitive deionization (CDI) has been successfully integrated into various industrial water treatments processes and has shown selectivity for organic acids using redox-active electrodes. Dr. Valentino’s lab is working on developing a scalable purification system that uses CDI with activated carbon electrodes to recover sodium butyrate (which can later be converted to biofuel) from fermentation broths while minimizing the energy consumption and cost of this process [A simple Schematic of CDI Attached below in Figure 1]. A variety of parameters are being investigated in the separation process including voltage, cycle time, and conductivity.

Figure 1: Simple CDI process

LabVIEW software is being used to control the operating parameters of the separation processes. While we currently have a LabVIEW program that can vary the potentiostat and measure the conductivity probe, the flow rate is maintained at a constant rate in the experimental setup. Literature has indicated that flow rate is a parameter that affects the adsorption rate of CDI processes.

Furthermore, the flow rates must be modified manually, which is tedious and time-consuming. This reduces the efficacy of data collection and the scalability potential of the current system.  In my project, I am expanding the current LabVIEW program to incorporate an automated pump that can be used to control the flow rate in the system. Furthermore, I am developing Python code that calculates the energy consumptions in these calculations.

During the first few weeks, I learnt LabVIEW and developed basic LabVIEW code that would be able to control the pumps remotely using a computer connection. LabVIEW consists of 2 interphases, a front panel, and a block diagram. In the code below, we can set the flow rates of different processes with the toggles and input parameters as well as the time these flow rates last. The block diagram shows the actual code controlling the pump. The three blocks below show initiation, pump control, and termination.

Figure 2: LabVIEW interphase and LabVIEW code

The pump was delivered in week 4. The setup for my experiments is shown below [Figure 3], which features a flowrate controllable peristaltic pump that transferring water between two beakers.

Figure 3: Pump Set up

Thus far, the pump control has worked perfectly, and I am excited to see this code being incorporated into the current LabVIEW program used in the CDI experiments. Over the remaining weeks of my internship, I will be working on troubleshooting with the team over at Argonne who will test this code and working on some Python code used to wrangle and analyze energy consumption from output files from the current LabVIEW programs.

I would like to thank everyone who has made this remote internship possible, Dr. Valentino for assigning me such an interesting project, and special thanks to Professor Richards for allowing us to use his lab space at Northwestern. I hope this work will make an impact in this field!

References:

  1. M. Lenz, R. Wagner, E. Hack, and M. Franzreb, “Object-Oriented Modeling of a Capacitive Deionization Process,” Front. Chem. Eng., vol. 2, no. April, pp. 1–14, 2020
  2. S. Porada, R. Zhao, A. Van Der Wal, V. Presser, and P. M. Biesheuvel, “Review on the science and technology of water desalination by capacitive deionization,” Prog. Mater. Sci., vol. 58, no. 8, pp. 1388–1442, 2013
  3. X. Su and T. A. Hatton, “Redox-electrodes for selective electrochemical separations,” Adv. Colloid Interface Sci., vol. 244, pp. 6–20, 2017
    M. E. Suss, S. Porada, X. Sun, P. M. Biesheuvel, J. Yoon, and V. Presser, “Water desalination via capacitive deionization: What is it and what can we expect from it?,” Energy Environ. Sci., vol. 8, no. 8, pp. 2296–2319, 2015
  4. L. Valentino, E., L. Valentino, E. Barry, Y. Lin, P. Laible , E. Tan, B. Kubic , J. Dunn “Redox-Based Electrochemical Separations,” Separations Consortium. 2021
  5. H. Wang, L. Edaño, L. Valentino,Y.J. Lin, V. M. Palakkal, L. Hu, B.H. Chen, D.J. Liu, Capacitive Deionization Using Carbon Derived from an Array of Zeolitic Imidazolate Frameworks.Nano Energy pp. 77, 2020
  6. D. Zhang, P. Gurunathan, L. Valentino, Y. Lin, R. Rousseau, V. Glezakou, Atomic Scale Understanding of Organic Anion Separations Using Ion Exchange Resins. Journal of Membrane. Science pp. 118890, 2020

Advanced Photon Source Upgrade (APS-U) Project

Introduction 

Hi! My name is Charles Cheng, and I am a rising sophomore studying Mechanical Engineering at Northwestern University. This summer (2021) I held an internship at Argonne National Laboratory, working on the Advanced Photon Source Upgrade (APS-U) Project with my mentor, John Quintana. The entirety of the internship was conducted remotely due to Argonne’s COVID-19 protocols. For the most part, working remotely did not impede my ability to work but introduced the anticipated challenges of communicating through email and video conferences. 

First, I will give some background on the project and its importance. The Advanced Proton Source (APS) is a national research facility that produces ultra-bright, high-energy x-ray beams. Research teams across the globe use the APS to conduct research across numerous scientific disciplines. The APS-U Project aims to replace the current storage ring with 3000 tons of new accelerator components. The project requires a variety of documentation in order to swiftly assemble the new storage ring; it is in the interest of researchers to minimize the amount of time the facility, under construction, cannot be used. Engineers will rely on this documentation to ensure the inventory is accurate and accessible. 

This summer, my objective is to create the deliverables for the APS-U project, provide logistical support as needed, and review drawings and engineering models to obtain accurate fidelity on APS’s databases for future operations. For the past few weeks, I have met with a number of engineers and logistic professionals, devising solutions to their proposed problems which varied from data inaccessibility to computational algorithms. In the following sections, I will briefly discuss my work creating those solutions and detail what I was able to accomplish. 

Keeping Track of (Literally) Tons of Inventory 

One of the issues that arises when upgrading a facility with hundreds of magnet modules and countless other components is the disorganization of data. What kind of data? Documentation, drawings, models, images, and reports from components in the old storage ring all need to be accounted for in a way that is not only easy to access but also standardized across all of APS’s 35 bending magnets and insertion devices. Prior to working on this issue, pertinent data about the inventory was stored in one of Argonne’s databases called the Component Database, or CDB for short. To beamline technicians and engineers, accessing this data from the CDB was inefficient, so a solution to represent the inventory in a better way was needed. 

Figure 1. Spreadsheet Categorically Listing Components from Bending Magnet 1

The solution I worked on extracted the data from the CDB and organized it onto a spreadsheet. I established a standardized hierarchy for the components’ sequence numbers to make it easier to locate each component and all the data attached to them. An example of the spreadsheets for one of the bending magnets is shown in Figure 1. The difficulty of this task lay in the number of components and modules I had to process, transferring hundreds of lines of data from the CDB to Excel as quickly as possible without making errors. Although tedious at times, I managed to successfully work my way through this within the first few weeks. 

More Inaccessible Data: The Downside of Reusing Old Components 

Next, I moved onwards to a similar problem, one that, again, dealt with data manipulation. A few APS engineers were planning to “harvest” or reuse some of the components for the bending magnet front ends. The data was stored in layout drawing on the Integrated Content Management System (ICMS), another of Argonne’s centralized databases for managing documents. In the layout drawings, there were links to web pages that contained valuable information like the alignment report which needed to be accessed by technicians when they began the upgrade. We needed a way to list the harvested components with their appropriate links and documentation. 

Figure 2. Bending Magnet Front Ends Spreadsheet

Similar to the solution devised for the previous problem, a spreadsheet was created to organize the data into clear categories. For each bending magnet, I listed all the links from its layout drawing and provided descriptions of every component, including their location relative to the centerline. After this was completed, a QR code was assigned to each component. So now, instead of opening the layout drawing, searching for the component, and clicking on the associated link, the technicians just have to scan a QR code attached to each component and will be presented with the component’s data without unnecessary hassle. An example of the spreadsheets for one of the bending magnets is shown in Figure 2.

Safety First! Creating Models for Simulating Beam Strikes 

Albeit essential to the APS-U project, data management is not the most exciting work. Thankfully, I was soon introduced to another problem a physicist, tasked with evaluating the safety of researchers using the new design of the storage ring, was having. To give some background, electrons travel through a series of bending magnets around the APS storage ring. In certain parts of the storage ring, there are magnets that cause the electrons to change paths and emit x-rays which are used to conduct research through various methodologies. The electrons are supposed to continue to travel down the storage ring, while the x-rays diverge into a separate pipe where the researchers conduct their experiments. The x-ray pipe is designed to shield the researchers from x-rays but not the electrons traveling through the storage ring. The physicist’s objective was to determine whether shielding from electrons was required by simulating beam strikes and observing if electrons could somehow travel down the x-ray pipe. 

Figure 3. OpenSCAD Model of a Section of the Storage Ring 

The issue here was that the models to be used to simulate beam strikes didn’t exist. However, a model of the storage ring in a stereolithographic (STL) format was available to me. I attempted to implement an algorithm to incrementally slice the STL model and output closed-contour polylines. However, I ran into a problem — the provided model contained improper intersections and triangles. The algorithm was able to perform the slicing but could not form the polylines correctly, so I set out to try another approach. Using the plots of the horizontal and vertical slices generated from the code we had written, I constructed a model of the storage ring in OpenSCAD. An image of the work-in-progress model is shown in Figure 3. As of now, this approach is working but a few challenges with extruding sections of the vacuum chamber along a path are still being worked out. The plan, once I have completed the model, will be to run it through the slicing algorithm and verify that my model closely matches the one I was provided. 

A Conventional Solution to Crooked Cabinets and Magnets in Disarray?

While I was working on modeling the storage ring, I was given yet another opportunity to work on an interesting problem: aligning magnets. When the magnet modules are assembled, very small errors in positioning cause the magnet centers to be out of line. In order to fix this, the magnets are shimmed, similar to how cabinets are shimmed when the ground is not level. The issue here is that the errors are on the magnitude of microns, and the engineers only have shims of certain sizes, like 25 microns. The engineers would like to calculate the shim sizes needed to best align the magnet centers given their limited size options. In other words, they want to shim the magnets in one axis to achieve a set of points with the best fit to the trendline. 

Figure 4. Raw and Shimmed X Error vs. Z for DLMA-1010 Module

The most straightforward solution I thought of was writing an algorithm that would test all the possible permutations of shim sizes for the magnet centers in a module and choose the permutation with the smallest standard error. I performed this calculation for the DLMA-1010 module, but even using a multiprocessing package, the runtime was over a couple of hours. I spent a few days trying to optimize the code and decreased my search range by making an initial guess; I offset the magnet positions using the given shim sizes until they were just below the largest displacement. Then, I only examined the permutations within a range of the offset. This proved successful, and I was able to compute results in a matter of seconds. A graph of the original errors in the x-axis compared to the shimmed errors versus the magnets’ positions in the z-axis is shown in Figure 4. 

Next Steps 

Last week, I held a meeting with the clients of the spreadsheet of harvested components for the bending magnet front ends. From the feedback they provided, I need to extract the data for another bending magnet whose layout drawing was not uploaded in the ICMS. Otherwise, I was told that everything looked good. On the topic of creating the OpenSCAD models, I am still working on extruding the complex sections of the storage ring but anticipate achieving results soon, which I can verify are accurate using the slicer algorithm. Lastly, for the magnet alignment problem, the clients have informed me that they would like to account for the total magnitude of the shims being used as well as the standard error. Also, they would like the process of extracting the coordinate points for the magnet centers to be automated. 

With great enthusiasm, I believe I will be able to accomplish the tasks illustrated above within the last weeks of my internship at Argonne. So far, I have deeply enjoyed the work I have performed whether it be managing data or writing code. From this experience, I have learned a great deal from John and would like to take the time to express my gratitude for his mentorship throughout this internship. I would also like to give my thanks to the engineers and professionals I have had the pleasure to work with.