Knowledge Graphs for a Circular Economy

Hello! My name is Lillian Jirousek, and I am a rising junior at Northwestern majoring in Math, Physics, and Integrated Science. This summer I have been working with Dr. Chaudhuri in order to find a pathway to reach a circular economy. I am collecting data on the production and uses of cellulose in order to model its life cycle in the creation of bioplastics and the various elements (temperature, microbes, usage etc.) that may alter that life cycle over time.

Goals of the Circular Economy

What most of us are used to in the present day is what’s known as a linear economy. Materials are gathered from one location, taken to another place to be manufactured, and then taken somewhere else to be distributed and eventually disposed of in a landfill. This creates both tons of materials waste, and adds to the CO2 emissions and greenhouse gasses. Creating items doomed to become waste in landfills or the ocean, as currently practiced across the world is a high carbon footprint approach. A far more desirable alternative to this is a circular economy. This typically involves people using locally sourced materials which they subsequently re-use and repurpose, thus greatly reducing both the materials waste and the CO2 emissions.

In order to reach this circular economy, there are many different products that must be produced, such as feedstocks and metals, yet we will focus on the product of bioplastics. These bioplastics can come from renewable sources of biomass and may eventually degrade and provide nutrients back into the soil. However, biodegradation is not guaranteed in bioplastics as many of the monomers used come from petroleum-based sources. One of the key elements to most bioplastics is cellulose, which is the focus of my project.

Use of Graph Theory

When describing something as complicated as the life cycle of a bioplastic, from the making of cellulose in a plant to the degradation of it via composting much later, it is quite the daunting task. Because of this, we use something known as a knowledge graph. A knowledge graph consists of various nodes, representing different entities that are subsequently connected together by edges which describe the relationships of said entities. A common example of this could be a group of nodes representing genes in a plant, and then edges that connect genes that appear to be coexpressed and thus likely implicated in the same steps for the production of cellulose.

Components of the Knowledge Graph

We have divided my project into a few key portions of the cellulose/bioplastic life cycle that will go into our knowledge graph. These include: the genes that are commonly expressed to produce cellulose in a plant, the process of making cellulose, differences in cellulose production from various species, the structure of cellulose and cellulose nanocrystals, the bioplastics that can be made and their properties, the process of composting, and the microbes that control decomposition. Much of this information is scattered across various databases, papers, or not even known. Because a circular economy generally occurs on a more local scale, one needs to know specifics about how the locally grown plants produce cellulose, the soil composition in the area, and properties of the bioplastics that could be made. All these variables could drastically change the cellulose life cycle in a given area. Thus, we may not actually develop a method to manifest a circular economy, but we can certainly show how we would and what information we may need in the future.

With the information that we have, we may create a graph neural network to describe the entire problem. In order to put seemingly unrelated things (e.g. genes that are expressed to make cellulose, the uses of materials such as parts of appliances or plastic bags, and the microbes that decompose bioplastics) we will use an embedding matrix to place all of our items in a distributed representation that may identify all items by a distinct set of properties. We will then use machine learning methods to determine one of the best possible paths to circularity.


The linear economy that we are all used to is simply not tenable for the long term. The abundance of potential bioplastics makes it inexcusable for us to not reach towards a more sustainable future. We thus aim to propose a framework for developing a circular economy given enough information exists in the future.