avatarOctavio Gonzalez-Lugo

Summary

This web page content describes a process for creating a data visualization of a protein using Blender, focusing on the principal Covid-19 protease with the PDB identifier 6LU7.

Abstract

The content of the web page explains the process of creating a data visualization for the principal Covid-19 protease using Blender. It begins by discussing the virus's infection process and the role of proteases in making new copies of the virus. The visualization is composed of individual elements, each with its own central location used as a local coordinate system. The process involves relocating the camera, changing the horizon color, and switching to the cycles engine. The protein is represented by its amino acid sequence, which is added as a text block in Blender. The sequence is then represented as a graph, with nodes for each amino acid and edges for the bonds between them. The frequency of each amino acid in the sequence is measured to create a compact representation of the protein. A bar plot is added to show the frequencies of each amino acid. The structure of the protein, determined through crystalization, is represented as a ribbon in Blender. Finally, materials are added to each element in the scene. The visualization aims to represent different aspects of the protein, including its sequence, structure, and amino acid frequencies, to help identify meaningful characteristics and patterns.

Bullet points

  • The visualization is created for the principal Covid-19 protease with the PDB identifier 6LU7.
  • The visualization is composed of individual elements, each with its own central location used as a local coordinate system.
  • The process involves relocating the camera, changing the horizon color, and switching to the cycles engine.
  • The protein is represented by its amino acid sequence, which is added as a text block in Blender.
  • The sequence is represented as a graph, with nodes for each amino acid and edges for the bonds between them.
  • The frequency of each amino acid in the sequence is measured to create a compact representation of the protein.
  • A bar plot is added to show the frequencies of each amino acid.
  • The structure of the protein, determined through crystalization, is represented as a ribbon in Blender.
  • Materials are added to each element in the scene.
  • The visualization aims to represent different aspects of the protein to help identify meaningful characteristics and patterns.

Protein dashboard visualization with blender

Viruses such as Covid-19 are biological entities that infect cells to reproduce itself. Once the virus introduces its genetic material to the host, it highjacks the host machinery to make new copies of the virus. Then, the virus components are synthesized with the molecular machinery inside the cell, usually, as polyproteins, the newly synthesized polyproteins needs to be processed to be functional. Covid-19 uses two proteases to cut the polyproteins making them functional, from those two proteases, the principal Cov protease has been crystallized with the PDB identifier 6LU7. In the following post, I will show how to create a data visualization with the protease information. The visualization will contain individual elements that can be arranged in different forms, each one will be an independent element. Each element will need a central location to be used as a local coordinate system. First, we relocate the camera, change the horizon color and change the render engine to the cycles engine

Each protein synthesized by any biological entity can be represented by its amino acid sequence. That sequence contains all the amino acids needed to synthesize the protein. We are going to add a text block to represent the protease as a sequence of characters. To do so, we insert a text element and delete the original text, then each amino acid is written with the standard one-letter code.

Sequences can be represented as graphs, where each node represents an amino acid and each edge represents the bond between the nth aminoacid and the nth+1 aminoacid. We add the Graph representation with.

A compact representation of the protein can be drawn by measuring the frequency of each amino acid inside the sequence. To measure the frequency of each amino acid we split the sequence into a list of characters, then we iterate through the sequence and update the vector of frequencies as the amino acids are found in the sequence.

With the measured frequencies, we add a basic bar plot with the frequencies of each amino acid with

Protein crystalization is the most accurate method to determine the structure of a protein. That structure can be used to develop new drugs and to elucidate how the protein works. We can add a ribbon representation of the protein by importing the obj file int blender with.

Finally, we add the materials to each element in the scene with the following

Each panel in the data visualization represents the same biological entity. However, every panel is different and tries to represent different aspects of the same object. The sequence represents the protein as text, it might be hard to find any useful patter with that representation, nonetheless, it changes an almost invisible entity into something tangible. The graph representation tries to put some order into the messy sequence, some nodes have a higher density of incoming edges, thus giving the appearance that those nodes or amino acids could be important. And that higher density of edges is conserved regardless of the graph layout. Representing the protein as the frequency of amino acids removes some of the ordering in the sequence, but make it easier to find amino acids with high an low frequency. And the 3D ribbon representation shows some pockets were potential drugs could interact to suppress the action of the protein. Representing the same object in a variety of forms is a useful task that can help us to create new and easier to read visualizations. Also, increases the chances to find meaningful cc characteristics and patterns in the data. Now you know how to create a simple dashboard visualization with blender. The complete code for this post can be found in my GitHub by clicking here. See you in the next one.

Data Science
Data Visualization
Programming
Bioinformatics
Covid-19
Recommended from ReadMedium