Coding

Part:BBa_K3627009:Design

Designed by: Sachin Jajoo   Group: iGEM20_UIUC_Illinois   (2020-10-27)


Third light chain for SARS-CoV-2 antibody


Assembly Compatibility:
  • 10
    INCOMPATIBLE WITH RFC[10]
    Illegal PstI site found at 238
    Illegal PstI site found at 385
  • 12
    INCOMPATIBLE WITH RFC[12]
    Illegal PstI site found at 238
    Illegal PstI site found at 385
  • 21
    COMPATIBLE WITH RFC[21]
  • 23
    INCOMPATIBLE WITH RFC[23]
    Illegal PstI site found at 238
    Illegal PstI site found at 385
  • 25
    INCOMPATIBLE WITH RFC[25]
    Illegal PstI site found at 238
    Illegal PstI site found at 385
    Illegal NgoMIV site found at 18
  • 1000
    COMPATIBLE WITH RFC[1000]


Design Notes

This is the most ambitious aspect of our project, beginning with an already-researched antibody sequence that binds to the spike protein, the S309 neutralizing agent (PDB ID 6WPT). The basic idea is to generate a multitude of sequences via random point mutations.

Initially, we designed a Gaussian Process Regression model that would hopefully generate several predicted iterations of sequences based on the binding energies of the antibody to various spike protein mutants. This, however, did not work as the model could not generate properly mutated sequences based off of numerical values for binding energy. Several discussions with experts later, we came across the concept of the genetic algorithm, a computational simulation of natural selection.

A simple genetic algorithm was then designed, applying the random mutation aspect of the algorithm to several positions on the antibody sequence. This algorithm was implemented for both the heavy and light chain sequences, generating several newly mutated sequences. PDB files were then generated for these sequences, which were then tested on PyRosetta for binding with the spike proteins. After getting the REU values through rosetta, the dominant sequences, that can bind to the spike protein are kept in the population. A flow chart is also attached under which briefly describes the process. To reduce the time we need to spend on folding, a lot of optimization is conducted, which are described below:

Any sequence with more than 4 mutations is killed as too much mutation will greatly reduce the quality of the protein model. Only key mutations are kept. Distance between mutations are kept as large as possible to reduce interference between mutations When folding protein, a stability test is first conducted to determine what quality of the protein should be folded. Also, sequences with higher scoring is kept in majority so we don’t have to do duplicate mutations

Mutation scans are also conducted with the antibody and spike protein files to give insight on what amino acid site can be more researched and what spike protein mutation will cause problems. The heatmaps are analyzed in the results section. Finally, 3 light chain and heavy chain sequences are picked out of the population with the best soring in binding with different variants in the spike protein mutation scan. We uploaded the mas coding parts and composite parts for future igem teams to test on as a vaccine neutralizing agent against COVID. Our antibody is also used by NEGEM team in their project design.



Source

This part was created by mutating a wild-type section of the genome. Mutations were induced to allow for better binding with the SARS-CoV-2 spike protein.

References