UTS site search

Dr Justin Ashworth


I am an experimental, computational and systems biologist with a focus on the complexity, operation and design of living systems, including microbes and marine microeukaryotes (phytoplankton). At UTS I am a member of the Climate Change Cluster (C3) having been awarded a prestigious and generous Discovery Early Career Award (DECRA) and Chancellor's Postdoctoral Fellowship to support the research focus on marine microbial biology and biotechnology.

Recently our teams have succeeded in efforts to discover and characterize novel and environmentally important processes occurring in marine microalgae through a combination of hypothesis-driven research, experimental systems biology ('omics), data-driven machine learning approaches, and laboratory genetics to investigate new molecular features in diatoms that may help to explain their dominance and importance in key marine ecosystems and biogeochemical processes. A deeper and more detailed understanding of these molecular processes also provides access to new green and sustainable algae biotechnologies.

I was raised in Hawaii, initiated my scientific career in the Pacific Northwest of the United States and prior to joining UTS held a Postdoctoral Fellowship at The Institute of Systems Biology, Seattle.

Image of Justin Ashworth
Chancellor's Postdoctoral Research Fellow, Climate Change Cluster
Core Member, Climate Change Cluster
+61 2 9514 3211

Research Interests

Marine Microeukaryotes [Diatoms]
Discovery and elucidation of molecular, genetic, and evolutionary processes by which marine phytoplankton (diatoms) regulate and adapt their physiology in response to changing ocean conditions; conditional regulation of photosynthesis, growth and production in algae; integrative meta-analyses of large genome and transcriptome datasets.
Microbial Systems Biology and Gene Regulation
"Reverse engineering" of gene regulatory networks that control critical processes in existing and new species; comparative genomics; experimental systems biology.
Molecular Function and Evolution, Bioengineering, Synthetic Biology
Rational modeling and experimental manipulation of molecular variation in proteins, enzymes, gene regulatory processes and microbial systems.
Can supervise: Yes

Journal articles

Stittrich, A.B., Ashworth, J., Shi, M., Robinson, M., Mauldin, D., Brunkow, M.E., Biswas, S., Kim, J.M., Kwon, K.S., Jung, J.U., Galas, D., Serikawa, K., Duerr, R.H., Guthery, S.L., Peschon, J., Hood, L., Roach, J.C. & Glusman, G. 2016, 'Genomic architecture of inflammatory bowel disease in five families with multiple affected individuals.', Human genome variation, vol. 3, p. 15060.
Currently, the best clinical predictor for inflammatory bowel disease (IBD) is family history. Over 163 sequence variants have been associated with IBD in genome-wide association studies, but they have weak effects and explain only a fraction of the observed heritability. It is expected that additional variants contribute to the genomic architecture of IBD, possibly including rare variants with effect sizes larger than the identified common variants. Here we applied a family study design and sequenced 38 individuals from five families, under the hypothesis that families with multiple IBD-affected individuals harbor one or more risk variants that (i) are shared among affected family members, (ii) are rare and (iii) have substantial effect on disease development. Our analysis revealed not only novel candidate risk variants but also high polygenic risk scores for common known risk variants in four out of the five families. Functional analysis of our top novel variant in the remaining family, a rare missense mutation in the ubiquitin ligase TRIM11, suggests that it leads to increased nuclear factor of kappa light chain enhancer in B-cells (NF-B) signaling. We conclude that an accumulation of common weak-effect variants accounts for the high incidence of IBD in most, but not all families we analyzed and that a family study design can identify novel rare variants conferring risk for IBD with potentially large effect size, such as the TRIM11 p.H414Y mutation.
Ashworth, J., Turkarslan, S., Harris, M., Orellana, M.V. & Baliga, N.S. 2016, 'Pan-transcriptomic analysis identifies coordinated and orthologous functional modules in the diatoms Thalassiosira pseudonana and Phaeodactylum tricornutum.', Marine genomics, vol. 26, pp. 21-28.
Diatoms are important primary producers in the ocean that thrive in diverse and dynamic environments. Their survival and success over changing conditions depend on the complex coordination of gene regulatory processes. Here we present an integrated analysis of all publicly available microarray data for the diatoms Thalassiosira pseudonana and Phaeodactylum tricornutum. This resource includes shared expression patterns, gene functions, and cis-regulatory DNA sequence motifs in each species that are statistically coordinated over many experiments. These data illustrate the coordination of transcriptional responses in diatoms over changing environmental conditions. Responses to silicic acid depletion segregate into multiple distinctly regulated groups of genes, regulation by heat shock transcription factors (HSFs) is implicated in the response to nitrate stress, and distinctly coordinated carbon concentrating, CO2 and pH-related responses are apparent. Fundamental features of diatom physiology are similarly coordinated between two distantly related diatom species, including the regulation of photosynthesis, cellular growth functions and lipid metabolism. These integrated data and analyses can be explored publicly (http://networks.systemsbiology.net/diatom-portal/).
Ament, S.A., Szelinger, S., Glusman, G., Ashworth, J., Hou, L., Akula, N., Shekhtman, T., Badner, J.A., Brunkow, M.E., Mauldin, D.E., Stittrich, A.B., Rouleau, K., Detera-Wadleigh, S.D., Nurnberger, J.I., Edenberg, H.J., Gershon, E.S., Schork, N., Price, N.D., Gelinas, R., Hood, L., Craig, D., McMahon, F.J., Kelsoe, J.R. & Roach, J.C. 2015, 'Rare variants in neuronal excitability genes influence risk for bipolar disorder.', Proceedings of the National Academy of Sciences of the United States of America, vol. 112, no. 11, pp. 3576-3581.
We sequenced the genomes of 200 individuals from 41 families multiply affected with bipolar disorder (BD) to identify contributions of rare variants to genetic risk. We initially focused on 3,087 candidate genes with known synaptic functions or prior evidence from genome-wide association studies. BD pedigrees had an increased burden of rare variants in genes encoding neuronal ion channels, including subunits of GABAA receptors and voltage-gated calcium channels. Four uncommon coding and regulatory variants also showed significant association, including a missense variant in GABRA6. Targeted sequencing of 26 of these candidate genes in an additional 3,014 cases and 1,717 controls confirmed rare variant associations in ANK3, CACNA1B, CACNA1C, CACNA1D, CACNG2, CAMK2A, and NGF. Variants in promoters and 5' and 3' UTRs contributed more strongly than coding variants to risk for BD, both in pedigrees and in the case-control cohort. The genes and pathways identified in this study regulate diverse aspects of neuronal excitability. We conclude that rare variants in neuronal excitability genes contribute to risk for BD.
Hennon, G.M.M., Ashworth, J., Groussman, R.D., Berthiaume, C., Morales, R.L., Baliga, N.S., Orellana, M.V. & Armbrust, E.V. 2015, 'Diatom acclimation to elevated CO 2 via cAMP signalling and coordinated gene expression', Nature Climate Change, vol. 5, no. 8, pp. 761-765.
View/Download from: Publisher's site
Plaisier, C.L., Lo, F.Y., Ashworth, J., Brooks, A.N., Beer, K.D., Kaur, A., Pan, M., Reiss, D.J., Facciotti, M.T. & Baliga, N.S. 2014, 'Evolution of context dependent regulation by expansion of feast/famine regulatory proteins.', BMC systems biology, vol. 8, p. 122.
BACKGROUND: Expansion of transcription factors is believed to have played a crucial role in evolution of all organisms by enabling them to deal with dynamic environments and colonize new environments. We investigated how the expansion of the Feast/Famine Regulatory Protein (FFRP) or Lrp-like proteins into an eight-member family in Halobacterium salinarum NRC-1 has aided in niche-adaptation of this archaeon to a complex and dynamically changing hypersaline environment. RESULTS: We mapped genome-wide binding locations for all eight FFRPs, investigated their preference for binding different effector molecules, and identified the contexts in which they act by analyzing transcriptional responses across 35 growth conditions that mimic different environmental and nutritional conditions this organism is likely to encounter in the wild. Integrative analysis of these data constructed an FFRP regulatory network with conditionally active states that reveal how interrelated variations in DNA-binding domains, effector-molecule preferences, and binding sites in target gene promoters have tuned the functions of each FFRP to the environments in which they act. We demonstrate how conditional regulation of similar genes by two FFRPs, AsnC (an activator) and VNG1237C (a repressor), have striking environment-specific fitness consequences for oxidative stress management and growth, respectively. CONCLUSIONS: This study provides a systems perspective into the evolutionary process by which gene duplication within a transcription factor family contributes to environment-specific adaptation of an organism.
Ashworth, J., Bernard, B., Reynolds, S., Plaisier, C.L., Shmulevich, I. & Baliga, N.S. 2014, 'Structure-based predictions broadly link transcription factor mutations to gene expression changes in cancers.', Nucleic acids research, vol. 42, no. 21, pp. 12973-12983.
Thousands of unique mutations in transcription factors (TFs) arise in cancers, and the functional and biological roles of relatively few of these have been characterized. Here, we used structure-based methods developed specifically for DNA-binding proteins to systematically predict the consequences of mutations in several TFs that are frequently mutated in cancers. The explicit consideration of protein-DNA interactions was crucial to explain the roles and prevalence of mutations in TP53 and RUNX1 in cancers, and resulted in a higher specificity of detection for known p53-regulated genes among genetic associations between TP53 genotypes and genome-wide expression in The Cancer Genome Atlas, compared to existing methods of mutation assessment. Biophysical predictions also indicated that the relative prevalence of TP53 missense mutations in cancer is proportional to their thermodynamic impacts on protein stability and DNA binding, which is consistent with the selection for the loss of p53 transcriptional function in cancers. Structure and thermodynamics-based predictions of the impacts of missense mutations that focus on specific molecular functions may be increasingly useful for the precise and large-scale inference of aberrant molecular phenotypes in cancer and other complex diseases.
Ashworth, J., Plaisier, C.L., Lo, F.Y., Reiss, D.J. & Baliga, N.S. 2014, 'Inference of expanded Lrp-like feast/famine transcription factor targets in a non-model organism using protein structure-based prediction.', PloS one, vol. 9, no. 9, p. e107863.
Widespread microbial genome sequencing presents an opportunity to understand the gene regulatory networks of non-model organisms. This requires knowledge of the binding sites for transcription factors whose DNA-binding properties are unknown or difficult to infer. We adapted a protein structure-based method to predict the specificities and putative regulons of homologous transcription factors across diverse species. As a proof-of-concept we predicted the specificities and transcriptional target genes of divergent archaeal feast/famine regulatory proteins, several of which are encoded in the genome of Halobacterium salinarum. This was validated by comparison to experimentally determined specificities for transcription factors in distantly related extremophiles, chromatin immunoprecipitation experiments, and cis-regulatory sequence conservation across eighteen related species of halobacteria. Through this analysis we were able to infer that Halobacterium salinarum employs a divergent local trans-regulatory strategy to regulate genes (carA and carB) involved in arginine and pyrimidine metabolism, whereas Escherichia coli employs an operon. The prediction of gene regulatory binding sites using structure-based methods is useful for the inference of gene regulatory relationships in new species that are otherwise difficult to infer.
Stittrich, A.B., Lehman, A., Bodian, D.L., Ashworth, J., Zong, Z., Li, H., Lam, P., Khromykh, A., Iyer, R.K., Vockley, J.G., Baveja, R., Silva, E.S., Dixon, J., Leon, E.L., Solomon, B.D., Glusman, G., Niederhuber, J.E., Roach, J.C. & Patel, M.S. 2014, 'Mutations in NOTCH1 cause Adams-Oliver syndrome.', American journal of human genetics, vol. 95, no. 3, pp. 275-284.
Notch signaling determines and reinforces cell fate in bilaterally symmetric multicellular eukaryotes. Despite the involvement of Notch in many key developmental systems, human mutations in Notch signaling components have mainly been described in disorders with vascular and bone effects. Here, we report five heterozygous NOTCH1 variants in unrelated individuals with Adams-Oliver syndrome (AOS), a rare disease with major features of aplasia cutis of the scalp and terminal transverse limb defects. Using whole-genome sequencing in a cohort of 11 families lacking mutations in the four genes with known roles in AOS pathology (ARHGAP31, RBPJ, DOCK6, and EOGT), we found a heterozygous de novo 85 kb deletion spanning the NOTCH1 5' region and three coding variants (c.1285T>C [p.Cys429Arg], c.4487G>A [p.Cys1496Tyr], and c.5965G>A [p.Asp1989Asn]), two of which are de novo, in four unrelated probands. In a fifth family, we identified a heterozygous canonical splice-site variant (c.743-1 G>T) in an affected father and daughter. These variants were not present in 5,077 in-house control genomes or in public databases. In keeping with the prominent developmental role described for Notch1 in mouse vasculature, we observed cardiac and multiple vascular defects in four of the five families. We propose that the limb and scalp defects might also be due to a vasculopathy in NOTCH1-related AOS. Our results suggest that mutations in NOTCH1 are the most common cause of AOS and add to a growing list of human diseases that have a vascular and/or bony component and are caused by alterations in the Notch signaling pathway.
Thyme, S.B., Boissel, S.J., Arshiya Quadri, S., Nolan, T., Baker, D.A., Park, R.U., Kusak, L., Ashworth, J. & Baker, D. 2014, 'Reprogramming homing endonuclease specificity through computational design and directed evolution.', Nucleic acids research, vol. 42, no. 4, pp. 2564-2576.
Homing endonucleases (HEs) can be used to induce targeted genome modification to reduce the fitness of pathogen vectors such as the malaria-transmitting Anopheles gambiae and to correct deleterious mutations in genetic diseases. We describe the creation of an extensive set of HE variants with novel DNA cleavage specificities using an integrated experimental and computational approach. Using computational modeling and an improved selection strategy, which optimizes specificity in addition to activity, we engineered an endonuclease to cleave in a gene associated with Anopheles sterility and another to cleave near a mutation that causes pyruvate kinase deficiency. In the course of this work we observed unanticipated context-dependence between bases which will need to be mechanistically understood for reprogramming of specificity to succeed more generally.
Ashworth, J., Plaisier, C.L., Lo, F.Y., Reiss, D.J. & Baliga, N.S. 2014, 'Correction: Inference of expanded Lrp-like feast/famine transcription factor targets in a non-model organism using protein structure-based prediction', PLoS ONE, vol. 9, no. 11.
View/Download from: Publisher's site
Ashworth, J., Coesel, S., Lee, A., Armbrust, E.V., Orellana, M.V. & Baliga, N.S. 2013, 'Genome-wide diel growth state transitions in the diatom Thalassiosira pseudonana.', Proceedings of the National Academy of Sciences of the United States of America, vol. 110, no. 18, pp. 7518-7523.
Marine diatoms are important primary producers that thrive in diverse and dynamic environments. They do so, in theory, by sensing changing conditions and adapting their physiology accordingly. Using the model species Thalassiosira pseudonana, we conducted a detailed physiological and transcriptomic survey to measure the recurrent transcriptional changes that characterize typical diatom growth in batch culture. Roughly 40% of the transcriptome varied significantly and recurrently, reflecting large, reproducible cell-state transitions between four principal states: (i) "dawn," following 12 h of darkness; (ii) "dusk," following 12 h of light; (iii) exponential growth and nutrient repletion; and (iv) stationary phase and nutrient depletion. Increases in expression of thousands of genes at the end of the reoccurring dark periods (dawn), including those involved in photosynthesis (e.g., ribulose-1,5-bisphosphate carboxylase oxygenase genes rbcS and rbcL), imply large-scale anticipatory circadian mechanisms at the level of gene regulation. Repeated shifts in the transcript levels of hundreds of genes encoding sensory, signaling, and regulatory functions accompanied the four cell-state transitions, providing a preliminary map of the highly coordinated gene regulatory program under varying conditions. Several putative light sensing and signaling proteins were associated with recurrent diel transitions, suggesting that these genes may be involved in light-sensitive and circadian regulation of cell state. These results begin to explain, in comprehensive detail, how the diatom gene regulatory program operates under varying environmental conditions. Detailed knowledge of this dynamic molecular process will be invaluable for new hypothesis generation and the interpretation of genetic, environmental, and metatranscriptomic data from field studies.
Ashworth, J., Wurtmann, E.J. & Baliga, N.S. 2012, 'Reverse engineering systems models of regulation: discovery, prediction and mechanisms.', Current opinion in biotechnology, vol. 23, no. 4, pp. 598-603.
Biological systems can now be understood in comprehensive and quantitative detail using systems biology approaches. Putative genome-scale models can be built rapidly based upon biological inventories and strategic system-wide molecular measurements. Current models combine statistical associations, causative abstractions, and known molecular mechanisms to explain and predict quantitative and complex phenotypes. This top-down 'reverse engineering' approach generates useful organism-scale models despite noise and incompleteness in data and knowledge. Here we review and discuss the reverse engineering of biological systems using top-down data-driven approaches, in order to improve discovery, hypothesis generation, and the inference of biological properties.
Fleishman, S.J., Leaver-Fay, A., Corn, J.E., Strauch, E.M., Khare, S.D., Koga, N., Ashworth, J., Murphy, P., Richter, F., Lemmon, G., Meiler, J. & Baker, D. 2011, 'RosettaScripts: a scripting language interface to the Rosetta macromolecular modeling suite.', PloS one, vol. 6, no. 6, p. e20161.
Macromolecular modeling and design are increasingly useful in basic research, biotechnology, and teaching. However, the absence of a user-friendly modeling framework that provides access to a wide range of modeling capabilities is hampering the wider adoption of computational methods by non-experts. RosettaScripts is an XML-like language for specifying modeling tasks in the Rosetta framework. RosettaScripts provides access to protocol-level functionalities, such as rigid-body docking and sequence redesign, and allows fast testing and deployment of complex protocols without need for modifying or recompiling the underlying C++ code. We illustrate these capabilities with RosettaScripts protocols for the stabilization of proteins, the generation of computationally constrained libraries for experimental selection of higher-affinity binding proteins, loop remodeling, small-molecule ligand docking, design of ligand-binding proteins, and specificity redesign in DNA-binding proteins.
Ashworth, J., Taylor, G.K., Havranek, J.J., Quadri, S.A., Stoddard, B.L. & Baker, D. 2010, 'Computational reprogramming of homing endonuclease specificity at multiple adjacent base pairs.', Nucleic acids research, vol. 38, no. 16, pp. 5601-5608.
Site-specific homing endonucleases are capable of inducing gene conversion via homologous recombination. Reprogramming their cleavage specificities allows the targeting of specific biological sites for gene correction or conversion. We used computational protein design to alter the cleavage specificity of I-MsoI for three contiguous base pair substitutions, resulting in an endonuclease whose activity and specificity for its new site rival that of wild-type I-MsoI for the original site. Concerted design for all simultaneous substitutions was more successful than a modular approach against individual substitutions, highlighting the importance of context-dependent redesign and optimization of protein-DNA interactions. We then used computational design based on the crystal structure of the designed complex, which revealed significant unanticipated shifts in DNA conformation, to create an endonuclease that specifically cleaves a site with four contiguous base pair substitutions. Our results demonstrate that specificity switches for multiple concerted base pair substitutions can be computationally designed, and that iteration between design and structure determination provides a route to large scale reprogramming of specificity.
Thyme, S.B., Jarjour, J., Takeuchi, R., Havranek, J.J., Ashworth, J., Scharenberg, A.M., Stoddard, B.L. & Baker, D. 2009, 'Exploitation of binding energy for catalysis and design.', Nature, vol. 461, no. 7268, pp. 1300-1304.
Enzymes use substrate-binding energy both to promote ground-state association and to stabilize the reaction transition state selectively. The monomeric homing endonuclease I-AniI cleaves with high sequence specificity in the centre of a 20-base-pair (bp) DNA target site, with the amino (N)-terminal domain of the enzyme making extensive binding interactions with the left (-) side of the target site and the similarly structured carboxy (C)-terminal domain interacting with the right (+) side. Here we show that, despite the approximate twofold symmetry of the enzyme-DNA complex, there is almost complete segregation of interactions responsible for substrate binding to the (-) side of the interface and interactions responsible for transition-state stabilization to the (+) side. Although single base-pair substitutions throughout the entire DNA target site reduce catalytic efficiency, mutations in the (-) DNA half-site almost exclusively increase the dissociation constant (K(D)) and the Michaelis constant under single-turnover conditions (K(M)*), and those in the (+) half-site primarily decrease the turnover number (k(cat)*). The reduction of activity produced by mutations on the (-) side, but not mutations on the (+) side, can be suppressed by tethering the substrate to the endonuclease displayed on the surface of yeast. This dramatic asymmetry in the use of enzyme-substrate binding energy for catalysis has direct relevance to the redesign of endonucleases to cleave genomic target sites for gene therapy and other applications. Computationally redesigned enzymes that achieve new specificities on the (-) side do so by modulating K(M)*, whereas redesigns with altered specificities on the (+) side modulate k(cat)*. Our results illustrate how classical enzymology and modern protein design can each inform the other.
Ashworth, J. & Baker, D. 2009, 'Assessment of the optimization of affinity and specificity at protein-DNA interfaces.', Nucleic acids research, vol. 37, no. 10, p. e73.
The biological functions of DNA-binding proteins often require that they interact with their targets with high affinity and/or high specificity. Here, we describe a computational method that estimates the extent of optimization for affinity and specificity of amino acids at a protein-DNA interface based on the crystal structure of the complex, by modeling the changes in binding-free energy associated with all individual amino acid and base substitutions at the interface. The extent to which residues are predicted to be optimal for specificity versus affinity varies within a given protein-DNA interface and between different complexes, and in many cases recapitulates previous experimental observations. The approach provides a complement to traditional methods of mutational analysis, and should be useful for rapidly formulating hypotheses about the roles of amino acid residues in protein-DNA interfaces.
Eastberg, J.H., McConnell Smith, A., Zhao, L., Ashworth, J., Shen, B.W. & Stoddard, B.L. 2007, 'Thermodynamics of DNA target site recognition by homing endonucleases.', Nucleic acids research, vol. 35, no. 21, pp. 7209-7221.
The thermodynamic profiles of target site recognition have been surveyed for homing endonucleases from various structural families. Similar to DNA-binding proteins that recognize shorter target sites, homing endonucleases display a narrow range of binding free energies and affinities, mediated by structural interactions that balance the magnitude of enthalpic and entropic forces. While the balance of DeltaH and TDeltaS are not strongly correlated with the overall extent of DNA bending, unfavorable DeltaH(binding) is associated with unstacking of individual base steps in the target site. The effects of deleterious basepair substitutions in the optimal target sites of two LAGLIDADG homing endonucleases, and the subsequent effect of redesigning one of those endonucleases to accommodate that DNA sequence change, were also measured. The substitution of base-specific hydrogen bonds in a wild-type endonuclease/DNA complex with hydrophobic van der Waals contacts in a redesigned complex reduced the ability to discriminate between sites, due to nonspecific DeltaS(binding).
Bateman, R.L., Ashworth, J., Witte, J.F., Baker, L.J., Bhanumoorthy, P., Timm, D.E., Hurley, T.D., Grompe, M. & McClard, R.W. 2007, 'Slow-onset inhibition of fumarylacetoacetate hydrolase by phosphinate mimics of the tetrahedral intermediate: kinetics, crystal structure and pharmacokinetics.', The Biochemical journal, vol. 402, no. 2, pp. 251-260.
FAH (fumarylacetoacetate hydrolase) catalyses the final step of tyrosine catabolism to produce fumarate and acetoacetate. HT1 (hereditary tyrosinaemia type 1) results from deficiency of this enzyme. Previously, we prepared a partial mimic of the putative tetrahedral intermediate in the reaction catalysed by FAH co-crystallized with the enzyme to reveal details of the mechanism [Bateman, Bhanumoorthy, Witte, McClard, Grompe and Timm (2001) J. Biol. Chem. 276, 15284-15291]. We have now successfully synthesized complete mimics CEHPOBA {4-[(2-carboxyethyl)-hydroxyphosphinyl]-3-oxobutyrate} and COPHPAA {3-[(3-carboxy-2-oxopropyl)hydroxyphosphinyl]acrylate}, which inhibit FAH in slow-onset tight-binding mode with K(i) values of 41 and 12 nM respectively. A high-resolution (1.35 A; 1 A=0.1 nm) crystal structure of the FAH.CEHPOBA complex was solved to reveal the affinity determinants for these compounds and to provide further insight into the mechanism of FAH catalysis. These compounds are active in vivo, and CEHPOBA demonstrated a notable dose-dependent increase in SA (succinylacetone; a metabolite seen in patients with HT1) in mouse serum after repeated injections, and, following a single injection (1 mumol/g; intraperitoneal), only a modest regain of FAH enzyme activity was detected in liver protein isolates after 24 h. These potent inhibitors provide a means to chemically phenocopy the metabolic defects of either HT1 or FAH knockout mice and promise future pharmacological utility for hepatocyte transplantation.
Ashworth, J., Havranek, J.J., Duarte, C.M., Sussman, D., Monnat, R.J., Stoddard, B.L. & Baker, D. 2006, 'Computational redesign of endonuclease DNA binding and cleavage specificity.', Nature, vol. 441, no. 7093, pp. 656-659.
The reprogramming of DNA-binding specificity is an important challenge for computational protein design that tests current understanding of protein-DNA recognition, and has considerable practical relevance for biotechnology and medicine. Here we describe the computational redesign of the cleavage specificity of the intron-encoded homing endonuclease I-MsoI using a physically realistic atomic-level forcefield. Using an in silico screen, we identified single base-pair substitutions predicted to disrupt binding by the wild-type enzyme, and then optimized the identities and conformations of clusters of amino acids around each of these unfavourable substitutions using Monte Carlo sampling. A redesigned enzyme that was predicted to display altered target site specificity, while maintaining wild-type binding affinity, was experimentally characterized. The redesigned enzyme binds and cleaves the redesigned recognition site approximately 10,000 times more effectively than does the wild-type enzyme, with a level of target discrimination comparable to the original endonuclease. Determination of the structure of the redesigned nuclease-recognition site complex by X-ray crystallography confirms the accuracy of the computationally predicted interface. These results suggest that computational protein design methods can have an important role in the creation of novel highly specific endonucleases for gene therapy and other applications.