## Abstract A microbial array chip with collagen gel spots entrapping living __Escherichia coli__ (__E. coli__) DH5α was applied for the screening of recombinant protein solubilities. The α‐fragment of β‐galactosidase (βGal) was fused to the target protein, namely, maltose‐binding protein (MBP), to
Prediction of protein solubility in Escherichia coli using logistic regression
✍ Scribed by Armando A. Diaz; Emanuele Tomba; Reese Lennarson; Rex Richard; Miguel J. Bagajewicz; Roger G. Harrison
- Publisher
- John Wiley and Sons
- Year
- 2010
- Tongue
- English
- Weight
- 254 KB
- Volume
- 105
- Category
- Article
- ISSN
- 0006-3592
No coin nor oath required. For personal study only.
✦ Synopsis
Abstract
In this article we present a new and more accurate model for the prediction of the solubility of proteins overexpressed in the bacterium Escherichia coli. The model uses the statistical technique of logistic regression. To build this model, 32 parameters that could potentially correlate well with solubility were used. In addition, the protein database was expanded compared to those used previously. We tested several different implementations of logistic regression with varied results. The best implementation, which is the one we report, exhibits excellent overall prediction accuracies: 94% for the model and 87% by cross‐validation. For comparison, we also tested discriminant analysis using the same parameters, and we obtained a less accurate prediction (69% cross‐validation accuracy for the stepwise forward plus interactions model). Biotechnol. Bioeng. 2010; 105: 374–383. © 2009 Wiley Periodicals, Inc.
📜 SIMILAR VOLUMES
## Abstract Many enzymes or fluorescent proteins produced in __Escherichia coli__ are enzymatically active or fluorescent respectively when deposited as inclusion bodies. The occurrence of insoluble but functional protein species with native‐like secondary structure indicates that solubility and co
We have constructed three plasmid vectors for the expression of green fluorescent protein (GFP) fusion proteins using the following motif: (His) 6 -GFP-EK-X, where X represents chloramphenicol acetyl-transferase (CAT), human interleukin-2 (hIL-2), and organophosphorous hydrolase (OPH), respectively,