What is a protein?
A protein is a chain of amino acids linked together by a peptide bond between each amino acid. There are 20 different amino acids used in the majority of proteins and each has a different functional group on their side chains. These functional groups can be chemically different and interact with the functional groups on other amino acids, either in the same protein molecule or in other distinct protein molecules.
The result of these interactions often defines the shape and structure of individual protein molecules or complexes of protein molecules. In addition, under certain conditions (different pH’s, presence of enzymes and others) the functional groups can be modified. Modifications can occur naturally in the cell or organism, or as an artifact of sample preparation and handling. The amino acid side chain functional groups can also be modified deliberately during sample preparation to “label” the group, such as adding biotin to the primary amine present on the side chain of lysine.
What is a peptide?
A peptide is a short protein, typically 4-50 amino acids in length. They are most often generated by proteolytic cleavage, or digestion, of larger proteins. In current proteomics methodologies, proteins are digested by proteolytic enzymes that cut proteins at specific amino acids only because mass spectrometry analysis of peptides is more sensitive and specific than for proteins. Trypsin, the most used proteolytic enzyme in proteomics, cuts proteins and peptides on the C-terminal (or carboxyl) side of the basic amino acids lysine and arginine.
What is proteomics?
Proteomics is the study of the complete set of proteins (proteome) that is expressed at a given time in a cell, tissue, organ or organism.
The term “proteomics” can essentially be applied to any work on proteins, whether that work involves completely cataloguing the expressed proteome of a cell, characterising the post-translational modifications of proteins at the single isoform or whole proteome level, or the characterisation of protein-protein interactions and protein complexes, to name but a brief few.
While the genome of an organism is rather constant, the proteome can differ from cell to cell and change constantly as the cell interacts with its own genome and the environment. In addition, proteins undergo post-translational modifications and their interactions with other proteins effect cells in ways not predicted by the genome.
The Human Genome Project has found that there are far fewer protein-coding genes in the human genome than proteins in the human proteome (20,000 to 25,000 genes vs. > 500,000 proteins). The protein diversity is thought to be due to alternative splicing and post-translational modification of proteins. The discrepancy implies that protein diversity cannot be fully explained by genomics and proteomics is then necessary to determine what proteins are expressed and how they are regulated in different states of the cell.
The main tools of proteomic projects are two-dimensional gel electrophoresis and multi-dimensional chromatography techniques, both of which are coupled to mass spectrometry.
Planning a proteomics experiment
With every new sample, the researcher needs to ask a simple question: what information do I want from this sample?
This seemingly simple question leads to some more, sometimes open ended questions that take the form of an elaborate and detailed decision tree. The answers to the above question can be numerous and lead to more questions, such as:
- What is the sample, where does it come from and how much have I got?
- What type of sample fractionation do I need to perform to reduce the complexity and dynamic concentration range of the sample?
- Am I cataloguing the proteome?
- Am I studying protein complexes or interactions?
- Do I need to resolve protein isoforms?
- Am I trying to quantitate the abundance of the proteins?
The careful consideration of these questions will then determine the techniques that will be adopted to solubilise and fractionate the sample, and reduce its complexity. Quite simply, if a protein (or its peptides) cannot be solubilised, then the protein cannot be analysed. The selection of these techniques will then determine what can be chemically done to the sample and in what order fractionation steps must be carried out to ensure good fractionation and high recovery of the fractionated proteins.
The chemistry of the sample must be at the forefront of initial experimental planning to ensure adequate removal of non-protein contaminants, such as salts, lipids, polysaccharides and nucleic acids, while either maintaining protein solubility or solubilising membrane proteins.