This blog post examines, from an omics research perspective, how protein combinations change as cells differentiate and how these changes relate to the functions of biological systems and the onset of disease.
One of the core keywords in modern life sciences is omics. Unlike past life science research focused on analyzing the function and structure of individual genes and proteins, omics encompasses concepts like genomics (studying the entire set of genes, or genome, in an organism or cell), transcriptomics (studying the entire set of RNA, or transcriptome), and proteomics (studying the entire set of proteins, or proteome).
According to molecular biology theory, only a portion of the genetic information contained in DNA is transcribed into RNA, and only a portion of that RNA is translated into protein. The genome of a specific biological system, such as an organism or a cell, holds the complete genetic information for all functions that system is capable of performing. The genome of the human system and the genome of another system, such as a human liver cell, contain the same information. However, the genomes of a human liver cell and a mouse liver cell each contain distinct information. Meanwhile, the transcriptome contains information about the functional activities most likely currently being performed from the genomic information, and the proteome, as part of the transcriptome, represents information about the functional activities actually being performed. The substances that directly perform essential ‘work’ in living organisms, such as catalyzing biochemical reactions, are the proteins that make up the proteome.
Humans possess over 20,000 distinct proteins, and human cells, depending on their type, possess different combinations of these proteins. That is, while some proteins are commonly found in skin cells, nerve cells, muscle cells, etc., other proteins are found only in specific cell types. Cells undergo a process called differentiation, where one cell type transforms into another in response to external stimuli or an inherent program. When cells change through differentiation, the combination of proteins they possess also changes. While cell differentiation is prominently observed during individual development, the process by which normal cells transform into cancer cells can also be understood as a differentiation process.
Consider a case where proteomics-based research is applied to a patient’s cancer cells and normal cells. Comparing the proteomes of cancer cells and normal cells allows the identification of proteins whose levels have changed in cancer cells relative to normal cells. Scientists identify these proteins as potential new therapeutic targets for cancer treatment and pursue research on them. Proteins whose levels are increased in cancer cells compared to normal cells can be candidates for oncogenes, while proteins whose levels are decreased in cancer cells compared to normal cells can be candidates for tumor suppressor proteins.
So how is the process of identifying which of the over 20,000 human proteins these discovered proteins are carried out? Proteins consist of 20 types of amino acids linked in a linear sequence, with each protein averaging about 500 amino acids. Since different proteins have different amino acid sequences, knowing the amino acid sequence of a specific protein allows its identity to be determined.
Several experimental methods exist to determine a protein’s amino acid sequence, one of which is peptide molecular weight analysis. This involves treating an unknown protein with trypsin to cleave it into peptides—fragments averaging about 10 amino acids—and then measuring the molecular weight of each peptide. Since trypsin recognizes specific amino acids for cleavage, it is possible to predict where the cleavage will occur between amino acids. Indeed, proteomic analysis data is presented numerically as peptide molecular weight values and the relative abundance of peptides. Since the amino acid sequences and molecular weights of all human proteins are already known, the molecular weight analysis results of peptides obtained by treating the proteomes of cancer cells and normal cells with trypsin can be used to identify candidate therapeutic target proteins.