Bioinfo | 001. Reverse Complement
Quick blog to document some progress on learning bioinfo analysis.
What I did today:
Find out some bioinformatics basics notes on Kaggle,such as Biopython, and Bioconductor.
Understood what reverse complement (反向互补) is.
Reverse complement is an operation specific to DNA sequences. To understand reverse complement, we need to know that DNA is composed of four bases (nucleotides), which are:
- Adenine (A)
- Thymine (T)
- Cytosine (C)
- Guanine (G)
In the DNA double helix, bases pair with each other in a specific way:
- A pairs with T
- C pairs with G
Why is Reverse Complement Possible?
DNA is double-stranded, and each strand has a complementary strand. Complementary means that each base has its pairing base. The process of reverse complement involves two steps:
Complement: Replace each base with its complementary base. Reverse: Reverse the complemented sequence.
For example, suppose we have a DNA sequence “ATGC”, we first of all need to do complements:
- A complements T
- T complements A
- G complements C
- C complements G
So, the complementary sequence is “TACG”. Then we need to do reverse: reverse “TACG” to get “GCAT”.
Why is Reverse Complement Useful?
In biology, the reverse complement sequence is useful in many situations. For example, when scientists need to design primers (short DNA fragments used to amplify DNA segments), they often use the reverse complement sequence.