What is DNA? What does it look like, and how is it sequenced? DNA or deoxyribonucleic acid is a large molecule that contains genetic material that can be found in all living organisms. Biological information is stored in DNA as a code made up of four chemical bases, adenine A, guanine G, cytosine C, and thymine T. The sequence of these four bases encodes the information available for building and maintaining an organism. There are over three billion bases in human DNA and is found in almost every cell in the human body. DNA bases pair up with each other specifically A with T, and C with G, to form units called base pairs. Each base is also attached to a sugar molecule and a phosphate molecule. Together a base, sugar, and phosphate are called a nucleotide. Nucleotides are arranged in two long strands called a double helix. The structure of the double helix looks like a ladder twisted into a spiral. The base pairs form the ladder's rungs and the sugar and phosphate molecules alternate to form the vertical sides of the ladder, often referred to as a "sugar phosphate backbone". The two strands of DNA run in opposite directions to each other and are thus anti-parallel. Therefore, the order of bases on one DNA strand, or side of the ladder, determines the bases on the other side of the ladder. Hence DNA sequences are usually written as if DNA were single stranded. Scientists only need to sequence one DNA strand in order to know the sequence of both strands. Genome sequencing is a process of determining the precise order of nucleotides or bases in a genome. This means the order of A's, C's, G's, and T's that make up an organism's DNA. It is the order of these nucleotides along a single strand that forms a genetic code. Today, genome or DNA sequencing is done by sequencing machines. These machines are high-tech but cannot sequence the whole genome in one go. Instead they sequence the DNA in short pieces around 150 letters long. There are different methods and machines that can sequence genomes. The first DNA sequencing technology called Sanger sequencing uses a technique known as capillary electrophoresis, which separates fragments of DNA by size and then sequences them by detecting the final fluorescent base on each fragment. This technology was a breakthrough, and was used to sequence the DNA from the Human Genome Project. It is a highly accurate method, but it is time consuming and expensive. Nowadays, we have the option of a more rapid and lower cost technology known as Next-Generation sequencing or high-throughput sequencing. With this sequencing technology, many strands of DNA can be sequenced in parallel which significantly reduces the time from experiment to data. However, this generates more data per instrument run, meaning it will require larger data storage, and the ability to analyze and manipulate larger data sets.