Stories
Slash Boxes
Comments

SoylentNews is people

posted by Fnord666 on Sunday March 26 2017, @05:58PM   Printer-friendly
from the encyclopedia-homo-sapiens dept.

Scientists have used a new technique called 3D genome assembly to sequence the genome of a mosquito that can carry the Zika virus:

A team spanning Baylor College of Medicine, Rice University, Texas Children's Hospital and the Broad Institute of MIT and Harvard has developed a new way to sequence genomes, which can assemble the genome of an organism, entirely from scratch, dramatically cheaper and faster. While there is much excitement about the so-called "$1000 genome" in medicine, when a doctor orders the DNA sequence of a patient, the test merely compares fragments of DNA from the patient to a reference genome. The task of generating a reference genome from scratch is an entirely different matter; for instance, the original human genome project took 10 years and cost $4 billion. The ability to quickly and easily generate a reference genome from scratch would open the door to creating reference genomes for everything from patients to tumors to all species on earth. Today in Science, the multi-institutional team reports a method -- called 3D genome assembly -- that can create a human reference genome, entirely from scratch, for less than $10,000.

To illustrate the power of 3D genome assembly, the researchers have assembled the 1.2 billion letter genome of the Aedes aegypti mosquito, which carries the Zika virus, producing the first end-to-end assembly of each of its three chromosomes. The new genome will enable scientists to better combat the Zika outbreak by identifying vulnerabilities in the mosquito that the virus uses to spread.

[...] "Our method is quite different from traditional genome assembly," said Olga Dudchenko, a postdoctoral fellow at the Center for Genome Architecture at Baylor College of Medicine, who led the research. "Several years ago, our team developed an experimental approach that allows us to determine how the 2-meter-long human genome folds up to fit inside the nucleus of a human cell. In this new study, we show that, just as these folding maps trace the contour of the genome as it folds inside the nucleus, they can also guide us through the sequence itself."

By carefully tracing the genome as it folds, the team found that they could stitch together hundreds of millions of short DNA reads into the sequences of entire chromosomes. Since the method only uses short reads, it dramatically reduces the cost of de novo genome assembly, which is likely to accelerate the use of de novo genomes in the clinic. "Sequencing a patient's genome from scratch using 3D assembly is so inexpensive that it's comparable in cost to an MRI," said Dudchenko, who also is a fellow at Rice University's Center for Theoretical Biological Physics. "Generating a de novo genome for a sick patient has become realistic."

De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds (DOI: 10.1126/science.aal3327) (DX)


Original Submission

Related Stories

Human Genome Sequenced With MinION Nanopore Sequencer 3 comments

Pocket-Size Nanopore Device Sequences Entire Human Genome

Researchers have assembled the entire human genome using a nanopore sequencer, according to a study published today (January 29) in Nature Biotechnology [open, DOI: 10.1038/nbt.4060] [DX]. Using a pocket-size device, dubbed MinION, the team was able to fill 12 gaps in the sequenced human genome by achieving reads of DNA sequences nearly one million bases in length—the longest to date.

Also at BBC.

Nanopore sequencing and assembly of a human genome with ultra-long reads (linked above)

We report the sequencing and assembly of a reference genome for the human GM12878 Utah/Ceph cell line using the MinION (Oxford Nanopore Technologies) nanopore sequencer. 91.2 Gb of sequence data, representing ∼30× theoretical coverage, were produced. Reference-based alignment enabled detection of large structural variants and epigenetic modifications. De novo assembly of nanopore reads alone yielded a contiguous assembly (NG50 ∼3 Mb). We developed a protocol to generate ultra-long reads (N50 > 100 kb, read lengths up to 882 kb). Incorporating an additional 5× coverage of these ultra-long reads more than doubled the assembly contiguity (NG50 ∼6.4 Mb). The final assembled genome was 2,867 million bases in size, covering 85.8% of the reference. Assembly accuracy, after incorporating complementary short-read sequencing data, exceeded 99.8%. Ultra-long reads enabled assembly and phasing of the 4-Mb major histocompatibility complex (MHC) locus in its entirety, measurement of telomere repeat length, and closure of gaps in the reference human genome assembly GRCh38.

Previously: The MinION - Genome Sequencing in a Handheld Device
A MARC in the Silicon: Sequencing E. coli with the MinION
Update: Sequencing That Stimulates the Sensors, and MinION Q&A Responses

Related: 3D Genome Assembly Could Create a Human Reference Genome for Under $10,000


Original Submission

This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 2) by SubiculumHammer on Sunday March 26 2017, @06:23PM (1 child)

    by SubiculumHammer (5191) on Sunday March 26 2017, @06:23PM (#484402)

    MRI costs researchers about $600/hour. I imagine most medical MRI take less than an hour. $300-600 is not like $10,000 for eegular folk.

  • (Score: 2) by gringer on Sunday March 26 2017, @06:29PM (2 children)

    by gringer (962) on Sunday March 26 2017, @06:29PM (#484403)

    Give me $10,000 USD, and I reckon I'd be able to make a "human reference genome" [whatever that means] on the MinION. That's full commercial value and includes the purchase cost of the sequencing device (which, admittedly, has a net cost of nothing after the original complimentary flow cells are accounted for).

    --
    Ask me about Sequencing DNA in front of Linus Torvalds [youtube.com]
    • (Score: 2) by takyon on Sunday March 26 2017, @09:03PM

      by takyon (881) <takyonNO@SPAMsoylentnews.org> on Sunday March 26 2017, @09:03PM (#484434) Journal

      I was looking to hear your take on it.

      --
      [SIG] 10/28/2017: Soylent Upgrade v14 [soylentnews.org]
    • (Score: 0) by Anonymous Coward on Monday March 27 2017, @07:16PM

      by Anonymous Coward on Monday March 27 2017, @07:16PM (#484800)

      Really? I thought the MinION would have too much trouble in the highly-repetitive regions due to its relatively short read length.

      I guess it is debatable if those regions are important enough for a "human reference genome", but that is the only drawback of MinION that comes to my mind.

  • (Score: 0) by Anonymous Coward on Sunday March 26 2017, @09:10PM

    by Anonymous Coward on Sunday March 26 2017, @09:10PM (#484435)

    ...with using DNA to store data. [arstechnica.com] A human genetic sequence is about 4 GB of information. Imagine, someday people could carry their entire genetic sequence around with them in a tiny, even microscopic object! Or carry 4 GB of arbitrary data in such an object, and it could be read for less than $10,000!

  • (Score: 2, Interesting) by Anonymous Coward on Monday March 27 2017, @12:40AM (3 children)

    by Anonymous Coward on Monday March 27 2017, @12:40AM (#484486)

    I wonder how good gene sequencing really is.
    How do you test it?

    Perhaps sequence the same organism multiple ways and cross check.
    Do they actually do this?

    • (Score: 3, Informative) by gringer on Monday March 27 2017, @04:28AM (2 children)

      by gringer (962) on Monday March 27 2017, @04:28AM (#484522)

      The base-call accuracy of Illumina sequencing is very high, typically 1 in 1000 [mostly random] errors. Illumina doesn't have an accuracy problem, but it does have issues with read length. These length issues are mostly dealt with by Hi-C, but not entirely.

      A few structures remain that are impossible to assemble precisely using short-read technology regardless of the depth or sequence quality. One example is highly-repetitive sequence within the centromeric region of chromosomes. These regions have repeated sequences where the unit of repeat is longer than the read length of the sequencer, which means that assembly results in hundreds of back-to-back repeats of a single sequence collapsing into a single unit.

      Imagine doing a jigsaw puzzle of a picket fence or a corrugated iron roof, but the dimensions of the puzzle are not known, and there are lots of duplicated pieces. That's the problem that Hi-C has difficulty dealing with.

      --
      Ask me about Sequencing DNA in front of Linus Torvalds [youtube.com]
      • (Score: 0) by Anonymous Coward on Monday March 27 2017, @12:50PM (1 child)

        by Anonymous Coward on Monday March 27 2017, @12:50PM (#484585)

        Thanks,

        Given the number of base pairs in a human, 10**-3, while amazing, does not sound like a reliable tool yet.

        In a diagnostic situation, seem like an incorrect read might head a diagnosis down a false path.

        On the other hand, it does provide insight into an area of near cluelessness.

        • (Score: 0) by Anonymous Coward on Monday March 27 2017, @07:52PM

          by Anonymous Coward on Monday March 27 2017, @07:52PM (#484827)

          does not sound like a reliable tool yet

          That is for a single read. You can fix that with increased "coverage" (multiple reads of the same sequence).

          Simple example:

          The quick brown fax
          The quikk brown fox
          The quick rrown fox
          The quick brown foc
          Thq quick brown fox
          The quick brown fox

          https://en.wikipedia.org/wiki/Coverage_(genetics) [wikipedia.org]

  • (Score: 2) by sbgen on Monday March 27 2017, @04:01AM

    by sbgen (1302) on Monday March 27 2017, @04:01AM (#484519)

    I think the summary got the article's claim slightly wrong. I do not have access to the article tax payers already paid for, will have make do withe the abstract for now. Full article will have to be for tomorrow. Here we go:

    The authors used existing draft assemblies in combination with their own short-read sequences to construct the whole genome. In addition, they generated Hi-C data, which provides a map of how chromosomes fold. Hi-C experiments are complex and I wont bore you with details today. By combining the Hi-C data and the genome assembly (with the help of *existing draft assemblies*) authors have generated a complete 3-D map of the genome of those organisms --> a map showing the sequence + how it is topographically arranged (folding-map). They do not claim to have generated genome assembly *de novo*. That would be possible with long read sequencing technologies. Only Pac-Bio and Oxford nanopore (MinION) have instruments for that purpose. As gringer mentioned MinION can generate de novo assembly for far lesser expense than mentioned in the TFS.

    I swear I posted this earlier but it appears to have vanished. If this is a duplicate my apologies.

    --
    Warning: Not a computer expert, but got to use it. Yes, my kind does exist.
(1)