Examples of org.broadinstitute.gatk.utils.activeregion.ActiveRegion

Package org.broadinstitute.gatk.utils.activeregion

Examples of org.broadinstitute.gatk.utils.activeregion.ActiveRegion

org.broadinstitute.gatk.utils.activeregion.ActiveRegion
Represents a single active region created by the Active Region Traversal for processing An active region is a single contiguous span of bases on the genome that should be operated on as a single unit for the active region traversal. The action may contains a list of reads that overlap the region (may because there may be no reads in the region). The region is tagged as being either active or inactive, depending on the probabilities provided by the isActiveProb results from the ART walker. Each region carries with it the exact span of the region (bases which are the core of the isActiveProbs from the walker) as well as an extended size, that includes the ART walker's extension size. Reads in the region provided by ART include all reads overlapping the extended span, not the raw span. User: rpoplin Date: 1/4/12

            return referenceModelForNoVariation(originalActiveRegion,false);


        final AssemblyResultSet assemblyResult =
                trimmingResult.needsTrimming() ? untrimmedAssemblyResult.trimTo(trimmingResult.getCallableRegion()) : untrimmedAssemblyResult;


        final ActiveRegion regionForGenotyping = assemblyResult.getRegionForGenotyping();


        // filter out reads from genotyping which fail mapping quality based criteria
        //TODO - why don't do this before any assembly is done? Why not just once at the beginning of this method
        //TODO - on the originalActiveRegion?
        //TODO - if you move this up you might have to consider to change referenceModelForNoVariation
        //TODO - that does also filter reads.
        final Collection<GATKSAMRecord> filteredReads = filterNonPassingReads( regionForGenotyping );
        final Map<String, List<GATKSAMRecord>> perSampleFilteredReadList = splitReadsBySample( filteredReads );


        // abort early if something is out of the acceptable range
        // TODO is this ever true at this point??? perhaps GGA. Need to check.
        if( ! assemblyResult.isVariationPresent() && ! disableOptimizations)
            return referenceModelForNoVariation(originalActiveRegion, false);


        // For sure this is not true if gVCF is on.
        if (dontGenotype) return NO_CALLS; // user requested we not proceed




        // TODO is this ever true at this point??? perhaps GGA. Need to check.
        if( regionForGenotyping.size() == 0 && ! disableOptimizations) {
            // no reads remain after filtering so nothing else to do!
            return referenceModelForNoVariation(originalActiveRegion, false);
        }


        // evaluate each sample's reads against all haplotypes
        //logger.info("Computing read likelihoods with " + assemblyResult.regionForGenotyping.size() + " reads");
        final List<Haplotype> haplotypes = assemblyResult.getHaplotypeList();
        final Map<String,List<GATKSAMRecord>> reads = splitReadsBySample( regionForGenotyping.getReads() );


        // Calculate the likelihoods: CPU intensive part.
        final ReadLikelihoods<Haplotype> readLikelihoods =
                likelihoodCalculationEngine.computeReadLikelihoods(assemblyResult,samplesList,reads);


        // Realign reads to their best haplotype.
        final Map<GATKSAMRecord,GATKSAMRecord> readRealignments = realignReadsToTheirBestHaplotype(readLikelihoods, assemblyResult.getPaddedReferenceLoc());
        readLikelihoods.changeReads(readRealignments);


        // Note: we used to subset down at this point to only the "best" haplotypes in all samples for genotyping, but there
        //  was a bad interaction between that selection and the marginalization that happens over each event when computing
        //  GLs.  In particular, for samples that are heterozygous non-reference (B/C) the marginalization for B treats the
        //  haplotype containing C as reference (and vice versa).  Now this is fine if all possible haplotypes are included
        //  in the genotyping, but we lose information if we select down to a few haplotypes.  [EB]


        final HaplotypeCallerGenotypingEngine.CalledHaplotypes calledHaplotypes = genotypingEngine.assignGenotypeLikelihoods(
                haplotypes,
                readLikelihoods,
                perSampleFilteredReadList,
                assemblyResult.getFullReferenceWithPadding(),
                assemblyResult.getPaddedReferenceLoc(),
                regionForGenotyping.getLocation(),
                getToolkit().getGenomeLocParser(),
                metaDataTracker,
                (consensusMode ? Collections.<VariantContext>emptyList() : givenAlleles),
                emitReferenceConfidence());

View Full Code Here

            this.ref = ref;
            this.extension = extension;


            refLoc = parser.createGenomeLoc("chr1", getStart(), getEnd());
            paddedRefLoc = parser.createGenomeLoc("chr1", getStart() - extension, getEnd() + extension);
            region = new ActiveRegion(getRefLoc(), parser, extension);
            final String pad = Utils.dupString("N", extension);
            refHap = ReferenceConfidenceModel.createReferenceHaplotype(getActiveRegion(), (pad + ref + pad).getBytes(), getPaddedRefLoc());
        }

View Full Code Here

            subject.add(entry.getKey(),entry.getValue());
        subject.setRegionForGenotyping(original);
        final GenomeLoc originalLocation = original.getExtendedLoc();
        final int length = originalLocation.size();
        final GenomeLoc newLocation = originalLocation.setStop(originalLocation.setStart(originalLocation,originalLocation.getStart() + length / 2),originalLocation.getStop() - length / 2);
        final ActiveRegion newRegion = original.trim(newLocation);


        final Map<Haplotype,Haplotype> originalHaplotypesByTrimmed = new HashMap<>(haplotypesAndResultSets.size());
        for (final Haplotype h : haplotypesAndResultSets.keySet())
            originalHaplotypesByTrimmed.put(h.trim(newRegion.getExtendedLoc()), h);


        final AssemblyResultSet trimmed = subject.trimTo(newRegion);


        Assert.assertFalse(subject.wasTrimmed());
        Assert.assertTrue(trimmed.wasTrimmed());

View Full Code Here

        }
    }


    @DataProvider(name="trimmingData")
    public Iterator<Object[]> trimmingData() {
        final ActiveRegion activeRegion = new ActiveRegion(genomeLocParser.createGenomeLoc("chr1",1000,1100),genomeLocParser,25);
        final int length = activeRegion.getExtendedLoc().size();
        final RandomDNA rnd = new RandomDNA(13); // keep it prepoducible by fixing the seed to lucky 13.
        final ActiveRegionTestDataSet actd = new ActiveRegionTestDataSet(10,new String(rnd.nextBases(length)),new String[] {
                "Civar:*1T*" }, new String[0], new byte[0], new byte[0], new byte[0]);


        final List<Haplotype> haplotypes = actd.haplotypeList();
        for (final Haplotype h : haplotypes)
            h.setGenomeLocation(activeRegion.getExtendedLoc());


        final ReadThreadingGraph rtg = new ReadThreadingGraph(10);
        for (final Haplotype h : haplotypes)
            rtg.addSequence("seq-" + Math.abs(h.hashCode()), h.getBases(), h.isReference());
        final SeqGraph seqGraph = rtg.convertToSequenceGraph();

View Full Code Here

        final Haplotype refHaplotype = new Haplotype(refBases, true);
        final Cigar c = new Cigar();
        c.add(new CigarElement(refHaplotype.getBases().length, CigarOperator.M));
        refHaplotype.setCigar(c);


        final ActiveRegion activeRegion = new ActiveRegion(loc, null, true, genomeLocParser, 0);
        activeRegion.addAll(reads);
//        logger.warn("Assembling " + activeRegion + " with " + engine);
        final AssemblyResultSet assemblyResultSet =  assembler.runLocalAssembly(activeRegion, refHaplotype, refBases, loc, Collections.<VariantContext>emptyList(), null);
        return assemblyResultSet.getHaplotypeList();
    }

View Full Code Here

    protected void loadPresetRegionsForContigToWorkQueue(final String contig) {
        if ( ! walkerHasPresetRegions ) throw new IllegalStateException("only appropriate to call when walker has preset regions");


        final GenomeLoc contigSpan = engine.getGenomeLocParser().createOverEntireContig(contig);
        for ( final GenomeLoc loc : this.walker.getPresetActiveRegions().getOverlapping(contigSpan) ) {
            workQueue.add(new ActiveRegion(loc, null, true, engine.getGenomeLocParser(), getActiveRegionExtension()));
        }
    }

View Full Code Here

        }


        // Since we've traversed sufficiently past this point (or this contig!) in the workQueue we can unload those regions and process them
        final LinkedList<MapData> readyRegions = new LinkedList<>();
        while( workQueue.peek() != null ) {
            final ActiveRegion activeRegion = workQueue.peek();
            if ( forceAllRegionsToBeActive || regionCompletelyWithinDeadZone(activeRegion) ) {
                writeActivityProfile(activeRegion.getSupportingStates());
                writeActiveRegion(activeRegion);
                readyRegions.add(prepActiveRegionForProcessing(workQueue.remove(), walker, referenceOrderedDataView));
            } else {
                break;
            }

View Full Code Here

        // shard_boundary_1_post: Primary in 1:14908-16384, Non-Primary in 1:16385-16927
        // shard_boundary_equal: Primary in 1:14908-16384, Non-Primary in 1:16385-16927
        // simple20: Primary in 20:10000-10100


        Map<GenomeLoc, ActiveRegion> activeRegions = getActiveRegions(t, walker, intervals);
        ActiveRegion region;


        region = activeRegions.get(genomeLocParser.createGenomeLoc("1", 1, 999));
        verifyReadMapping(region, "simple", "overlap_equal", "overlap_unequal", "extended_and_np");


        region = activeRegions.get(genomeLocParser.createGenomeLoc("1", 1000, 1999));

View Full Code Here

        // shard_boundary_1_post: Primary in 1:14908-16384, Non-Primary in 1:16385-16927
        // shard_boundary_equal: Primary in 1:14908-16384, Non-Primary in 1:16385-16927
        // simple20: Primary in 20:10000-10100


        Map<GenomeLoc, ActiveRegion> activeRegions = getActiveRegions(t, walker, intervals);
        ActiveRegion region;


        region = activeRegions.get(genomeLocParser.createGenomeLoc("1", 1, 999));
        verifyReadMapping(region, "simple", "overlap_equal", "overlap_unequal", "extended_and_np");


        region = activeRegions.get(genomeLocParser.createGenomeLoc("1", 1000, 1999));

View Full Code Here

        // shard_boundary_1_post: Non-Primary in 1:14908-16384, Primary in 1:16385-16927
        // shard_boundary_equal: Non-Primary in 1:14908-16384, Primary in 1:16385-16927
        // simple20: Primary in 20:10000-10100


        Map<GenomeLoc, ActiveRegion> activeRegions = getActiveRegions(t, walker, intervals);
        ActiveRegion region;


        region = activeRegions.get(genomeLocParser.createGenomeLoc("1", 1, 999));
        verifyReadMapping(region, "simple", "overlap_equal", "overlap_unequal", "extended_and_np");


        region = activeRegions.get(genomeLocParser.createGenomeLoc("1", 1000, 1999));

View Full Code Here

0 1

TOP

Related Classes of org.broadinstitute.gatk.utils.activeregion.ActiveRegion

org.broadinstitute.gatk.engine.traversals.TraverseActiveRegions

org.broadinstitute.gatk.engine.traversals.TraverseActiveRegionsUnitTest

org.broadinstitute.gatk.tools.walkers.haplotypecaller.AssemblyResultSetUnitTest

org.broadinstitute.gatk.tools.walkers.haplotypecaller.HaplotypeCaller

org.broadinstitute.gatk.tools.walkers.haplotypecaller.LocalAssemblyEngineUnitTest

org.broadinstitute.gatk.tools.walkers.haplotypecaller.ReferenceConfidenceModelUnitTest$RefConfData

org.broadinstitute.gatk.utils.GenomeLoc

org.broadinstitute.gatk.utils.sam.GATKSAMRecord

All source code are property of their respective owners. Java is a trademark of Sun Microsystems, Inc and owned by ORACLE Inc. Contact coftware#gmail.com.