Description of output

The most important outputs are merged_muts_drm_annotated.csv, a comma-separated-value file that can be conveniently opened in any spreadsheet program, and report.md/report.pdf, a mostly comprehensive report of the sample written in markdown format and that includes the columns from merged_muts_drm_annotated.csv. Other files are discussed below. The columns are, in order,

The report also includes a rough estimate of the subtype for the analyzed sample. This is done by aligning a subset of reads to sequences representative of different subtypes and choosing the best match. The distribution of best matches gives an idea of the most likely subtype for the sample. For HIV, a better approach is to take the sample consensus and run HIV BLAST (see below).

Other output

Sample consensus

A consensus sequence for the sample is found in cns_final.fasta. This can be used, for example, to run HIV BLAST from Los Alamos HIV Database.

Main alignment file

The alignment of (at most) 200,000 reads to the sample consensus is in file hq_2_cons_sorted.bam. One can explore this alignment, for example, with the command

samtools tview hq_2_cons_sorted.bam cns_final.fasta