@@ -160,26 +160,32 @@ Inside the `$path/` folder, the following sub-folders will be created:
*`$path/Annotation` contains all files which are the base input for external annotation with Diamond and MEGAN6, per sample (grouped mapped/unmapped).
### External annotation with Diamond and MEGAN6
To finalize our analyzing process, the annotation steps must be performed as external analyzing. The reason is that DIAMOND needs a lot of power to generate an NCBI-database based mapping file per sample. Therefore we use the EVE-Cluster.
To finalize our analysis process, the annotation is performed externally. This allows for switching to a more powerfull machine as Diamond needs a lot of computational power to generate an NCBI-database based mapping file per sample.
The first step for annotation is to generate an nr.dmnd file based on the nr.gz (downloaded from NCBI) with Diamond. This step must be performed only at first time, after you should use for every run the same nr file:
The first step for annotation is to generate an `nr.dmnd` file based on `nr.gz`, downloaded from NCBI, with Diamond. This step must be performed only at the first time:
After generation the nr.dmnd file the blastx run of diamond can started. Therefor use the diamond_daa_maske.sub inside the subScript folder of MCB-MG-Pipeline. Inside the maske.sub transform the /data/…/temp/ path like your desired path to a temp folder (generate a temp folder under your desired path). You can use this path for every run and therefor it must be transformed only on first time. After optimize the sub-file, use the following command on EVE-Cluster. NOTE: IDx must be renamed with your individual sample-IDs (the input is found under $path/Annotation/IDx/mapped_annotation/input and $path/Annotation/IDx/unmapped_annotation/input):
After generation of the `nr.dmnd` file, the blastx run of Diamond can started:
Now the *.daa file must be transformed in a *.rma file, to use MEGAN6. There are also a diamond_daa2rma_maske.sub under the subScript folder of MCB-MG-Pipeline. Also the path must be transformed like the path of the first sub file.
NOTE: IDx must be renamed with your individual sample-IDs (the input is found under `$path/Annotation/IDx/mapped_annotation/input` and `$path/Annotation/IDx/unmapped_annotation/input`.
Now the `*.daa` files must be transformed into `*.rma` files using MEGAN6:
`-lg` means long reads, because the unmapped reads are contigs, `-pof` could be also an additional setting for paired end reads in one file; `-fwa` indicates first word is accession
After all preparation steps the *.rma file can uploaded on MEGAN6 with Open . Look at the MEGAN-Manual to find out, which possibilities of analyzing are given.
Finally, `*.rma` files can be loaded in MEGAN6 ("Open"). Look at the MEGAN-Manual to find out, which possibilities for analysis are availbale.
### Analyzing the whole community
Depending on the relation between mapped and unmapped reads, the mapped reads are low. Because we only show on the first alignment of reads. To reconstruct the whole community in optimal relation, please calculate like following: