dRep
File naming
The dRep format is a special case that requires two files:
Cdb.csv
Wdb.csv
Their names have to be exactly as above.
File format
tip
For more information on the dRep output files, visit the dRep documentation.
Cdb.tsv
This file informs the cluster of every MAG.
The file must follow the Tab Separated Values (TSV). It must have columns representing the following data, in that order and with a header:
Column name | Column obligatoriness | Data type | Data nullability |
---|---|---|---|
genome | Mandatory | String | Not nullable |
secondary_cluster | Mandatory | String | Nullable |
threshold | Optional (ignored) | N/A | N/A |
cluster_method | Optional (ignored) | N/A | N/A |
comparison_algorithm | Optional (ignored) | N/A | N/A |
primary_cluster | Optional (ignored) | N/A | N/A |
Wdb.tsv
This file informs the "winners" (i.e. best representatives) of each cluster.
The file must follow the Tab Separated Values (TSV). It must have columns representing the following data, in that order and with a header:
Column name | Column obligatoriness | Data type | Data nullability |
---|---|---|---|
genome | Mandatory | String | Not nullable |
score | Optional (ignored) | N/A | N/A |
cluster | Optional (ignored) | N/A | N/A |
Mapping to database
DrepDirectory
Original data | DrepDirectory field | Notes |
---|---|---|
dRep directory path | path | This is the path to the directory that contains both Cdb.csv and Wdb.csv |
DrepEntry
Original data | DrepEntry field | Notes |
---|---|---|
genome column of Wdb.csv | winner | MAGs whose names are in Wdb.csv are the winners of theirs clusters |
genome column of Cdb.csv | genome_name | |
secondary_cluster column of Cdb.csv | genome_cluster_name |