Also can we train this same model on regular language data so we can converse about the genomes? I suppose a normal multi modal model can talk about what it sees in images in english. Could we have a similar thing with genomes? Ie DNA is just another modality in a multimodal.
Yes! That is what has been done in ChatNT [1] where you can ask natural language questions like "Determine the degradation rate of the human RNA sequence @myseq.fna on a scale from -5 to 5." and the ChatNT will answer with "The degradation rate for this sequence is 1.83."
> My biggest point of confusion is what type of practical things these models can do.
See for example this notebook [2] where the Nucleotide Transformer is finetuned to classify genomic sequences as two of the most basic genomic motifs: promoters and enhancers types.
Disclaimer: I work at InstaDeep but was not involved in either of the above projects.
[1] https://www.biorxiv.org/content/10.1101/2024.04.30.591835v2 [2] https://github.com/huggingface/notebooks/blob/main/examples/...
https://chemrxiv.org/engage/chemrxiv/public-api/documentatio...