We are a multi-omics and nutraceutical company focused on health and fitness of Type 2 Diabetics through microbiome interventions.
I am reaching out because I came across an interesting, gut health biomarker paper (https://www.nature.com/articles/s41467-020-18476-8) aimed at using a subset of microbiota to determine whether or not someone is metabolically healthy or unhealthy. We are interested in seeing if our interventions could have an effect on this index and wanted to test out similar approaches to our metagenomics pipeline.
As such, I was trying to run our fasta.tar.gz files through the MetaPhlAn2 pipeline to get a dataframe usable by: https://github.com/jaeyunsung/GMHI_2020. I have tried several methods to change the tarballs to bz2 files that I see are usable by MetaPhlAn2, but when I try to run the program, I get the error message: “Error! numpy python library not detected!!” So, I redownloaded all the python, numpy, bioconda, libraries to the updated conda and python 3 and it seems it is still incompatible with MetaPhlAn2.
I was hoping you or one of your technicians might help me out in running through these processes and helping me troubleshoot, it would be greatly appreciated.
We unfortunately do not have much wiggle room with the analyzed files that come out of our sequencing efforts, as we get them as files from CoreBiome/Diversigen, using shallow shotgun metagenomics analyzed with BURST. But if you are familiar with the output file formats, perhaps if you know a way to convert their processed relative abundance data to the MetaPhlAn2 analyzed data that would be another route to take.
I am quite new to bioinformatics/NGS and am learning on the job, so any help is greatly appreciated.
Thank you in advance for your time,