The bioBakery help forum

Merge tables error: too many columns specified

Hello,

I’m using MetaPhlAn3 and I’m getting an error merging the tables (different from the other error on the forum):
Too many columns specified: expected 4 and found 3

I used conda list to determine my MetaPhlAn version:
metaphlan 3.0 pyh5ca1d4c_4 bioconda

This is the command I used to run MetaPhlAn3 (I batch submitted to my institution’s high compute cluster):

metaphlan $file1,$file2,file3 --bowtie2out {samplename}.bowtie2.bt2 --nproc cores --input_type fastq -o profiled_{samplename}.txt

I got some output warnings, but a final profiled table was produced. Below are the warnings:

Use of uninitialized value $bt2_args[2] in join or string at /home/saatkinson/miniconda3/envs/biobakery/bin/bowtie2 line 423.
Use of uninitialized value bt2_args[3] in join or string at /home/saatkinson/miniconda3/envs/biobakery/bin/bowtie2 line 423. Use of uninitialized value [2] in string eq at /home/saatkinson/miniconda3/envs/biobakery/bin/bowtie2 line 360.
Use of uninitialized value $
[3] in string eq at /home/saatkinson/miniconda3/envs/biobakery/bin/bowtie2 line 360.
Use of uninitialized value in exists at /home/saatkinson/miniconda3/envs/biobakery/bin/bowtie2 line 81.
Use of uninitialized value in exists at /home/saatkinson/miniconda3/envs/biobakery/bin/bowtie2 line 81.
Use of uninitialized value $bt2_args[2] in join or string at /home/saatkinson/miniconda3/envs/biobakery/bin/bowtie2 line 459.
Use of uninitialized value $bt2_args[3] in join or string at /home/saatkinson/miniconda3/envs/biobakery/bin/bowtie2 line 459.
WARNING: The metagenome profile contains clades that represent multiple species merged into a single representant.
An additional column listing the merged species is added to the MetaPhlAn output.

I then tried to merge my tables;

merge_metaphlan_tables.py /rcc/stor1/projects/CMR/NAFLD/metatranscriptomics/metaphlan3_out/profiled*.txt > ab_table_merged_all.txt

And got the below error. It seems confused because it’s saying I have too many columns specified, but then says it expects 4 and found 3 (which would be too few). Also I manually inspected a few files and the couple I looked at definitely had 4 columns (clade_name, NCBI_taxa_ID, relative_abundance, additional_species).

Traceback (most recent call last):
File “/home/saatkinson/miniconda3/envs/biobakery/bin/merge_metaphlan_tables.py”, line 10, in
sys.exit(main())
File “/home/saatkinson/miniconda3/envs/biobakery/lib/python3.7/site-packages/metaphlan/utils/merge_metaphlan_tables.py”, line 78, in main
merge(args.aistms, sys.stdout)
File “/home/saatkinson/miniconda3/envs/biobakery/lib/python3.7/site-packages/metaphlan/utils/merge_metaphlan_tables.py”, line 48, in merge
index_col=index_col
File “/home/saatkinson/miniconda3/envs/biobakery/lib/python3.7/site-packages/pandas/io/parsers.py”, line 676, in parser_f
return _read(filepath_or_buffer, kwds)
File “/home/saatkinson/miniconda3/envs/biobakery/lib/python3.7/site-packages/pandas/io/parsers.py”, line 454, in _read
data = parser.read(nrows)
File “/home/saatkinson/miniconda3/envs/biobakery/lib/python3.7/site-packages/pandas/io/parsers.py”, line 1133, in read
ret = self._engine.read(nrows)
File “/home/saatkinson/miniconda3/envs/biobakery/lib/python3.7/site-packages/pandas/io/parsers.py”, line 2037, in read
data = self._reader.read(nrows)
File “pandas/_libs/parsers.pyx”, line 860, in pandas._libs.parsers.TextReader.read
File “pandas/_libs/parsers.pyx”, line 875, in pandas._libs.parsers.TextReader._read_low_memory
File “pandas/_libs/parsers.pyx”, line 952, in pandas._libs.parsers.TextReader._read_rows
File “pandas/_libs/parsers.pyx”, line 1013, in pandas._libs.parsers.TextReader._convert_column_data
pandas.errors.ParserError: Too many columns specified: expected 4 and found 3

Any help with this error would be appreciated!
Thanks,
Samantha

Hi Samantha,
Could you please send me all the files you’re trying to merge?

Hello,

Here are the files, there should be 158.

Thanks,

Samantha

(Attachment profiled_samples.zip is missing)

Hello,

Here are the files, I tried to send them in a zipped folder, but got an email back that zip is not an approved extension.

Therefore, here is a link to a OneDrive folder. There should be 158 samples.

I edited the online to post to remove the link to the data. Hopefully that doesn’t break anything.

~Samantha

Hi Samantha, unfortunately I cannot access to the link, probably , I am not allowed to see it due to the sharing policies.

Sorry this has been such a hassle! I’m working around my institution’s permissions settings and sent you the data via my google drive to the email address associated with your profile here. Please let me know if you don’t get it or can’t access the data.

~Samantha

1 Like

There’s a profile which is 100% UNKNOWN with the last column missing, this was due to a bug already resolved. You can update your MetaPhlAn installation to the latest version on conda which includes recent bug fixes.
In the meantime, for the affected file (profiled_stool-obctrl-R030.txt) you can add by hand the missing tab after “100.0” or re-run MetaPhlAn on the sample using as input the bowtie2out file.

I’ll attach here down the merged profiles.
merged_profiles.txt (335.7 KB)

Hi,

I finally went back to update my MetaPhAn3 install to fix the bug you mentioned, but even after updating I’m getting the same error.

I ran

conda install -c bioconda metaphlan

And conda list gives me this as the version of MetaPhlAn

metaphlan 3.0.1 pyh5ca1d4c_0 bioconda

Could you tell me which version I should be running?

Thanks,
Samantha

The version is correct, have you re-run MetaPhlAn on the sample or just tried to re-merge the same profiles?

This is also happening to me, the very same error message when I try to merge my files:
“Too many columns specified: expected 4 and found 3.”
A resolution would be great.

Thanks,

Cathriona

Oh jeez, I completely forgot that I was supposed to rerun MetaPhlAn for that sample :frowning: Doing so fixed the merging tables problem!

Thanks so much for your help!!
~Samantha