Researchers from the Baker IDI Heart and Diabetes Institute in Melbourne, Australia have blamed a default autocorrect function within Microsoft Excel for errors in supplementary files associated with approximately one-fifth of all academic genomics papers.
The autocorrect function in Excel and other spreadsheet programs converts gene symbols such as SEPT2 (Septin-2) to date format “September 2”. The same happens with MARCH1 (short name for Membrane-Associated Ring Finger (C3HC4) 1, E3 Ubiquitin Protein Ligase).
Inadvertent gene symbol conversion is problematic because these supplementary files are an important resource to the genomics community and are frequently reused.
The research, described in Genome Biology, by Mark Ziemann, Yotam Eren and Assam El-Osta screened the supplementary data files from 3597 papers published in 18 journals between 2005 and 2015. Using a combination of script-based screening and manual verification methods.
Despite the problem being highlighted more than a decade ago, the issue has grown.
There is however, an easy way to identify the errors – all researchers, reviewers and editorial staff need to do is to copy and paste the column of gene names into a new sheet and then sort the column alphabetically. Any gene symbols converted to dates will appear as numbers at the top of the column.
To learn more, please review the full research in detail here.