Stories
Slash Boxes
Comments

SoylentNews is people

SoylentNews is powered by your submissions, so send in your scoop. Only 18 submissions in the queue.
posted by janrinok on Friday August 26 2016, @08:13AM   Printer-friendly
from the not-so-bright-scientists dept.

Scientific literature often mis-names genes and boffins say Microsoft Excel is partly to blame.

"Automatic conversion of gene symbols to dates and floating-point numbers is a problematic feature of Excel software," In a paper titled write Mark Ziemann, Yotam Eren and Assam El-OstaEmai of the Baker IDI Heart & Diabetes Institute in Australia in a paper titled Gene name errors are widespread in the scientific literature .

Among the things Excel does to gene names include changing "SEPT2", the name of a gene thought to have a role in proper formation of cell structure, to the date "2-Sep". The "MARCH1" gene becomes "1-Mar".

The paper notes that this is a problem that's been know for over a decade, but one which remains pervasive. The trio studied 35,175 Excel tables attached to 3,597 scientific papers published between 2005 and 2015 and found errors in "987 supplementary files from 704 published articles. Of the selected journals, the proportion of published articles with Excel files containing gene lists that are affected by gene name errors is 19.6 per cent."

It's not hard to change the default format of Excel cell to avoid changes of this sort: you can get it done in a click or three. Much of the problem in these papers is therefore between scientists' ears, rather than within Excel itself. The paper's silent on why genetic scientists, who The Register will assume are not short of intelligence, have been making Excel errors for years.

This article focuses on errors resulting from auto-correction of gene names; certainly other subject areas have suffered from similarly 'helpful' software. What hilarious and/or cringe-worthy 'corrections' have YOU seen?


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 5, Informative) by zocalo on Friday August 26 2016, @09:50AM

    by zocalo (302) on Friday August 26 2016, @09:50AM (#393416)
    Although Excel definitely has some problems, this really isn't one of them. It's entirely down to the people that are creating the spreadsheet, whether by entering the data directly or importing CSV files/whatever, failing to set a suitable format (e.g. Text) on the cells in question to prevent Excel trying to interpret the data, which it is always going to do on the default "General" cell format. Excel is specifically singled out here, but other spreadsheets definitely have the same problem, although I don't think any other the alternatives go quite so far as Excel in trying to interpret data and automatically apply some cell formatting or have the market penetration so it's almost inevitable it's going to be involved in more documents.

    If you are going to pick a given tool for a task, you really ought to make sure that you know to make the tool accomplish the task. If you're importing pre-formatted data from elsewhere, e.g. a CSV, which seems the most likely case in Excel; right click on the top-left corner of the sheet, "Format Cells", "Text", "OK" - four mouse clicks and it's a non-issue. Hardly a "power user function" either. PEBKAC.
    --
    UNIX? They're not even circumcised! Savages!
    Starting Score:    1  point
    Moderation   +3  
       Insightful=1, Informative=2, Total=3
    Extra 'Informative' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   5  
  • (Score: 0) by Anonymous Coward on Friday August 26 2016, @09:56AM

    by Anonymous Coward on Friday August 26 2016, @09:56AM (#393418)
    +1
    Also, catching these 'errors' is rather trivial. Apply filter on the column with data and then see the filter value list. Any auto conversions as stated in the article are quickly spotted.
  • (Score: 1, Informative) by Anonymous Coward on Friday August 26 2016, @10:06AM

    by Anonymous Coward on Friday August 26 2016, @10:06AM (#393422)

    You can also disable all these auto corrections in the settings. Ditto with LibreOffice (which does it's own nonsense of this sort).

    • (Score: 3, Insightful) by janrinok on Friday August 26 2016, @02:03PM

      by janrinok (52) Subscriber Badge on Friday August 26 2016, @02:03PM (#393474) Journal

      It's not hard to change the default format of Excel cell to avoid changes of this sort: you can get it done in a click or three.

      So, you are suggesting that they do exactly what it says you should do in TFS? Well, who would have thought that....?

      Much of the problem in these papers is therefore between scientists' ears, rather than within Excel itself. The paper's silent on why genetic scientists, who The Register will assume are not short of intelligence, have been making Excel errors for years.

      Blaming the messenger when it's a user problem - er no, I think TFS laid the blame quite squarely on the shoulders of the scientists using Excel, I even wrote an appropriate dept to make it obvious: 'not-so-bright-scientists dept'.

      I suspect someone didn't read TFS closely enough :-)

      --
      [nostyle RIP 06 May 2025]
      • (Score: 1) by kurenai.tsubasa on Friday August 26 2016, @03:23PM

        by kurenai.tsubasa (5227) on Friday August 26 2016, @03:23PM (#393526) Journal

        Did like the dept line :)

        I would heap the blame on the march towards “user friendliness.” User friendly is in the eyes of the user. It turns out that Excel really isn't that user friendly. Except if it came out of the box in a way that scientists and others who need to work with data in a rigorous manner may find more user friendly, legions of PHBs and accountants would blot out the sun with their irritated, angry helpdesk requests. “Why doesn't this stupid thing see that SEPT2 is when the next pay period is over?! What kind of autistic dweeb wrote this?!”

        One would hope that the Everybody Can Code! thing would educate people the Excel isn't the only way and often isn't the best way to work with and present data, but one would hope for too much.

        I think a lot of it is mostly starts from the irrational fear of the command line. I'm certain Google, Microsoft, Apple et al do have a vested interest in the idea that it's all magick powered by waldos under the hood. (Sure, one could argue a command prompt still doesn't constitute “under the hood,” since it isn't really.) code.org doesn't really do anything to dispel that notion afaict, but I digress.

      • (Score: 0) by Anonymous Coward on Friday August 26 2016, @04:11PM

        by Anonymous Coward on Friday August 26 2016, @04:11PM (#393543)
        How do I submit a request for a "well duh" moderation?
  • (Score: 0) by Anonymous Coward on Friday August 26 2016, @07:11PM

    by Anonymous Coward on Friday August 26 2016, @07:11PM (#393639)

    I don't blame the users when these kind of helpful features pop up unnanounced. This automatic data formatting has been a bane of mine as well, and it is something that gets foisted upon the user and get enabled by default. You also have to catch the error, which isn't always obvious when think that it is reasonable that before, when you entered your data, the cell ended up holding the data you typed, but now it works differently. Where I got bit in the ass with this was when entering dates, it was helping me out by reformatting it for me by assuming I wanted DD/MM/YY when I was typing in MM/DD/YY. I'm very sympathetic to users. Sure, you can go in and change this default behavior, once you found out about it, but what about the other helpful features that are enabled by default that you haven't discovered yet? The default on these programs is "do exactly what I tell you, not do what you think I mean" and let the user decide which features to enable.

    • (Score: 0) by Anonymous Coward on Friday August 26 2016, @07:14PM

      by Anonymous Coward on Friday August 26 2016, @07:14PM (#393640)

      I meant the default should be do as I want. Left that part out in haste.