Stories
Slash Boxes
Comments

SoylentNews is people

posted by mrpg on Monday April 09 2018, @01:38PM   Printer-friendly
from the that-word-again dept.

The scientific paper—the actual form of it—was one of the enabling inventions of modernity. Before it was developed in the 1600s, results were communicated privately in letters, ephemerally in lectures, or all at once in books. There was no public forum for incremental advances. By making room for reports of single experiments or minor technical advances, journals made the chaos of science accretive. Scientists from that point forward became like the social insects: They made their progress steadily, as a buzzing mass.

The earliest papers were in some ways more readable than papers are today. They were less specialized, more direct, shorter, and far less formal. Calculus had only just been invented. Entire data sets could fit in a table on a single page. What little "computation" contributed to the results was done by hand and could be verified in the same way.

The more sophisticated science becomes, the harder it is to communicate results. Papers today are longer than ever and full of jargon and symbols. They depend on chains of computer programs that generate data, and clean up data, and plot data, and run statistical models on data. These programs tend to be both so sloppily written and so central to the results that it's contributed to a replication crisis, or put another way, a failure of the paper to perform its most basic task: to report what you've actually discovered, clearly enough that someone else can discover it for themselves.

Source: The Scientific Paper Is Obsolete


Original Submission

 
This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 2) by PiMuNu on Monday April 09 2018, @03:20PM (6 children)

    by PiMuNu (3823) on Monday April 09 2018, @03:20PM (#664485)

    Mod up - I am writing a technical report preliminary to a few page paper. The technical report is already ~ 50 pages long and that is without much text...

    Starting Score:    1  point
    Karma-Bonus Modifier   +1  

    Total Score:   2  
  • (Score: 4, Interesting) by bzipitidoo on Monday April 09 2018, @05:20PM (5 children)

    by bzipitidoo (4388) on Monday April 09 2018, @05:20PM (#664551) Journal

    At the least, scientific papers-- or, I should say "results" instead of "papers"-- could ditch the PDF format and use something better such as epub. Epub isn't the ultimate either, of course, but PDF was about the worst possible choice. Why? Because it is print oriented. It dictates fonts, font sizes, line breaks, and even the spacing between characters. None of that has anything whatsoever to do with the results. The article is insightful to compare the current state of affairs with the early days of the Gutenberg Press, which was limited to duplication of calligraphy and took a long time to recognize and be free of its limitations.

    Instead, we now have this crazy system in which it has become routine for organizations to provide LaTeX templates to generate the perfect PDF from the LaTeX original. (Depends on the discipline-- I feel sure some are deeply in the MS world and provide Word templates instead of LaTeX ones.) And then, arXiv doesn't want that final PDF, though they will take it. They prefer the LaTeX source so they can use their own system to create a PDF their way. So much for PDF being such a great format, or we wouldn't have this silly dispute over the details of the final form, details that do not matter! PDF is also notoriously wasteful and bloated. PDF can be a total pain to import. Much, much better to work with the source, whether that's in LaTeX or some other format.

    I strongly suspect the evil academic publishers of perpetrating the PDF, as it fits better with their model that enriches them at the expense of the rest of us. A format such as PDF is also a way to lock work away. It appears all open, but it isn't, not in many of the ways that count. Having a PDF without the LaTeX is like having a binary without the source code. Yeah, you can run a binary-- if you have a compatible system. You can even disassemble it. But none of that is as good as having a copy of the source. Also, as the article points out, the source code of the programs used in the research can save a whole lot of time for those trying to replicate the results, and may be even more important than the write up, yet that is routinely omitted, not even considered part of the final product, the "paper". Having a paper but not the source code is like having an instruction manual without the program it is about.

    Yet another issue is the ridiculous page limit, often set to just 10 pages. The only reason I see for any such limit now is to keep authors focused. We sure don't lack for space, as long as we stick to digital media and drop the whole notion of printing results on dead trees.

    All this is just one of the many problems with academic publishing. I'd love to see control wrenched away from the traditional publishers, who have become little more than rent seeking parasitic scum, paywalling everything they can. $30 for a 10 page paper, source code not included, for which the public has already paid? Highway robbery! We don't need the likes of Elsevier. One new system that has arisen, this so-called "author pays" model, is crap. If they charged a pittance, that'd be one thing, but no, they ask not $30, but perhaps $500 of the author. That employers have been stepping up to foot those bills is nice, sort of, but it screws all the researchers who do not work for such an employer. Independent researchers are seriously disadvantaged by such a system.

    Maybe what could work is a sort of "results forum", like this site, but about research results. "Results for scientists, news that _really_ matters". If only arXiv had a forum.... Sure, there's Usenet, but accessing it isn't that easy. It's not hard, but it is harder than it need be.

    • (Score: 0) by Anonymous Coward on Monday April 09 2018, @06:32PM (1 child)

      by Anonymous Coward on Monday April 09 2018, @06:32PM (#664590)

      I would argue that "published" work, especially in research, should be in a fixed form, such that future researchers can literally see the same paper.
      also, please note that some journals already offer epub as a download option.

      • (Score: 2) by bzipitidoo on Monday April 09 2018, @07:58PM

        by bzipitidoo (4388) on Monday April 09 2018, @07:58PM (#664638) Journal

        This forum supports that. Once a message is posted, it's fixed. No one, including the author, can edit the messages after they are posted.

    • (Score: 2) by Wootery on Tuesday April 10 2018, @09:21AM (2 children)

      by Wootery (2341) on Tuesday April 10 2018, @09:21AM (#664872)

      something better such as epub

      We already have a standard format for publishing documents for on-screen reading: HTML.

      What's so special about research publications that they can't just use web technologies like the rest of us, and be easily viewable in the browser with no screwery?

      I seem to recall seeing a journal that does publish HTML, but I forget which.

      EPUB is essentially just HTML+images bundled into a single file (and made browser-unfriendly), right?

      • (Score: 2) by Wootery on Tuesday April 10 2018, @09:24AM (1 child)

        by Wootery (2341) on Tuesday April 10 2018, @09:24AM (#664873)

        Forgive the self-reply: HTML should be used as an in addition to format, rather than to replace PDF. I can see the value in PDF/EPUB to preserve exact on-paper formatting.

        • (Score: 2) by bzipitidoo on Tuesday April 10 2018, @01:24PM

          by bzipitidoo (4388) on Tuesday April 10 2018, @01:24PM (#664929) Journal

          Yes, epub is primarily html files in a zip file.

          > preserve exact on-paper formatting.

          That's the problem. Exact formatting should not matter. The information is what's important. This desire for exact formatting is basically fear that the formatting might contain valuable information that will be lost if not stored in a format that preserves it, such as pdf. A big limitation of epub is that the average epub reader might not support MathML. It is only recently (2015) that MathML was formally added to the HTML5 standard.

          Thing is, we don't ourselves know, or care that much, where things will end up until the paper is finished and we generate a pdf from the final write up. The most we do is check that the conversion to pdf didn't screw things up! Too frequently, I find that a bunch of symbols didn't make it into the pdf output, because yet another font was missing the italicized version. Bit disconcerting to have your formulas completely screwed up because the parentheses didn't get copied over. One of the biggest is the simple check that the pdf did not exceed the page limit.

          We still don't have good handling of mathematical functions. Instead, I have had to become familiar with LaTeX formatting. In many ways, LaTeX is merely a means of using the much deficient ASCII character set to write formulas, since ASCII doesn't have stuff such as the Set Theory notation, or the integral symbol. I might use MATLAB or Mathematica, but as they are not free, and the math usually isn't that heavy, I use a spreadsheet, or program it in the programming language du jour, or just write it down with old fashioned pencil and paper. Then I translate it into LaTeX.

          UTF-8 has the math symbols that are missing from ASCII. We haven't yet integrated this capability into our systems, and it's not so simple as just trading out the LaTeX $\int$ for ∫, UTF-8 character u222b. I expect LaTeX simply would not understand UTF-8 math symbols. Part of the problem is that our mathematical notation itself could use an update.