Stories
Slash Boxes
Comments

SoylentNews is people

SoylentNews is powered by your submissions, so send in your scoop. Only 12 submissions in the queue.
posted by janrinok on Sunday May 21 2023, @01:19PM   Printer-friendly
from the sounds-more-sinister-than-DarkERNIE-I-suppose dept.

A language model trained on the fringes of the dark web... for science:

We're still early in the snowball effect unleashed by the release of Large Language Models (LLMs) like ChatGPT into the wild. Paired with the open-sourcing of other GPT (Generative Pre-Trained Transformer) models, the number of applications employing AI is exploding; and as we know, ChatGPT itself can be used to create highly advanced malware.

As time passes, applied LLMs will only increase, each specializing in their own area, trained on carefully curated data for a specific purpose. And one such application just dropped, one that was trained on data from the dark web itself. DarkBERT, as its South Korean creators called it, has arrived — follow that link for the release paper, which gives an overall introduction to the dark web itself.

DarkBERT is based on the RoBERTa architecture, an AI approach developed back in 2019. It has seen a renaissance of sorts, with researchers discovering it actually had more performance to give than could be extracted from it in 2019. It seems the model was severely undertrained when released, far below its maximum efficiency.

Originally spotted on The Eponymous Pickle.

Related: People are Already Trying to Get ChatGPT to Write Malware


Original Submission

 
This discussion was created by janrinok (52) for logged-in users only, but now has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
  • (Score: 3, Insightful) by jb on Monday May 22 2023, @07:43AM

    by jb (338) on Monday May 22 2023, @07:43AM (#1307294)

    I'm not even sure where the boundaries are, or the difference, for when something is an AI and when something is an Expert System

    That's because expert systems (ES) *are* a form of AI, just a rather different form to the machine learning (ML) systems that are the most popular today.

    One of my pet hates is that so many people now use "AI" as nothing by a synonym for ML, which it quite clearly isn't (ML is a proper subset of AI).

    The difference you're looking for is this:

    ES are made up of a rules base (a set of propositions known to be true by validating them with a panel of experts in whatever field we're working in, hence the name), coupled with an inference engine (which takes observations as inputs then applies the rules base to infer an answer; requiring further input if there's not enough information yet).

    ML involves applying (mostly) statistical methods to recognise patterns in the input based on patterns in the training data and predict answers on that basis.

    The two could not be more different. One is precise, deterministic and explainable. The other is none of those things (but as usual, hype trumps reason).

    Starting Score:    1  point
    Moderation   +1  
       Insightful=1, Total=1
    Extra 'Insightful' Modifier   0  
    Karma-Bonus Modifier   +1  

    Total Score:   3