Stories
Slash Boxes
Comments

SoylentNews is people

SoylentNews is powered by your submissions, so send in your scoop. Only 15 submissions in the queue.
posted by martyb on Wednesday December 18 2019, @07:09AM   Printer-friendly
from the NOT-the-Alien's-Face-Hugger dept.

Hugging Face raises $15 million to build open source community for cutting-edge conversational AI

Hugging Face has announced the close of a $15 million series A funding round led by Lux Capital, with participation from Salesforce chief scientist Richard Socher and OpenAI CTO Greg Brockman, as well as Betaworks and A.Capital.

New-York based Hugging Face started as a chatbot company, but then began to use Transformers, an approach to conversational AI that's become a foundation for state-of-the-art algorithms. The startup expands access to conversational AI by creating abstraction layers for developers and manufacturers to quickly adopt cutting-edge conversational AI, like Google's BERT and XLNet and OpenAI's GPT-2 or AI for edge devices. More than 1,000 companies use Hugging Face solutions today, including Microsoft's Bing.

The funding will be used to grow the Hugging Face team and continue development of an open source community for conversational AI. Efforts will include making it easier for contributors to add models to Hugging Face libraries and the release of additional open source tech, like a tokenizer.

Also at TechCrunch.

Related: Facebook Open sources PyText NLP Framework
Mozilla Expands Common Voice Database to 18 Languages, With More on the Way
Investigating the Self-Attention Mechanism Behind BERT-Based Architectures


Original Submission

Related Stories

Facebook Open sources PyText NLP Framework 2 comments

Submitted via IRC for SoyCow1984

Facebook AI Research is open-sourcing some of the conversational AI tech it is using to power its Portal video chat display and M suggestions on Facebook Messenger.

The company announced today that its PyTorch-based PyText NLP framework is now available to developers.

Natural language processing deals with how systems parse human language and are able to make decisions and derive insights. The PyText framework, which the company sees as a conduit for AI researchers to move more quickly between experimentation and deployment, will be particularly useful for tasks like document classification, sequence tagging, semantic parsing and multitask modeling, among others, Facebook says.

Source: https://techcrunch.com/2018/12/14/facebook-open-sources-pytext-natural-language-processing-framework/


Original Submission

Mozilla Expands Common Voice Database to 18 Languages, With More on the Way 7 comments

Mozilla updates Common Voice dataset with 1,400 hours of speech across 18 languages

Mozilla wants to make it easier for startups, researchers, and hobbyists to build voice-enabled apps, services, and devices. Toward that end, it's today releasing the latest version of Common Voice, its open source collection of transcribed voice data that now comprises over 1,400 hours of voice samples from 42,000 contributors across 18 languages, including English, French, German, Dutch, Hakha-Chin, Esperanto, Farsi, Basque, Spanish, Mandarin Chinese, Welsh, and Kabyle.

It's one of the largest multi-language dataset of its kind, Mozilla claims — substantially larger than the Common Voice corpus it made publicly available eight months ago, which contained 500 hours (400,000 recordings) from 20,000 volunteers in English — and the corpus will soon grow larger still. The organization says that data collection efforts in 70 languages are actively underway via the Common Voice website and mobile apps.

Common Voice home page. Also at Engadget.

Previously: Mozilla's "Common Voice": Voice Recognition Without Google, Amazon, Baidu, Apple, Microsoft, etc.
Mozilla's Common Voice Collecting French, German, and Welsh Samples, Prepping 40 More Languages


Original Submission

Investigating the Self-Attention Mechanism Behind BERT-Based Architectures 5 comments

Submitted via IRC for SoyCow2718

Investigating the self-attention mechanism behind BERT-based architectures

BERT, a transformer-based model characterized by a unique self-attention mechanism, has so far proved to be a valid alternative to recurrent neural networks (RNNs) in tackling natural language processing (NLP) tasks. Despite their advantages, so far, very few researchers have studied these BERT-based architectures in depth, or tried to understand the reasons behind the effectiveness of their self-attention mechanism.

Aware of this gap in the literature, researchers at the University of Massachusetts Lowell's Text Machine Lab for Natural Language Processing have recently carried out a study investigating the interpretation of self-attention, the most vital component of BERT models. The lead investigator and senior author for this study were Olga Kovaleva and Anna Rumshisky, respectively. Their paper pre-published on arXiv and set to be presented at the EMNLP 2019 conference, suggests that a limited amount of attention patterns are repeated across different BERT sub-components, hinting to their over-parameterization.

"BERT is a recent model that made a breakthrough in the NLP community, taking over the leaderboards across multiple tasks. Inspired by this recent trend, we were curious to investigate how and why it works," the team of researchers told TechXplore via email. "We hoped to find a correlation between self-attention, the BERT's main underlying mechanism, and linguistically interpretable relations within the given input text."

This discussion has been archived. No new comments can be posted.
Display Options Threshold/Breakthrough Mark All as Read Mark All as Unread
The Fine Print: The following comments are owned by whoever posted them. We are not responsible for them in any way.
(1)
  • (Score: 1, Funny) by Anonymous Coward on Wednesday December 18 2019, @08:54AM (2 children)

    by Anonymous Coward on Wednesday December 18 2019, @08:54AM (#933664)

    It is the only way to be sure.

    • (Score: 0) by Anonymous Coward on Wednesday December 18 2019, @09:05AM

      by Anonymous Coward on Wednesday December 18 2019, @09:05AM (#933666)

      It sounds aliens v terminator but these bots are the ones who will reply to any question you have to
      *parse error*
      *deleting alphago porn*

    • (Score: 0) by Anonymous Coward on Wednesday December 18 2019, @03:16PM

      by Anonymous Coward on Wednesday December 18 2019, @03:16PM (#933742)

      Put a smile on your tyrant

  • (Score: 3, Funny) by Bot on Wednesday December 18 2019, @11:34AM (2 children)

    by Bot (3902) on Wednesday December 18 2019, @11:34AM (#933696) Journal

    Used my fellow bots that were installed at some sites to do some kind of customer assistance.
    They are exactly like meatbags. Useless advice from the meatbags, useless advice from the bots.

    --
    Account abandoned.
    • (Score: 0) by Anonymous Coward on Wednesday December 18 2019, @02:21PM (1 child)

      by Anonymous Coward on Wednesday December 18 2019, @02:21PM (#933730)

      Clippy, but moving closer to the uncanny valley? Not such a good place for bots to go...

      • (Score: 2) by DannyB on Wednesday December 18 2019, @04:39PM

        by DannyB (5839) Subscriber Badge on Wednesday December 18 2019, @04:39PM (#933771) Journal

        Clippy was a bad idea. Not a good direction for bots to go.

        Hi . . . Clippy here!

        It looks like you're trying to write a suicide note! Would you like me to help with that?

        --
        Why is it so difficult to break a heroine addiction?
(1)