SilverKey Monitor

A skeptic on Chat GPT

published on 2023/04/01

Even taking into account the learning with human feedback, ChatGPT is still a parrot repeating what it saw on the internet with additional Bullshit-generation capabilities. There is no way a model trained simply on the objective of learning how the language looks like is able to do much more than repeat information (in a way aligned to user query) that it already saw during training as it has little to none understanding of the contents. Human feedback only adds another layer of deception, providing the model information on how people like the bullshit to be served.

The fact that OpenAI allegedly tried to hire people to play with and explain in extensive detail how to solve various problems only proves that simply making a model bigger does not mean it becomes smarter. It just has more storage for memorizing the answers.

Reason Field Lab


Convoy is an open sourrce webhooks gateway

published on 2023/04/01

Convoy is an open source high-performance webhooks gateway used to securely ingest, persist, debug, deliver and manage millions of events reliably with rich features such as retries, rate limiting, static ips, circuit breaking, rolling secrets and more. To get started import the openapi spec into Postman or Insomnia.

[Convoy](https://github.com/frain-dev/convoy


Universal Speech Model

published on 2023/03/30
AI

Universal Speech Model (USM) is a family of state-of-the-art speech models with 2B parameters trained on 12 million hours of speech and 28 billion sentences of text, spanning 300+ languages. USM, which is for use in YouTube (e.g., for closed captions), can perform automatic speech recognition (ASR) on widely-spoken languages like English and Mandarin, but also languages like Punjabi, Assamese, Santhali, Balinese, Shona, Malagasy, Luganda, Luo, Bambara, Soga, Maninka, Xhosa, Akan, Lingala, Chichewa, Nkore, Nzema to name a few. Some of these languages are spoken by fewer than twenty million people, making it very hard to find the necessary training data.

We demonstrate that utilizing a large unlabeled multilingual dataset to pre-train the encoder of our model and fine-tuning on a smaller set of labeled data enables us to recognize these under-represented languages. Moreover, our model training process is effective for adapting to new languages and data.

Google Research


Chat Thing is a service that allows you to create a ChatGPT AI chat bot based on your own data

published on 2023/03/29

Create AI chatbots powered by your data that you can use anywhere

The easiest way to create an AI chatbot powered by ChatGPT using your existing data from Notion, uploaded files, websites and more.

https://chatthing.ai/

This looks like a very interesting service. Hat tip to Scripting News for its open discussion at https://github.com/scripting/Scripting-News/issues/255.


Text2Video-Zero is a Zero-Shot Video Generator

published on 2023/03/29
AI

Our method Text2Video-Zero enables zero-shot video generation using (i) a textual prompt (see rows 1, 2), (ii) a prompt combined with guidance from poses or edges (see lower right), and (iii) Video Instruct-Pix2Pix, i.e., instruction-guided video editing (see lower left). Results are temporally consistent and follow closely the guidance and textual prompts.

GitHub

The paper is here

Recent text-to-video generation approaches rely on computationally heavy training and require large-scale video datasets. In this paper, we introduce a new task of zero-shot text-to-video generation and propose a low-cost approach (without any training or optimization) by leveraging the power of existing text-to-image synthesis methods (e.g., Stable Diffusion), making them suitable for the video domain.

Our key modifications include (i) enriching the latent codes of the generated frames with motion dynamics to keep the global scene and the background time consistent; and (ii) reprogramming frame-level self-attention using a new cross-frame attention of each frame on the first frame, to preserve the context, appearance, and identity of the foreground object.

Experiments show that this leads to low overhead, yet high-quality and remarkably consistent video generation. Moreover, our approach is not limited to text-to-video synthesis but is also applicable to other tasks such as conditional and content-specialized video generation, and Video Instruct-Pix2Pix, i.e., instruction-guided video editing.

As experiments show, our method performs comparably or sometimes better than recent approaches, despite not being trained on additional video data. Our code will be open sourced at: this https URL .

Arxiv.org