-
New OpenAI Update
OpenAI announced a set of changes to their model APIs. The biggest announcement is the addition of function calls for both GPT-3.5 and 4. This allows developers to enable plugins and other external tools for the models. They also released new versions of GPT-3.5 and 4 that are better at following directions and a Version — read more
-
Episode 8: Nvidia, Microsoft und die Entwicklung von ChatGPT
In dieser Episode reden Nico und Ich über die neuen Ankündigungen von Nvidia und Microsoft und geben eine ausführlichere Erklürung wie ChatGPT denkt und lernt. Mehr informationen auf dem Discord server https://discord.gg/3YzyeGJHthoder auf https://mkannen.tech/ — read more
-
Episode 7: KI-Regulationen und Copyright
In dieser Episode reden Florian und Ich über den EU AI Act die US Senats Anhörungen und wie Copyright und Kunst im Zeitalter von KI funktionieren kann. Wenn ihr mehr von Florians musik hören wollt: https://open.spotify.com/artist/4CKbYO3CbkhsJe0CvexCAx?si=s2pEu5PATcSngnvVoIMKIQ Mehr informationen auf dem Discord server https://discord.gg/3YzyeGJHthoder auf https://mkannen.tech/ — read more
-
Episode 6: Bewusstsein, Erinnerungen und Emotionen
In dieser Episode reden Florian und Ich über mögliche Formen künstlichen Bewusstseins, den Zweck von Emotionen und warum es schwierig ist eine KI mit unseren Interessen in Einklang zu bringen. Mehr informationen auf dem Discord server https://discord.gg/3YzyeGJHthoder auf https://mkannen.tech/ — read more
-
Intel Presents New Hardware
Intel just announced a new supercomputer named Aurora. It is expected to offer more than 2 exaflops of peak double-precision compute performance and is based on their new GPU series which outperforms even the new H100 cards from NVIDIA. They are going to use Aurora to train their own LLMs up to a trillion parameters. — read more
-
US Senate Holds an AI Hearing
Today the US Senate held an AI testimony to discuss the risks and chances of AI and possible ways to regulate the sector nationally and globally. Witnesses testifying include Sam Altman, CEO of OpenAI; Gary Marcus, professor emeritus at New York University, and Christina Montgomery, vice president and chief privacy and trust officer at IBM. — read more
-
Google IO Summary
Google IO happened yesterday and the keynote focused heavily on AI. Some of the things that I found most important are: PaLM 2 is their new LLM. It comes in different sizes from small enough for pixel phones, to big enough to beat ChatGPT-3.5. It is used in Bard and many of their productivity tools. — read more
-
Claude comes with 100K context
Anthropic, the OpenAI competitor just announced a new version of their LLM Claude. This new Version has a context length of 100K tokens, which corresponds to around 75K words. It is not clear from the announcement how they implemented that and how the full context gets fed into the attention layers. OpenAI is planning to — read more
-
AI helps with AI Understanding
One of the main problems of LLMs is that they are black boxes and how they produce an output is not understandable for humans. Understanding what different neurons are representing and how they influence the model is important to make sure they are reliable and do not contain dangerous trends. OpenAI applied GPT-4 to find — read more
-
Episode 3: KI in Bildung und Text-To-Speech
In dieser Episode rede ich mit Florian über die verschiedenen Anwendungsmöglichkeiten von KI in der Bildung. Die erwähnten Beispiele für text-to-speech: https://mkannen.tech/text-to-speech-is-reaching-a-critical-point/https://github.com/suno-ai/bark Für weitere Informationen besuche https://mkannen.tech/ — read more
-
OpenAI Open-Sources a New Text-to-3D model
Shap-E can generate 3D assets from text or images. Unlike their earlier model Point-E, this one can directly generate the parameters of implicit functions that can be rendered as both textured meshes and neural radiance fields. It is also faster to run and open-source! Read the paper here. Just like video generation, the quality is — read more
-
Microsoft Improves Bing Chat Again
Microsoft announced, that not only Bing Chat is now available for everyone but also that Bing Chat will get new features such as image search, and more ways to present visual information. They also add the ability to summarise PDFs and other types of content. But the biggest news is that they bring plugins to — read more
-
DeepFloyd is finally here
Stability AI finally released DeepFloyd, a new text-to-image model, which is capable of putting text in images and has a much better spacial awareness. It was trained on a new version of the LAION-A dataset. Test it out here — read more
-
Study Extends BERT’s Context Length to 2 Million Tokens
Researchers have made a breakthrough in the field of artificial intelligence, successfully extending the context length of BERT, a Transformer-based natural language processing model, to two million tokens. The team achieved this feat by incorporating a recurrent memory into BERT using the Recurrent Memory Transformer (RMT) architecture. The researchers’ method increases the model’s effective context — read more
-
Google and DeepMind Team Up
Google and DeepMind just announced that they will unite Google Brain and Deepmind into Google DeepMind. This is a good step for both sites since Deepmind really needs the computing power of Google to make further progress on AGI and Google needs the Manpower and knowledge of the Deepmind team to quickly catch up to — read more
-
The next open-source LLM
Stability-AI finally released their own open-source language model. It is trained from scratch and can be used commercially. The first two models are 3B and 7B parameters in size, which is comparable to many other open-source models. What I am more excited about are their planned 65B and 175B parameter models which are bigger than — read more
-
NVIDIA improves text-to-video jet again
NVIDIA’s newest model, VideoLDM can generate videos with resolutions up to 1280 x 2048. They archive that by training a diffusion model in a compressed latent space, introducing a temporal dimension to the latent space, and fine-tuning on encoded image sequences while temporally aligning diffusion model upsamplers. It is visibly better than previous models and — read more
-
Text-to-Speech is reaching a critical point
Today, Microsoft published a paper called “NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers“. In this paper, they show a new text-to-speech model which is not only able to copy human speech, but also singing. The model uses a latent diffusion model and neural audio codec to synthesize high-quality, expressive — read more
-
MiniGPT-4 is an Open-Source Multimodal Model
MiniGPT-4, is an open-source multimodal model similar to the version of GPT-4 that was shown during OpenAI’s presentation. It combines a Visual encoder with an LLM. They used Vicuna which is a fine-tuned version of LLaMA. In the future, I hope more teams try to add new ideas to their models instead of creating more — read more