Author: Maximilian Kannen (Page 2 of 5)

US Senate Holds an AI Hearing

16/05/2023 / Maximilian Kannen / 0 Comments

Today the US Senate held an AI testimony to discuss the risks and chances of AI and possible ways to regulate the sector nationally and globally.

Witnesses testifying include Sam Altman, CEO of OpenAI; Gary Marcus, professor emeritus at New York University, and Christina Montgomery, vice president and chief privacy and trust officer at IBM.

I think the discussion was quite good and is relevant for everyone. One thing that stands out is the companies’ wish to be controlled and guided by the government. The EU AI Act was a topic and the need for a global solution was a main talking point. A critical idea was for an agency to give out licenses to companies for developing LLMs, which Sam Altman proposed.

I hope Governments find a way to make sure AI is deployed in a way where everyone profits and the development of the technology is not slowed down or limited to a few people or profits.

Why Humans Should Stay on Earth: The Case for Robotic Space Exploration

12/05/2023 / Maximilian Kannen / 1 Comment

Space travel has always captivated humanity. From the first rocket launched into space to modern-day space programs, the dream of exploring the cosmos has persisted. Recently, the idea of colonizing Mars has gained traction, with visionaries like Elon Musk advocating for humans to become a multi-planetary species. However, numerous reasons indicate that humans should not venture into space, and that using robots is a more viable option. In this blog post, we will delve deeper into the challenges and limitations of human space travel and argue for prioritizing robotic space exploration.

The Human Body is Ill-Suited for Space The human body is not designed to withstand the rigors of space travel or to live on another planet for extended periods. Numerous issues arise when humans venture into space, some of which are:

Microgravity: The absence of gravity in space can cause a range of biological problems, including muscle atrophy, bone density loss, and impaired fluid regulation. These issues can make it difficult for astronauts to function effectively during missions and can result in long-term health problems after returning to Earth.
Limited resources: Space lacks the essential resources we need to survive, such as water, food, and air. Providing these resources for human space missions adds considerable complexity and expense to missions and creates additional points of potential failure.
Radiation exposure: Mars does not have the same protective shield as Earth, leaving humans vulnerable to radiation exposure. This can increase the risk of cancer and other health problems. Furthermore, Mars’s surface is bombarded by meteorites, which can cause significant damage to human habitats.
Psychological challenges: Humans may struggle with stress, isolation, and confined living conditions during long-term space missions, leading to mental health issues and decreased performance.

Robots, in contrast, do not require air, food, or water, are largely unaffected by radiation, and can withstand damage better than humans. They also do not experience the psychological challenges faced by humans in space.

The High Cost of Human Space Travel The costs associated with solving the problems mentioned above are astronomical. Sending humans into space requires advanced life support systems, extra protection for spacecraft and habitats, and provisions such as food, water, and air. Some specific costs include:

Life support systems: Developing and maintaining advanced life support systems for human space missions is resource-intensive and can add significantly to mission costs.
Provisions: Transporting the necessary supplies for human survival, such as food, water, and air, is expensive and increases the overall weight of the spacecraft, which in turn raises fuel costs.
Mission redundancy: Human space missions require additional safety and backup systems to minimize the risks associated with equipment failures, further driving up costs.
Returning: Since there is no way for humans to live in space or on other planets indefinitely, we would have to bring them back eventually. Which basically doubles the cost.

In contrast, robotic missions like NASA’s Viper Rover can be completed for a fraction of the price of manned missions, as they do not require complex life support systems or extensive provisions.

Superior Performance of Machines in Space Robots possess several advantages over humans when it comes to space exploration:

Durability: Robots are not vulnerable to radiation exposure and can withstand the harsh conditions of space better than humans.
Efficiency: Robots can work tirelessly without needing rest, food, or water, making them ideal for tasks like construction and repairs.
Adaptability: Robots can be designed and programmed to perform specific tasks without the need for complex life support systems, allowing them to be more cost-effective and adaptable to various mission requirements.
Safety: Using robots for space exploration is safer than sending humans, as it reduces the risks associated with space travel while still expanding our knowledge of the cosmos.

While the idea of colonizing Mars is undoubtedly exciting, it is not a practical solution. The challenges associated with space travel, including the human body’s limitations in space, high costs, and the superior performance of machines, make it clear that using robots for space exploration is a better option. As we continue to explore the universe, we must prioritize safety, efficiency, and cost-effectiveness to ensure that space travel remains a viable and sustainable option for future generations. By focusing on robotic space exploration, we can continue to expand our understanding of the cosmos while mitigating the risks and challenges faced by human astronauts. Ultimately, this approach will allow us to make more informed decisions about the potential for human settlement on other planets and contribute to the ongoing development of space technology and knowledge.

Google IO Summary

11/05/2023 / Maximilian Kannen / 0 Comments

The entire keynote

Google IO happened yesterday and the keynote focused heavily on AI. Some of the things that I found most important are:

PaLM 2 is their new LLM. It comes in different sizes from small enough for pixel phones, to big enough to beat ChatGPT-3.5. It is used in Bard and many of their productivity tools.

Gamini is a multimodal model and the product of the Google DeepMind fusion. It is getting trained right now and could be a contender for the strongest AI when it comes out. I am quite excited about this release since DeepMind is my personal favorite for AGI.

Moreover, they showcased their seamless integration of PaLM and other advanced generative AI tools throughout their product suite as a direct response to Microsoft’s Copilot. They applied the same innovative approach to their search functionality, incorporating PaLM to deliver a search experience reminiscent of Bing GPT. This development fills me with hope, considering their search results outperform those of Bing. It’s likely that their decision to keep PaLM smaller was driven by cost considerations, allowing for more economical operation in the realm of search.

Claude comes with 100K context

11/05/2023 / Maximilian Kannen / 0 Comments

Anthropic, the OpenAI competitor just announced a new version of their LLM Claude. This new Version has a context length of 100K tokens, which corresponds to around 75K words. It is not clear from the announcement how they implemented that and how the full context gets fed into the attention layers.

OpenAI is planning to release a 32K context version of GPT-4 soon.

Longer context means you can feed long-form content like books, reports, or entire code bases into the model and work with the entirety of the data.

AI helps with AI Understanding

10/05/2023 / Maximilian Kannen / 0 Comments

One of the main problems of LLMs is that they are black boxes and how they produce an output is not understandable for humans. Understanding what different neurons are representing and how they influence the model is important to make sure they are reliable and do not contain dangerous trends.

OpenAI applied GPT-4 to find out the different meanings of neurons in GPT-2. The methodology involves using GPT-4 to generate explanations of neuron behavior in GPT-2, simulate what a neuron that fired for the explanation would do, and then compare these simulated activations with the real activations to score the explanation’s accuracy. This process helps in understanding and could potentially help improve the model’s performance.

The tools and datasets used for this process are being open-sourced to encourage further research and development of better explanation generation techniques. This is part of the recent efforts in AI alignment before even more powerful models are trained. Read more about the process here and the paper here. You can also view the neurons of GPT-2 here. I recommend clicking through the network and admiring the artificial brain.

OpenAI Open-Sources a New Text-to-3D model

05/05/2023 / Maximilian Kannen / 0 Comments

Shap-E can generate 3D assets from text or images. Unlike their earlier model Point-E, this one can directly generate the parameters of implicit functions that can be rendered as both textured meshes and neural radiance fields. It is also faster to run and open-source! Read the paper here.

Just like video generation, the quality is still behind image generation. I expect this to change by the end of this year.

Microsoft Improves Bing Chat Again

04/05/2023 / Maximilian Kannen / 0 Comments

Microsoft announced, that not only Bing Chat is now available for everyone but also that Bing Chat will get new features such as image search, and more ways to present visual information. They also add the ability to summarise PDFs and other types of content.

But the biggest news is that they bring plugins to Bing Chat, which will work similarly to the ChatGPT plugins. I recommend reading the entire announcement yourself. This is the first step to their promise of a copilot for the web and I think they are doing a good job. This also puts pressure on their partner OpenAI which work on their own improvements to ChatGPT and now have to fight against their Investor Microsoft.

DeepFloyd is finally here

02/05/2023 / Maximilian Kannen / 0 Comments

Stability AI finally released DeepFloyd, a new text-to-image model, which is capable of putting text in images and has a much better spacial awareness. It was trained on a new version of the LAION-A dataset.

Test it out here

Study Extends BERT’s Context Length to 2 Million Tokens

24/04/2023 / Maximilian Kannen / 0 Comments

Researchers have made a breakthrough in the field of artificial intelligence, successfully extending the context length of BERT, a Transformer-based natural language processing model, to two million tokens. The team achieved this feat by incorporating a recurrent memory into BERT using the Recurrent Memory Transformer (RMT) architecture.

The researchers’ method increases the model’s effective context length and maintains high memory retrieval accuracy. This allows the model to store and process both local and global information, improving the flow of information between different segments of an input sequence.

The study’s experiments demonstrated the effectiveness of the RMT-augmented BERT model, which can now tackle tasks on sequences up to seven times its originally designed input length (512 tokens). This breakthrough has the potential to significantly enhance long-term dependency handling in natural language understanding and generation tasks, as well as enable large-scale context processing for memory-intensive applications.

Google and DeepMind Team Up

20/04/2023 / Maximilian Kannen / 0 Comments

Google and DeepMind just announced that they will unite Google Brain and Deepmind into Google DeepMind. This is a good step for both sites since Deepmind really needs the computing power of Google to make further progress on AGI and Google needs the Manpower and knowledge of the Deepmind team to quickly catch up to OpenAi and Microsoft. This partnership could lead to a real rival on the way to AGI for OpenAI. I personally always liked that DeepMind had a different approach to AGI and I hope they will continue to push different ideas other than language models.

The next open-source LLM

19/04/2023 / Maximilian Kannen / 0 Comments

Stability-AI finally released their own open-source language model. It is trained from scratch and can be used commercially. The first two models are 3B and 7B parameters in size, which is comparable to many other open-source models.

What I am more excited about are their planned 65B and 175B parameter models which are bigger than most other recent open-source models. These models will show how close open-source models can actually get to chatGPT and if local AI assistants have a future.

NVIDIA improves text-to-video jet again

19/04/2023 / Maximilian Kannen / 0 Comments

NVIDIA’s newest model, VideoLDM can generate videos with resolutions up to 1280 x 2048. They archive that by training a diffusion model in a compressed latent space, introducing a temporal dimension to the latent space, and fine-tuning on encoded image sequences while temporally aligning diffusion model upsamplers.

It is visibly better than previous models and it looks like my prediction for this year is coming true and we get video models as capable as the image models from the end of the last year. Read the paper here.

Text-to-Speech is reaching a critical point

19/04/2023 / Maximilian Kannen / 0 Comments

Today, Microsoft published a paper called “NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers“. In this paper, they show a new text-to-speech model which is not only able to copy human speech, but also singing. The model uses a latent diffusion model and neural audio codec to synthesize high-quality, expressive voices with strong zero-shot ability by generating quantized latent vectors conditioned on text input.

With this model, we are reaching a critical point. text-to-speech is now good enough to fool people and replace many jobs and positions that require speech. It also allows for better speech interfaces to language models which makes the interaction more natural from now on. As we are approaching a future where people have personal Ai assistants, natural speech is a core technology. And even though NaturalSpeech 2 is not perfect it is good enough to start this future.

MiniGPT-4 is an Open-Source Multimodal Model

18/04/2023 / Maximilian Kannen / 0 Comments

MiniGPT-4, is an open-source multimodal model similar to the version of GPT-4 that was shown during OpenAI’s presentation. It combines a Visual encoder with an LLM. They used Vicuna which is a fine-tuned version of LLaMA.

In the future, I hope more teams try to add new ideas to their models instead of creating more and more small language models.

Nahaufnahme vom Gehirn

18/04/2023 / Maximilian Kannen / 0 Comments

Researchers at Duke’s Center for In Vivo Microscopy, in collaboration with other institutions, have achieved a breakthrough in magnetic resonance imaging (MRI) technology, capturing the highest resolution images ever of a mouse brain. Using an incredibly powerful 9.4 Tesla magnet, 100 times stronger gradient coils than those used in clinical MRIs, and a high-performance computer, the team generated scans with voxels (cubic pixels) measuring just 5 microns, 64 million times smaller than those in a clinical MRI.

The team combined these high-resolution MRI scans with light sheet microscopy, a complementary technique that allows for specific cell labeling, to create vivid and detailed images of the entire mouse brain. These images provide unprecedented insights into brain connectivity, changes in brain structure with age, and the effects of neurodegenerative diseases such as Alzheimer’s.

The researchers believe that this breakthrough in MRI resolution will greatly enhance our understanding of diseases, leading to better insights into conditions such as Alzheimer’s, and how they may affect the human brain. The ability to visualize the brain in such microscopic detail opens up new possibilities for studying the effects of diet, drugs, and other interventions on brain health and longevity.

OpenAssistent is here

15/04/2023 / Maximilian Kannen / 0 Comments

OpenAssistent is an open-source project to build a personal assistant. They just released their first model. you can try it out here.

announcement video

While the progress on smaller models by the open-source community is impressive there are a few things I want to mention. Many advertise these models as local alternatives to chatGPT or even compare them to GPT-4. This is sadly not true. it is not possible to replicate the capabilities of a model like GPT-4 on a local machine; at least not yet. This does not mean that they are not good. many of them are able to generate good answers or even use APIs like chatGPT.

Zip-NeRF: the next step towards the Metaverse

14/04/2023 / Maximilian Kannen / 0 Comments

Neural Radiance Fields (NeRFs), which are used for synthesizing high-quality images of 3D scenes are a class of generative models that learn to represent scenes as continuous volumetric functions, mapping 3D spatial coordinates to RGB colors and volumetric density. Grid-based representations of NeRFs use a discretized grid to approximate this continuous function, which allows for efficient training and rendering. However, these grid-based approaches often suffer from aliasing artifacts, such as jaggies or missing scene content, due to the lack of explicit understanding of scale.

This new paper proposes a novel technique called Zip-NeRF that combines ideas from rendering and signal processing to address the aliasing issue in grid-based NeRFs. This allows for anti-aliasing in grid-based NeRFs, resulting in significantly lower error rates compared to previous techniques. Moreover, Zip-NeRF achieves faster training times, being 22 times faster than current approaches.

This makes them applicable for VR and AR applications and allows for high-quality 3d scenes. Next year when the Hardware improves we will see some very high-quality VR experiences.

New Image generation approach

13/04/2023 / Maximilian Kannen / 0 Comments

OpenAI developed a new approach to image generation called consistency models . Current models, like Dalle-2 or stable diffusion, iteratively diffuse the result. This new approach goes straight to the final result which makes the process way faster and cheaper. While not as good as some diffusion models yet, they will likely improve and become an alternative for scenarios where faster results are needed.

Stanford and Google let AI roleplay

11/04/2023 / Maximilian Kannen / 0 Comments

In a new research paper, Google and Stanford University created a sandbox world where they let 25 AI agents role-play. The agents are based on chatGPT-3.5 and behave more believably than real humans. Future agents based on GPT-4 will be able to act even more realistically and intelligently. This could not only mean that we get better AI NPCs in computer games, but it also means that we will not be able to distinguish bots from real people. This is a great danger in a world where public opinions influence many. As these agents become more human-like, the risk of deep emotional connections increases, especially if the person does not know that they are interacting with an AI.

Author: Maximilian Kannen (Page 2 of 5)

US Senate Holds an AI Hearing

Why Humans Should Stay on Earth: The Case for Robotic Space Exploration

Google IO Summary

Claude comes with 100K context

AI helps with AI Understanding

OpenAI Open-Sources a New Text-to-3D model

Microsoft Improves Bing Chat Again

DeepFloyd is finally here

Study Extends BERT’s Context Length to 2 Million Tokens

Google and DeepMind Team Up

The next open-source LLM

NVIDIA improves text-to-video jet again

Text-to-Speech is reaching a critical point

MiniGPT-4 is an Open-Source Multimodal Model

Nahaufnahme vom Gehirn

OpenAssistent is here

Zip-NeRF: the next step towards the Metaverse

New Image generation approach

Stanford and Google let AI roleplay

Meta

Recent Posts

Recent Comments