Weekly AI and NLP News — July 24th 2023 | by Fabio Chiusano | NLPlanet

Meta’ LLaMA 2, LangSmith, and Apple GPT

Here are your weekly articles, guides, and news about NLP and AI chosen for you by NLPlanet!

Meta releases Llama 2. Meta has released Llama-2, an open-source model with a commercial license, that showcases similar performance to ChatGPT. Trained on 2T tokens with varying parameter sizes, Llama-2 was further fine-tuned and improved using a combination of instruction and reinforcement learning, outperforming other open-source models like Falcon and MPT.
Announcing LangSmith, a unified platform for debugging, testing, evaluating, and monitoring your LLM applications. LangChain has developed LangSmith, a powerful tool designed to enhance the performance of LLM-powered apps. By providing essential debugging, testing, evaluation, and monitoring features, LangSmith helps AI professionals identify and address issues such as unexpected results, errors, and latency. The tool also enables easy experimentation with new chains and templates, making it a valuable asset for those working in the artificial intelligence field.
Apple is testing a ChatGPT-like AI chatbot. Apple is creating its own chatbot, named “Apple GPT,” to rival Google and OpenAI. Despite initial security concerns, the chatbot is now more widely accessible to Apple employees for prototyping purposes, with restricted usage and no customer-bound features allowed.
Cerebras Systems signs $100 million AI supercomputer deal with UAE’s G42. Cerebras Systems has struck a $100 million deal with G42, marking the debut of AI supercomputers that could potentially challenge Nvidia’s market position. In response to chip shortages, cloud computing providers are seeking alternative solutions. To accelerate the rollout, Cerebras will construct three Condor Galaxy systems in the United States, with the first supercomputer set to go online this year, followed by two others in early 2024.
Custom instructions for ChatGPT. OpenAI introduces personalized custom instructions for ChatGPT, allowing users to have a more tailored and adaptable experience. This feature, developed after gathering feedback from users across 22 countries, highlights the importance of customization in meeting diverse needs. Custom instructions will be gradually rolled out to all users, with beta access initially available to Plus plan subscribers.
Wix’s new tool can create entire websites from prompts. Wix, the leading website builder, has released an AI Site Generator tool that uses AI to automatically generate websites based on user prompts. This tool goes beyond design, offering features such as e-commerce, scheduling, food ordering, and event ticketing. The CEO of Wix believes that AI is essential in simplifying website development for small businesses and helping them avoid missing out on income opportunities.
Results of the Open Source AI Game Jam. The Open Source AI Game Jam hosted by Hugging Face showcased innovative games integrating AI models. The winning game, “Snip It,” allows players to explore a museum where objects in paintings come to life when snipped. Other impressive games included “Yabbit Attack,” which uses genetic algorithms, “Fish Dang Bot Rolling Land” with Text To Speech integration, and “Everchanging Quest” incorporating GPT-4 and Starcoder. Check out the link for the full list and explore the games’ AI features.

Building an AI WebTV. The AI WebTV project showcases the potential of text-to-video models like Zeroscope and MusicGen in generating entertaining videos. Created using Hugging Face services, it utilizes a combination of ChatGPT, Zeroscope V2, and FILM to create high-quality video clips with accompanying music. An exciting example of advancements in AI technology for audio-visual synthesis.

FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning. Stanford University has introduced FlashAttention-2, an algorithm that accelerates attention and reduces memory usage in language models. The updated version is 2x faster than the original FlashAttention and achieves improved performance through better parallelism and work partitioning techniques. It now parallelizes over the sequence length dimension for faster processing, benefiting smaller batch sizes or fewer heads with long sequences. This development is significant for AI professionals seeking to enhance the scalability of Transformer-based models.
Lost in the Middle: How Language Models Use Long Contexts. This study investigates the performance of language models in utilizing extended contexts for tasks such as question answering and retrieval. While models excel in finding relevant information at the start or end of input, their performance declines when accessing middle sections of long contexts. The study highlights the challenges of utilizing long contexts and the necessity for future improvements.
Learning to Retrieve In-Context Examples for Large Language Models. Researchers have developed a framework that uses dense retrievers to automatically select high-quality examples for in-context learning of LLMs. Experimental results demonstrate its effectiveness in improving LLM performance by retrieving similar and contextually relevant examples.
How is ChatGPT’s behavior changing over time? A research study examined the performance of GPT-3.5 and GPT-4 on various tasks over time. It found some significant variations in their behavior, with GPT-4’s accuracy in identifying prime numbers dropping from March to June 2023. Both models also displayed an increase in formatting mistakes during code generation.
Brain2Music. Researchers have developed a method for reconstructing music from brain activity using fMRI. By correlating brain regions with the MusicLM model activations, they can predict and recreate music similar to what the human subjects experienced. MusicLM, an AI language model trained on diverse music, plays a key role in generating high-quality audio compositions.
ShortGPT. ShortGPT is an AI framework that simplifies short video content creation by automating tasks such as video creation, voiceover synthesis, footage sourcing, and editing. It supports multiple languages and automates caption generation using web and Pexels API.
Copy Is All You Need. A new text generation approach, called Copy Is All You Need, improves quality by copying text segments from existing collections. This method utilizes contextualized text representations and efficient vector search toolkits to generate text, resulting in comparable inference efficiency to token-level autoregressive models.
Towards A Unified Agent with Foundation Models. Researchers have found that using language models and vision language models in reinforcement learning agents can address key challenges in the field. By leveraging the knowledge stored in these models, agents are able to explore sparse-reward environments, reuse data for learning, schedule skills for novel tasks, and learn from expert observations. In simulated robotic environments, language-centric RL agents outperformed baseline models in stacking objects task, demonstrating the potential of Foundation Models in RL.

Thank you for reading! If you want to learn more about NLP, remember to follow NLPlanet. You can find us on LinkedIn, Twitter, Medium, and our Discord server!

Source link

Leave a Reply Cancel reply

Related Stories

Different types of artificial intelligence (AI) | by Robert Ishimura Sousa | Apr, 2024

VC-Dimension V.S. Inductive Bias V.S. Biology V.S. Physical Laws : Comprehensive Multi-Disciplinary Table of Machine Learning Classifiers | by Medium_AI_CS_ML | Apr, 2024

Why Machine Learning Is Worth Talking About? | by jupytermishra | Apr, 2024

You may have missed

The Weekly Reorg: Bitcoin Fashion Week

Virtual curating frees artist – Hypergrid Business

Different types of artificial intelligence (AI) | by Robert Ishimura Sousa | Apr, 2024

Azteco Is Helping Millions Buy Bitcoin Without Sharing Their Identity