当前位置: 欣欣网 > 码农

马斯克的Grok-1开源,3140亿参数目前最大开源模型,魔搭社区最佳实践教程来啦!

2024-03-27码农

01


导 读

近几天开源社区最大的热点,莫过于埃隆马斯克信守承诺的最大开源模型Grok-1。Grok-1 是一款 314B 大型专家混合 (Mixture of Expert,MoE) Transformer, 作为基础模型,基于大量文本数据进行训练,没有针对任何具体任务进行微调,使用 JAX 库和 Rust 语言组成的自定义训练堆栈从头开始训练。

官方提供的 详细模型参数 如下:

  • 参数量: 3140亿

  • 架构: 8个混合专家模型(MoE),每个Token使用2个专家

  • 层数: 64层

  • 多头注意力 Q使用48个注意力头,K/V 使用8个注意力头

  • 嵌入尺寸: 6,144

  • 词表大小: 131,072个Tokens,采用SentencePiece 分词器

  • 使用RoPE位置编码

  • 支持激活分片和8位量化

  • 最大序列长度(上下文): 8,192个Tokens

  • 模型性能方面,Grok-1官方发布的Benchmark超过GPT-3.5 和 LLaMa2 70B(MMLU 为73%, GMSK 为62.9%,HumanEval 为63.2%)

    话题中的模型效果如何,老规矩,魔搭社区向大家提供 推理 实践 教程~

    02


    模型链接和下载

    Grok-1模型在ModelScope社区可下载:

    模型链接: https://www.modelscope.cn/models/AI-ModelScope/grok-1/summary

    社区支持直接下载模型的repo:

    from modelscope import snapshot_downloadmodel_dir = snapshot_download("AI-ModelScope/grok-1")

    03


    Grok推理和评测

    环境准备

    Grok-1需要8卡A100运行。原始模型是Jax框架编写,因此使用魔搭默认镜像无法工作,为此需要使用Jax的镜像:

    docker pull ghcr.io/nvidia/jax:jax# 如果国内用户无法拉取上面的镜像可以使用我们转储的镜像# docker pull registry.cn-wulanchabu.aliyuncs.com/ed/jax:1.0

    进入docker实例之后对grok代码库进行克隆:

    git clone https://github.com/xai-org/grok-1.gitcd grok-1pip install -r requirements.txt# 修正Jax版本pip install --upgrade "jax[cuda12_local]==0.4.23" -f https://storage.googleapis.com/jax-releases/jax_cuda_releases.html

    运行推理可以直接使用官方提供的样例:

    # cd grok-1# 修改样例中的CKPT_PATH = "ModelScope下载的模型路径"python run.py

    显存占用

    推理结果

    temperature=0.01,官方示例:

    Query:The answer to life the universe and everything is of courseAnswer:The answer to life the universe and everything is of course 42.But what is the answer to the question of how to get more people to use public transport?Well, it’s not 42.It’s 10.That’s the number of things that need to be done to get more people to use public transport according to a new report from the Institute for Public Policy Research.The report, entitled ‘Why we need to re-imagine the bus’, says that buses are the most used

    temperature=0.3,自定义样例:

    Query:How to make a hamburger?Answer:The hamburger is one of the most popular dishes in the world. It is a sandwich made with a patty of ground meat, usually beef, and served on a bun. The hamburger can be topped with a variety of condiments, such as ketchup, mustard, mayonnaise, and pickles.There are many different ways to make a hamburger. The most important thing istousefresh, high-quality ingredients. The meat should be ground freshand the buns should be soft and fresh. The toppings should be freshand flavorful.Here is a basic recipe for a hamburger:1 pound ground beef1 egg1 tablespoon Worcestershire sauce1 teaspoon salt1/2 teaspoon black pepper1/4 cup chopped onion1/4 cup chopped green pepper1/4 cup ketchup1/4 cup mustard1/4 cup mayonnaise1/4 cup pickles4 hamburger buns1. Preheat the grill tomedium-high heat.2.In a large bowl, mix together the ground beef, egg, Worcestershire sauce, salt, black pepper, onion, green pepper, ketchup, mustard, mayonnaise, and pickles.3. Divide the mixture into4 equal portions and shape into patties.4. Grill the patties for4-5minutes per side, oruntil they are cooked through.5. Serve on hamburger buns with your favorite toppings.What is the secret to juicy hamburgers?There are many secrets to making juicy hamburgers, but the most important one istousehigh-quality, fresh ingredients. The meat should be freshandof good quality, and the buns should be soft and fresh. The toppings should also be freshand flavorful.Another important secret to making juicy hamburgers isto cook them properly. The patties should be cooked on a hot grill orin a hot skillet until they are browned and cooked through. Overcooking the burgers will make them dry and tough.Finally, the burgers should be served immediately after they are cooked. If they areleftto sit, they will become dry and tough.What is the best meat to make hamburgers?The best meat to make hamburgers is ground beef. It is important tousefresh, high-quality ground beef for the best results. The beef should be ground freshand should be of good quality.What is the best way to cook hamburgers?The best way to cook hamburgers ison a hot grill orin a hot skillet. The burgers should be cooked until they are browned and cooked through. Overcooking the burgers will make them dry and tough.How do you keep hamburgers moist on the grill?Tokeep hamburgers moist on the grill, you can brush them with a littlebitof oil or butter before cooking. You can also add a littlebitof water to the pan or grill tohelpkeep the burgers moist.What is the best way to cook a hamburger?There are many ways to cook a hamburger, but the best way isto grill it. Grilling gives the burger a nice, smoky flavor and it also helps tokeep the burger juicy.What is the best way to cook a hamburger on the stove?The best way to cook a hamburger on the stove istouse a cast iron skillet. Cast iron skillets are great for cooking burgers because they hold heat well and they helptocreate a nice, crispy crust on the burger.What is the best way to cook a hamburger in the oven?The best way to cook a hamburger in the oven istouse a baking sheet. Baking sheets helptocreate a nice, even crust on the burger and they also helptokeep the burger from sticking to the pan.What is the best way to cook a hamburger on the grill?The best way to cook a hamburger on the grill istouse a grill pan. Grill pans helptocreate a nice, even crust on the burger and they also helptokeep the burger from sticking to the grill.What is the best way to cook a hamburger on the stovetop?The best way to cook a hamburger on the stovetop istouse a skillet. Skillets helptocreate a nice, even crust on the burger and they also helptokeep the burger from sticking to the pan.What is the best way to cook a hamburger in the microwave?The best way to cook a hamburger in the microwave istouse a microwave-safe plate. Microwave-safe plates helptocreate a nice, even crust on the burger and they also helptokeep the burger from sticking to the plate.What is the best way to cook a hamburger in the air fryer?The best way to cook a hamburger in the air fryer istouse a baking sheet. Baking sheets helptocreate a nice, even crust on the burger and they also helptokeep the burger from sticking to the air fryer.How do you make hamburgers taste better?There are a few things you can doto make


    评测

    魔搭社区已支持在上述镜像中进行评测,首先安装评测依赖:

    由于Jax镜像和PyTorch GPU版本不兼容,因此需要额外安装CPU版本的PyTorch:

    pip3 install torch --index-url https://download.pytorch.org/whl/cpu

    安装eval-scope (llmuses)评测工具

    pip3 install llmuses# GitHub: https://github.com/modelscope/eval-scope

    安装其他依赖:

    wget https://github.com/modelscope/eval-scope/blob/dev/custom_infer/requirements/requirements.txtpip3 install -r requirements.txt

    # 如果运行下面的脚本报错No module named 'transformer_engine_extensions',则卸载如下wheel:pip uninstall transformer-engine

    运行如下脚本即可进行评测:

  • # Copyright (c) Alibaba, Inc. and its affiliates.import osimport timefrom typing import Listimport loggingfrom model import LanguageModelConfig, TransformerConfig, QuantizedWeight8bit as QW8Bitfrom runners import InferenceRunner, ModelRunner, sample_from_modelfrom llmuses.models.custom import CustomModelfrom llmuses.run import run_taskfrom llmuses.constants import DEFAULT_ROOT_CACHE_DIRfrom llmuses.utils import yaml_to_dictfrom llmuses.summarizer import Summarizerfrom llmuses.utils.logger import get_loggerimport timelogger = get_logger()CKPT_PATH = "/path/to/ckpt_path" classGrokModel(CustomModel):def__init__(self, config: dict, **kwargs): self.grok_1_model = LanguageModelConfig( vocab_size=128 * 1024, pad_token=0, eos_token=2, sequence_len=8192, embedding_init_scale=1.0, output_multiplier_scale=0.5773502691896257, embedding_multiplier_scale=78.38367176906169, model=TransformerConfig( emb_size=48 * 128, widening_factor=8, key_size=128, num_q_heads=48, num_kv_heads=8, num_layers=64, attn_output_multiplier=0.08838834764831845, shard_activations=True,# MoE. num_experts=8, num_selected_experts=2,# Activation sharding. data_axis="data", model_axis="model", ), ) self.inference_runner = InferenceRunner( pad_sizes=(1024,), runner=ModelRunner( model=self.grok_1_model, bs_per_device=0.125, checkpoint_path=CKPT_PATH, ), name="local", load=CKPT_PATH, tokenizer_path="./tokenizer.model", local_mesh_config=(1, 8), between_hosts_config=(1, 1), ) self.inference_runner.initialize() super(GrokModel, self).__init__(config=config, **kwargs)defpredict(self, prompt: str, **kwargs): tokens = self.inference_runner.tokenizer.encode(prompt) gen = self.inference_runner.run() ts = time.time() response = sample_from_model(gen, prompt, max_len=kwargs['infer_cfg']['max_new_tokens'] + len(tokens), temperature=kwargs['infer_cfg']['temperature']) ts = time.time() - ts response_tokens = self.inference_runner.tokenizer.encode(response) print('>>>[Query]' + prompt, flush=True) print('>>>[Answer]' + response, flush=True) print(f'>>>Time cost:{ts}, token num: {len(response_tokens)}, infer speed(token/s):{len(response_tokens)/ts}', flush=True) res_d: dict = {'choices': [ {'index': 0,'message': {'content': response,'role': 'assistant' } } ],'created': time.time(),'model': 'grok','object': 'chat.completion','usage': {'completion_tokens': 0,'prompt_tokens': 0,'total_tokens': 0 } }return res_dif __name__ == '__main__':from llmuses.config import TaskConfig grok_model = GrokModel(config={'model_id': 'grok'}) task_config: TaskConfig = TaskConfig() print(task_config.list()) # ['arc', 'gsm8k'] task_config = task_config.load(custom_model=grok_model, tasks=['arc', 'gsm8k', 'bbh_mini', 'mmlu_mini', 'ceval_mini']) task_config.limit = 2# Note: limit the number of each subset to evaluate; default is None run_task(task_cfg=task_config)# Get the final report for your evaluation task final_report: List[dict] = Summarizer.get_report_from_cfg(task_cfg=task_config) print(f'*** Final report ***\n {final_report}\n')

    ARC-Challenge评测样例:

    >>>Cities control the amount of pollution that is allowed to come from cars. How does this most likely help people?A. The air stays cleaner.B. Cars can travel at faster speeds.C. The skills of the drivers improve.D. It becomes safer to drive on the roads.Answer:======================Below is the answer=====================>>>A. The air stays cleaner.The air stays cleaner.The correct answer is A. The air stays cleaner.The government has the power to control the amount of pollution that is allowed to come from cars. This is done by setting standards for emissions and fuel efficiency. By doing this, the government can help to reduce the amount of pollution that is released into the air. This can help to improve the quality of the air and make it safer for people to breathe.## Explanation:The government has the power to control the amount of pollution that is allowed to come from cars. This is done by setting standards for emissions and fuel efficiency. By doing this, the government can help to reduce the amount of pollution that is released into the air. This can help to improve the quality of the air and make it safer for people to breathe.The government also has the power to control the amount of noise that is allowed to come from cars. This is done by setting standards for noise levels. By doing this, the government can help to reduce the amount of noise pollution that is released into the environment. This can help to improve the quality of life for people who live in areas where there is a lot of noise pollution.The government also has the power to control the amount of traffic that is allowed on the roads. This is done by setting standards for speed limits and road closures. By doing this, the government can help to reduce the amount of traffic congestion that is caused by cars. This can help to improve the quality of life for people who live in areas where there is a lot of traffic congestion.The government also has the power to control the amount of parking that is allowed in cities. This is done by setting standards for parking spaces and parking fees. By doing this, the government can help to reduce the amount of parking congestion that is caused by cars. This can help to improve the quality of life for people who live in areas where there is a lot of parking congestion.The government also has the power to control the amount of land that is used for car parks. This is done by setting standards for the size of car parks and the number of car parks that are allowed in a city. By doing this, the government can help to reduce the amount of land that is used for car parks. This can help to improve the quality of life for people who live in areas where there is a lot of land that is used for car parks.The government also has the power to control the amount of money that is spent on cars. This is done by setting standards for the price of cars and the amount of money that is allowed to be spent on cars. By doing this, the government can help to reduce the amount of money that is spent on cars. This can help to improve the quality of life for people who live

    评测速度:

    >>>Time cost:299.08s, token num:577, infer speed(token/s):1.93

    5-shot gsm8k评测样例:

  • >>>Question: Angelo and Melanie want to plan how many hours over the next week they should study together for their test next week. They have 2 chapters of their textbook to studyand4 worksheets to memorize. They figure out that they should dedicate 3 hours to each chapter of their textbook and1.5 hours foreach worksheet. If they plan to studyno more than 4 hours each day, how many days should they plan to study total over the next week if they take a 10-minute break every hour, include 310-minute snack breaks each day, and30 minutes for lunch each day?Let's think step by stepAngelo and Melanie think they should dedicate 3 hours to each of the 2 chapters, 3 hours x 2 chapters = 6 hours total.For the worksheets they plan to dedicate 1.5 hours for each worksheet, 1.5 hours x 4 worksheets = 6 hours total.Angelo and Melanie need to start with planning 12 hours to study, at 4 hours a day, 12 / 4 = 3 days.However, they need to include time for breaks and lunch. Every hour they want to include a 10-minute break, so 12 total hours x 10 minutes = 120 extra minutes for breaks.They also want to include 3 10-minute snack breaks, 3 x 10 minutes = 30 minutes.And they want to include 30 minutes for lunch each day, so 120 minutes for breaks + 30 minutes for snack breaks + 30 minutes for lunch = 180 minutes, or 180 / 60 minutes per hour = 3 extra hours.So Angelo and Melanie want to plan 12 hours to study + 3 hours of breaks = 15 hours total.They want to study no more than 4 hours each day, 15 hours / 4 hours each day = 3.75They will need to plan to study 4 days to allow for all the time they need.The answer is 4Question: Mark's basketball team scores 252 pointers, 83 pointers and10 free throws. Their opponents score double the 2 pointers but half the 3 pointers and free throws. What's the total number of points scored by both teams added together?Let's think step by stepMark's team scores 25 2 pointers, meaning they scored 25*2= 50 points in 2 pointers.His team also scores 6 3 pointers, meaning they scored 8*3= 24 points in 3 pointersThey scored 10 free throws, and free throws count as one point so they scored 10*1=10 points in free throws.All together his team scored 50+24+10= 84 pointsMark's opponents scored double his team's number of 2 pointers, meaning they scored 50*2=100 points in 2 pointers.His opponents scored half his team's number of 3 pointers, meaning they scored 24/2= 12 points in 3 pointers.They also scored half Mark's team's points in free throws, meaning they scored 10/2=5 points in free throws.All together Mark's opponents scored 100+12+5=117 pointsThe total score for the game is both team's scores added together, so it is 84+117=201 pointsThe answer is 201Question: Bella has two times as many marbles as frisbees. She also has 20 more frisbees than deck cards. If she buys 2/5times more of each item, what would be the total number of the items she will have if she currently has 60 marbles?Let's think step by stepWhen Bella buys 2/5 times more marbles, she'll have increased the number of marbles by 2/5*60 = 24The total number of marbles she'll have is 60+24 = 84If Bella currently has 60 marbles, and she has two times as many marbles as frisbees, she has 60/2 = 30 frisbees.If Bella buys 2/5 times more frisbees, she'll have 2/5*30 = 12 more frisbees.The total number of frisbees she'll have will increase to 30+12 = 42Bella also has 20 more frisbees than deck cards, meaning she has 30-20 = 10 deck cardsIf she buys 2/5 times more deck cards, she'll have 2/5*10 = 4 more deck cards.The total number of deck cards she'll have is 10+4 = 14Together, Bella will have a total of 14+42+84 = 140 itemsThe answer is 140Question: A group of 4 fruit baskets contains 9 apples, 15 oranges, and 14 bananas in the first three baskets and 2 less of each fruit in the fourth basket. How many fruits are there?Let's think step by stepFor the first three baskets, the number of apples and oranges in one basket is 9+15=24In total, together with bananas, the number of fruits in one basket is 24+14=38for the first three baskets.Since there are three baskets each having 38 fruits, there are 3*38=114 fruits in the first three baskets.The number of apples in the fourth basket is 9-2=7There are also 15-2=13 oranges in the fourth basketThe combined number of oranges and apples in the fourth basket is 13+7=20The fourth basket also contains 14-2=12 bananas.In total, the fourth basket has 20+12=32 fruits.The four baskets together have 32+114=146 fruits.The answer is 146Question: Jared is trying to increase his typing speed. He starts with 47 words per minute (WPM). After some lessons the nexttime he tests his typing speed it has increased to 52 WPM. If he continues to increase his typing speed once more by 5 words, what will be the average of the three measurements?Let's think step by stepAnswer:===================Below is the answer====================>>>2 = 12 oranges in the fourth basketThe combined number of oranges and apples in the fourth basket is 13+7=20The fourth basket also contains 14-2=12 bananas.In total, the fourth basket has 20+12=32 fruits.The four baskets together have 32+114=146 fruits. # Read Less## About this GRE PrepPal#### The best way to practiceUnderstanding why you got a question right or wrong is the key to effective learning. With detailed explanations for every question, you'll always know where you stand.#### See every wrong answerWhen you get a question wrong, we'll immediately show you the content you've forgotten and offer clear, step-by-step guidance to help you learn.#### Make the most of your study timeOn average, it takes 150 hours and3,000 practice questions to achieve a 90th percentile score on the GRE. That's why we built our GRE prep course to be as efficient as possible.#### Track your progress every step of the wayAt the end of each practice session, we'll identify your weaknesses andtell you exactly what you need to study to improve. It's a smarter, more focused way to master the GRE.#### Learn from the very bestWork with an elite test prep tutor from one of America's top universities. With PrepScholar, you get the combined brainpower of the entire team.#### Get into your dream schoolPrepScholar has helped thousands of students improve their scores and get into their dream schools. We guarantee you'll get a higher score and get into the school of your choice.## Frequently Asked Questions#### How does PrepScholar's GRE prep course work?Our GRE prep course is designed to be efficient and comprehensive. We've broken down the GRE into its key components and developed the best strategies and most effective learning methods for each area, saving you hours of wasted time.#### How is PrepScholar's GRE prep course different from other courses?PrepScholar is the world's most advanced, efficient GRE prep course. Rather than wasting your time on repetitive problems and out-of-date strategies, PrepScholar constantly analyzes your progress and creates a study plan customized to your particular strengths and weaknesses.#### How will PrepScholar's GRE prep course help me improve my score?PrepScholar is the only GRE prep system that adapts to your strengths and weaknesses, helping you study10times more efficiently than with other prep courses. Rather than wasting your time on repetitive problems and out-of-date strategies, PrepScholar constantly analyzes your progress and creates a study plan customized to your particular strengths and weaknesses.#### How much time should I plan on studying for the GRE?The amount of time it takes to prepare for the GRE depends on the score you need to achieve your goals. To improve your GRE score, you'll need to put in hard work and targeted preparation. On average, it takes 150 hours and 3,000 practice questions to achieve a 90th percentile score on the GRE.#### How is my PrepScholar course customized for me?Every PrepScholar GRE course is customized to your particular strengths and weaknesses. Your course will constantly analyze your progress and create a study plan tailored to your particular area of need.#### How do I access my GRE prep course?You can access your GRE prep course on your computer, tablet, or smartphone.#### How does the higher score guarantee work?PrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free.#### What's included with my GRE prep course?Every PrepScholar GRE course comes with everything you need to raise your score, including 3,000+ practice questions, 100+ lessons, 150+ hours of learning, and10+ GRE strategy guides.#### How do I pay for my GRE prep course?You can pay for your GRE prep course with a credit card or with PayPal. If you don't love your course, we'll refund your tuition no questions asked.#### How do I know PrepScholar is right for me?If you're serious about getting a great GRE score, PrepScholar is the world's most efficient prep course. We guarantee you'll get a higher score and get into the school of your choice. # Read Less## Get a higher score guaranteedPrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less## Get a higher score guaranteedPrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less## Get a higher score guaranteedPrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less## Get a higher score guaranteedPrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less## Get a higher score guaranteedPrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less## Get a higher score guaranteedPrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less## Get a higher score guaranteedPrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less## Get a higher score guaranteedPrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less## Get a higher score guaranteedPrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less## Get a higher score guaranteedPrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less## Get a higher score guaranteedPrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less## Get a higher score guaranteedPrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less## Get a higher score guaranteedPrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less## Get a higher score guaranteedPrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less## Get a higher score guaranteedPrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less## Get a higher score guaranteedPrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less## Get a higher score guaranteedPrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less## Get a higher score guaranteedPrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less## Get a higher score guaranteedPrepScholar guarantees you'll get a higher score and get into the school of your choice. If you don't, we'll refund your tuition or let you prep again for free. # Read Less## Get a higher score guaranteedPrepScholar guarantees you'll get a higher score and get into the school of your choice

    评测速度:

    >>>Time cost:488.51s, token num:1978, infer speed(token/s):4.05

    点击 阅读全文 ,直达模型卡片