Competition Rules
-
Submissions must be reproducible from initial model through merging and fine-tuning. Winning models, along with all associated code and data, must be open-sourced and made public after the competition.
-
Submissions must NOT use any copyrighted or proprietary data, code, or closed-source content. The use of data or content that breaks service contracts or trade secrets of any entity is not allowed.
-
Submissions must take less than 1 hours to merge/fine-tune and evaluate on a single Nvidia A6000 (48 GB) or equivalent resource.
-
Each team can make unlimited submissions during the competition, however the submissions will be evaluated once every week. We will further update the instructions for model submissions as the competition progresses.
-
This competition will be run under the honor system. Teams that submit very similar results or copy another team’s solution will be disqualified. Violating the spirit of the honor system or taking unfair advantage of the community, even when not against an explicit rule, may result in disqualification and ineligibility for prizes.
Allowed Models
- Participants in the LLM-Merging competition are allowed to use any publicly available model weights that can be downloaded and fullfils the conditions. Specifically, the model should satisfy the following criteria:
- The model is publicly available on Hugging Face
- The model is uploaded before May 31st 2024
- The model’s parameter size is not larger than 8 billion This flexibility aims to encourage creativity and innovation in model merging techniques. To help participants get started, we have provided a list of recommended models.
1. Base Model suggestions
Llama 2 Family
Llama 3 Family
Mistral Family
FLAN T5 Family
Gemma Family
2. Finetuned Model example
All adapters under
- predibase
- magicoder
- conllpp
- dbpedia
- cnn
- agnews_explained
- gsm8k
- customer_support
- glue_qnli
- glue_mnli
- glue_sst2
- glue_cola
- glue_stsb
- glue_mrpc
- glue_qqp
- tldr_headline_gen
- tldr_content_gen
- e2e_nlg
- wikisql
- hellaswag
- hellaswag_processed
- legal
- jigsaw
- bc5cdr
- covid
- drop
- drop_explained
- viggo
Llama 3 Family
Mistral Family
FLAN T5 Family
Gemma Family
2. Finetuned Model example
All adapters under
- predibase
- magicoder
- conllpp
- dbpedia
- cnn
- agnews_explained
- gsm8k
- customer_support
- glue_qnli
- glue_mnli
- glue_sst2
- glue_cola
- glue_stsb
- glue_mrpc
- glue_qqp
- tldr_headline_gen
- tldr_content_gen
- e2e_nlg
- wikisql
- hellaswag
- hellaswag_processed
- legal
- jigsaw
- bc5cdr
- covid
- drop
- drop_explained
- viggo
Mistral Family
FLAN T5 Family
Gemma Family
2. Finetuned Model example
All adapters under
- predibase
- magicoder
- conllpp
- dbpedia
- cnn
- agnews_explained
- gsm8k
- customer_support
- glue_qnli
- glue_mnli
- glue_sst2
- glue_cola
- glue_stsb
- glue_mrpc
- glue_qqp
- tldr_headline_gen
- tldr_content_gen
- e2e_nlg
- wikisql
- hellaswag
- hellaswag_processed
- legal
- jigsaw
- bc5cdr
- covid
- drop
- drop_explained
- viggo
FLAN T5 Family
Gemma Family
2. Finetuned Model example
All adapters under
- predibase
- magicoder
- conllpp
- dbpedia
- cnn
- agnews_explained
- gsm8k
- customer_support
- glue_qnli
- glue_mnli
- glue_sst2
- glue_cola
- glue_stsb
- glue_mrpc
- glue_qqp
- tldr_headline_gen
- tldr_content_gen
- e2e_nlg
- wikisql
- hellaswag
- hellaswag_processed
- legal
- jigsaw
- bc5cdr
- covid
- drop
- drop_explained
- viggo
Gemma Family
2. Finetuned Model example
All adapters under
- predibase
- magicoder
- conllpp
- dbpedia
- cnn
- agnews_explained
- gsm8k
- customer_support
- glue_qnli
- glue_mnli
- glue_sst2
- glue_cola
- glue_stsb
- glue_mrpc
- glue_qqp
- tldr_headline_gen
- tldr_content_gen
- e2e_nlg
- wikisql
- hellaswag
- hellaswag_processed
- legal
- jigsaw
- bc5cdr
- covid
- drop
- drop_explained
- viggo
All adapters under
- predibase
- magicoder
- conllpp
- dbpedia
- cnn
- agnews_explained
- gsm8k
- customer_support
- glue_qnli
- glue_mnli
- glue_sst2
- glue_cola
- glue_stsb
- glue_mrpc
- glue_qqp
- tldr_headline_gen
- tldr_content_gen
- e2e_nlg
- wikisql
- hellaswag
- hellaswag_processed
- legal
- jigsaw
- bc5cdr
- covid
- drop
- drop_explained
- viggo
- magicoder
- conllpp
- dbpedia
- cnn
- agnews_explained
- gsm8k
- customer_support
- glue_qnli
- glue_mnli
- glue_sst2
- glue_cola
- glue_stsb
- glue_mrpc
- glue_qqp
- tldr_headline_gen
- tldr_content_gen
- e2e_nlg
- wikisql
- hellaswag
- hellaswag_processed
- legal
- jigsaw
- bc5cdr
- covid
- drop
- drop_explained
- viggo