continual-pretraining

Here are 2 public repositories matching this topic...

hugoabonizio / clinical-protocols-br

Adapting Qwen2.5-14B to Brazilian SUS clinical guidelines. Includes 2 open benchmarks (HealthBench-BR, PCDT-QA) and 8 model checkpoints from the paper's ablations.

benchmark domain-adaptation brazilian-portuguese clinical-nlp medical-ai llm qwen grpo continual-pretraining

Updated May 7, 2026
Python

haolpku / Awesome-LLM-Data-Preparation

Star

Data Preparation for Large Language Models — a curated companion to our JCST 2026 survey. Covers Pre-training, Continual Pre-training, and Post-training (SFT/RLHF/RLAIF) across collection, filtering, dedup, generation, evaluation.

nlp awesome deep-learning survey awesome-list data-preparation sft pretraining data-centric-ai large-language-models llm rlhf instruction-tuning rlaif continual-pretraining

Updated Apr 28, 2026
Shell

Improve this page

Add a description, image, and links to the continual-pretraining topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the continual-pretraining topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly