zxmengde Skills

zxmengde / ai-research-01-model-architecture-torchtitan

Facilitates distributed LLM pretraining using PyTorch with advanced parallelism for efficient model training on multiple GPUs.

openclaw

100

96

zxmengde / ai-research-02-tokenization-sentencepiece

Provides a language-independent tokenizer for multilingual models, supporting BPE and Unigram algorithms for efficient text processing.

claude-codecursor

100

99

zxmengde / ai-research-04-mechanistic-interpretability-nnsight

Guides users in interpreting neural network internals using nnsight, enabling experiments on large models without local GPU resources.

openclaw

100

88

zxmengde / ai-research-04-mechanistic-interpretability-pyvene

Guides users in performing causal interventions on PyTorch models using pyvene's framework for reproducible experiments.

openclaw

100

90

zxmengde / ai-research-04-mechanistic-interpretability-saelens

Guides training and analysis of Sparse Autoencoders for interpretable feature extraction in neural networks.

openclaw

100

93

zxmengde / ai-research-04-mechanistic-interpretability-transformer-lens

Guides mechanistic interpretability research using TransformerLens for inspecting transformer internals and studying attention patterns.

openclaw

100

91

zxmengde / ai-research-06-post-training-miles

Guides enterprise-grade RL training using miles for large MoE models, optimizing performance and stability with low-precision techniques.

openclaw

100

99

zxmengde / ai-research-06-post-training-slime

Guides LLM post-training with RL using the slime framework, integrating Megatron-LM for efficient model training and data generation.

openclaw

100

68

zxmengde / ai-research-08-distributed-training-megatron-core

Facilitates training of large language models using NVIDIA Megatron-Core with advanced parallelism for optimal GPU efficiency.

openclaw

100

zxmengde / ai-research-08-distributed-training-pytorch-fsdp2

Enhances PyTorch training scripts with FSDP2 for efficient distributed training, enabling large model handling and optimized performance.

openclaw

100

zxmengde / ai-research-08-distributed-training-pytorch-lightning

Facilitates distributed training in PyTorch with minimal boilerplate, enabling seamless scaling from laptops to supercomputers.

openclaw

100

98

zxmengde / ai-research-08-distributed-training-ray-train

Facilitates distributed training of machine learning models across clusters, optimizing performance with Ray's orchestration capabilities.

openclaw

100

99

zxmengde / ai-research-09-infrastructure-modal

Enables seamless deployment of ML models on a serverless GPU cloud platform, optimizing for performance and cost efficiency.

openclaw

100

92

zxmengde / ai-research-09-infrastructure-skypilot

Facilitates multi-cloud orchestration for ML workloads with automatic cost optimization and efficient resource management.

openclaw

100

88

zxmengde / ai-research-10-optimization-awq

Optimizes large language models with activation-aware weight quantization for faster inference and minimal accuracy loss.

openclaw

100

99

zxmengde / ai-research-10-optimization-bitsandbytes

Optimizes large language models by quantizing them to 8-bit or 4-bit, significantly reducing memory usage while maintaining accuracy.

openclaw

100

99

zxmengde / ai-research-10-optimization-flash-attention

Optimizes transformer attention using Flash Attention for significant speed and memory efficiency in PyTorch models.

openclaw

100

98

zxmengde / ai-research-10-optimization-hqq

Enables fast, calibration-free quantization of LLMs to low-bit precision, enhancing deployment efficiency with HuggingFace Transformers.

openclaw

100

99

zxmengde / ai-research-10-optimization-ml-training-recipes

Provides optimized PyTorch training recipes for various domains, enhancing model training efficiency and performance.

openclaw

100

zxmengde / ai-research-11-evaluation-bigcode-evaluation-harness

Evaluates code generation models using multiple benchmarks to assess coding abilities and quality across various programming languages.

openclaw

100

93