{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Ch5 配套 notebook：預訓練模型的能與不能（mock-first）\n", "\n", "**不安裝任何 MLIP 套件**，用 numpy 親手重現 Ch5 的三個核心論點：\n", "\n", "1. **誤差量級對照**：分布內 vs 分布外 vs 你要分辨的訊號\n", "2. **Softening 的成因**：只用「近平衡快照」訓練 → 勢能面的牆壁塌給你看\n", "3. **Zero-shot vs fine-tune**：為什麼少量資料微調贏過從頭訓練\n", "\n", "最後附 Colab 真跑開關（CHGNet / MatterSim）。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 1. 誤差量級對照表：一張圖就是結論\n", "\n", "數字來源（皆為官方文件實測值，見 `docs/repo-analysis/`）：MatterSim MODEL_CARD 的分布內 MAE ≈ 0.03 eV/atom、分布外（Random-TP 測試集）≈ 0.2 eV/atom；HER 火山圖上你要分辨的活性差 ≈ 0.1–0.2 eV。" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import numpy as np\n", "import matplotlib.pyplot as plt\n", "rng = np.random.default_rng(20260611)\n", "\n", "labels = [\"in-distribution\\nMAE\", \"signal to resolve\\n(volcano)\", \"out-of-distribution\\nMAE\"]\n", "vals = [0.03, 0.10, 0.20]\n", "colors = [\"#1565C0\", \"#2e9e6b\", \"#E07B39\"]\n", "\n", "plt.figure(figsize=(6, 3.4))\n", "bars = plt.bar(labels, vals, color=colors, width=0.55)\n", "for b, v in zip(bars, vals):\n", " plt.text(b.get_x()+b.get_width()/2, v+0.005, f\"{v} eV\", ha=\"center\", fontweight=\"bold\")\n", "plt.ylabel(\"energy scale (eV)\")\n", "plt.title(\"Why zero-shot adsorption energies can't read a volcano plot\")\n", "plt.tight_layout(); plt.show()\n", "\n", "print(\"結論：分布外誤差(0.2) > 要分辨的訊號(0.1) → zero-shot 吸附能讀不出火山圖的結構。\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 2. Softening：親手把勢能面的牆壁訓塌\n", "\n", "**機制**（Deng et al., npj Comput. Mater. 2025 的玩具版）：MPtrj 這類訓練資料是「鬆弛軌跡的快照」——絕大多數樣本擠在能量谷底附近。模型沒看過勢能面的牆壁（短距離斥力、解離極限），就把牆壁「畫軟」。\n", "\n", "實驗設計：真理是一條 1D Morse 勢能；訓練樣本只從 **r ∈ [0.95, 1.30] r₀（近平衡）** 抽；然後要求模型預測整條曲線。" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": "De, a, r0 = 1.0, 1.7, 1.1\nV = lambda r: De * (1 - np.exp(-a * (r - r0)))**2 # 真理：Morse\n\n# RBF 特徵 + ridge（跟 ch4 notebook 同一套迷你 MLIP）\ncenters = np.linspace(0.7, 2.6, 24); width = 0.12\nfeat = lambda r: np.exp(-((np.atleast_1d(r)[:, None] - centers[None, :]) / width)**2)\n\ndef fit_ridge(r_tr, y_tr, lam=1e-6, w_prior=None):\n A = np.hstack([feat(r_tr), np.ones((len(r_tr), 1))])\n n = A.shape[1]\n w0 = np.zeros(n) if w_prior is None else w_prior\n return np.linalg.solve(A.T@A + lam*np.eye(n), A.T@y_tr + lam*w0)\n\npredict = lambda w, r: np.hstack([feat(r), np.ones((len(np.atleast_1d(r)), 1))]) @ w\n\n# 只用近平衡樣本訓練（模擬 MPtrj 的取樣偏差）\nr_train = rng.uniform(0.95*r0, 1.30*r0, 60)\nw_soft = fit_ridge(r_train, V(r_train))\n\nr_all = np.linspace(0.75, 2.5, 300)\nplt.figure(figsize=(6.6, 3.8))\nplt.plot(r_all, V(r_all), \"k-\", lw=2, label=\"true PES (Morse)\")\nplt.plot(r_all, predict(w_soft, r_all), \"--\", color=\"#E07B39\", lw=2.2, label=\"model (near-eq. training)\")\nplt.axvspan(0.95*r0, 1.30*r0, color=\"#1565C0\", alpha=0.12, label=\"training window\")\nplt.ylim(-0.3, 2.2); plt.xlabel(\"r (Å)\"); plt.ylabel(\"V (eV)\")\nplt.title(\"Softening: walls collapse outside the training window\")\nplt.legend(fontsize=9); plt.tight_layout(); plt.show()\n\nfor r_q, name in [(1.05, \"近平衡（分布內）\"), (0.85, \"斥力牆（分布外）\"), (2.2, \"解離區（分布外）\")]:\n err = abs(predict(w_soft, [r_q])[0] - V(r_q))\n print(f\" r={r_q:>4} Å {name:14s} |誤差| = {err:.3f} eV\")" }, { "cell_type": "markdown", "metadata": {}, "source": [ "牆壁與解離區的誤差比谷底大 1–2 個數量級——而且模型**畫得很平滑、看起來毫無異狀**。把「r 軸」換成「從 bulk 到表面/缺陷/過渡態的構型座標」，這就是 universal MLIP 的 softening：表面能、能障、吸附能被系統性低估。" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 3. Zero-shot vs fine-tune：補習為什麼贏過重讀\n", "\n", "設定：「預訓練模型」在通用體系（原本的 Morse）的大量資料上學好；「你的體系」是參數不同的另一條 Morse（更深、平衡距離更長——想像換了一種金屬）。比較兩種策略在只有 N 筆 DFT 資料時的表現：\n", "\n", "- **scratch**：只用 N 筆資料從頭擬合\n", "- **fine-tune**：同樣 N 筆資料，但把解「錨定」在預訓練權重附近（ridge toward w_pre——multihead replay 的數學雛形）" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# 預訓練：在通用體系上充分學習\n", "r_big = rng.uniform(0.8, 2.4, 600)\n", "w_pre = fit_ridge(r_big, V(r_big), lam=1e-6)\n", "\n", "# 你的體系：參數不同的 Morse\n", "De2, a2, r02 = 1.4, 1.5, 1.25\n", "V2 = lambda r: De2 * (1 - np.exp(-a2 * (r - r02)))**2\n", "r_test = np.linspace(0.85, 2.3, 200)\n", "\n", "Ns, n_trial = [3, 5, 10, 20, 40, 80], 30\n", "err_scratch, err_ft = [], []\n", "for N in Ns:\n", " es, ef = [], []\n", " for _ in range(n_trial):\n", " r_n = rng.uniform(0.9, 2.0, N)\n", " w_s = fit_ridge(r_n, V2(r_n), lam=1e-4) # 從頭\n", " w_f = fit_ridge(r_n, V2(r_n), lam=0.5, w_prior=w_pre) # 微調（錨定預訓練權重）\n", " es.append(np.sqrt(np.mean((predict(w_s, r_test) - V2(r_test))**2)))\n", " ef.append(np.sqrt(np.mean((predict(w_f, r_test) - V2(r_test))**2)))\n", " err_scratch.append(np.mean(es)); err_ft.append(np.mean(ef))\n", "\n", "zero_shot = np.sqrt(np.mean((predict(w_pre, r_test) - V2(r_test))**2))\n", "\n", "plt.figure(figsize=(6.2, 3.8))\n", "plt.loglog(Ns, err_scratch, \"o-\", color=\"#9aa7c0\", label=\"train from scratch\")\n", "plt.loglog(Ns, err_ft, \"s-\", color=\"#1565C0\", label=\"fine-tune (anchored to pretrained)\")\n", "plt.axhline(zero_shot, color=\"#E07B39\", ls=\"--\", label=f\"zero-shot (N=0): {zero_shot:.2f}\")\n", "plt.xlabel(\"N training points on YOUR system\"); plt.ylabel(\"test RMSE (eV)\")\n", "plt.title(\"Fine-tuning wins at small N\"); plt.legend(fontsize=9)\n", "plt.grid(True, which=\"both\", alpha=0.3); plt.tight_layout(); plt.show()\n", "\n", "print(f\"zero-shot RMSE = {zero_shot:.3f} eV（不可接受）\")\n", "for N, es, ef in zip(Ns, err_scratch, err_ft):\n", " print(f\" N={N:>3}: scratch {es:.3f} vs fine-tune {ef:.3f} eV\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "三個檔位在同一張圖上：**zero-shot**（橙虛線）誤差大但成本零；**fine-tune** 用幾筆資料就大幅改善；**scratch** 要很多資料才追上。這就是 Ch5 的三檔位、也是 Ch6 工作流選 fine-tune 的數學理由。\n", "\n", "（真實 MLIP 的 fine-tune 比「ridge 錨定」精巧——MACE 用 multihead replay 防遺忘——但「把解拉在預訓練權重附近」這個核心直覺是同一個。）" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## 4.（Colab 真跑開關）CHGNet / MatterSim\n", "\n", "```python\n", "# === CHGNet：五行得到 e/f/s/magmom（CPU 可跑，權重內建）===\n", "# !pip install chgnet\n", "# from chgnet.model import CHGNet\n", "# from pymatgen.core import Structure, Lattice\n", "# model = CHGNet.load()\n", "# s = Structure(Lattice.cubic(4.21), [\"Mg\", \"O\"], [[0,0,0],[0.5,0.5,0.5]])\n", "# print(model.predict_structure(s)) # 注意輸出裡的 magmom——CHGNet 的獨家\n", "\n", "# === MatterSim：ASE calculator（Colab 需 Python>=3.12，先 !python --version）===\n", "# !pip install mattersim\n", "# from mattersim.forcefield import MatterSimCalculator\n", "# from ase.build import bulk\n", "# atoms = bulk(\"Si\"); atoms.calc = MatterSimCalculator()\n", "# print(atoms.get_potential_energy())\n", "```\n", "\n", "**動手任務**：用 CHGNet relax 一個你算過的結構，與你的 VASP 結果比鍵長/體積——這就是 Ch5 練習 2。\n", "\n", "## 我學完這份 notebook 應該能說出什麼\n", "\n", "1. 分布外誤差（~0.2 eV/atom）大於火山圖要分辨的訊號（~0.1 eV）→ zero-shot 吸附能不能拿來做定量結論。\n", "2. softening 的成因是訓練資料擠在谷底：牆沒人教，模型就把牆畫軟——而且輸出看起來毫無異狀。\n", "3. zero-shot / fine-tune / scratch 三檔位的誤差-成本曲線，以及為什麼研究工作流選 fine-tune。" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "name": "python", "version": "3.x" } }, "nbformat": 4, "nbformat_minor": 5 }