{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Ch5 配套 notebook：預訓練模型的能與不能（mock-first）\n",
    "\n",
    "**不安裝任何 MLIP 套件**，用 numpy 親手重現 Ch5 的三個核心論點：\n",
    "\n",
    "1. **誤差量級對照**：分布內 vs 分布外 vs 你要分辨的訊號\n",
    "2. **Softening 的成因**：只用「近平衡快照」訓練 → 勢能面的牆壁塌給你看\n",
    "3. **Zero-shot vs fine-tune**：為什麼少量資料微調贏過從頭訓練\n",
    "\n",
    "最後附 Colab 真跑開關（CHGNet / MatterSim）。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 1. 誤差量級對照表：一張圖就是結論\n",
    "\n",
    "數字來源（皆為官方文件實測值，見 `docs/repo-analysis/`）：MatterSim MODEL_CARD 的分布內 MAE ≈ 0.03 eV/atom、分布外（Random-TP 測試集）≈ 0.2 eV/atom；HER 火山圖上你要分辨的活性差 ≈ 0.1–0.2 eV。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "rng = np.random.default_rng(20260611)\n",
    "\n",
    "labels = [\"in-distribution\\nMAE\", \"signal to resolve\\n(volcano)\", \"out-of-distribution\\nMAE\"]\n",
    "vals   = [0.03, 0.10, 0.20]\n",
    "colors = [\"#1565C0\", \"#2e9e6b\", \"#E07B39\"]\n",
    "\n",
    "plt.figure(figsize=(6, 3.4))\n",
    "bars = plt.bar(labels, vals, color=colors, width=0.55)\n",
    "for b, v in zip(bars, vals):\n",
    "    plt.text(b.get_x()+b.get_width()/2, v+0.005, f\"{v} eV\", ha=\"center\", fontweight=\"bold\")\n",
    "plt.ylabel(\"energy scale (eV)\")\n",
    "plt.title(\"Why zero-shot adsorption energies can't read a volcano plot\")\n",
    "plt.tight_layout(); plt.show()\n",
    "\n",
    "print(\"結論：分布外誤差(0.2) > 要分辨的訊號(0.1) → zero-shot 吸附能讀不出火山圖的結構。\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2. Softening：親手把勢能面的牆壁訓塌\n",
    "\n",
    "**機制**（Deng et al., npj Comput. Mater. 2025 的玩具版）：MPtrj 這類訓練資料是「鬆弛軌跡的快照」——絕大多數樣本擠在能量谷底附近。模型沒看過勢能面的牆壁（短距離斥力、解離極限），就把牆壁「畫軟」。\n",
    "\n",
    "實驗設計：真理是一條 1D Morse 勢能；訓練樣本只從 **r ∈ [0.95, 1.30] r₀（近平衡）** 抽；然後要求模型預測整條曲線。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": "De, a, r0 = 1.0, 1.7, 1.1\nV = lambda r: De * (1 - np.exp(-a * (r - r0)))**2   # 真理：Morse\n\n# RBF 特徵 + ridge（跟 ch4 notebook 同一套迷你 MLIP）\ncenters = np.linspace(0.7, 2.6, 24); width = 0.12\nfeat = lambda r: np.exp(-((np.atleast_1d(r)[:, None] - centers[None, :]) / width)**2)\n\ndef fit_ridge(r_tr, y_tr, lam=1e-6, w_prior=None):\n    A = np.hstack([feat(r_tr), np.ones((len(r_tr), 1))])\n    n = A.shape[1]\n    w0 = np.zeros(n) if w_prior is None else w_prior\n    return np.linalg.solve(A.T@A + lam*np.eye(n), A.T@y_tr + lam*w0)\n\npredict = lambda w, r: np.hstack([feat(r), np.ones((len(np.atleast_1d(r)), 1))]) @ w\n\n# 只用近平衡樣本訓練（模擬 MPtrj 的取樣偏差）\nr_train = rng.uniform(0.95*r0, 1.30*r0, 60)\nw_soft  = fit_ridge(r_train, V(r_train))\n\nr_all = np.linspace(0.75, 2.5, 300)\nplt.figure(figsize=(6.6, 3.8))\nplt.plot(r_all, V(r_all), \"k-\", lw=2, label=\"true PES (Morse)\")\nplt.plot(r_all, predict(w_soft, r_all), \"--\", color=\"#E07B39\", lw=2.2, label=\"model (near-eq. training)\")\nplt.axvspan(0.95*r0, 1.30*r0, color=\"#1565C0\", alpha=0.12, label=\"training window\")\nplt.ylim(-0.3, 2.2); plt.xlabel(\"r (Å)\"); plt.ylabel(\"V (eV)\")\nplt.title(\"Softening: walls collapse outside the training window\")\nplt.legend(fontsize=9); plt.tight_layout(); plt.show()\n\nfor r_q, name in [(1.05, \"近平衡（分布內）\"), (0.85, \"斥力牆（分布外）\"), (2.2, \"解離區（分布外）\")]:\n    err = abs(predict(w_soft, [r_q])[0] - V(r_q))\n    print(f\"  r={r_q:>4} Å {name:14s} |誤差| = {err:.3f} eV\")"
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "牆壁與解離區的誤差比谷底大 1–2 個數量級——而且模型**畫得很平滑、看起來毫無異狀**。把「r 軸」換成「從 bulk 到表面/缺陷/過渡態的構型座標」，這就是 universal MLIP 的 softening：表面能、能障、吸附能被系統性低估。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 3. Zero-shot vs fine-tune：補習為什麼贏過重讀\n",
    "\n",
    "設定：「預訓練模型」在通用體系（原本的 Morse）的大量資料上學好；「你的體系」是參數不同的另一條 Morse（更深、平衡距離更長——想像換了一種金屬）。比較兩種策略在只有 N 筆 DFT 資料時的表現：\n",
    "\n",
    "- **scratch**：只用 N 筆資料從頭擬合\n",
    "- **fine-tune**：同樣 N 筆資料，但把解「錨定」在預訓練權重附近（ridge toward w_pre——multihead replay 的數學雛形）"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 預訓練：在通用體系上充分學習\n",
    "r_big = rng.uniform(0.8, 2.4, 600)\n",
    "w_pre = fit_ridge(r_big, V(r_big), lam=1e-6)\n",
    "\n",
    "# 你的體系：參數不同的 Morse\n",
    "De2, a2, r02 = 1.4, 1.5, 1.25\n",
    "V2 = lambda r: De2 * (1 - np.exp(-a2 * (r - r02)))**2\n",
    "r_test = np.linspace(0.85, 2.3, 200)\n",
    "\n",
    "Ns, n_trial = [3, 5, 10, 20, 40, 80], 30\n",
    "err_scratch, err_ft = [], []\n",
    "for N in Ns:\n",
    "    es, ef = [], []\n",
    "    for _ in range(n_trial):\n",
    "        r_n = rng.uniform(0.9, 2.0, N)\n",
    "        w_s = fit_ridge(r_n, V2(r_n), lam=1e-4)                 # 從頭\n",
    "        w_f = fit_ridge(r_n, V2(r_n), lam=0.5, w_prior=w_pre)   # 微調（錨定預訓練權重）\n",
    "        es.append(np.sqrt(np.mean((predict(w_s, r_test) - V2(r_test))**2)))\n",
    "        ef.append(np.sqrt(np.mean((predict(w_f, r_test) - V2(r_test))**2)))\n",
    "    err_scratch.append(np.mean(es)); err_ft.append(np.mean(ef))\n",
    "\n",
    "zero_shot = np.sqrt(np.mean((predict(w_pre, r_test) - V2(r_test))**2))\n",
    "\n",
    "plt.figure(figsize=(6.2, 3.8))\n",
    "plt.loglog(Ns, err_scratch, \"o-\", color=\"#9aa7c0\", label=\"train from scratch\")\n",
    "plt.loglog(Ns, err_ft, \"s-\", color=\"#1565C0\", label=\"fine-tune (anchored to pretrained)\")\n",
    "plt.axhline(zero_shot, color=\"#E07B39\", ls=\"--\", label=f\"zero-shot (N=0): {zero_shot:.2f}\")\n",
    "plt.xlabel(\"N training points on YOUR system\"); plt.ylabel(\"test RMSE (eV)\")\n",
    "plt.title(\"Fine-tuning wins at small N\"); plt.legend(fontsize=9)\n",
    "plt.grid(True, which=\"both\", alpha=0.3); plt.tight_layout(); plt.show()\n",
    "\n",
    "print(f\"zero-shot RMSE = {zero_shot:.3f} eV（不可接受）\")\n",
    "for N, es, ef in zip(Ns, err_scratch, err_ft):\n",
    "    print(f\"  N={N:>3}: scratch {es:.3f}  vs  fine-tune {ef:.3f} eV\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "三個檔位在同一張圖上：**zero-shot**（橙虛線）誤差大但成本零；**fine-tune** 用幾筆資料就大幅改善；**scratch** 要很多資料才追上。這就是 Ch5 的三檔位、也是 Ch6 工作流選 fine-tune 的數學理由。\n",
    "\n",
    "（真實 MLIP 的 fine-tune 比「ridge 錨定」精巧——MACE 用 multihead replay 防遺忘——但「把解拉在預訓練權重附近」這個核心直覺是同一個。）"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 4.（Colab 真跑開關）CHGNet / MatterSim\n",
    "\n",
    "```python\n",
    "# === CHGNet：五行得到 e/f/s/magmom（CPU 可跑，權重內建）===\n",
    "# !pip install chgnet\n",
    "# from chgnet.model import CHGNet\n",
    "# from pymatgen.core import Structure, Lattice\n",
    "# model = CHGNet.load()\n",
    "# s = Structure(Lattice.cubic(4.21), [\"Mg\", \"O\"], [[0,0,0],[0.5,0.5,0.5]])\n",
    "# print(model.predict_structure(s))   # 注意輸出裡的 magmom——CHGNet 的獨家\n",
    "\n",
    "# === MatterSim：ASE calculator（Colab 需 Python>=3.12，先 !python --version）===\n",
    "# !pip install mattersim\n",
    "# from mattersim.forcefield import MatterSimCalculator\n",
    "# from ase.build import bulk\n",
    "# atoms = bulk(\"Si\"); atoms.calc = MatterSimCalculator()\n",
    "# print(atoms.get_potential_energy())\n",
    "```\n",
    "\n",
    "**動手任務**：用 CHGNet relax 一個你算過的結構，與你的 VASP 結果比鍵長/體積——這就是 Ch5 練習 2。\n",
    "\n",
    "## 我學完這份 notebook 應該能說出什麼\n",
    "\n",
    "1. 分布外誤差（~0.2 eV/atom）大於火山圖要分辨的訊號（~0.1 eV）→ zero-shot 吸附能不能拿來做定量結論。\n",
    "2. softening 的成因是訓練資料擠在谷底：牆沒人教，模型就把牆畫軟——而且輸出看起來毫無異狀。\n",
    "3. zero-shot / fine-tune / scratch 三檔位的誤差-成本曲線，以及為什麼研究工作流選 fine-tune。"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "name": "python",
   "version": "3.x"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}