{ "cells": [ { "cell_type": "markdown", "id": "473e4b0b", "metadata": { "collapsed": true, "jupyter": { "outputs_hidden": true } }, "source": [ "# Conformers and conformational fingerprints" ] }, { "cell_type": "markdown", "id": "9a96a2ea", "metadata": {}, "source": [ "Most molecular fingerprints operate on topological molecular graph. It is a \"flat\" structure, i.e. only takes into consideration atoms and bonds, without any spatial structure. **Conformers** are 3D structures that approximate the configuration of atoms in space, optimizing for stability and physical feasibility. While this is hard and not always possible, we can get more information this way.\n", "\n", "With scikit-fingerprints, generating conformers is very easy with `ConformerGenerator` class ([docs](https://scikit-fingerprints.github.io/scikit-fingerprints/modules/generated/skfp.preprocessing.ConformerGenerator.html)). It uses [ETKDGv3 algorithm](https://pubs.acs.org/doi/10.1021/acs.jcim.0c00025), which is a method based on distance geometry, stochastic optimization, and experimental information. It has relatively good speed and performance, and our implementation provides reasonable defaults for its parameters.\n", "\n", "`ConformerGenerator` requires `Mol` objects as inputs, and outputs new `PropertyMol` objects with conformer information attached. Selected conformer identifier is saved in `conf_id` property. All hydrogens are also explicitly added, as it's required for proper conformer generation. No in-place changes are made for safety.\n", "\n", "Let's generate conformers for molecules from [beta-secretase 1 (BACE) dataset](https://doi.org/10.1021/acs.jcim.6b00290) from MoleculeNet benchmark. Due to computational cost, using `n_jobs=-1` is highly encouraged." ] }, { "cell_type": "code", "execution_count": 1, "id": "bb06ca57-32ee-4c21-86dd-c69f65188acd", "metadata": { "execution": { "iopub.execute_input": "2025-01-19T18:06:55.581197Z", "iopub.status.busy": "2025-01-19T18:06:55.581005Z", "iopub.status.idle": "2025-01-19T18:07:15.536629Z", "shell.execute_reply": "2025-01-19T18:07:15.536131Z", "shell.execute_reply.started": "2025-01-19T18:06:55.581182Z" } }, "outputs": [], "source": [ "from skfp.datasets.moleculenet import load_bace\n", "from skfp.preprocessing import ConformerGenerator, MolFromSmilesTransformer\n", "\n", "smiles_list, y = load_bace()\n", "\n", "mol_from_smiles = MolFromSmilesTransformer()\n", "mols = mol_from_smiles.transform(smiles_list)\n", "\n", "conf_gen = ConformerGenerator(n_jobs=-1)\n", "mols = conf_gen.transform(mols)" ] }, { "cell_type": "markdown", "id": "649cbcf4-42ee-43a0-875a-393eccc64928", "metadata": {}, "source": [ "Let's visualize an example molecule and its conformer, using [Py3DMol](https://pypi.org/project/py3Dmol/).\n", "\n", "Since returned molecules are regular RDKit `Mol` objects, they are fully interoperable with any other frameworks. scikit-fingerprints brings convenience and speed of multiprocessing." ] }, { "cell_type": "code", "execution_count": 2, "id": "a6c9c364-c721-4711-9663-d2cd89d39388", "metadata": { "execution": { "iopub.execute_input": "2025-01-19T18:07:15.552095Z", "iopub.status.busy": "2025-01-19T18:07:15.551810Z", "iopub.status.idle": "2025-01-19T18:07:18.044913Z", "shell.execute_reply": "2025-01-19T18:07:18.044397Z", "shell.execute_reply.started": "2025-01-19T18:07:15.552074Z" } }, "outputs": [], "source": [ "!pip install --quiet py3Dmol" ] }, { "cell_type": "code", "execution_count": 3, "id": "9985ab66-5f1d-4d27-807a-5c44b5200456", "metadata": { "execution": { "iopub.execute_input": "2025-01-19T18:07:18.045429Z", "iopub.status.busy": "2025-01-19T18:07:18.045319Z", "iopub.status.idle": "2025-01-19T18:07:18.056265Z", "shell.execute_reply": "2025-01-19T18:07:18.055998Z", "shell.execute_reply.started": "2025-01-19T18:07:18.045417Z" } }, "outputs": [ { "data": { "image/png": "iVBORw0KGgoAAAANSUhEUgAAAcIAAACWCAIAAADCEh9HAAAABmJLR0QA/wD/AP+gvaeTAAAgAElEQVR4nO3dd1xTV/sA8CcJSdiCosgSUBkCIqKoLFEcFPcoWAeiuKsFa51FRa1a1J+KWuvCgQhFZEjrwo2IggUEZFkUWYIgU0ASSHJ/f5yaNwVkJDcJyvl+3s/7OSS55x6qPjn3jOdQCIIADMMwTFhUaTcAwzDsy4bDKIZhmEhwGMUwDBMJDqMYhmEiwWEUwzBMJDiMYhiGiQSHUaxr4XK5Z86cQeXa2tqgoCDptgfD2oXDKNa1cDic06dPo/KHDx8uXbok3fZgWLtwGMUwDBOJjLQbgGHNvXr1aubMmQDAYrGk3RYMax8Oo1iXM3DgwMjISAB4+/bt0qVLpd0cDGsHfqjHMAwTCQ6jmNQkJSUdP34cle/fv49mk2RkZObOnYteVFRURE/3GNaV4Yd6TGqqqqpev36NymVlZQUFBQBAo9HWrVuXmJjo4eExeLBFUNBFqbYRw9qHwygmTeXl5WlpaQCAYigfnT74w4enDQ0KUmoXhnUCDqOYNOXk5ISHhwNARkaGpaXlo0ePTE1NGYxeYWHMly+ZTCZkZcHbtzB+vLQbimGfh8MoJk3W1tY7d+4EgJCQkJycHBcXl0GDDjs5zTt2DBgM2LYNcnIgORmHUaxLw1NMWFdRX19vZzfm779nGRmBgwPExMCrV9JuE4Z1AO6NYlJjYWGhoaGByqNHjx42bJiBgUFhIdBocPEiHDgAXl6wYoV024hh7cNhFJMaNTU1NTU1VNbU1EQFHR0oKQEAGDoUBgyAqCjQ0ZFWAzGsQ/BDPdZ1/fIL3Lol7UZgWHtwbxTrchQVYdYsAIAePSAgAADg2TMYMUK6jcKwz8K9UazLUVKChQv/LY8fDzExMGoUnDol1TZh2OfhMIp1debmQBCwdi2kpEi7KRjWGhxGsa7OxQWWLwcWC1xdobZW2q3BsBZwGMWEQRBEY2MjKvN4PH5ZTI4cAQsLyMmB5cvFeh8MEwYOo5gwXr9+PWfOHFR+9uzZ6tWrxXo7WVkIDQUlJQgJgQsXxHorkrHZbH66gPr6+uLiYum2BxMHHEaxL4OBARw7BgCwZg1kZvKk3ZyOevXq1aZNm1A5MTHx119/lW57MHHAC54wIeXk5Ozfvx9aJGcSH3d3ePAAXr3KdXf3iIm5IS8vL1w9ubmgrw8UCrDZ8OED9O5NbjOxbgf3RjEh9erVy97e3t7e3sLCQmI3PX6cVVExOTExxsvLS+hKjIzg/HkAgMxM2LKFtLZ9zrNnz5YuXbp06dIDBw6I/WaYNOAwigmpZ8+e1tbW1tbWZmZmErupgoLslStX5OTk/P39hT57efBgOHcOysvJbdpnWVhY+Pr6+vr6LscTZF8pHEaxL4yZmdmhQ4cAYNWqVdnZ2R25pKYGEhLg/HnYtAkSE4FOh61bYeNGMTf0EwaDgbIH9OjRQ0K3xCQLj41iwhg4cOCVK1dQeeTIkW/fvvXw8Dh37pxk7r5y5crY2Njg4GBXV9eEhAQ5OTnBd4uLi7OysvLyTJOT+2ZnQ1bWv7lOkL59AQC++QbOnoWnT8XeVAaD0fvT4KusrGzPnj3FfktM4igEQUi7DdiXraqqasCAAVVVVYGBgQsWLJDMTevq6oYPH/7y5cuFCxfOnj07Ozs7Ozs7MzMzOzu7pqYGAOzsLjx+7I4+LC8PRkZgbAwmJuDsDN9/DwkJUFgIDg7g6Aj+/mJs57Nnzw4dOrRixYqxY8eK8TaYVOEwipHg/PnzHh4evXr1ysjIUFdXl8xNk5KSrK2tFRUVq6qqBF/v1avXoEGDxoz5UVV1lpERKCkBmw2ZmZCZCRkZoKwMsrIQEQEAcOQIlJTAhg0gJwfCTvu34/vvvz9x4sSGDRvQqgbsq4TDKEaOb775Jjo62tXV9fLlyx28xNLSMjk5uWW5g+rr6/v27ctisWxsbCwtLY2NjXV1dWVkZN69e5eVlZWdnU2jTbt61b2p6T9XqanB+/f/+zE7G6ZMgaFDITQUKJRO3b99jY2NmpqaFRUVqamp5ubmJNeOdR0EhpEhLy9PSUkJAMLDwzt4iZmZWavlDvr9998BwM7OjiAIDw8PXV1dyn8DoZ3dEgBCQ4MYP55Yvpzw8yPu3CFKSv5TSVYWoaJCABBbt3b2/u1Dw8eWlpbkV411JXiKCSOHrq7unj17PD09V69ePXbsWFVV1c99ksfj5eTkKCkpsdns06dP81/s7B1PnDgBAGvWrAGAsrKy/Px8JpNpaGhobGw86F+Whobw3/mn5oyNITQUJk2CPXvA0BDc3DrbirYEBAQAgLu7O5mVYl0PfqjHSMPj8caMGRMbG+vh4XH27FnBt4qLi5M+efr0aUVFxe7duwMDAw8fPow+sGHDhvT09I7f6/79++PGjdPQ0MjPz6fT6ZmZmQwGQ19fn0ajCdFyf39YtgwYDLh9GxwchKigFWVlZdra2gRBFBUVSWy8GJMK3BvFSEOlUv39/S0sLM6dOzdu3DgFBYXExMTExMS///67oqJC8JPa2toyMjJ0Ot3Z2Rm9srGTyzh/++03AFi5ciWdTgcAExMTUVq+dCmkpsJvv8G330J8PAwYIEpl/woKCmpqapo+fTqOoV8/aY8qYF+V0tLSSZMmAUCzYUoVFRVbW1tPT8/Q0NCST8OTQo+N5ufny8jIMBiMkmYjnSJoaiKcnAh5eeLbb8/W1NSIXiHaI9vxkWLsy4XDKCYMHx+ft2/fovLatWs/fPhw69atb7/9lsFgoLhJpVLHjRu3ZcuW8PDw/Pz8ViupqKhotdyuzZs3A8D8+fNF+RVaqq4mJk3yBoBJkyZxOBxRqnrx4gUA9OzZk8VikdU8rMvCYRQTxsyZM3NyclDZ3t5+7dq1KHrSaDQHBwcA6NOnT3x8/Pv370m/NYvF6tOnDwA8ffqU9MrfvHmDKv/xxx9FqWfdunUAsHr1arIahnVleGwUE9KrV69Q0vuPHz9+++23YWFh8+fPr62tNTIyiomJ0dDQ8Pf3X7JkCf8kerIEBweXlZVZWlqOGjWK3JoBQE9PLzw8fPz48YcPHzYwMFi1alW7l3A4nOLi4oKCgvz8/IKCgsLCwoKCgpiYGABYyD+ZD/uq4TCKCSkqKgqtaiorKzM1NS0oKKBQKFu2bKmsrAQADQ0NMd0XLRf19PQUU/12dnanTp1atGiRl5eXoaHhuHHj0OsNDQ0lJSW5ubm5ubnFxcX8ckFBAYfDaVYJg8GgUChJSUkj8MHQ3QAOo5iQfvrpp4EDBwLA48ePQWBOCeWvy8zMzMrKWrJkCbk3ffLkSWJiopqaGv8IE3Fwd3fPysrat2/f9OnTR40a9e7du4KCgtrPHKdHpVK1tLT09PR0dHR0dHT69eunq6v7zz//rF+/ft26dTY2NkOGDBFfU7GuAIdRjGTW1tavX79esGBBWVkZ6ZWjdU7Lly+XlZUlvXJBe/fujYqKqq+vv3fvHnpFVlZWU1Ozf//+GhoaqKCjo0OlUjkcTkVFBeqcPn/+/Pr165MmTTpw4MDLly/PnDkzb968v//+W+hE/dgXAYdRTBgbN27kP7bv3r1bQUGB/xb/oZ70MFpSUhIeHk6j0ZYtW0ZuzS1xOJzy8vLy8vIjR45YWFiwWKyqqio07pmfn//ixYvCwsJmi2H59PT0AODo0aPx8fEvXrxYv349GojAvlY4jGLCEJzeGT16NL88aNCg+Ph4+DQ2Ki8v39DQINf2fswOO3XqVGNj4+zZs1GcEquoqKjy8nJLS0tPT08vL6+jR4+2/AyTydTS0uJ3TvkdVUNDQwCQlZUNDg4eMWLEiRMnHBwcxDoKgUkXDqPdA5cLPj6QmgoUCujrw/79wGSK4z4LFy5EPS8NDQ09Pb05c+Y4OTn5+fmJXnNTU5O/vz982kQvbufPnweAxYsXA4CZmdmIESP69euno6Ojq6uLCjo6Ou1uTzIzM9u3b5+np+fKlStHjRqlq6srgZZjUiDtFVeYRJw7R6xd+2951y5i/37x3apfv34AkJubm5KSQqfTqVTqo0ePRK82ODgYAExNTXk8nui1ta2oqIhGozEYDNHXvfJ4vOnTpwOAvb29iEv6sS4Ln8XUPURHw6JF/5Y9PCA6Wkz3IQiitLQUANTV1YcMGbJp0yYej7d06dKGhgZRqmWxWOj8pR9++IFCobx584ac5n5GQEAAl8udMWOG6IteKRTK+fPn+/XrFxsbu3v3blKah3U1OIx2DyzW/57imUxgscR0n8rKSjabraKiguamt23bZmZm9s8///j4+HSqnrq6usePH58+fdrLy8vOzk5FRaWgoEBWVnb69OlBQUEmJiYnT54Uz28AABAYGAifnuhFp6qqGhgYSKPRdu3a9eDBA1LqxLoWaXeHMYnYtYs4derfcng44eVFEASRkUEkJpJ7H7SXfNCgQfxXkpOTZWRkqFRqXFxcGxdWVlbevXt3//79c+fONTIyolL/8wUvIyODlvpPnjz5woULAECn0+/evUtu45FHjx4BgJaWFrnP4N7e3gCgra1dXl5OYrVYV4DDaPdQXU04OhI//URs2ULY2xMlJURxMaGjQygpEbdvk3if27dvA4Cjo6Pgixs2bAAAY2PjhoYG/ouVlZWxsbF+fn5ubm4mJibNMkLR6XQTExM3Nzc/P7/Y2Nj6+vo3b96gIzbXr1+Psur17Nnzn3/+IbHxCOqEent7k1ttU1OTjY0NAMyaNYvcmjGpw2G02+BwiMxMIi2NaGoiKiuJv/4iliwhAAgGgwgKIusmFy9ehBa5l1gsFsoH6urqun379ilTpmhqajZ7KlJQULCxsVm9evXZs2efP3/e2NjYsvLY2FiUQerUqVNo3sbIyKiqqoqsxhMEUVdXh45Cyc7OJrFapKCgAB2wfPLkyY5fdf369dTUVFQODQ19/fo16Q3DRITDaPfDZhO2tgSNRhw/Tvj4EAAEhULs20dK3fv27UMdRvTj27dvQ0NDPT09jYyMAEBwAamSkhLKQBoQEJCent7BJ+hz586hvuqNGzfQIXETJ05samoipfH8+kePHk1Whc2g05lkZWX5kbFd27Zti4iIQOVVq1aRsuwBIxcOo92Sry9BoRAAxKZNxKFDBJVKANRv2yb6WiKUMc/FxcXNzQ2tfOJDj+0oc/OrV6+EvpeXlxcA9OrVKyYmBiW180JDvWSwt7cHgPPnz5NVYUsoz4CpqenHjx/b+FhjY2NhYWFGRgYOo10fPotJQkpLSxcuXBgdHQ0AGRkZe/fuDQoKkmaDAgJg2TJoaoJFi8DZmfjhhzmqqkwrq3PnzqFjOTpO8Jyle/fusdls/vl0SkpKI0eOHD9+/PDhw8ePH89gMFgsFkW0g4zRUqRr166ZmJgcPnx42rRpbDb7xIkTK1euFKVaAMjJyTEyMlJQUCguLkaP9uJQX19vZWWVlZW1cuXKbdu2lZSUoHxR/P+vqqoqKSnJz8/ncrl6enpubm7R0dFoV1haWlpAQACK9VgXIu043l2UlJRMmDABlV+8eDF37lzptocgCOLPPwl5eQKA4+Ly8Pp1RUVFAHBycqqtrW37utevX4eGhm7atGn8+PEtTwBVU1NzcnI6duxYWloal8tFl+Tk5ACAnp4eKQ3/8OGDmZkZADg7O6PtRnQ6/d69eyJW+/PPPwPAkiVLSGlkG54/f85gMNB/8M+h0Wiampq2tra4N9r14c2gklNYWHj8+HEAKCkpkXZbAABg6lR48ACmTTtUWBjxyy9Xr16dN29edHT02LFjr1+/jp6XEcH+ZkJCwvv37wWrUVFRMTU1HfZJXFzcL7/8EhYWJhgmiouLAUBLS4uUhispKf35558jR468efPm4MGDN2zYcODAARcXl/j4eAMDA+Hq5PF45C4XbdXq1asVFBTWrVunra2NTgzV09NTV1fX0tLq27evpqZm3759UVldXR2t+tq+fbv42oORQ9pxvLsoKSkZNmzYnTt37ty5c+bMmS7RGyUIgiAKMzPRIKaZmdmTJ09QGNLV1T116pSPj8+UKVPQMiNB/PPp0OxQswrZbHb//v0nTpwoOPqJtnK6urqS2PJHjx6hifvTp09PmzYNAIyNjYWeuL9x4wYAGBoaim+zaWVlJZPJpNFoaGmqqqpqR05qevjwIX/ZwI0bNz53sBUmRTiMSkhXfKj/pLi4GKUW1tTU/PHHH1uOXfbt23fy5Mnbt2+PiooqKipqt8L58+czGAz+fD1BEAcOHACAtfx9/SThT9zfvHlTxIl7FxcXAPj111/JbaGg06dPoxZu2bIFAFasWCG+e2GShMOohHTlMEoQRGVlJZq4oFAoKCtHG/3NdmVmZqqpqampqYWEhKBX0Az+gQMHyG44gU4TUVNTe/ToERqI6Gywrq2tvXv3LoPBoNFoBQUFPB4vLS2N9HYSBIEO+7tw4YK+vj4AxMTEiOMumOThMNppqamp/HPM//77b8GdOZ9z4sSJPXv2VFdXox+5XG5dXZ0YmygUFouFDg4aPHiw6LXZ2dkBgL6+/suXL4lPfb3g4GDRa26Gw+FMnjwZAExMTG7dusVgMIyMjNo4aL6xsTE9PT00NNTHx8fFxcXExAQNQerq6uro6DQ0NEybNk1WVjYlJYXcdhYVFVGpVFlZ2Vu3bgGAjo4Of/4N+9LhMNpp7u7uiZ+2ok+ePDkvL6/tz6enp6Nl56JPJYvbyJEjAeDs2bOiV3Xs2DEajQYAlpaW9fX1tra24ut/1dTUmJqaAoCzs/P169crKyv5b7HZ7LS0tJCQEG9v75kzZxoYGKBWCWIymebm5srKygCwdetWlFp/4MCBbcRiIaCNCXPmzEGnjW7atInEyjHpwmG00zoVRlksFhp2XLZsmURaJ7yUlBQ0fURKT/njx4+GhoaKiooODg4VFRU2NjZ0Ol0cW+CR3NxclNRu+fLlV65c2bFjh4uLy6BBg1qugaXT6aampq6urjt37gwLC8vOzkZjqQ8fPqTRaFQq9a+//ho6dCjpE2Lor0FERARqZ8fHDfz9/fkzUb/99pvQDZg7dy5/wf+MGTOErgdrCYfRTnN3dx8zZsz06dOnT5+urq7edhhFY4IDBw788OGDxFoonOXLl5M7C7Rjx46goCB0GgdBEDweLyQkpL6+nqz6m7l3756MjAzatC5IQ0NjypQpmzZtCggISExMbGPv0M6dOwGgT58+jx8/Rp3TU/y0WKLJzMxEU/MRERFo/KHj19rZ2fGHg0QZbxk5ciT/C9LMzEzoerCWcBjttJa90ZKSklY/efv2bQqFIiMjEx8fT3ozduzYwU+5tm7dulZzeXTchw8f0L6djIwMMlr3r4qKCn01teIdO4hTp0r271fv3bu0tJTE+gWhPKc0Gs3R0XHz5s2XLl1KTk7uyMg1H5fLnThxIgA4ODigPWaysrLJycmit40/Nf/dd98BgK+vb8evJTGM3rhx4/79+/fv3x8wYIDQ9WAt4eX3onr27NmCBQtWrly5d+9ewQMyy8vL3d3dCYL45Zdf0JgjuRISEvinwMfGxnK53M5u4hQUHEwbOnSfpmYaSsVEFkpd3ZKGBo0dOwCgL4APSWfbterq1avV1dXm5ub8I5E7i0qlBgYGDh06NCYmxt7eftmyZWfOnHF1dU1KSkKdU+EQBBESEgIAs2fPnjlzJoVC6ezxdgsWLJCRkQGAoqIioZsBABkZGWilbWNjoyj1YM1JOYx/gXbv3s1fDr1+/frNmzejqd7+/fvfuHGD/zG0IFz0E3iOHj2KCvX19YKTP87OzvHx8Xl5eXl5eUOGDOlUt6ulwYMJAOLKFVHqaEXV8+cEQB2VGqOkRADcpVLF1xsdP348ABw/flzEeviDpH/++Scpg6SxsbEA0K9fv4CAAOh8+qhmvVEul1tQUNCpGj58+PDx40f8UC8+OIySIDEx0crKCn0tTZkyJT8/H52OqaKiIuKeEx6Px3+OKysrGzt2LP8tZ2fnJUuWeHl5eXl5qaurixJGY2IIAKJvX0K0gYFWoDDKlpN7N2YMAfCQwRBTGM3NzaVSqXJycoLT9ELbtWsXAPTu3Zs/SNqpDKHNoJwpmzdv/uabb4SoqlkYPXr0qJycnK+vbwfXS2VnZ5uams6fPx+HUfHBYZQcTU1Nhw4dQsOLcnJy6NHpjz/+ELHatsNoYWEhKltZWYkSRr/7jgAgtm0TpaWtq8zNLaNSCQD0v7NycmIKo+iIjoULF5JSG3+QdPTo0Wgbq9CDpI2NjWhqPiYmRkZGhk6nd/YQkd9++43/h/t///d/aD0WAIwdOzY3N7fta//44w+U2cDc3DwuLo4feZ8/fy7E74J9Dg6jnVZRUcHfuF1aWsqfgi8oKNi1axfaSNOjR48RI0aIfi8ej6esrDxjxowZM2ZMmjRJHGG0rIxgMgkajWhv/aswKioqvtHV/eDtTfj6lm7damFmJo4VCxwOR0dHBwBIzH5UWlqKUvR7e3ujNQzCrST966+/AMDExOTo0aMAMG3aNNHbdu3aNZTkRV5e/nPd0qampk2bNqGAO3/+/C643eNrgsNop/n5+fn7+6Py1q1bL168eOTIEWtra/5WdHl5eQqFwmQyRd8J00Zv9M6dO/xZ9Rs3bgi9JWbvXgKAENM6Qh6Px/8HzOVyxbTa6c8//wQAIyMjcrOKPHz4EB3GJ8ogKX9qHk0zXr58mZS2VVVVoeAOALa2ts0W5BYVFaH9Dkwm08/Pj5Q7Ym3AYbTTmoVRlHQDPctPmTIlNDSUzWZ///33qA/SdobzdrURRtPT0wHA0tJSlPoJgvDxIRQViZs3RaxGmtBsnjg27P/yyy/NBklPnDjRqRrc3d3l5eUfPXpEoVCUlJTI/SK5du0a6jILdktjYmL69u0LANra2k+fPiXxdtjn4DDaaX5+fg4ODitWrFixYoWlpeXVq1dXrFjRbGF5fX29sbExkLGa/eDBg6hQV1eHZieOHTsWHx+Ppn1dXFyE/S0I/s7MbduIL3d7d0lJCZ1OZ4hn8orL5To5OQkOkjKZzKSkpE5VUl9fjxb2owVw5BLsltrZ2Xl7e6N1b2PHjn337h3pt8NahcNop/n5+R06dKikpKSkpOTHH3+8evVqqx9LSkpiMBgUCkVwFZQgLpfLHyjkcDgdHL3atm1br169hg8f/sMPPwDA3r17hfstJk4krKwI1FcWuUcrTbt37xbl66RdZWVlaCDy559/XrFihaamZlxcXLtX1dbWpqenX79+/eTJkz///DPaW3Wb1LOsBYWHh6NBeRkZGQqFsm3bNhGX2WGdgpffC0NZWRk9Nwmut2/G0tJy+/btW7duXbp0aVpaWq9evZp9oKio6Pvvv7927RoApKamHj169MKFC23fNycnx9/fv6Kiorq6GqXQt7S0FPq3WLAA9uyB3buFrqBLmDp1amFhYWcXtHdc7969g4ODx40b5+vrGxkZyZ9FRKqqqnJzc9ExSrmfFBcXo54g/2NMJpPBYFRVVYmpkbNmzbKxsdHV1W1sbJSVlXVycmqZgQUTHxxGO01RUZEfPZWVleU+vzNny5Ytd+7ciYmJWbZsGdpMLSIDA4PIyMjFixdnZWWhMGphYdHBawkCMjPhyROIi4MhQwAA3Nxg9mzIyhK9XVKwcePG2bNnjxw50tzcvKmpSVtbW3z3Gj169I4dO7Zu3bpo0aLFixdXVlYWFBQUFBQUFhay2exWL5GTk9PV1e33yT///HPp0qUlS5YMHjx40KBB4mhkenp6Y2OjqqpqVVXVpEmT7ty5g9IeYhKAw2in8bdgAsCGDRva+CSVSj1//ryFhUVkZOTFixcXLlzY7AM5OTlo1KzjpzONHDnyzp07kydPTk1NZTAYaE3i53z8CMnJkJQEcXHw4AGUl//7up0dyMsDhQJ+frBxYwfv3LXU1dU1NTWhcm1tLZfLFevttmzZcvny5YqKikOHDgm+rqqqqqGhoamp2f8T9KOenh7a28ZHoVACAwNdXV0TEhLk5eVJb2FkZCQAfP/997m5uWFhYZmZmfr6+ugAmLKyMjqd3vLwQYw00h5V+PqhoysVFRVzcnIEX8/Pz7e3t09OTk5OTg4ODu7U/MOlS5cAgEajtUxbWVRUFBoa6uXlZWVlpa3N+bTynQAg+vUj5s0jjh0jUlOJiRMJtPh17VpCX5+IjiaWLiU6cDJQV7Fq1SofH5/g4ODg4OBRo0ZlZWWJ9XZcLhelrP/uu+/8/f1v376dnZ3dqWUYtbW1KF/BvHnzSG8ej8dD/fGkpKSmpqa4uDh/f//Dhw+jd319fQMDA0m/KcaHe6Nit2jRops3b4aGhi5atCgmJkZw0EpZWRktSCQIAh1h30FotdOoUaNOnTqFtu4kJSXFxcU9fvwY5WRDbG2LlZV17OzA1hbs7UFf/381LFkCaDRi504YMACWLIGiIsjIgIgI6NtX1F9ZMqhUarMen/hERUW9efNm4MCBQUFBwt1UUVExIiLCysoqODh4zJgx/M1IpIiPjy8qKtLV1R06dCiFQrGxscn6Qgdrvkw4jErCyZMnnz59GhcXt3//fpQzDT6desQvd+ofZ3JyMgA4OzvTaDR7e/vKykr+W8rKyjY2NtbW1ra2tqNG9fzcHJirK//zsGYN2NnBjBmQkMDz8Fi9a9eS4cOHd/qXlDhHR0d0VEl4eLi47+Xn5wcAXl5eogRuIyOj06dPz50719PT09LSctiwYWQ1Dz3Rz5o1S/A4wpCQkBcvXgBAWlqal5cXWffCWiHt7nB3wc89mpCQQBBEU1OTi4uLECtg8vPz//jjj2aDaz169FiwYMHvv/+elpYm9HamsjLCw+M0AMjJyYnj0CRyrVq1KjY2FpVdXFzE+lCPvrSUlZVJOVYEZSrR1dWtqKgQvTbE0NAQ/rsX1t/ff8+ePRUVFRUVFT4+PvihXqxwGJUcdIalsbFxfX09SgFlYGDQ7mnATU1N6enpp06dcnNz06WK2BIAABGJSURBVP/0WE6n0ykUSq9evYyMjACAn0xPRGw2G63lNjc3/+mnn/ivb9y4kZT6SVRWVsZPI/Du3TsRs1a3zc3NDQAEz4sWBYvFQp39KVOmkLJ7NTU1FQD69OkjuFYUj41KEg6jksNisQYPHgwAy5cvV1dXB4CIiIhWP/nu3bvIyMgNGzbY2trKysoKdjx79uw5efLkGTNmoGCKlqOSm7L+5MmTubm5grnUSDkr9AtVWlrKZDJpNFq76ZQ6Lj8/H/3B7d+/X/TaduzYgf5SCb54+fLlc+fOofLJkycjIyNFvxH2OTiMStTz589RDj0AsLa25ndGOBxOenp6QEDA8uXLTUxMBEe4AKB///5ubm5+fn6JiYn8Z/Yff/wRvaumpkZuSg7E1NT0/SfdOYxu27YNxLBL6tq1a2iQR/TTUtFheTe/6LQIXzgKIbDXApMAb2/vvXv3AkBoaKiqqurjx4/RJLvgFhcFBQULCws7OztbW1sbG5uWO6AAgMfjWVhYvHjxQlFRMTc3F60QJJG+vj7KJw8AT548ycjIILf+LwKbzdbV1S0tLX38+DHKmUSiLVu2+Pr6qqurP3/+XENDQ7hK3rx5079//x49epSVlfG/oTFJk3Yc73YWLFiA/ss363IOGDDAzc3t999/T0lJ6eCG6MmTJ6Nr7ezsWGSv+cQP9QRBnDlzBgCGDRsmjso5HA76ohozZozQW+BRgjFxrEXFOg6HUYl6/vw5f8UMjUYbNmyYp6dnaGioEMl4OByOiooKAKCOzJw5c8h9tMdhlPg0q37p0iUx1f/u3Tt+cmjharCxsQGAK6SfooV1Bg6jQiotLQ0LC0Pl3Nzcz6VxagZNMSErVqwQpQHx8fEAYGBgkJ6e3qNHDwDYuXOnKBU2Exoa2mq5OwgJCeGX9+3bJ9ZlACg5NIVC+VyqsDa8e/cOnUCFk9tLl4Q2gXx9SktL+dlG8vLybt26hcpsNru4uDgpKenKlStHjhzZvHnzwoULJ0yYYGpqqqSkhPaWoGl6dMCZ0NAxwuPGjTM1NUUrSfUFdymJzMXFpdVyd7Bnzx5++dKlS6KcXN0uBweH3bt3EwSxePHiN2/edOraiIgIHo83ceLENjKNYRKAdzGRZuzYsWlpaYIbilpiMpk8Hq+mpoZCodjb24tyO34YBQBnZ+c3b968evXK29sbhYDr169nZ2f/9NNPotwCk4yNGzcmJCRERkbOmTMnNjaWyWQCAIfDqa2tBQCUlxYACIKorq5Gl6AJyXPnzgHAzJkzpdZ0DABwGBXFo0eP0PrN8vLyYcOG1dTUVFZWMhgMJSUldXX1/v37a2pq1tbWmpmZjRgxQkNDQ1tb29vb+/jx4ywWa8iQIa3Ov3cQi8V6+vQplUodM2YMeqVPnz6ZmZno3xv6QF1dnci/YjdVU1PDj008Hk/ct6NQKGfPnk1OTk5KSlJVVW1oaOjghUwmU1FRUXCkCJMKHEaFN3r06KCgIAB48ODB1atXw8PDFRQU+vTps2vXLhMTk2+//RYAPD09bWxs+MFu7969gYGBHz58aDvBXbvi4uIaGhqGDh3arJ5Xr16FhoYCQEJCAjo7CBNCjx490C51ADA3N5fAHVVVVefNm3fgwAF+DKXRaOhPkEqlorFvCoWCJhUBQEVFhUKh5OTk5OfnL1q06MmTJ+ggZUwqcBglTUeGJpWVlacOGRIUG5uQkFBWViaYR71TBJ/oBfF4PA6HAxLpQ2HkysrK4nA4v/322+rVqzt4SVlZmaOj44sXL9zc3CIiIpotocMkBk8xCUlNTc3Z2RmVdXR0xo4dK/jugQMHZs6cOXPmzOvXr//nssbGS0lJU2i0urq69evXC333z4VRQ0PDefPmzZs3D6U+woTDX5DbrCw+HA7nwYMHADBs2LCAgICOpPE+ffr08OHDz5w506tXr6tXr+7+0k+D+aJJe6nAV2jnzp38dXw//PDDgwcP/vdebCwBkGdsjKZW7969K0T9L1++pNFoDAbjw4cPggtOHzx4sGbNGlQOCwvbsWOH0L8CJmGxsbEAMGjQoCNHjgDA/Pnz2/48h8NBX6IjRoy4du0ajUajUCh49ai04N6oZD18CAC6EyZ4e3sDwKpVq1gsVmfrmDhxIpfLHTVqFJVKRXNcyMiRI9EGcABwcnJas2YNOW3GxO/27dsAMHHiRFSYMGFC25+n0WihoaEDBgx49uzZ5cuX0QGxHh4e3XPPrtThMEo+Ozs7dEg9AEyYMEFGRsbe3j47OxsAICYGAMDBYcOGDUOGDMnJydm3b19n60dT8C2f6OXk5PiDrYqKiqKsBMAkDEVPR0fHmJgYCoXSbhgFgJ49e0ZERCgoKAQGBjKZzLlz59bW1k6dOrWiokL87cX+S9rd4a/fokWLAEBXV7cwN5dQUCAoFKK0lCCI2NhYCoXCZDLfvHnTiermzMmlUF4DXDAxWbt27ahRo8TUbExiKisr0SgNGkk3Nzfv+LVoZolGo0VFRaE0phMmTMCH1EsYDqNiV19fj1baG2hrvwMgBPaqb9q06fz58/zcd2w2u+198eyZMzl0Ov+Muhf37+Mw+hW4fPkyADg6Om7evBk6nx8aDRD17Nnz0aNH6HGk5UGHmFjhMCoJ1dXVlpaWAGCuoFC5apXgWwsXLkxKSkLlSZMm5efnC7779u3bP//808fHx8XFxcTE5BEAATCXSn0PQADU5+biMPoVWLp0KQD4+vqivySdPVqGy+VOnToVdWPv3r2L0uV1/WNgviZ43agk9JCXj3F1vZ6b+7y62iUpKaq+vtVN0FwuNzU1NTo6OiUlJTU1NS0tDW0H5KPRaMDlBn9aE9rU1IQ3U38F0MDoiBEjfv75Z1lZ2c4uVqNSqUFBQdbW1mlpacePHz948OAPP/zg4eFhYGDwRRxN+DWQdhzvHubMIX7/vTg5eUXv3rEAE8ePR+lBKysrv/nmGwcHB0tLS21t7ZbLp1VVVW1tbT09PQMCAtLT07kPHhAKCuiJ/jCAq6urOPLeY5KEDsRWV1cPDAwEACcnJ+HqefnyJdrjtGfPHnR6c79+/UpLS8ltLdYqHEbF7/174tOjd1ZWVhiTOQpAR0cH5XkSRKFQjIyMFixYcODAgTt37pSVlX2uSn5yPB8fHwn9Fph4oKObFyxY4O7uDgAHDx4Uuqro6GgajUalUiMjI0ePHg0Atra2bDa77auqqqr4pzYVFRVdvnxZ6AZ0WziMil9aGjF7Nv+nd4sXz5eTQzuglZWV+/TpM2vWLD8/v9jYWCcnp2Zjo224efMmylN58eJF8bQbE9L79+9bLbdq0qRJAHDhwgWUv/nFixei3BqdT6OkpBQTE6OtrQ0Aq1evbvuSvLy8yZMno3JiYqK7u7soDeiecBgVv8pKQuAUiqrx4y0BtDQ10UmTbU8xte306dMAQKfT79+/T26TMVEIHhwgWG6JzWYrKipSKBS0u1dLS0vEURoej/fdd98BgKGh4cOHD+Xk5ADg9OnT6N2Ghoby8vLXr1+npKQ8fvw4Ojo6NDQ0OTkZh1ER4Skm8VNVBUtL2LsXZs+GuLiaoqJkgMVOTiiVydmzZ2k0GvpgVFSUjEwn/kSWLVuWkZFx5MgRFxeXJ0+eGBoaiqX9mGjevXv39u1b4lO20JqaGh6PV1dX19TUlJqaWldXN2TIkPr6+l69ek2YMEHE9CIUCuXcuXM5OTlJSUm//vrr8ePHPTw81q1bt3HjxtraWi6X2/KSixcvJiUlocSA1dXVurq6ojSge8JhVCJOnoQ//oCzZ2HAgI1GRpCdjYauAEAwbnYqhiIHDx7Mzc3966+/5s6dm5iYiHP8dAUNDQ1oLScAcLlcX19ftFO+VXp6emw229nZuaysjJ8uVhRycnJhYWFWVlby8vLffffdvn37SkpKUARnMBgKCgqqqqqKiooKCgqo3KNHj2HDhqHEgElJSceOHRO9Dd0NDqMSQaXC/Pkwfz5BEA+3bwcAfhgVEY1GCw4Onjt37vbt29PT01EG3/r6+sLCQv6GVEzCmEzm/PnzUTkqKkpHR2fYsGEAoKqqCgDKyso0Gk1BQYHBYBAEERYWlpeX5+3tvW/fPn46URHp6eklJCTo6+uzWKyCggI2m/3y5Ut9ff1WT0PJz88n5abdmpQHFbqZ9PR0ANDS0iK95rdv3/LXyqSmprabIggTn46PjRIE8eTJEwaDQaFQBM/RI8uNGzcAYPjw4W18pqqqij9L+fbtW/5BjVjH4dQkEvXo0SMAcHBwkHZDMDGSlZVttdwqa2trtMhpyZIl6FuWRNHR0dDe4YkqKipubm6orKmpOXv2bHLb0B3gh3qJQmGUrCf6ZlJSUtARnjU1NULn1cdE9/fff7da/pw1a9Y8f/783LlzU6dOTUxMJDE1FwqjTk5OZFWItQr3RiVKrGF0yJAhwcHBwcHBvr6+4qgfE5/jx49bWVnl5eXNmzev1fl0IRQWFmZnZysrK48cOZKUCrHPwWFUcl69elVcXNy7d28xTf5QKBQ6nU6n04WY8cekS1ZWNiwsrHfv3rdv3965cycpdd68eRMAxo8f3+rMEkYiHEYlh98VFceyJDqd3r9/f1SWk5Pr168f6bfAxKpfv34hISEyMjK7d+8ODw8XvUL8RC8xFIIgpN2Grx+Px6NSqQUFBdevX9fX1297yB/rzg4ePLh+/XolJaX4+HgTExOh6+FwOGpqajU1Nbm5uR05sxYTBQ6jkmBubp6WltayjGEtubu7X7x40cjIKCEhASWgEcLjx4/t7e2NjY2zsrLIbR7WEn6ox7Cu5eTJk5aWli9fvkTb24WrBD/RSxKei5AQ/nbAjx8/SrclWBcnJycXHh5uZWUVFRW1d+9e/r7STrl16xbgMCopOIxKiLm5OSrgaXSsXXp6esHBwc7Oztu3bx86dChKpjdq1CgOh4M+UFtbyy8DQGNjY319PQBQKJSKiory8vLk5GRZWVm80UMy8D9pCRk7diwqoKNyMKxtEyZM2LVrl7e394IFC549ezZw4MCUlBQ2m93uhQRB3L59m8fj2dvby8vLS6CpGA6jGNZFbdmyJSUl5cqVKzNnzrx161Z0dDTKsEcQhIKCAnqsaWhoaGxspNPpaNcpl8u9d+/ehQsXAD/RSxCeqZeEjx8/8vsFgmUMa1ttbe3w4cMLCgpYLFanLlRWVvb390ebgzFxw2EUw7o0b2/vvXv3ysrKolT2Kioqgts35OXlmUwm/0cajaasrJyXl/fq1StdXd3ExEQ1NTUpNLqbwQ/1GNalobRPp06dWrhwYQcvaWpqGjduXGxs7Jw5c6Kjo/GsprjhdaMY1nXxeLzY2FjoZHJFOp0eGhqqpaV1//79jRs3iq112L9wGMWwrislJaWqqmrAgAGdPSKpb9++V65cYTKZhw8fRjNOmPjgMIphXdeDBw8AYMyYMUJca21t7efnBwCrVq1KTEwkt2GYIBxGMazrQmGUv+i4s1auXLls2TIWizV79uz379+T2jTsf/BMPYZ1UVwuV01Nrbq6uqioSEtLS7hKmpqaHB0dHz9+7OjoiKebxAT3RjGsi0pOTq6urjYwMBA6hgIAnU4PCQnp27fv/fv39+zZQ2LzMD4cRjGsi6qurjY2Nhb6iZ5PS0srLCxs3LhxK1as4L+IH0NJhMMohnU5Bw8e5PF4EyZMyMrKIiXpsq2t7d27d8+cOXPlyhX0ytq1a9HAKyY6HEYxrMsJCgriH2wXFBQk3cZg7cLjzRjWjdy4caO4uBgAUlJSZsyYIe3mfCVwGMWwrsjFxQXtnS8vLyexWj09vSFDhgAA2hyFkQKHUQzriq5cuYIORh48eDCJ1ZqYmKDF/JGRkSRW283hsVEMwzCR0Hbs2CHtNmAY1tzw4cPRQz2FQrGysiKlTmtraxMTEyqVCgCOjo76+vqojIkI72LCMAwTCf4uwjAMEwkOoxiGYSLBYRTDMEwkOIxiGIaJBIdRDMMwkeAwimEYJpL/BwBw2pG1sPKOAAAEgXpUWHRyZGtpdFBLTCByZGtpdCAyMDI0LjAzLjMAAHic5ZNrTJNnFMfP+7b0DpRCS6H0JgpVES+AINr3eVoZ06mpgDrjbc10pjI+oHPOaTZuybBeYkTAoHSxCA7xNrxkOuz7PCjiZgzDqUSFcJGMqAQhjikLzqyT8cEvy7KvO8k/v5yck5z8T84Z5E93gj+UfjEwFja/0vzKZyRg8pMVvYFwHKaJfzULxWDxUyAQjVEoGSM7Xh9vx2Pl/0ox/nvMeA5v5/+ab5v4BzBvJjDM/51y/0UwLLAC//ZAGGAKEFnYADGIJSCRglTmYmVyp1zhYhWBzsAgFxsU7AxWulhliDNE5WJVoabAMFOY2sWqNU5NuIsN1zq1ES42IhLEOhBEQZQe9AbQG0EPoJI6I8KcSrmTMQFjBnYCsNEgmChkhJNAHCNkJLEgsYBsMiimQNBUCI6D0GkQGg/q6aCZAeEzQTsLIhMgMhEik0A3G3TJoEuBqDkQlQqGuWCYBwYrGDkwIjBisAv99kSgjxKwjEgmV4aopCJFYFCwUi5Sa8K1EWHhGsHYN7z5CJu3vI2Ks2W0eXIO6TjRQh/EyqhE380dKzlK46TPyU00B+VpymjA/gDSv0nCeTZn0CTZUv6Jz8PVr0+g6Q9zEYnNRy04mxZsaEf9Eh41xdSRQsvX6P3HpaisJolc3VaPOhk7/9kX85DafQ/dTka+MvNPaHGTDuXfn+f7g1qxJXUF+k4h50QVn+CbqQd463sst91Qg+v4Ud8MSyXnOH8ee4+WokMZI1x6VwVetjEEVyS+5po61+DLyWK860UVZw6KwWs2LcCZB71cQVwVqn2YikO0Cs6x7i7HRxfilKar1h2djbilp4Z01Xi4/csu4aDVWqr+dRfyNd/BC7espS+7ilH7+hE8vXgNnVpRztWqGFt7oYaeXDnMC+L68OaLVaT/yVO+3fwcfzxw03d2dh25sOMw6cgx4vqhUZR22UNzH+jJ066zpGfgFlU+20KOP24mdvtd+vhcCamNtdOUzj56zZHFHY12k6yaNursWERpwyJ0ZFUv/bG1gaiLClC3dA+dkjOVdhpvcPNPVtKyDb3krugJwol1dPjyKq7co0SJpSaa91sWebnuW9+7D1eQvK+auWXXHqB6ZyXpz5fhre8sIZ5CBxlsbOO2jqSRwdFKlCW9wT9SzecbFStxZb+X7HxdfcXtaMOyJVUo5XMzmumrwQe2Z+D9+dHIUX2RO7ygHG/9PhsZwqL4Kpcbp1mtJD7mCC6aGEmHM++joVO38KWW3fSF4R7KSxDbFtxy0w9+OMAt2RZs09WG0tUle8mExgDbbUaIVvZ+RLxDr/GXB1niKg2lzxru4VfFrzj72lbCH+8msxYvx417r6PaSQt58ck4fMajwQMFv5DMc0rc45uLdUY3zc05S5h9feS05xCt3tHAFRf1kfQ7LyiYveTMqaW0SHSV/s4I6TcVy+nw2uuUafXwF4o30n2PRunOY6evnJiWxd9u+pla5+5Bn36YTnocr6gm3oAcvUNE8yecZ7Wyz5wZnQAABnp6VFh0TU9MIHJka2l0IDIwMjQuMDMuMwAAeJx9V0Fy5DYMvPsV+oBVBECAxCGHXXuzTqViVyWb/CH3/L/SAGUKk0PsHZWG26KaQKMBPx3x8/vrr3//c+wfeX16Oo72P//c/fhLWmtPvx1xc3z99v2X9+Plx5evnysvH3++//jjMD1s4hn8PmK//Pj47XOFjo/j2c7RdM5xPPNJxoPpeG6nMyvx/TAfLwHtOlk0oTptAnB2GfHMRkog9Ww0vLXjmc7pzZsCOVhMCqEeyH6aTneLlyrToA4kieAW5F9+/v4TfeL1eD+e5QRFVQ88NyELvLSJd9w7W+zM52zshuXTe5tCB52tEZPcwBERkNOsMSjSide3vOmzWx83cMaOdAo73oX/RxSMYkcykd7+S9UDjhM3Vw24TEOcYommdytUCU9GbMR4OpCGqPaMRuu47wVJQAJAkyMzp7XgeCEpIryRkS0+3bq3CM8UbKoZVx1eckWRLEGcjEkzooBqbE4c2SrISJYi60hVxp6YoZGMfXevNDWRRmoyM0BEuSVC6l5CSpGljpdPBXlGXmkOCmAQr8cZybIrNTwP6YHcsHg1QjHrjhMK4VMRYYRAcFzqQglUHRXoGUo28ykRKXJ1ywDZpBIgbtgRjKgPiL2fDSkx6OdEYqlVYCTHTlQSpBZKYeyoS5sohBJJ5ozPsCa4hUpltiWSZu44wo2U3BMJb1AJZArhEGoqkAhprc7IzsCeDHklsuvQODl7Jykn58jOxAYo+qxjURpLmR11LF6glpuSI8aSRxoEg0goDIPqmUZC1QzVkIp0b9OuihlVHnxVEjebA1+RbQQylO8Q6Sy1yb78Ae+fluo0Mugvdx29PZhOu/yp9QbVhYF1FJoGVlV8FDkJLSyotpm5IgR2asStiSAyBZu+N04GhEMgoC2aFFp6RYHK8bbcdEBDsRlNlXxouE4pMcD9W+xKrU9Kj+w2pEUQGtGsBgWzfYsgSEdJZMJszB5ahm6mUz2XLeiE8lrGHlWibWDXCXd7CNcIqEYBzTHjWBB+HAYp7F6tBMF8i7cq/HXZ0uThKx3IMFERDMTzlghxjuYDhmP0tFXY7iingmG+pQjg8qiiaDw9toqHG81eqOJ+bQqHi1pHLXrvmeJhEHdRYWdAY1N4bQIQodFTj4RDtVIwPZIlkBFsYZUrItWWdLnRA4FIFtTigzNDbMPTfdDLhtbXR6pgkcMFt9mnPPwbQLTK2vciUWCnHSeAqZgb3D/aFBnKpgDHOjuKvcOJelRz42Q5EOOHs0eekHuYLPwLCYMNwbOiX6ANSz2653nEkTxKaaGBQ5HRehCD4hUaWZoodqxyIuE7mQPG3nVPpUQ6yn8uT0N1SFZf8/lAVDmhKKLVaBR9QvNI02erBaWRoxGbIos5HsD8YL65vboUPemVIzQ3Xx0E3pqvx0CDLlmQupQ3VBusEhKQbCawiS6iJfhqV/DRjSSaBBTgkHhYJkJWylnHtSdmA9gbOhnsBJlHX8FsYBU5V4kyWh6Sm0aG9pl3ED6N+n5fWLize1aTGaplQbF/KTxry08Qc/ZEMAyas2eAi/Y6GtGyKUQSLSqCieZvC4tjaSthNV5YsjFWdcIxGUeMkQl1UlNgsjjMsMVlD4yStjWnGaaFgu1rX4yQyuEPbRlF6Hp0f6Cr17ZQXDY0xAC2l1GY5LW1fXt/fZhu17z79eP99Z5345fvmRZfDrkHV8Kn39Mp4aP38En42D1hMj7jniMJX+c9LRI+fk+DFJ868/W40J4XKd4V8+QmhwGmx4U2P7SFHhfaFDGU9bjQZol+EF8xW5Upq8eFNleMUz0utOli6Ohxoc04GnJcuE5CFO/izZmDMy58BzQ448KbMwdnXHhz5uCMC2/OHJxx4c0ZTZPiwpszj4wob8o8Mz28GXMwxkXqTEBH9vvS+dfKZiy8Vm5IBrkmS5KwlYOLXgG8t7ElpVtLYy3sY8coDHXJPrX4WtiH7i1PhCcf/pLolMKTffLOGQo8/4iTFFmvfSsXbkRIOIS2afegHbLavHvwDhHdZRDEQ0Sbec9Ix9OlUayVnS/NQoMcNm3NasO7dsY0GEMJWm2cQgl6/8kXnJF83ZzVrpXNWce1sjnHNB0K0bt6/Vq5C7hdK5uz0VKQbs7G18rmbGkSqPXqZWtlcza9VjZnC86Qmm3ONq6VzdnmtbI5h59V94rvn3/p4/7pXyx/B51c8FvKAAADunpUWHRTTUlMRVMgcmRraXQgMjAyNC4wMy4zAAB4nF1Vu27lVgz8lZReQFfg+xEhQAA3W+1WqRau3KdNsx+fIXWTOIYBSffM4WvIoX98fXvn95cfX9++/PN83fc+7oPHu/zvhv75/vLtA/7R4PX3t/vX68tv3798+/fQ9uvDzQ+f318/x/z8+9X++/yc3Z3Su+yb8fzl50uc2sF8POxUzdSDzwgnvuIMLgKgpwXefFKwy5VnhnDvuXn6Qae0seZVpzlRrivc7UHKUgUIRZIfDznVOe140GmZog2oI7QWomRACNQlE4hbKudAkjnWyKs4AHnAak6om8r3lkrnOFTvmLScPXwvVRdsEGjcDUA01hPSW6fU7M3XWkV8nIVkORDyMt4c2nu8IsumuPxk7153LNIiC1m3AQJRYWCMwezYWAHIidMpQ6UgYAEIePO47CwUwYdMpchxEglrnyjZaocCkKdFRV3g3nnuMyPPCc0Fri45vS0Y91nZdAH3zGtCBhzIiUID56xRzBcisWXJYWhR1CDUw08BUreMBhQdOlBzgO1ranaxMULrSYaE1BZ0Zj5CqneMaqiEQ8On8ZglnHsBRBW2Y8CFhK/hVzinoDBeh5i9p1ExmrCkcJvVRguM3Jqh6bqMqgcv5aq2VkgolprKNAeUBLIudL1IlgPkXmtMQBBKkTNtf2YQedoWPscg33f+hBQJbquLenw5NW2VJTljN/UW2l7IDsR578ChNrZlXfF5PXwaVpOd8DKLeYGfAQgdn3mrpplIiCR0XKlhJmZmI8vuuqrZByqM5s5oZvikX0JQ6iNGL7KDzl5LhkHjU2qAjcQJiuNynUnJ9hrqoDuyWn8WefODep5Gjt6OP+hj24TJd75jEf72yNDTFZG7YuyBoSm0VTF2yUYl1cgJlowWrsCgbRlIgtzGCiKVjUYhsRBWlA/CkXd49EFkHaJzpONQEGPqEZj7ep69MkjhzdshYaiA744G5Ds+wZDP+iPMy/YRY9O3HbbYyDg46yYaW2ggQ/l175Rg9HqjaUISg0k5+X1yv2t4HCR1GxhweJ9T9WqAZkMJ1M9QFWNjQdUDpGPOjql/5A5iDfStDrEn0D2ZzdJ2zH4FR2vSSgOAzVrdQxeBPYCtjhFHwoTptqc+0fGeHSFPVWK20nZ1MsYS5cAeE3qMrFbWM8qOpYlNVGDwphKYzx6EzDC1gNAtpDuFPZUhxAbdOARMKruBiyCt+efSLbMIbfctOKAZc+TbRPrl+OuPXwW9zUOHhZ9/A3hEjzPblTCrAAAAAElFTkSuQmCC", "text/html": [ "\n", "
conf_id0
" ], "text/plain": [ "" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "mols[0]" ] }, { "cell_type": "code", "execution_count": 4, "id": "67b966d8-2af7-4cef-985a-f9281ce01283", "metadata": { "execution": { "iopub.execute_input": "2025-01-19T18:07:18.056701Z", "iopub.status.busy": "2025-01-19T18:07:18.056597Z", "iopub.status.idle": "2025-01-19T18:07:18.073502Z", "shell.execute_reply": "2025-01-19T18:07:18.073233Z", "shell.execute_reply.started": "2025-01-19T18:07:18.056691Z" } }, "outputs": [ { "data": { "application/3dmoljs_load.v0": "
\n

3Dmol.js failed to load for some reason. Please check your browser console for error messages.

\n
\n", "text/html": [ "
\n", "

3Dmol.js failed to load for some reason. Please check your browser console for error messages.

\n", "
\n", "" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "text/plain": [ "" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import py3Dmol\n", "from rdkit.Chem import MolToMolBlock\n", "\n", "view = py3Dmol.view(\n", " data=MolToMolBlock(mols[0]), style={\"stick\": {}, \"sphere\": {\"scale\": 0.3}}\n", ")\n", "view.zoomTo()" ] }, { "cell_type": "markdown", "id": "92f6351f-25f5-4493-8404-a40e4255a99c", "metadata": {}, "source": [ "### Conformational fingerprints\n", "\n", "Many fingerprints utilize the 3-dimensional structure of conformers. Many of them come calculate features from distributions of interatomic distances, e.g. [AutocorrFingerprint](https://scikit-fingerprints.github.io/scikit-fingerprints/modules/generated/skfp.fingerprints.AutocorrFingerprint.html#skfp.fingerprints.AutocorrFingerprint), [MORSEFingerprint](https://scikit-fingerprints.github.io/scikit-fingerprints/modules/generated/skfp.fingerprints.MORSEFingerprint.html) and [RDFFingerprint](https://scikit-fingerprints.github.io/scikit-fingerprints/modules/generated/skfp.fingerprints.RDFFingerprint.html). Others work differently, e.g. [E3FPFingerprint](https://scikit-fingerprints.github.io/scikit-fingerprints/modules/generated/skfp.fingerprints.E3FPFingerprint.html) uses extension of ECFP fingerprint to 3D \"balls\" around atoms, summarizing their circular, spatial neighborhoods. Many of those fingerprints are very cheap to compute, just the conformer generation is expensive. Only E3FP has a visibly larger cost.\n", "\n", "Let's use a few of them on generated conformers. We will also use pipelines, which you can read more about in tutorial 3." ] }, { "cell_type": "code", "execution_count": 5, "id": "a04d450a", "metadata": { "execution": { "iopub.execute_input": "2025-01-19T18:07:18.074001Z", "iopub.status.busy": "2025-01-19T18:07:18.073875Z", "iopub.status.idle": "2025-01-19T18:07:32.316312Z", "shell.execute_reply": "2025-01-19T18:07:32.315798Z", "shell.execute_reply.started": "2025-01-19T18:07:18.073991Z" }, "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Autocorrelation\n", "AUROC: 67.75%\n", "\n", "E3FP\n", "AUROC: 76.20%\n", "\n", "MoRSE\n", "AUROC: 69.41%\n", "\n", "RDF\n", "AUROC: 68.65%\n", "\n" ] } ], "source": [ "from sklearn.ensemble import RandomForestClassifier\n", "from sklearn.metrics import roc_auc_score\n", "from sklearn.pipeline import make_pipeline\n", "\n", "from skfp.fingerprints import (\n", " AutocorrFingerprint,\n", " E3FPFingerprint,\n", " MORSEFingerprint,\n", " RDFFingerprint,\n", ")\n", "from skfp.model_selection import scaffold_train_test_split\n", "\n", "mols_train, mols_test, y_train, y_test = scaffold_train_test_split(\n", " mols, y, test_size=0.2\n", ")\n", "\n", "for fp_name, fp_obj in [\n", " (\"Autocorrelation\", AutocorrFingerprint(use_3D=True)),\n", " (\"E3FP\", E3FPFingerprint(n_jobs=-1)),\n", " (\"MoRSE\", MORSEFingerprint()),\n", " (\"RDF\", RDFFingerprint()),\n", "]:\n", " print(fp_name)\n", "\n", " pipeline = make_pipeline(\n", " fp_obj,\n", " RandomForestClassifier(random_state=0),\n", " )\n", " pipeline.fit(mols_train, y_train)\n", "\n", " y_pred = pipeline.predict_proba(mols_test)[:, 1]\n", " auroc = roc_auc_score(y_test, y_pred)\n", "\n", " print(f\"AUROC: {auroc:.2%}\")\n", " print()" ] }, { "cell_type": "markdown", "id": "a6b6f2db-3d2b-4343-8a81-cf0df25d30db", "metadata": {}, "source": [ "### Parameters for conformer generation\n", "\n", "Generating conformers with `ConformerGenerator` is easy with default parameters, which are designed as a reasonable tradeoff between quality, speed, and reliability. They are:\n", "1. Generate 1 conformer per molecule\n", "2. Max 1000 attempts to generate per molecule\n", "3. No force field optimization\n", "\n", "In case of optimization failure, as a fallback we try 10x as many iterations and randomized initial coordinates, as it often helps with harder molecules. As a last resort, we also remove enforcing chirality and ignore smoothing failures. If this fails, error is raised by default.\n", "\n", "For higher quality, but also at considerably higher computational cost, we can generate more conformers and select the one with the lowest energy (most stable one), and also enable force field optimization like [UFF](https://doi.org/10.1021/ja00051a040) or [MMFF94](https://en.wikipedia.org/wiki/Merck_molecular_force_field).\n", "\n", "Let's compare those two options, and also measure the time and classification quality of resulting algorithms. The quality is not necessarily always much higher, but it may be useful if the base case already performs well for a given use case." ] }, { "cell_type": "code", "execution_count": 9, "id": "a906baa0-b191-4859-9418-fcc1cb8fb740", "metadata": { "execution": { "iopub.execute_input": "2025-01-19T18:19:06.937236Z", "iopub.status.busy": "2025-01-19T18:19:06.936996Z", "iopub.status.idle": "2025-01-19T18:20:45.597853Z", "shell.execute_reply": "2025-01-19T18:20:45.597346Z", "shell.execute_reply.started": "2025-01-19T18:19:06.937221Z" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Default time: 12.41\n", "AUROC default: 76.20%\n", "\n", "Quality-optimized time: 44.14\n", "AUROC optimized: 78.69%\n" ] } ], "source": [ "from time import time\n", "\n", "start = time()\n", "pipeline_default = make_pipeline(\n", " ConformerGenerator(n_jobs=-1),\n", " E3FPFingerprint(n_jobs=-1),\n", " RandomForestClassifier(random_state=0),\n", ")\n", "pipeline_default.fit(mols_train, y_train)\n", "end = time()\n", "y_pred_default = pipeline_default.predict_proba(mols_test)[:, 1]\n", "auroc_default = roc_auc_score(y_test, y_pred_default)\n", "print(f\"Default time: {end - start:.2f}\")\n", "print(f\"AUROC default: {auroc_default:.2%}\")\n", "\n", "print()\n", "\n", "start = time()\n", "pipeline_optimized = make_pipeline(\n", " ConformerGenerator(num_conformers=3, optimize_force_field=\"UFF\", n_jobs=-1),\n", " E3FPFingerprint(n_jobs=-1),\n", " RandomForestClassifier(random_state=0),\n", ")\n", "pipeline_optimized.fit(mols_train, y_train)\n", "end = time()\n", "y_pred_optimized = pipeline_optimized.predict_proba(mols_test)[:, 1]\n", "auroc_optimized = roc_auc_score(y_test, y_pred_optimized)\n", "print(f\"Quality-optimized time: {end - start:.2f}\")\n", "print(f\"AUROC optimized: {auroc_optimized:.2%}\")" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.21" } }, "nbformat": 4, "nbformat_minor": 5 }