{ "cells": [ { "cell_type": "markdown", "id": "20f329f1-2859-4eb4-b152-e3cc5c9150a5", "metadata": {}, "source": [ "# Simple example\n", "\n", "This section will present the main functionality of the `pymcdm-reidentify` library. In this example, we will use artificial data in the form of a decision matrix and other parameters needed for the models.\n", "\n", "Since the `pymcdm-reidentify` library is based on `pymcdm` and `mealpy`, these modules will also be utilized in this example." ] }, { "cell_type": "code", "execution_count": 1, "id": "e1c67279c110c2fe", "metadata": { "ExecuteTime": { "end_time": "2024-08-21T11:06:30.286762Z", "start_time": "2024-08-21T11:06:25.822974Z" } }, "source": [ "import numpy as np\n", "import matplotlib.pyplot as plt\n", "\n", "from mealpy.swarm_based.PSO import OriginalPSO\n", "\n", "from pymcdm.methods import TOPSIS, SPOTIS, COMET\n", "from pymcdm.methods.comet_tools import MethodExpert, get_local_weights\n", "from pymcdm.visuals import comet_contourf, ranking_scatter\n", "\n", "from pymcdm_reidentify.methods import SESP, SITCOM, SITW, SITWLocal, STFN, STRFN\n", "from pymcdm_reidentify.methods.comet_tools import MLExpert\n", "from pymcdm_reidentify.visuals import model_contourf, fitness_plot, tfn_plot, trfn_plot, weights_diff_plot\n", "from pymcdm_reidentify.normalizations import FuzzyNormalization" ], "outputs": [] }, { "cell_type": "markdown", "id": "63390e10-80fd-406d-a875-1ea7e167440b", "metadata": {}, "source": [ "Creation of an artificial decision matrix with 2 criteria and 1000 alternatives." ] }, { "cell_type": "code", "execution_count": 2, "id": "d65ea569-efa9-4d52-a7a8-addfe1e1bc14", "metadata": {}, "source": [ "matrix = np.random.random((1000, 2))" ], "outputs": [] }, { "cell_type": "markdown", "id": "2a6fc7c8-3020-43ba-8b2f-44ea800f7f09", "metadata": {}, "source": [ "Definition of expert parameters such as weights, criteria types or expected solution point. All according to the logic of the MCDA/MCDM methods presented [here](https://pymcdm.readthedocs.io/en/master/example/Example.html)." ] }, { "cell_type": "code", "execution_count": 3, "id": "2fb7d3d2-ebd5-4df9-821b-730bac15f640", "metadata": {}, "source": [ "types = np.array([-1, 1])\n", "weights = np.array([0.5, 0.5])\n", "esp = np.array([0.5, 0.5])" ], "outputs": [] }, { "cell_type": "markdown", "id": "8ddbed97-139d-4aa1-ae0a-37a06545778e", "metadata": {}, "source": [ "Below is a 3D visualisation of the unknown expert model. The SPOTIS method was used to create it." ] }, { "cell_type": "code", "execution_count": 4, "id": "89741287-b12f-4b81-8e9a-bb4d084d644b", "metadata": {}, "source": [ "bounds = np.array([[0, 1], [0, 1]])\n", "spotis = SPOTIS(bounds, esp)" ], "outputs": [] }, { "cell_type": "code", "execution_count": 5, "id": "a9d6c27b-b4a1-4c97-a123-57ee5a34f3b2", "metadata": {}, "source": [ "preference_unknown_model = spotis(matrix, weights, types)\n", "rank_unknown_model = spotis.rank(preference_unknown_model)" ], "outputs": [] }, { "cell_type": "code", "execution_count": 6, "id": "32721115-1d70-4341-9ab1-ad262a54dbb5", "metadata": {}, "source": [ "fig, ax = plt.subplots(figsize=(3, 3), dpi=150, tight_layout=True)\n", "model_contourf(spotis, bounds, esp=esp, model_kwargs={'weights': weights, 'types': types})\n", "ax.set_title('Unknown expert model')\n", "plt.show()" ], "outputs": [] }, { "cell_type": "markdown", "id": "057acec7-28d7-4e1d-8999-f2893721cc80", "metadata": {}, "source": "## Using re-identification methods to develop a similar expert model" }, { "cell_type": "markdown", "id": "24f3e229-cf38-492d-befc-8c3f8793ccfe", "metadata": {}, "source": "### Stochastic Expected Solution Point (SESP)" }, { "cell_type": "markdown", "id": "701bf8ef-401b-4478-ac7c-82e4af9ce83a", "metadata": {}, "source": [ "The first method that will be presented to re-identify the expert model will be the SESP method. It is characterised by the search for an Expected Solution Point, with which a decision model can be identified.\n", "\n", "A stochastic optimization method will be used to search for the Expected Solution Point in this case. The implementation will use the MealPy library, which is based on various stochastic methods, and it is this library that will be used in this example." ] }, { "cell_type": "markdown", "id": "242a5720-72d9-47a3-a695-327636d256b1", "metadata": {}, "source": [ "For this method, it is necessary to have the information related to the learning of such a model, i.e. the alternatives, weights, types and boundaries of the decision model.In this case, the data from the previous model will be used.\n", "\n", "In this example:\n", "- explicit data are: alternatives, rank of alternatives, weights, types and boundaries of the decision model\n", "- implicit data are: esp" ] }, { "cell_type": "code", "execution_count": 7, "id": "facbfb83-e89f-4616-bb44-cd338ad499e7", "metadata": {}, "source": [ "# Selection of a stochastic optimization method with its hyper parameters.\n", "stoch = OriginalPSO(epoch=1000, pop_size=100)\n", "\n", "# Creation of a SEPS model object\n", "sesp = SESP(stoch.solve, SPOTIS(bounds), types)" ], "outputs": [] }, { "cell_type": "code", "execution_count": 8, "id": "46f10d01-d257-47cc-9039-1bfcbafd4188", "metadata": {}, "source": [ "# Learning the model. The log_to argument was used to avoid search messages per ESP epoch.\n", "sesp.fit(matrix, rank_unknown_model, log_to=None)" ], "outputs": [] }, { "cell_type": "code", "execution_count": 9, "id": "8effe7f2-bca6-4c42-99d8-c08622c88fcc", "metadata": {}, "source": [ "# Visualisation of the learning process \n", "fig, ax = plt.subplots(figsize=(5, 2), dpi=150, tight_layout=True)\n", "fitness_plot(stoch, ax=ax)\n", "plt.show()" ], "outputs": [] }, { "cell_type": "code", "execution_count": 10, "id": "33609852-7b64-497c-8d57-d8e1b48f0221", "metadata": {}, "source": [ "# Reference ESP value\n", "print(esp)\n", "# Obtained ESP value\n", "print(sesp())\n", "# Mean absolute error\n", "print(np.sum(np.abs(esp - sesp())))" ], "outputs": [] }, { "cell_type": "markdown", "id": "18f16362-8852-4c56-a6b1-aebb1ee848b4", "metadata": {}, "source": "### Stochastic IdenTification Of Models (SITCOM)" }, { "cell_type": "markdown", "id": "0bedecb7-619a-4220-b544-4d4ec4384f6b", "metadata": {}, "source": [ "The second method related to the re-identification of decision models is SITCOM. This method is mainly based on the assumptions of the COMET approach, where, using stochastic optimization, the preference values of characteristic objects are searched with which a similar COMET model can be reconstructed. \n", "\n", "In this example:\n", "- explicit data are: alternatives, rank of alternatives, characteristics values\n", "- implicit data are: preference of characteristics objects" ] }, { "cell_type": "code", "execution_count": 11, "id": "57f51360-476f-488d-852c-4a544b0e668c", "metadata": {}, "source": [ "# Creation of characteristic values\n", "cvalues = np.vstack((\n", " [0, 0],\n", " [0.5, 0.5],\n", " [1, 1]\n", ")).T" ], "outputs": [] }, { "cell_type": "code", "execution_count": 12, "id": "18df1229-20a5-4570-9559-47aac625cd4a", "metadata": {}, "source": [ "# Selection of a stochastic optimization method with its hyper parameters.\n", "stoch = OriginalPSO(epoch=1000, pop_size=100)\n", "\n", "# Creation of a SITCOM model object\n", "sitcom = SITCOM(stoch.solve, cvalues)" ], "outputs": [] }, { "cell_type": "code", "execution_count": 13, "id": "dcfcddcb-df1a-44ab-a3f1-bb03f1aa0f52", "metadata": {}, "source": [ "# Learning the model. The log_to argument was used to avoid search messages per preference of characteristics objects epoch.\n", "sitcom.fit(matrix, rank_unknown_model, log_to=None)" ], "outputs": [] }, { "cell_type": "code", "execution_count": 14, "id": "f6b168b7-adcf-4cb4-90ce-e8372bd2143b", "metadata": {}, "source": [ "# Visualization of the resulting SITCOM model\n", "fig, ax = plt.subplots(figsize=(3, 3), dpi=150, tight_layout=True)\n", "comet_contourf(sitcom.model, np.array([[0.5, 0.5]]), 10)\n", "ax.set_title('SITCOM model')\n", "plt.show()" ], "outputs": [] }, { "cell_type": "code", "execution_count": 15, "id": "03147fad-0b2b-4fc8-9aa7-06e8a1d3ea69", "metadata": {}, "source": [ "# Visualisation of the learning process \n", "fig, ax = plt.subplots(figsize=(5, 2), dpi=150, tight_layout=True)\n", "fitness_plot(stoch, ax=ax)\n", "plt.show()" ], "outputs": [] }, { "cell_type": "markdown", "id": "793fcde2-f7c0-4ac6-95a6-75914f99a4a1", "metadata": {}, "source": "### Stochastic Identification of Weights (SITW)" }, { "cell_type": "markdown", "id": "beafda19-ff13-403c-84ed-5c43de628afc", "metadata": {}, "source": [ "The third SITW method related to the re-identification of decision models. For this method, re-identification is related to the search for criterion weights. This approach uses the MCDA/MCDM base model combined with a stochastic optimization method to search for weights.\n", "\n", "In this example:\n", "- explicit data are: alternatives, rank of alternatives, types of criteria\n", "- implicit data are: criteria weights" ] }, { "cell_type": "code", "execution_count": 16, "id": "6db2ed49-c81b-4818-bf26-cb6d944c56ec", "metadata": {}, "source": [ "# Selection of a stochastic optimization method with its hyperparameters.\n", "stoch = OriginalPSO(epoch=1000, pop_size=100)\n", "\n", "# Creation of a SITW model object\n", "sitw = SITW(stoch.solve, SPOTIS(bounds, esp), types)" ], "outputs": [] }, { "cell_type": "code", "execution_count": 17, "id": "2e72a272-f644-4515-b8e6-11d9848036d3", "metadata": {}, "source": [ "# Learning the model. The log_to argument was used to avoid search messages per criteria weights epoch.\n", "sitw.fit(matrix, rank_unknown_model, log_to=None)" ], "outputs": [] }, { "cell_type": "code", "execution_count": 18, "id": "68d2b5cb-8f05-4b24-a1f4-9800e833e6c8", "metadata": {}, "source": [ "# Visualisation of the learning process \n", "fig, ax = plt.subplots(figsize=(5, 2), dpi=150, tight_layout=True)\n", "fitness_plot(stoch, ax=ax)\n", "plt.show()" ], "outputs": [] }, { "cell_type": "code", "execution_count": 19, "id": "b3a4d8b8-8aaf-4890-a2b0-50969e2afa7d", "metadata": {}, "source": [ "fig, ax = plt.subplots(figsize=(5, 2), dpi=150, tight_layout=True)\n", "weights_diff_plot(weights, sitw(), ax=ax)\n", "plt.show()" ], "outputs": [] }, { "cell_type": "code", "execution_count": 20, "id": "587b4368-0bb3-4ba2-9f07-995f00a1af3b", "metadata": {}, "source": [ "# Reference weights\n", "weights" ], "outputs": [] }, { "cell_type": "code", "execution_count": 21, "id": "92708399-5a50-4c9b-8cb6-beeb16d42192", "metadata": {}, "source": [ "# Obtained weights\n", "sitw()" ], "outputs": [] }, { "cell_type": "code", "execution_count": 22, "id": "1f33bc00-f570-4367-8a59-f7600e0783ab", "metadata": {}, "source": [ "# Mean absolute error\n", "print(np.sum(np.abs(weights - sitw())))" ], "outputs": [] }, { "cell_type": "markdown", "id": "95b9e723-aa1b-4e9d-a150-d9c6a452b4ca", "metadata": {}, "source": "### Stochastic Identification of Weights based Local weights (SITWLocal)" }, { "cell_type": "markdown", "id": "497d9b71-f765-4934-9c4c-9659e860321c", "metadata": {}, "source": [ "The fourth method related to re-identification is the SITWLocal method. In this method, reidentification is based on the determined local weights using the COMET method, unlike the other methods, which are based on ranking-based reidentification. The SITWLocal method is used to determine global weights.\n", "\n", "In this example:\n", "- explicit data are: alternatives, local weights of alternatives, types of criteria\n", "- implicit data are: criteria weights" ] }, { "cell_type": "code", "execution_count": 23, "id": "ea92fcda-525e-4893-8e36-bacb558e7668", "metadata": {}, "source": [ "# Unknown COMET expert model\n", "model_comet = COMET(COMET.make_cvalues(matrix, 3), MethodExpert(TOPSIS(), weights, types))" ], "outputs": [] }, { "cell_type": "code", "execution_count": 24, "id": "26f6e127-dab2-4988-802d-2547bae66744", "metadata": {}, "source": [ "# Local weights from an unknown expert model\n", "local_weights = np.array([get_local_weights(model_comet, alt, percent_step=0.01) for alt in matrix])" ], "outputs": [] }, { "cell_type": "code", "execution_count": 25, "id": "368b1acb-03f2-4a62-870f-f6dc633d7fd5", "metadata": {}, "source": [ "# Selection of a stochastic optimization method with its hyperparameters.\n", "stoch = OriginalPSO(epoch=200, pop_size=20)\n", "\n", "# Creation of a SITWLocal model object\n", "sitw_local = SITWLocal(stoch.solve, TOPSIS(), types)" ], "outputs": [] }, { "cell_type": "code", "execution_count": 26, "id": "28c322f2-ffe9-4566-a0a0-7e6d1a8e762a", "metadata": {}, "source": [ "# Learning the model. The log_to argument was used to avoid search messages per criteria weights epoch.\n", "sitw_local.fit(matrix, local_weights, log_to=None)" ], "outputs": [] }, { "cell_type": "code", "execution_count": 27, "id": "5cd008a1-dbba-49a9-80fc-11d583242212", "metadata": {}, "source": [ "sitw_local()" ], "outputs": [] }, { "cell_type": "markdown", "id": "bfb21d55-0476-4078-ba57-ddff327e371e", "metadata": {}, "source": [ "### Stochastic Triangular Fuzzy Normalization (STFN)\n", "\n", "The fifth approach to re-identifying decision models, using fuzzy normalization, involves re-identifying the cores for the triangular numbers used to normalize the decision matrix, i.e., for individual criteria. A stochastic optimization method is used to select appropriate cores for TFNs." ] }, { "cell_type": "code", "execution_count": 28, "id": "9c5929be-e8ff-46b5-ab7d-69bbc2bb6df2", "metadata": {}, "source": [ "# Selection of a stochastic optimization method with its hyperparameters.\n", "stoch = OriginalPSO(epoch=1000, pop_size=100)\n", "\n", "# Creation of a STFN model object\n", "stfn = STFN(stoch.solve, TOPSIS(), bounds, weights)" ], "outputs": [] }, { "cell_type": "code", "execution_count": 29, "id": "ab2f2be8-3630-4253-b0f6-f820b5bdcfd2", "metadata": {}, "source": [ "# Learning the model. The log_to argument was used to avoid search messages per cores epoch.\n", "stfn.fit(matrix, rank_unknown_model, log_to=None)" ], "outputs": [] }, { "cell_type": "code", "execution_count": 30, "id": "f0f98839-6b79-4aa0-a6eb-361437153c38", "metadata": {}, "source": [ "# Identified cores values\n", "stfn.cores" ], "outputs": [] }, { "cell_type": "code", "execution_count": 31, "id": "0931a6b8-a50a-4edb-93a6-26f38d4767cd", "metadata": {}, "source": [ "# Plotting the identified triangular fuzzy numbers together with the values of the cores used for normalization\n", "for i, (fun, a, m, b) in enumerate(zip(stfn(), stfn.lb, stfn.cores, stfn.ub), 1):\n", " fig, ax = plt.subplots(figsize=(4.5, 2.2), dpi=150, tight_layout=True)\n", " tfn_plot(fun, a, m, b, crit=i, ax=ax)\n", " plt.show()" ], "outputs": [] }, { "cell_type": "code", "execution_count": 32, "id": "7a1a1c9a-e9e9-4cd9-b364-e2874f42b2d9", "metadata": {}, "source": [ "# Creating a TOPSIS model with identified triangular numbers for normalization\n", "topsis_fn = TOPSIS(FuzzyNormalization(stfn()))" ], "outputs": [] }, { "cell_type": "code", "execution_count": 33, "id": "00088734-2f2d-446a-8ad0-8ef292faefac", "metadata": {}, "source": [ "# Exemplary view of the STFN-TOPSIS model\n", "fig, ax = plt.subplots(figsize=(3, 3), dpi=150, tight_layout=True)\n", "model_contourf(topsis_fn, bounds, esp=stfn.cores, model_kwargs={'weights': weights, 'types': types}, text_kwargs={'text': '$TFNs_{Cores}$'})\n", "ax.set_title('STFN-TOPSIS')\n", "plt.show()" ], "outputs": [] }, { "cell_type": "markdown", "id": "6488a38e-9893-4753-b5f9-ccd1ef7b1edc", "metadata": {}, "source": [ "### Stochastic Trapezoidal Fuzzy Normalization (STRFN)\n", "\n", "The sixth approach to re-identifying decision models, using fuzzy normalization, involves re-identifying the cores for the trapezoidal numbers used to normalize the decision matrix, i.e. for individual criteria. A stochastic optimization method is used to select appropriate cores for TRFN." ] }, { "cell_type": "code", "execution_count": 34, "id": "6880a7b3-b3f9-47f2-a972-1dfed4643c1d", "metadata": {}, "source": [ "# Selection of a stochastic optimization method with its hyperparameters.\n", "stoch = OriginalPSO(epoch=1000, pop_size=100)\n", "\n", "# Creation of a STRFN model object\n", "strfn = STRFN(stoch.solve, TOPSIS(), bounds, weights)" ], "outputs": [] }, { "cell_type": "code", "execution_count": 35, "id": "a6d78050-120f-4c6c-83b6-9ebb4a8bd5a0", "metadata": {}, "source": [ "# Learning the model. The log_to argument was used to avoid search messages per cores epoch.\n", "strfn.fit(matrix, rank_unknown_model, log_to=None)" ], "outputs": [] }, { "cell_type": "code", "execution_count": 36, "id": "4c677dd4-57a3-4d41-bd39-f0c392890eb1", "metadata": {}, "source": [ "# Plotting the identified trapezoidal fuzzy numbers together with the values of the cores used for normalization\n", "for i, (fun, a, b, c, d) in enumerate(zip(strfn(), strfn.lb[::2], strfn.cores[::2], strfn.cores[1::2], strfn.ub[1::2]), 1):\n", " fig, ax = plt.subplots(figsize=(4.5, 2.2), dpi=150, tight_layout=True)\n", " trfn_plot(fun, a, b, c, d, crit=i, ax=ax)\n", " plt.show()" ], "outputs": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.0" } }, "nbformat": 4, "nbformat_minor": 5 }