{ "cells": [ { "cell_type": "markdown", "metadata": { "id": "ADioeXA4e9DY" }, "source": [ "# Python для анализа данных" ] }, { "cell_type": "code", "execution_count": 1, "metadata": { "id": "Qri5nDu-eSXQ" }, "outputs": [], "source": [ "import numpy as np" ] }, { "cell_type": "markdown", "metadata": { "id": "w5l3lSPkeSXN" }, "source": [ "# Библиотека `numpy`\n", "\n", "![440px-NumPy_logo_2020.svg.png]()\n", "\n", "Пакет `numpy` предоставляет $n$-мерные однородные массивы (все элементы одного типа); в них нельзя вставить или удалить элемент в произвольном месте. В `numpy` реализовано много операций над массивами в целом. Если задачу можно решить, произведя некоторую последовательность операций над массивами, то это будет столь же эффективно, как в `C` или `matlab` — львиная доля времени тратится в библиотечных функциях, написанных на `C`.\n", "\n", "\n", "## 1. Одномерные массивы\n", "\n", "#### 1.1 Типы массивов, атрибуты" ] }, { "cell_type": "markdown", "metadata": { "id": "4OYggcLMeSXY" }, "source": [ "Можно преобразовать список в массив." ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "60DzfgyMeSXa", "outputId": "a03727c2-4295-43f2-d0ec-07ff0863bc6e" }, "outputs": [ { "data": { "text/plain": [ "(array([ 5, 7, -3, 4, 2, -4]), numpy.ndarray)" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a = np.array([5, 7, -3, 4, 2, -4])\n", "a, type(a)" ] }, { "cell_type": "markdown", "metadata": { "id": "G1DRYeu5eSX0" }, "source": [ "#### 1.2 Индексация\n", "\n", "Индексировать массив можно обычным образом." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "Y9tGXwDIeSX1", "outputId": "cb25adaf-3e7c-4ff2-afb3-e86186d9e949" }, "outputs": [ { "data": { "text/plain": [ "np.int64(7)" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a[1]" ] }, { "cell_type": "markdown", "metadata": { "id": "rmKD1RweeSX3" }, "source": [ "Массивы — изменяемые объекты." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "QlkGiSjxeSX3", "outputId": "8bb4dfec-01c5-40f0-d378-852198c3c626" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 5 3 -3 4 2 -4]\n" ] } ], "source": [ "a[1] = 3\n", "print(a)" ] }, { "cell_type": "markdown", "metadata": { "id": "emMTo7dI3Ovd" }, "source": [ "Хотим удалить из него отрицательные значения." ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "gnEca4_-3Ove", "outputId": "b6cbc36c-d379-4a3f-bf3e-899b7089db66" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ True True False True True False]\n", "[5 3 4 2]\n" ] } ], "source": [ "print(a > 0)\n", "print(a[a > 0])" ] }, { "cell_type": "markdown", "metadata": { "id": "6n7n_2-n3Ove" }, "source": [ "Кроме того, отфильтрованные значения можно заполнить, например, нулями." ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "3gF4sDRQ3Ove", "outputId": "8060fa35-02b1-4b44-d3a6-1934691e1c01" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[5 3 0 4 2 0]\n" ] } ], "source": [ "a[a < 0] = 0\n", "print(a)" ] }, { "cell_type": "markdown", "metadata": { "id": "ViI6VqnXeSYE" }, "source": [ "#### 1.3 Создание массивов\n", "\n", "Массивы, заполненные нулями или единицами. Часто лучше сначала создать такой массив, а потом присваивать значения его элементам." ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "V8yK0FLteSYF", "outputId": "278ad938-6cdd-428a-d9be-d808c509d120" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0. 0. 0.]\n", "[1 1 1]\n" ] } ], "source": [ "a = np.zeros(3)\n", "b = np.ones(3, dtype=np.int64)\n", "print(a)\n", "print(b)" ] }, { "cell_type": "markdown", "metadata": { "id": "-nnOn0gNeSYL" }, "source": [ "Функция `arange` подобна `range`. Аргументы могут быть с плавающей точкой. Следует избегать ситуаций, когда *(конец-начало)/шаг* — целое число, потому что в этом случае включение последнего элемента зависит от ошибок округления. Лучше, чтобы конец диапазона был где-то посредине шага." ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "gqTdkX18eSYL", "outputId": "478de6cf-8ae3-4432-ffcc-37332b82a066" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0 2 4 6 8]\n", "[0. 2. 4. 6. 8.]\n" ] } ], "source": [ "a = np.arange(0, 9, 2)\n", "b = np.arange(0., 9, 2)\n", "print(a)\n", "print(b)" ] }, { "cell_type": "markdown", "metadata": { "id": "vgPAKPaqeSYP" }, "source": [ "Последовательности чисел с постоянным шагом можно также создавать функцией `linspace`. Начало и конец диапазона включаются; последний аргумент — число точек." ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "l1ruNluSeSYP", "outputId": "00c64d37-44a6-405c-c235-92bde302958e", "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0. 2. 4. 6. 8.]\n" ] } ], "source": [ "a = np.linspace(0, 8, 5)\n", "print(a)" ] }, { "cell_type": "markdown", "metadata": { "id": "7t8OsUzt3Ovf" }, "source": [ "Функция `np.random.random()`.\n", "Данная функция создает массив указанной формы и заполняет его случайными числами с плавающей точкой из непрерывного равномерного распределения в интервале [0, 1)." ] }, { "cell_type": "code", "execution_count": 10, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "K9nQT1HY3Ovg", "outputId": "a5293cce-2105-4c68-e495-bd22be6584b3" }, "outputs": [ { "data": { "text/plain": [ "array([0.28194415, 0.10660636, 0.3367799 , 0.66343749, 0.79696398])" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.random.random(5)" ] }, { "cell_type": "markdown", "metadata": { "id": "QFhzbGhb3Ovg" }, "source": [ "Функция `np.random.choice`.\n", "Генерирует случайную выборку из заданного одномерного массива." ] }, { "cell_type": "code", "execution_count": 11, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "ffjn77Hd3Ovg", "outputId": "dbe6b448-35ce-472a-f244-20e0504689f2" }, "outputs": [ { "data": { "text/plain": [ "array([6, 9, 1, 5])" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.random.choice(10, 4)" ] }, { "cell_type": "code", "execution_count": 12, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "iVEbZETZ3Ovg", "outputId": "9029dbf0-b7d6-4c4d-d6fd-823866abc62c" }, "outputs": [ { "data": { "text/plain": [ "array([5, 0, 8, 3, 2, 2, 0])" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.random.choice(10, 7, replace=True)" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "6OidJf3_3Ovh", "outputId": "1ab8ed83-bc27-4d3e-bd2c-d79882d9f426" }, "outputs": [ { "data": { "text/plain": [ "array(['foo', 'bar'], dtype=' b)\n", "print(a == b)" ] }, { "cell_type": "markdown", "metadata": { "id": "vQ05bHN4eSYo" }, "source": [ "Кванторы \"существует\" и \"для всех\"." ] }, { "cell_type": "code", "execution_count": 17, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "HTMkK7wYeSYo", "outputId": "fc93a0d1-47a9-4e21-dd2e-30c38550bb6c" }, "outputs": [ { "data": { "text/plain": [ "(np.True_, np.False_)" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.any(a == b), np.all(a == b)" ] }, { "cell_type": "markdown", "metadata": { "id": "YdXbXq--eSYp" }, "source": [ "Модификация на месте." ] }, { "cell_type": "code", "execution_count": 18, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "iNT_nYSveSYq", "outputId": "487b93fa-4c98-4226-ec2d-e8be873ce59c" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0 2 1]\n", "[1 3 2]\n" ] } ], "source": [ "print(a)\n", "a += 1\n", "print(a)" ] }, { "cell_type": "code", "execution_count": 19, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "CItC6JrLeSYr", "outputId": "61f4a55f-da5b-4164-f0b3-e55eb8c47389" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[3 2 5]\n", "[ 6 4 10]\n" ] } ], "source": [ "print(b)\n", "b *= 2\n", "print(b)" ] }, { "cell_type": "markdown", "metadata": { "id": "JtiaPtiGeSYu" }, "source": [ "При выполнении операций над массивами деление на 0 не возбуждает исключения, а даёт значения `np.nan` или `np.inf`." ] }, { "cell_type": "code", "execution_count": 20, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "uksiTX96eSYu", "outputId": "37ed7b87-2c42-4b43-d700-c579742a92c0" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 0. nan inf -inf]\n" ] }, { "name": "stderr", "output_type": "stream", "text": [ "/tmp/ipykernel_28824/3088186758.py:1: RuntimeWarning: divide by zero encountered in divide\n", " print(np.array([0.0, 0.0, 1.0, -1.0]) / np.array([1.0, 0.0, 0.0, 0.0]))\n", "/tmp/ipykernel_28824/3088186758.py:1: RuntimeWarning: invalid value encountered in divide\n", " print(np.array([0.0, 0.0, 1.0, -1.0]) / np.array([1.0, 0.0, 0.0, 0.0]))\n" ] } ], "source": [ "print(np.array([0.0, 0.0, 1.0, -1.0]) / np.array([1.0, 0.0, 0.0, 0.0]))" ] }, { "cell_type": "code", "execution_count": 21, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "sXwqQVDHeSYw", "outputId": "0ecb63d5-f780-4ac2-b826-50db4536b8bb" }, "outputs": [ { "data": { "text/plain": [ "(nan, inf, nan, 0.0)" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.nan + 1, np.inf + 1, np.inf * 0, 1. / np.inf" ] }, { "cell_type": "markdown", "metadata": { "id": "ukM-LTuAeSYx" }, "source": [ "Сумма и произведение всех элементов массива; максимальный и минимальный элемент; среднее и среднеквадратичное отклонение." ] }, { "cell_type": "code", "execution_count": 22, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "ZIo9nEFCeSYx", "outputId": "4d1170a6-ca34-4ba3-9fec-cefb2f6930ba" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 6 4 10]\n" ] }, { "data": { "text/plain": [ "(np.int64(20),\n", " np.int64(240),\n", " np.int64(10),\n", " np.int64(4),\n", " np.float64(6.666666666666667),\n", " np.float64(2.494438257849294))" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "print(b)\n", "b.sum(), b.prod(), b.max(), b.min(), b.mean(), b.std()" ] }, { "cell_type": "markdown", "metadata": { "id": "rju3ggwJeSYy" }, "source": [ "Имеются встроенные функции." ] }, { "cell_type": "code", "execution_count": 23, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "H3xDbf6PeSYz", "outputId": "baec065c-9065-42d7-cd91-e134f188ad3f" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[2.44948974 2. 3.16227766]\n", "[ 403.42879349 54.59815003 22026.46579481]\n", "[1.79175947 1.38629436 2.30258509]\n", "[-0.2794155 -0.7568025 -0.54402111]\n", "2.718281828459045 3.141592653589793\n" ] } ], "source": [ "print(np.sqrt(b))\n", "print(np.exp(b))\n", "print(np.log(b))\n", "print(np.sin(b))\n", "print(np.e, np.pi)" ] }, { "cell_type": "markdown", "metadata": { "id": "g6EBDaBWeSY0" }, "source": [ "Иногда бывает нужно использовать частичные (кумулятивные) суммы. В наших курсах такое может пригодиться." ] }, { "cell_type": "code", "execution_count": 24, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "JzFVH5DIeSY0", "outputId": "0314f209-9510-4f7e-f65d-09119992fafd" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 6 10 20]\n" ] } ], "source": [ "print(b.cumsum())" ] }, { "cell_type": "markdown", "metadata": { "id": "dyaDa9YXeSY1" }, "source": [ "#### 2.2 Сортировка, изменение массивов\n", "\n", "Функция `sort` возвращает отсортированную копию, метод `sort` сортирует на месте." ] }, { "cell_type": "code", "execution_count": 25, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "XS7IRW09eSY2", "outputId": "f6228618-b2a8-4041-a7ae-e10214852c84" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 6 4 10]\n", "[ 4 6 10]\n", "[ 6 4 10]\n" ] } ], "source": [ "print(b)\n", "print(np.sort(b))\n", "print(b)" ] }, { "cell_type": "code", "execution_count": 26, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "QENzFihJeSY3", "outputId": "20d00e78-04b9-4470-f638-2de205afc226" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 6 4 10]\n", "[ 4 6 10]\n" ] } ], "source": [ "print(b)\n", "b.sort()\n", "print(b)" ] }, { "cell_type": "markdown", "metadata": { "id": "ZPjBXM_OeSY4" }, "source": [ "Объединение массивов." ] }, { "cell_type": "code", "execution_count": 27, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "_SYn0MdseSY4", "outputId": "e8a0d1ee-e7f3-4be2-e0e0-20c16fc72202" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1 3 2]\n", "[ 4 6 10]\n", "[ 1 3 2 4 6 10]\n" ] } ], "source": [ "print(a)\n", "print(b)\n", "a = np.hstack((a, b))\n", "print(a)" ] }, { "cell_type": "markdown", "metadata": { "id": "q669pBe0eSY_" }, "source": [ "#### 2.3 Способы индексации массивов\n", "\n", "Есть несколько способов индексации массива. Вот обычный индекс." ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "oW8SgWFLeSZA", "outputId": "f070500a-badc-48d6-d7a7-f60e5760d22b" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0. 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1. ]\n" ] } ], "source": [ "a = np.linspace(0, 1, 11)\n", "print(a)" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "7eQ_xryueSZB", "outputId": "a03b4e1c-2b57-4816-a11c-c5eea54efa9d" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.2\n" ] } ], "source": [ "print(a[2])" ] }, { "cell_type": "markdown", "metadata": { "id": "corcUOljeSZC" }, "source": [ "Диапазон индексов. Создаётся новый заголовок массива, указывающий на те же данные. Изменения, сделанные через такой массив, видны и в исходном массиве." ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "g7sRF3GqeSZC", "outputId": "c0f0da24-fab0-45f6-f0a2-2a642a72249c" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0.2 0.3 0.4 0.5]\n", "[-0.2 0.3 0.4 0.5]\n", "[ 0. 0.1 -0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1. ]\n" ] } ], "source": [ "b = a[2:6]\n", "print(b)\n", "\n", "b[0] = -0.2\n", "print(b)\n", "\n", "print(a)" ] }, { "cell_type": "markdown", "metadata": { "id": "lE9gziSweSZG" }, "source": [ "Диапазон с шагом 2." ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "wLEQPJsXeSZH", "outputId": "d9c1120d-97bf-472b-9720-a5e2d7c25b6b" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0.1 0.3 0.5 0.7 0.9]\n" ] } ], "source": [ "b = a[1:10:2]\n", "print(b)" ] }, { "cell_type": "markdown", "metadata": { "id": "Nh9e4ZWNeSZK" }, "source": [ "Массив в обратном порядке." ] }, { "cell_type": "code", "execution_count": 32, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "LKbKi90neSZK", "outputId": "aa8126fe-1a81-4dcb-dc57-1c83e9f983f2" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 1. 0.9 0.8 0.7 0.6 0.5 0.4 0.3 -0.2 0.1 0. ]\n" ] } ], "source": [ "b = a[::-1]\n", "print(b)" ] }, { "cell_type": "markdown", "metadata": { "id": "Zjq__bNXeSZP" }, "source": [ "Чтобы скопировать и данные массива, нужно использовать метод `copy`." ] }, { "cell_type": "code", "execution_count": 33, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "VVds6EI5eSZP", "outputId": "480a6d30-e8fd-4c51-b7fc-c9f6b2e45068" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0. 0.1 0. 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1. ]\n", "[ 0. 0.1 -0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1. ]\n" ] } ], "source": [ "b = a.copy()\n", "b[2] = 0\n", "print(b)\n", "print(a)" ] }, { "cell_type": "markdown", "metadata": { "id": "2ynzvLTmeSZR" }, "source": [ "Можно задать список индексов." ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "id": "fRF4VWZmeSZR" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[-0.2 0.3 0.5]\n" ] } ], "source": [ "print(a[[2, 3, 5]])" ] }, { "cell_type": "markdown", "metadata": { "id": "JF2kGGHDeSZT" }, "source": [ "Можно задать булев массив той же величины." ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "id": "yeFzrHxfeSZT", "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[False True False True True True True True True True True]\n" ] } ], "source": [ "b = a > 0\n", "print(b)" ] }, { "cell_type": "code", "execution_count": 36, "metadata": { "id": "CS_Xass1eSZU" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0.1 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1. ]\n", "[ 0. 0.1 -0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1. ]\n", "[False True False True True True True True True True True]\n" ] } ], "source": [ "print(a[b])\n", "print(a)\n", "print(b)" ] }, { "cell_type": "markdown", "metadata": { "id": "DUBkecfkeSZW" }, "source": [ "## 3. Двумерные массивы\n", "\n", "#### 3.1 Создание, простые операции" ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "kELaHm5YeSZW", "outputId": "d5644e82-f679-4878-e416-42df3b986f92" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 0. 1.]\n", " [-1. 0.]]\n", "-1.0\n" ] } ], "source": [ "a = np.array([[0.0, 1.0], [-1.0, 0.0]])\n", "print(a)\n", "print(a[1, 0])" ] }, { "cell_type": "code", "execution_count": 38, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "0TUKspWjeSZZ", "outputId": "673e25c7-a442-456d-da85-bc65b0f4957a" }, "outputs": [ { "data": { "text/plain": [ "(2, 2)" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a.shape" ] }, { "cell_type": "code", "execution_count": 39, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "mH7fe1FkeSZa", "outputId": "56251c84-be12-4d7d-eeef-b8cca8bfae74" }, "outputs": [ { "data": { "text/plain": [ "2" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(a)" ] }, { "cell_type": "markdown", "metadata": { "id": "jWVJLH9ZeSZh" }, "source": [ "Можно растянуть в одномерный массив." ] }, { "cell_type": "code", "execution_count": 40, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "aJdF4hyreSZi", "outputId": "b6d87e4d-c6b5-4a82-b982-e73537165700" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 0. 1. -1. 0.]\n" ] } ], "source": [ "print(a.ravel())" ] }, { "cell_type": "markdown", "metadata": { "id": "ff31TTeHeSZj" }, "source": [ "Арифметические операции поэлементные." ] }, { "cell_type": "code", "execution_count": 41, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "a6WXKEj4eSZj", "outputId": "ec1259f0-3491-44c1-f6be-cd0dea07fc15" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[1. 2.]\n", " [0. 1.]]\n", "[[ 0. 2.]\n", " [-2. 0.]]\n", "[[ 0. 2.]\n", " [-1. 1.]]\n", "[[0. 1.]\n", " [1. 2.]]\n", "[[ 1. -4.]\n", " [-4. 8.]]\n" ] } ], "source": [ "print(a + 1)\n", "print(a * 2)\n", "print(a + [0, 1]) # второе слагаемое дополняется до матрицы копированием строк\n", "print(a + np.array([[0, 2]]).T) # .T - транспонирование\n", "\n", "b = np.array([[1.0, -5.0], [-3.0, 8.0]])\n", "print(a + b)" ] }, { "cell_type": "markdown", "metadata": { "id": "nLIlrz-BeSZl" }, "source": [ "#### 3.2 Работа с матрицами\n", "\n", "Поэлементное и матричное умножение." ] }, { "cell_type": "code", "execution_count": 42, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "xAUzdffSeSZl", "outputId": "2dbeb579-5105-48bb-9feb-12ce02f15d63" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 0. -5.]\n", " [ 3. 0.]]\n" ] } ], "source": [ "print(a * b)" ] }, { "cell_type": "code", "execution_count": 43, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "GAI2uX5PeSZn", "outputId": "50ccf93c-20c0-457c-b5e9-ab3c00f0d802" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[-3. 8.]\n", " [-1. 5.]]\n" ] } ], "source": [ "print(a @ b)" ] }, { "cell_type": "code", "execution_count": 44, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "6ERS9kXLeSZp", "outputId": "76a44c70-534a-4eb8-9271-cf950297709f" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 5. 1.]\n", " [-8. -3.]]\n" ] } ], "source": [ "print(b @ a)" ] }, { "cell_type": "markdown", "metadata": { "id": "KwK4g9gZeSaA" }, "source": [ "Единичная матрица." ] }, { "cell_type": "code", "execution_count": 45, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "ST8KOHBWeSaB", "outputId": "b20ac614-d4d1-4e53-99d0-2e0b39a5d00d" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[1. 0. 0. 0.]\n", " [0. 1. 0. 0.]\n", " [0. 0. 1. 0.]\n", " [0. 0. 0. 1.]]\n" ] } ], "source": [ "I = np.eye(4)\n", "print(I)" ] }, { "cell_type": "markdown", "metadata": { "id": "zDOuMhIzeSaC" }, "source": [ "Метод `reshape`." ] }, { "cell_type": "code", "execution_count": 46, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "iV0R_x39eSaC", "outputId": "5eb58827-ccc1-44e9-a13a-533c604a036f" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[1. 0. 0. 0. 0. 1. 0. 0. 0. 0. 1. 0. 0. 0. 0. 1.]\n" ] } ], "source": [ "print(I.reshape(16))" ] }, { "cell_type": "code", "execution_count": 47, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "uIEi4h1MeSaD", "outputId": "b2115a77-15f9-4f30-96c3-3bd8eb13e204" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[1. 0. 0. 0. 0. 1. 0. 0.]\n", " [0. 0. 1. 0. 0. 0. 0. 1.]]\n" ] } ], "source": [ "print(I.reshape(2, 8))" ] }, { "cell_type": "markdown", "metadata": { "id": "HfQZjZZleSaE" }, "source": [ "Строка." ] }, { "cell_type": "code", "execution_count": 48, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "dU-B5CsdeSaE", "outputId": "60595b08-120f-4fe0-dc1f-7dcc14e3b197" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0. 1. 0. 0.]\n" ] } ], "source": [ "print(I[1])" ] }, { "cell_type": "markdown", "metadata": { "id": "V6lshMkHeSaH" }, "source": [ "Столбец." ] }, { "cell_type": "code", "execution_count": 49, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "orbMt6WMeSaH", "outputId": "cbd660ee-cdba-4595-85e4-109706cde7e1" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[0. 0. 1. 0.]\n" ] } ], "source": [ "print(I[:, 2])" ] }, { "cell_type": "markdown", "metadata": { "id": "nqnUtI-SeSaI" }, "source": [ "Подматрица." ] }, { "cell_type": "code", "execution_count": 50, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "p8ycTAw3eSaI", "outputId": "868c26e6-726e-4e79-9c74-efd0cfda2ba8", "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[0. 0.]\n", " [1. 0.]]\n" ] } ], "source": [ "print(I[0:2, 1:3])" ] }, { "cell_type": "markdown", "metadata": { "id": "OFz0xwrreSaL" }, "source": [ "Транспонированная матрица." ] }, { "cell_type": "code", "execution_count": 51, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "Z4J5zzvNeSaL", "outputId": "67475b6a-0bac-4452-fc0d-51722b42f6a2" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 1. -3.]\n", " [-5. 8.]]\n" ] } ], "source": [ "print(b.T)" ] }, { "cell_type": "markdown", "metadata": { "id": "yGeRknu3eSaM" }, "source": [ "Соединение матриц по горизонтали и по вертикали." ] }, { "cell_type": "code", "execution_count": 52, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "ObH0A6B3eSaM", "outputId": "9e711514-e2c5-4496-f0b2-f36a443be012" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[0 1]\n", " [2 3]]\n", "[[4 5 6]\n", " [7 8 9]]\n", "[[4 5]\n", " [6 7]\n", " [8 9]]\n" ] } ], "source": [ "a = np.array([[0, 1], [2, 3]])\n", "b = np.array([[4, 5, 6], [7, 8, 9]])\n", "c = np.array([[4, 5], [6, 7], [8, 9]])\n", "print(a)\n", "print(b)\n", "print(c)" ] }, { "cell_type": "code", "execution_count": 53, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "0cpapPNveSaN", "outputId": "476034c1-18b5-410b-e2d8-7b2b9e715f71" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[0 1 4 5 6]\n", " [2 3 7 8 9]]\n" ] } ], "source": [ "print(np.hstack((a, b)))" ] }, { "cell_type": "code", "execution_count": 54, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "EPxXHWTAeSaO", "outputId": "5f85e045-fabd-4b8c-ffe6-be7664468ede" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[0 1]\n", " [2 3]\n", " [4 5]\n", " [6 7]\n", " [8 9]]\n" ] } ], "source": [ "print(np.vstack((a, c)))" ] }, { "cell_type": "markdown", "metadata": { "id": "i3SBcbGzeSaP" }, "source": [ "Сумма всех элементов; суммы столбцов; суммы строк." ] }, { "cell_type": "code", "execution_count": 55, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "V4Mgz9GIe0iL", "outputId": "88b3d22a-d71c-4fad-f134-c25a24471f23" }, "outputs": [ { "data": { "text/plain": [ "array([[4, 5, 6],\n", " [7, 8, 9]])" ] }, "execution_count": 55, "metadata": {}, "output_type": "execute_result" } ], "source": [ "b" ] }, { "cell_type": "code", "execution_count": 56, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "kQLBs3mHeSaP", "outputId": "79299428-0996-4211-b731-79b77e1e8313" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "39\n", "[11 13 15]\n", "[15 24]\n" ] } ], "source": [ "print(b.sum())\n", "print(b.sum(axis=0))\n", "print(b.sum(axis=1))" ] }, { "cell_type": "markdown", "metadata": { "id": "rc8otaIBeSaQ" }, "source": [ "Аналогично работают `prod`, `max`, `min` и т.д." ] }, { "cell_type": "code", "execution_count": 57, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "822so2-9eSaQ", "outputId": "9ed50917-5ad7-4e84-8a8d-4d086e43d763" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "9\n", "[7 8 9]\n", "[4 7]\n" ] } ], "source": [ "print(b.max())\n", "print(b.max(axis=0))\n", "print(b.min(axis=1))" ] }, { "cell_type": "markdown", "metadata": { "id": "AlzAeJI-eSaR" }, "source": [ "След - сумма диагональных элементов." ] }, { "cell_type": "code", "execution_count": 58, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "-BlaKobheSaR", "outputId": "41709f85-0265-4f22-b504-9b2e358fb6a4" }, "outputs": [ { "data": { "text/plain": [ "np.int64(3)" ] }, "execution_count": 58, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.trace(a)" ] }, { "cell_type": "markdown", "metadata": { "id": "7VULr5jSeSaS" }, "source": [ "## 4. Тензоры (многомерные массивы)" ] }, { "cell_type": "markdown", "metadata": { "id": "3wwvC1UOQppZ" }, "source": [ "#### 4.1 Создание, простые операции" ] }, { "cell_type": "code", "execution_count": 59, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "_Vx6sZvjeSaT", "outputId": "270e56a2-15d3-40a7-bc12-27b56d508c93" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[[ 0 1 2 3]\n", " [ 4 5 6 7]\n", " [ 8 9 10 11]]\n", "\n", " [[12 13 14 15]\n", " [16 17 18 19]\n", " [20 21 22 23]]]\n" ] } ], "source": [ "X = np.arange(24).reshape(2, 3, 4)\n", "print(X)" ] }, { "cell_type": "markdown", "metadata": { "id": "6-a_IMz5eSaU" }, "source": [ "Суммирование (аналогично остальные операции)." ] }, { "cell_type": "code", "execution_count": 60, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "XnTt-275eSaU", "outputId": "14391cab-ea20-4806-ac30-f2ebaa37d0f7" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[12 14 16 18]\n", " [20 22 24 26]\n", " [28 30 32 34]]\n" ] } ], "source": [ "# суммируем только по нулевой оси, то есть для фиксированных j и k \n", "# суммируем только элементы с индексами (*, j, k)\n", "print(X.sum(axis=0))" ] }, { "cell_type": "code", "execution_count": 61, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "O5m-SV4Ge0iN", "outputId": "e9001e81-f3b5-47af-f581-a8688deabc05" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[ 66 210]\n" ] } ], "source": [ "# суммируем сразу по двум осям, то есть для фиксированной i \n", "# суммируем только элементы с индексами (i, *, *)\n", "print(X.sum(axis=(1, 2)))" ] }, { "cell_type": "markdown", "metadata": { "id": "-FWPCnqzQppZ" }, "source": [ "#### 4.2. Broadcasting" ] }, { "cell_type": "markdown", "metadata": { "id": "uPjV3yjKQppZ" }, "source": [ "Выше при арифметических операциях с массивами, например, при сложении и умножении, мы перемножали массивы одинаковой формы. В самом простом случае операндами были одномерные массивы одинаковой длины." ] }, { "cell_type": "code", "execution_count": 62, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "vjFxZY6JQppa", "outputId": "ef8baac4-3a40-4554-f441-36f90d013e06" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[2 4 6]\n" ] } ], "source": [ "# Самый простой случай\n", "a = np.array([1, 2, 3])\n", "b = np.array([2, 2, 2])\n", "print(a * b)" ] }, { "cell_type": "markdown", "metadata": { "id": "CFUBi7CDQppa" }, "source": [ "Произошло поэлементное умножение, все элементы массива $a$ умножились на $2$. Но мы знаем, что это можно сделать проще, просто умножив массив на $2$." ] }, { "cell_type": "code", "execution_count": 63, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "eSGmLzA5Qppa", "outputId": "a58ccfb2-5b71-4e9f-bef2-11b43fd01cde" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[2 4 6]\n" ] } ], "source": [ "# Умножение массива на число\n", "print(a * 2)" ] }, { "cell_type": "markdown", "metadata": { "id": "u7_r9j_9Qppb" }, "source": [ "На самом деле поведение будет аналогичным, если умножить одномерный массив на массив длины $1$." ] }, { "cell_type": "code", "execution_count": 64, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "yg6006goQppb", "outputId": "6b80a310-14ec-47e6-b293-b835d55cc4c7" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[2 4 6]\n" ] } ], "source": [ "# Умножение массивов разных длин\n", "print(a * [2])" ] }, { "cell_type": "markdown", "metadata": { "id": "FoaMvoBmQppc" }, "source": [ "В этом случае работает так называемый *broadcasting*. Один массив \"растягивается\", чтобы повторить форму другого.\n", "\n", "![theory.broadcast_1.gif]()" ] }, { "cell_type": "markdown", "metadata": { "id": "vzmLw9c3Qppc" }, "source": [ "Такой же эффект работает и для многомерных массивов. Если по какому-то измерению размер у одного массива равен $1$, а у другого — произвольный, то по этому измерению может произойти \"рястяжение\". Таким образом, массивы можно умножать друг на друга, если в измерениях, где они по размеру не совпадают, хотя бы у одного размер $1$. Для других поэлементных операций правило аналогично.\n", "\n", "Важно отметить, что размерности сопоставляются справа налево. Если их количество не совпадает, то массивы меньшей размерности сначала дополняются слева размерностями 1. Например, при сложении массива размера $4 \\times 3$ с массивом размера $3$ последний сначала преобразуется в массив размера $1 \\times 3$." ] }, { "cell_type": "code", "execution_count": 65, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "MPaw_M3iQppc", "outputId": "3111ab64-9c1d-45f9-c403-9957b06adae0" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 0 1 2]\n", " [10 11 12]\n", " [20 21 22]\n", " [30 31 32]]\n" ] } ], "source": [ "\n", "a = np.array([[ 0, 0, 0],\n", " [10, 10, 10],\n", " [20, 20, 20],\n", " [30, 30, 30]])\n", "\n", "b = np.array([0, 1, 2])\n", "\n", "print(a + b)" ] }, { "cell_type": "markdown", "metadata": { "id": "NVxT_nHjQppd" }, "source": [ "Схематично проведенную операцию можно визуализировать следующим образом.\n", "\n", "![theory.broadcast_2.gif]()\n", "\n", "\n", "Если неединичные размерности справа не будут совпадать, то выполнить операцию уже не получится. Например, как приведено на схеме ниже. \n", "\n", "![theory.broadcast_3.gif]()\n" ] }, { "cell_type": "markdown", "metadata": { "id": "p-EYlh-03Ov4" }, "source": [ "А если размеры будут не совместимы, то произойдет ошибка." ] }, { "cell_type": "code", "execution_count": 66, "metadata": { "colab": { "base_uri": "https://localhost:8080/", "height": 182 }, "id": "jPo3OiBZ3Ov4", "outputId": "7e261607-3764-4e80-ffca-2dcd6e7e028b" }, "outputs": [ { "ename": "ValueError", "evalue": "operands could not be broadcast together with shapes (4,3) (4,) ", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[66], line 2\u001b[0m\n\u001b[1;32m 1\u001b[0m b \u001b[38;5;241m=\u001b[39m np\u001b[38;5;241m.\u001b[39marray([\u001b[38;5;241m1.0\u001b[39m, \u001b[38;5;241m2.0\u001b[39m, \u001b[38;5;241m3.0\u001b[39m, \u001b[38;5;241m4.0\u001b[39m])\n\u001b[0;32m----> 2\u001b[0m \u001b[43ma\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m+\u001b[39;49m\u001b[43m \u001b[49m\u001b[43mb\u001b[49m\n", "\u001b[0;31mValueError\u001b[0m: operands could not be broadcast together with shapes (4,3) (4,) " ] } ], "source": [ "b = np.array([1.0, 2.0, 3.0, 4.0])\n", "a + b" ] }, { "cell_type": "markdown", "metadata": { "id": "Hn5BEycv3Ov4" }, "source": [ "Если массивы имеют несовместимый размер, можно их сначала привести к одной форме." ] }, { "cell_type": "code", "execution_count": 67, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "BsLoImoW3Ov4", "outputId": "f1f38273-63b0-420c-b924-4b9bdc3a2c57" }, "outputs": [ { "data": { "text/plain": [ "array([[ 1., 2., 3.],\n", " [11., 12., 13.],\n", " [21., 22., 23.],\n", " [31., 32., 33.]])" ] }, "execution_count": 67, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a = np.array([0.0, 10.0, 20.0, 30.0])\n", "b = np.array([1.0, 2.0, 3.0])\n", "a.reshape((-1, 1)) + b" ] }, { "cell_type": "markdown", "metadata": { "id": "sWPEBlaKQppe" }, "source": [ "*Замечание*\n", "\n", "Знать про broadcasting нужно, но пользоваться им надо с осторожностью. Многократное копирование массива при растяжении может привести к неэффективной работе программы по памяти. Особенно за этим приходится следить при работе с GPU." ] }, { "cell_type": "markdown", "metadata": { "id": "qpZqvsLu3Ov5" }, "source": [ "Часто при работе с массивами NumPy требуется добавлять новые оси измерений и удалять существующие. В NumPy добавлять новые оси иногда удобнее с помощью специального объекта `newaxis`. Например, пусть у нас есть одномерный массив:" ] }, { "cell_type": "code", "execution_count": 68, "metadata": { "id": "ptHIMxD73Ov5" }, "outputs": [], "source": [ "a = np.array([1,2,3,4,5,6,7,8,9,10])" ] }, { "cell_type": "markdown", "metadata": { "id": "wbMn6eA93Ov5" }, "source": [ "У него одна ось – одно измерение. Добавим еще одну ось, допустим, в начало. С помощью объекта np.newaxis это можно сделать так:" ] }, { "cell_type": "code", "execution_count": 69, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "SEVHK1rc3Ov5", "outputId": "06ea936c-92b4-4b7b-e0a3-5b550202936f" }, "outputs": [ { "data": { "text/plain": [ "(1, 10)" ] }, "execution_count": 69, "metadata": {}, "output_type": "execute_result" } ], "source": [ "b = a[np.newaxis, :] # добавление оси axis0\n", "b.shape" ] }, { "cell_type": "markdown", "metadata": { "id": "osZG_dcW3Ov5" }, "source": [ "Или, можно прописать сразу две оси:" ] }, { "cell_type": "code", "execution_count": 70, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "00aotrkP3Ov5", "outputId": "18d662d3-6e0f-4d7c-e853-171350107a96" }, "outputs": [ { "data": { "text/plain": [ "(1, 10, 1)" ] }, "execution_count": 70, "metadata": {}, "output_type": "execute_result" } ], "source": [ "c = a[np.newaxis, :, np.newaxis]\n", "c.shape" ] }, { "cell_type": "markdown", "metadata": { "id": "fQu4JWrkeSaV" }, "source": [ "## 5. Линейная алгебра" ] }, { "cell_type": "code", "execution_count": 71, "metadata": { "id": "KJ9qk8KBe0iO" }, "outputs": [], "source": [ "a = np.array([[0, 1], [2, 3]])" ] }, { "cell_type": "code", "execution_count": 72, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "QfoGKtHdeSaV", "outputId": "7931ab04-ea1b-4cf4-e6f5-0331bbf06cce" }, "outputs": [ { "data": { "text/plain": [ "np.float64(-2.0)" ] }, "execution_count": 72, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.linalg.det(a)" ] }, { "cell_type": "markdown", "metadata": { "id": "fh6WDlobeSaW" }, "source": [ "Обратная матрица." ] }, { "cell_type": "code", "execution_count": 73, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "6_JMYc5NeSaX", "outputId": "17aa79fe-c804-4b1a-c6e6-836d4dfec4a7" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[-1.5 0.5]\n", " [ 1. 0. ]]\n" ] } ], "source": [ "a1 = np.linalg.inv(a)\n", "print(a1)" ] }, { "cell_type": "code", "execution_count": 74, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "9RK1Y8tOeSaY", "outputId": "c89fefe9-326c-4847-a7ec-d71ac9b2aa9d" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[1. 0.]\n", " [0. 1.]]\n", "[[1. 0.]\n", " [0. 1.]]\n" ] } ], "source": [ "print(a @ a1)\n", "print(a1 @ a)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "ename": "LinAlgError", "evalue": "Singular matrix", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mLinAlgError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[4], line 2\u001b[0m\n\u001b[1;32m 1\u001b[0m a \u001b[38;5;241m=\u001b[39m np\u001b[38;5;241m.\u001b[39marray([[\u001b[38;5;241m1e-9\u001b[39m, \u001b[38;5;241m0\u001b[39m],[\u001b[38;5;241m0\u001b[39m, \u001b[38;5;241m0\u001b[39m]])\n\u001b[0;32m----> 2\u001b[0m \u001b[43mnp\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mlinalg\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43minv\u001b[49m\u001b[43m(\u001b[49m\u001b[43ma\u001b[49m\u001b[43m)\u001b[49m\n", "File \u001b[0;32m~/torch-env/lib/python3.12/site-packages/numpy/linalg/linalg.py:561\u001b[0m, in \u001b[0;36minv\u001b[0;34m(a)\u001b[0m\n\u001b[1;32m 559\u001b[0m signature \u001b[38;5;241m=\u001b[39m \u001b[38;5;124m'\u001b[39m\u001b[38;5;124mD->D\u001b[39m\u001b[38;5;124m'\u001b[39m \u001b[38;5;28;01mif\u001b[39;00m isComplexType(t) \u001b[38;5;28;01melse\u001b[39;00m \u001b[38;5;124m'\u001b[39m\u001b[38;5;124md->d\u001b[39m\u001b[38;5;124m'\u001b[39m\n\u001b[1;32m 560\u001b[0m extobj \u001b[38;5;241m=\u001b[39m get_linalg_error_extobj(_raise_linalgerror_singular)\n\u001b[0;32m--> 561\u001b[0m ainv \u001b[38;5;241m=\u001b[39m \u001b[43m_umath_linalg\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43minv\u001b[49m\u001b[43m(\u001b[49m\u001b[43ma\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43msignature\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43msignature\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mextobj\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mextobj\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 562\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m wrap(ainv\u001b[38;5;241m.\u001b[39mastype(result_t, copy\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mFalse\u001b[39;00m))\n", "File \u001b[0;32m~/torch-env/lib/python3.12/site-packages/numpy/linalg/linalg.py:112\u001b[0m, in \u001b[0;36m_raise_linalgerror_singular\u001b[0;34m(err, flag)\u001b[0m\n\u001b[1;32m 111\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m\u001b[38;5;250m \u001b[39m\u001b[38;5;21m_raise_linalgerror_singular\u001b[39m(err, flag):\n\u001b[0;32m--> 112\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m LinAlgError(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mSingular matrix\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n", "\u001b[0;31mLinAlgError\u001b[0m: Singular matrix" ] } ], "source": [ "a = np.array([[1e-9, 0],[0, 0]])\n", "np.linalg.inv(a)" ] }, { "cell_type": "markdown", "metadata": { "id": "Y4voLe-LeSau" }, "source": [ "## 6. Производительность numpy\n", "\n", "Посмотрим на простой пример — сумма первых $10^8$ чисел." ] }, { "cell_type": "code", "execution_count": 75, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "nvCQ_4WReSav", "outputId": "206ff723-f29c-47c7-ce17-b061a9ce95f3" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "4999999950000000\n", "CPU times: user 12.4 s, sys: 1.79 ms, total: 12.4 s\n", "Wall time: 12.4 s\n" ] } ], "source": [ "%%time\n", "\n", "sum_value = 0\n", "for i in range(10 ** 8):\n", " sum_value += i\n", "print(sum_value)" ] }, { "cell_type": "markdown", "metadata": { "id": "v4SQIcy8eSaw" }, "source": [ "Немного улучшенный код." ] }, { "cell_type": "code", "execution_count": 76, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "ewr9BFMpeSaw", "outputId": "c476b8da-b480-4501-923b-0638f89e7c39" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "4999999950000000\n", "CPU times: user 2.63 s, sys: 8.12 ms, total: 2.64 s\n", "Wall time: 2.62 s\n" ] } ], "source": [ "%%time\n", "\n", "sum_value = sum(range(10 ** 8))\n", "print(sum_value)" ] }, { "cell_type": "markdown", "metadata": { "id": "pUjiYs86eSax" }, "source": [ "Код с использованием функций библиотеки `numpy`." ] }, { "cell_type": "code", "execution_count": 77, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "Jv6RFksoeSax", "outputId": "79144d35-18e1-477a-cda9-ea6866dfa371" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "4999999950000000\n", "CPU times: user 103 ms, sys: 78.1 ms, total: 181 ms\n", "Wall time: 192 ms\n" ] } ], "source": [ "%%time\n", "\n", "sum_value = np.arange(10 ** 8).sum()\n", "print(sum_value)" ] }, { "cell_type": "markdown", "metadata": { "id": "ya-V2P3veSay" }, "source": [ "Простой и понятный код работает в $60$ раз быстрее!\n", "\n", "Посмотрим на другой пример. Сгенерируем матрицу размера $500\\times1000$, и вычислим средний минимум по колонкам.\n", "\n", "Простой код, но при этом даже использующий некоторые питон-функции.\n", "\n", "*Замечание*. Далее с помощью `scipy.stats` происходит генерация случайных чисел из равномерного распределения на отрезке $[0, 1]$. Этот модуль будем изучать в следующем ноутбуке." ] }, { "cell_type": "code", "execution_count": 78, "metadata": { "id": "Ulfmy68keSaz" }, "outputs": [], "source": [ "import scipy.stats as sps" ] }, { "cell_type": "code", "execution_count": 79, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "bp_qnnwPeSaz", "outputId": "5c99eb42-2674-4c55-e43e-e36097ceacae" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.003937677101331695\n", "CPU times: user 13.2 s, sys: 86.1 ms, total: 13.2 s\n", "Wall time: 13.2 s\n" ] } ], "source": [ "%%time\n", "\n", "N, M = 500, 1000\n", "matrix = []\n", "for i in range(N):\n", " matrix.append([sps.uniform.rvs() for j in range(M)])\n", "\n", "min_col = [min([matrix[i][j] for i in range(N)]) for j in range(M)]\n", "mean_min = sum(min_col) / N\n", "print(mean_min)" ] }, { "cell_type": "markdown", "metadata": { "id": "c47oHwHzeSa0" }, "source": [ "Понятный код с использованием функций библиотеки numpy." ] }, { "cell_type": "code", "execution_count": 80, "metadata": { "colab": { "base_uri": "https://localhost:8080/" }, "id": "h1WGN9hseSa0", "outputId": "926ee2fe-bb94-4ff5-d2e3-d75ae6ca9109" }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.0010190658049421483\n", "CPU times: user 14.3 ms, sys: 2.98 ms, total: 17.2 ms\n", "Wall time: 16.1 ms\n" ] } ], "source": [ "%%time\n", "\n", "N, M = 500, 1000\n", "matrix = sps.uniform.rvs(size=(N, M))\n", "mean_min = matrix.min(axis=1).mean()\n", "print(mean_min)" ] }, { "cell_type": "markdown", "metadata": { "id": "ABVSxP6seSa1" }, "source": [ "Простой и понятный код работает в 800 раз быстрее!" ] } ], "metadata": { "colab": { "provenance": [] }, "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.3" }, "vscode": { "interpreter": { "hash": "33e61429d47ea5072c304948017faf4b8066559ab931d76623e2d35f352f9359" } } }, "nbformat": 4, "nbformat_minor": 1 }