eulerbooks/notebooks/problem0007.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "87fa1305",
   "metadata": {},
   "source": [
    "# [100001st Prime](https://projecteuler.net/problem=7)\n",
    "\n",
    "Unlike integer factorization, we *can* efficiently determine whether a number is prime. One of the simplest and fastest methods is the [Miller-Rabin primality test](https://en.wikipedia.org/wiki/Miller%E2%80%93Rabin_primality_test), which is a probabilistic test.\n",
    "\n",
    "Of course, SageMath has tools to let us compute the 10001st prime without implementing a primality test ourselves! "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "3a27f454",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "104743"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "nth_prime(10001)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6f644292",
   "metadata": {},
   "source": [
    "Let's look into how the Miller-Rabin test works anyway, though.\n",
    "\n",
    "## Fermat's little theorem\n",
    "First, a little bit of elementary number theory. [Fermat's little theorem](https://en.wikipedia.org/wiki/Fermat%27s_little_theorem) says that if $p$ is a prime number, then for any integer $a$,\n",
    "$$a^p \\equiv a \\pmod{p}$$\n",
    "It follows from the [contrapositive](https://en.wikipedia.org/wiki/Contraposition) that if there exists an integer $a$ such that $a^p \\not\\equiv a \\pmod{p}$, then $p$ is not prime. We then call $a$ a *witness* that proves $p$ is composite.\n",
    "\n",
    "Consider the fact that $3^{6} \\equiv 3 \\pmod{6}$, but 6 is composite. In cases like this, we say that $p$ is a *base $a$ Fermat pseudoprime*, e.g. 6 is a base 3 Fermat pseudoprime. 6 is also a base 4 Fermat pseudoprime, but 2 and 5 both serve as witnesses to 6 being composite (we need only consider bases modulo $p$).\n",
    "\n",
    "This may lead us to believe that we can just check every base $a$, and if there are no witnesses, then $p$ is prime. However, (along with this method being inefficient for large $p$), there are numbers that are Fermat pseudoprimes for *every* base, called [Carmichael numbers](https://en.wikipedia.org/wiki/Carmichael_number). Therefore, the [converse](https://w.wiki/BAyj) of Fermat's little theorem does not hold.\n",
    "\n",
    "Nevertheless, Carmichael numbers are relatively rare, so if you pick a random number to test and a few random bases, and find no witnesses, you can be pretty sure the number is in fact prime.\n",
    "\n",
    "## Miller-Rabin test\n",
    "\n",
    "The Miller-Rabin test employs another useful fact about prime numbers: if $p$ is prime, then the only values of $x$ such that \n",
    "$$x^2 \\equiv 1 \\pmod{p}$$\n",
    "are $x \\equiv \\pm 1 \\pmod{p}$. (*Proof: if $x^2 \\equiv 1 \\pmod{p}$, then $p \\vert (x+1)(x-1)$. Then by [Euclid's lemma](https://en.wikipedia.org/wiki/Euclid%27s_lemma), $p$ must divide one of $x+1$ or $x-1$, implying $x \\equiv \\pm 1 \\pmod{p}$.*)\n",
    "\n",
    "Similarly to Fermat's little theorem, the converse of this statement doesn't hold. For example, $x^2 \\equiv 1 \\pmod{9}$ only has $x=1$ and $x=-1$ as solutions, but 9 is composite.\n",
    "\n",
    "We now have two questions to lead an investigation into the primality of a number $n$:\n",
    "* Is there an integer $a$ such that $a^n \\not\\equiv a \\pmod{n}$?\n",
    "* Does 1 have a non-trivial square root modulo $n$?\n",
    "\n",
    "If the answer to either of these is yes, then $n$ is *definitely* composite. If both answers are no, this is strong evidence that $n$ is prime, *but not proof*, since we have already demonstrated that both of these conditions can have false positives. A composite number that is erroneously suggested to be prime by both of these conditions is called a *base $a$ strong pseudoprime*. Fortunately, no number (including Carmichael numbers) is a base $a$ strong pseudoprime for all bases $a$. In fact, Rabin (one of the namesakes of the algorithm) proved that if $n$ is composite, at least 3/4 of the numbers 1 to $n-1$ are witnesses.\n",
    "\n",
    "Let's finally look at the actual test. Suppose we want to test if an odd number $n$ is prime (testing whether an even number is prime is trivial). There must exist $t \\geq 1$ and odd $u$ such that $n - 1 = 2^t u$."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "720f7e8c",
   "metadata": {},
   "outputs": [],
   "source": [
    "def factor_twos(n):\n",
    "    t = 0\n",
    "    u = n - 1\n",
    "    while u % 2 == 0:\n",
    "        u //= 2\n",
    "        t += 1\n",
    "    \n",
    "    return t, u"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "980bd2cf",
   "metadata": {},
   "source": [
    "It follows that\n",
    "$$a^{n-1} \\equiv a^{2^t u} \\pmod{n}$$\n",
    "for any base $a$, so we'll randomly select a base $1 \\leq a \\leq n-1$.\n",
    "\n",
    "The test proceeds by repeatedly squaring $a^u$ modulo $n$ until we reach $a^{n-1}$. If at any point in this squaring, we find that $a^{2^{i-1} u}$ does not equal 1 or -1, but the following squaring gives $a^{2^i u} = 1$, then we have found a non-trivial square root of 1 modulo $n$. Therefore, $n$ must be composite.\n",
    "\n",
    "After $t$ squarings, we will reach $a^{n-1}$. If this equals anything other than 1, then by the contrapositive of Fermat's little theorem, this also tells us that $n$ is composite.\n",
    "\n",
    "If, after $t$ squarings, we have not found a non-trivial square root of 1 modulo $n$ and $a^{n-1} \\equiv 1 \\pmod{n}$, then either $n$ is prime, or is a base $a$ strong pseudoprime."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "123d1068",
   "metadata": {},
   "outputs": [],
   "source": [
    "def witness(a, n):\n",
    "    t, u = factor_twos(n)\n",
    "    x = pow(a, u, n)\n",
    "    for _ in range(0, t):\n",
    "        x, y = pow(x, 2, n), x\n",
    "        if x == 1 and y != 1 and y != n - 1:\n",
    "            return True\n",
    "    \n",
    "    return x != 1"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f8454603",
   "metadata": {},
   "source": [
    "You can reduce the chance of a false report of primality by repeating the procedure with different bases. If we test several bases and do not find any witnesses, it is exceedingly unlikely that $n$ is composite."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "e71ba6ec",
   "metadata": {},
   "outputs": [],
   "source": [
    "from random import randint\n",
    "\n",
    "def is_prime(n, k=10):\n",
    "    if n < 2:\n",
    "        return False\n",
    "    elif n == 2:\n",
    "        return True\n",
    "    elif n % 2 == 0:\n",
    "        return False\n",
    "    \n",
    "    for _ in range(0, k):\n",
    "        a = randint(1, n - 1)\n",
    "        if witness(a, n):\n",
    "            return False\n",
    "        \n",
    "    return True"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "45c9ca5f-97f7-4091-8a6b-6851a84b39a2",
   "metadata": {},
   "source": [
    "With this implementation, we can generate the primes by iterating over the positive integers and testing if each is prime."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "9ea628f0-79f0-447d-a444-6e66565a3620",
   "metadata": {},
   "outputs": [],
   "source": [
    "from itertools import count\n",
    "\n",
    "def primes(start=1):\n",
    "    for n in count(start):\n",
    "        if is_prime(n):\n",
    "            yield n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f8e9fad3-8ef0-469e-b381-43315b915559",
   "metadata": {},
   "source": [
    "Then to solve the problem, we can just count the primes until we reach 10001.\n",
    "\n",
    "## Rosser's theorem\n",
    "This strategy works... but we can go even deeper.\n",
    "\n",
    "Running Miller-Rabin on every integer until we reach the 10001st prime isn't super efficient. To avoid this, we can take advantage of [Rosser's theorem](https://en.wikipedia.org/wiki/Rosser%27s_theorem), which states that $p_n > n \\log n$. In other words, we can start our primality testing at $10001 \\log 10001$ instead of 1."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "d7f5255b-d21e-4cfa-837d-9289c44c156a",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "92114"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "lower = ceil(10001 * log(10001))\n",
    "lower"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "48cbf855-fe03-4f7f-9c07-45a1c1c0295f",
   "metadata": {},
   "source": [
    "## Prime-counting function\n",
    "There's one issue preventing us from jumping ahead to 92114: we don't know how many primes are less than 92114, so when we find the next prime (spoiler: it's 92119), we won't know what number prime that is (another spoiler: it's the 8897th prime).\n",
    "\n",
    "To answer this question, we can use the [prime-counting function](https://en.wikipedia.org/wiki/Prime-counting_function). SageMath has the prime-counting function available directly as `prime_pi`, but let's implement it ourselves. As a first pass, we can use a sieve. Check out [problem 10](https://projecteuler.net/problem=10) for a sieve implementation - to avoid duplicating work, we'll use SageMath's sieve here."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "43b0b9df-d745-4890-a835-aa0c3e3bced8",
   "metadata": {},
   "outputs": [],
   "source": [
    "def prime_counting_function(n):\n",
    "    return len(prime_range(n+1))"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0c1e85af-74ba-48ee-9f5e-9aab35cc2a55",
   "metadata": {},
   "source": [
    "This will tell us where to start our tally if we start checking for primes at 92114."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "1636b854-6d24-4373-987a-8a66e3b91fe4",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "8896"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "tally = prime_counting_function(lower)\n",
    "tally"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "13c91512-6a23-4569-bdea-2e88f4414922",
   "metadata": {},
   "source": [
    "This runs considerably quicker than starting from 1."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "1590869a-5d9d-4662-b07b-18511eb25b79",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "104743"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "for (i, p) in enumerate(primes(start=lower), start=tally+1):\n",
    "    if i == 10001:\n",
    "        break\n",
    "\n",
    "p"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ec384441",
   "metadata": {},
   "source": [
    "## Relevant sequences\n",
    "* Prime numbers: [A000040](https://oeis.org/A000040)\n",
    "* Carmichael numbers: [A002997](https://oeis.org/A002997)\n",
    "* Prime-counting function: [A000720](https://oeis.org/A000720)\n",
    "\n",
    "#### Copyright (C) 2025 filifa\n",
    "\n",
    "This work is licensed under the [Creative Commons Attribution-ShareAlike 4.0 International license](https://creativecommons.org/licenses/by-sa/4.0/) and the [BSD Zero Clause license](https://spdx.org/licenses/0BSD.html)."
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "SageMath 10.7",
   "language": "sage",
   "name": "sagemath"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.12.5"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}