336 lines
12 KiB
Plaintext
336 lines
12 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "87fa1305",
|
|
"metadata": {},
|
|
"source": [
|
|
"# [100001st Prime](https://projecteuler.net/problem=7)\n",
|
|
"\n",
|
|
"Unlike integer factorization, we *can* efficiently determine whether a number is prime. One of the simplest and fastest methods is the [Miller-Rabin primality test](https://en.wikipedia.org/wiki/Miller%E2%80%93Rabin_primality_test), which is a probabilistic test.\n",
|
|
"\n",
|
|
"Of course, SageMath has tools to let us compute the 10001st prime without implementing a primality test ourselves! "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 1,
|
|
"id": "3a27f454",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"104743"
|
|
]
|
|
},
|
|
"execution_count": 1,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"nth_prime(10001)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "6f644292",
|
|
"metadata": {},
|
|
"source": [
|
|
"Let's look into how the Miller-Rabin test works anyway, though.\n",
|
|
"\n",
|
|
"## Fermat's little theorem\n",
|
|
"First, a little bit of elementary number theory. [Fermat's little theorem](https://en.wikipedia.org/wiki/Fermat%27s_little_theorem) says that if $p$ is a prime number, then for any integer $a$,\n",
|
|
"$$a^p \\equiv a \\pmod{p}$$\n",
|
|
"It follows from the [contrapositive](https://en.wikipedia.org/wiki/Contraposition) that if there exists an integer $a$ such that $a^p \\not\\equiv a \\pmod{p}$, then $p$ is not prime. We then call $a$ a *witness* that proves $p$ is composite.\n",
|
|
"\n",
|
|
"Consider the fact that $3^{6} \\equiv 3 \\pmod{6}$, but 6 is composite. In cases like this, we say that $p$ is a *base $a$ Fermat pseudoprime*, e.g. 6 is a base 3 Fermat pseudoprime. 6 is also a base 4 Fermat pseudoprime, but 2 and 5 both serve as witnesses to 6 being composite (we need only consider bases modulo $p$).\n",
|
|
"\n",
|
|
"This may lead us to believe that we can just check every base $a$, and if there are no witnesses, then $p$ is prime. However, (along with this method being inefficient for large $p$), there are numbers that are Fermat pseudoprimes for *every* base, called [Carmichael numbers](https://en.wikipedia.org/wiki/Carmichael_number). Therefore, the [converse](https://w.wiki/BAyj) of Fermat's little theorem does not hold.\n",
|
|
"\n",
|
|
"Nevertheless, Carmichael numbers are relatively rare, so if you pick a random number to test and a few random bases, and find no witnesses, you can be pretty sure the number is in fact prime.\n",
|
|
"\n",
|
|
"## Miller-Rabin test\n",
|
|
"\n",
|
|
"The Miller-Rabin test employs another useful fact about prime numbers: if $p$ is prime, then the only values of $x$ such that \n",
|
|
"$$x^2 \\equiv 1 \\pmod{p}$$\n",
|
|
"are $x \\equiv \\pm 1 \\pmod{p}$. (*Proof: if $x^2 \\equiv 1 \\pmod{p}$, then $p \\vert (x+1)(x-1)$. Then by [Euclid's lemma](https://en.wikipedia.org/wiki/Euclid%27s_lemma), $p$ must divide one of $x+1$ or $x-1$, implying $x \\equiv \\pm 1 \\pmod{p}$.*)\n",
|
|
"\n",
|
|
"Similarly to Fermat's little theorem, the converse of this statement doesn't hold. For example, $x^2 \\equiv 1 \\pmod{9}$ only has $x=1$ and $x=-1$ as solutions, but 9 is composite.\n",
|
|
"\n",
|
|
"We now have two questions to lead an investigation into the primality of a number $n$:\n",
|
|
"* Is there an integer $a$ such that $a^n \\not\\equiv a \\pmod{n}$?\n",
|
|
"* Does 1 have a non-trivial square root modulo $n$?\n",
|
|
"\n",
|
|
"If the answer to either of these is yes, then $n$ is *definitely* composite. If both answers are no, this is strong evidence that $n$ is prime, *but not proof*, since we have already demonstrated that both of these conditions can have false positives. A composite number that is erroneously suggested to be prime by both of these conditions is called a *base $a$ strong pseudoprime*. Fortunately, no number (including Carmichael numbers) is a base $a$ strong pseudoprime for all bases $a$. In fact, Rabin (one of the namesakes of the algorithm) proved that if $n$ is composite, at least 3/4 of the numbers 1 to $n-1$ are witnesses.\n",
|
|
"\n",
|
|
"Let's finally look at the actual test. Suppose we want to test if an odd number $n$ is prime (testing whether an even number is prime is trivial). There must exist $t \\geq 1$ and odd $u$ such that $n - 1 = 2^t u$."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 2,
|
|
"id": "720f7e8c",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"def factor_twos(n):\n",
|
|
" t = 0\n",
|
|
" u = n - 1\n",
|
|
" while u % 2 == 0:\n",
|
|
" u //= 2\n",
|
|
" t += 1\n",
|
|
" \n",
|
|
" return t, u"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "980bd2cf",
|
|
"metadata": {},
|
|
"source": [
|
|
"It follows that\n",
|
|
"$$a^{n-1} \\equiv a^{2^t u} \\pmod{n}$$\n",
|
|
"for any base $a$, so we'll randomly select a base $1 \\leq a \\leq n-1$.\n",
|
|
"\n",
|
|
"The test proceeds by repeatedly squaring $a^u$ modulo $n$ until we reach $a^{n-1}$. If at any point in this squaring, we find that $a^{2^{i-1} u}$ does not equal 1 or -1, but the following squaring gives $a^{2^i u} = 1$, then we have found a non-trivial square root of 1 modulo $n$. Therefore, $n$ must be composite.\n",
|
|
"\n",
|
|
"After $t$ squarings, we will reach $a^{n-1}$. If this equals anything other than 1, then by the contrapositive of Fermat's little theorem, this also tells us that $n$ is composite.\n",
|
|
"\n",
|
|
"If, after $t$ squarings, we have not found a non-trivial square root of 1 modulo $n$ and $a^{n-1} \\equiv 1 \\pmod{n}$, then either $n$ is prime, or is a base $a$ strong pseudoprime."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 3,
|
|
"id": "123d1068",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"def witness(a, n):\n",
|
|
" t, u = factor_twos(n)\n",
|
|
" x = pow(a, u, n)\n",
|
|
" for _ in range(0, t):\n",
|
|
" x, y = pow(x, 2, n), x\n",
|
|
" if x == 1 and y != 1 and y != n - 1:\n",
|
|
" return True\n",
|
|
" \n",
|
|
" return x != 1"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "f8454603",
|
|
"metadata": {},
|
|
"source": [
|
|
"You can reduce the chance of a false report of primality by repeating the procedure with different bases. If we test several bases and do not find any witnesses, it is exceedingly unlikely that $n$ is composite."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 4,
|
|
"id": "e71ba6ec",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from random import randint\n",
|
|
"\n",
|
|
"def is_prime(n, k=10):\n",
|
|
" if n < 2:\n",
|
|
" return False\n",
|
|
" elif n == 2:\n",
|
|
" return True\n",
|
|
" elif n % 2 == 0:\n",
|
|
" return False\n",
|
|
" \n",
|
|
" for _ in range(0, k):\n",
|
|
" a = randint(1, n - 1)\n",
|
|
" if witness(a, n):\n",
|
|
" return False\n",
|
|
" \n",
|
|
" return True"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "45c9ca5f-97f7-4091-8a6b-6851a84b39a2",
|
|
"metadata": {},
|
|
"source": [
|
|
"With this implementation, we can generate the primes by iterating over the positive integers and testing if each is prime."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 5,
|
|
"id": "9ea628f0-79f0-447d-a444-6e66565a3620",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from itertools import count\n",
|
|
"\n",
|
|
"def primes(start=1):\n",
|
|
" for n in count(start):\n",
|
|
" if is_prime(n):\n",
|
|
" yield n"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "f8e9fad3-8ef0-469e-b381-43315b915559",
|
|
"metadata": {},
|
|
"source": [
|
|
"Then to solve the problem, we can just count the primes until we reach 10001.\n",
|
|
"\n",
|
|
"## Rosser's theorem\n",
|
|
"This strategy works... but we can go even deeper.\n",
|
|
"\n",
|
|
"Running Miller-Rabin on every integer until we reach the 10001st prime isn't super efficient. To avoid this, we can take advantage of [Rosser's theorem](https://en.wikipedia.org/wiki/Rosser%27s_theorem), which states that $p_n > n \\log n$. In other words, we can start our primality testing at $10001 \\log 10001$ instead of 1."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 6,
|
|
"id": "d7f5255b-d21e-4cfa-837d-9289c44c156a",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"92114"
|
|
]
|
|
},
|
|
"execution_count": 6,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"lower = ceil(10001 * log(10001))\n",
|
|
"lower"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "48cbf855-fe03-4f7f-9c07-45a1c1c0295f",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Prime-counting function\n",
|
|
"There's one issue preventing us from jumping ahead to 92114: we don't know how many primes are less than 92114, so when we find the next prime (spoiler: it's 92119), we won't know what number prime that is (another spoiler: it's the 8897th prime).\n",
|
|
"\n",
|
|
"To answer this question, we can use the [prime-counting function](https://en.wikipedia.org/wiki/Prime-counting_function). SageMath has the prime-counting function available directly as `prime_pi`, but let's implement it ourselves. As a first pass, we can use a sieve. Check out [problem 10](https://projecteuler.net/problem=10) for a sieve implementation - to avoid duplicating work, we'll use SageMath's sieve here."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 7,
|
|
"id": "43b0b9df-d745-4890-a835-aa0c3e3bced8",
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"def prime_counting_function(n):\n",
|
|
" return len(prime_range(n+1))"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "0c1e85af-74ba-48ee-9f5e-9aab35cc2a55",
|
|
"metadata": {},
|
|
"source": [
|
|
"This will tell us where to start our tally if we start checking for primes at 92114."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 8,
|
|
"id": "1636b854-6d24-4373-987a-8a66e3b91fe4",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"8896"
|
|
]
|
|
},
|
|
"execution_count": 8,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"tally = prime_counting_function(lower)\n",
|
|
"tally"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "13c91512-6a23-4569-bdea-2e88f4414922",
|
|
"metadata": {},
|
|
"source": [
|
|
"This runs considerably quicker than starting from 1."
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 9,
|
|
"id": "1590869a-5d9d-4662-b07b-18511eb25b79",
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"data": {
|
|
"text/plain": [
|
|
"104743"
|
|
]
|
|
},
|
|
"execution_count": 9,
|
|
"metadata": {},
|
|
"output_type": "execute_result"
|
|
}
|
|
],
|
|
"source": [
|
|
"for (i, p) in enumerate(primes(start=lower), start=tally+1):\n",
|
|
" if i == 10001:\n",
|
|
" break\n",
|
|
"\n",
|
|
"p"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"id": "ec384441",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Relevant sequences\n",
|
|
"* Prime numbers: [A000040](https://oeis.org/A000040)\n",
|
|
"* Carmichael numbers: [A002997](https://oeis.org/A002997)\n",
|
|
"* Prime-counting function: [A000720](https://oeis.org/A000720)\n",
|
|
"\n",
|
|
"#### Copyright (C) 2025 filifa\n",
|
|
"\n",
|
|
"This work is licensed under the [Creative Commons Attribution-ShareAlike 4.0 International license](https://creativecommons.org/licenses/by-sa/4.0/) and the [BSD Zero Clause license](https://spdx.org/licenses/0BSD.html)."
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"kernelspec": {
|
|
"display_name": "SageMath 10.7",
|
|
"language": "sage",
|
|
"name": "sagemath"
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.12.5"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 5
|
|
}
|