Table of Contents
The number of magic squares of order six counted up to rotations and reflections
 nms6.txt
17 753 889 197 660 635 632
This number has been confirmed by completing the enumeration twice using hundreds of GPUs at cloud resource rental services. Though the number and models of GPUs used varied over time, the initial enumeration took about 80,000 hours of GeForce RTX4090 and the second enumeration with an improved code took about 54,000 hours.
As the result of the second enumeration, the initial result 17 753 889 189 701 384 304 obtained in July 2023 was found to be incorrect. More details of the errors are in the next section.
The result is consistent with stochastic estimates 1.8(2)・10^{19}[1], 1.7745(16)·10^{19}[2], 1.775392(12)·10^{19}[3] and 1.77543(73)·10^{19}[4].
 Y. Ohishi, in Japanese, Estimation of number of solutions of 6th order magic square by random sampling, Sugei Puzzle No.177, April 1992
 K. Pinn and C. Wieczerkowski, Number of Magic Squares From Parallel Tempering Monte Calrlo, International Journal of Modern Physics C, 9 April 1998.
 A. Kitajima and M. Kikuchi, Numerous but Rare: An Exploration of Magic Squares, PROS ONE 10(5) e0125062, 14 May 2015.
Correction history
More than thousand of instances of GPU server were used in the enumeration and some of them were unfortunately faulty and produced wrong results. To find and correct such errors, every subtotal was calculated at least twice and extra counting was done to determine the correct answer when a mismatch happened. There were following two cases where the initial result was incorrect.
corrected on 2023.09.07
During the thorough doublecheck, it was discovered that a portion of the results generated by an instance was incorrect. The instance ran with two RTX4090s for 60 hours and generated 3,771 subsubtotals. Out of those subsubtotals, 12 were incorrect and all incorrect results were generated by only one of the two RTX4090s. It is unlikely that these errors are due to logical flaws or coding mistakes. Hardware failures are suspected.
As the result of the correction, the number increased by 960(40×24).
corrected on 2024.02.17
Another erroneous instance was found. It ran with an RTX4090 for about one month and produced about 19,000 subsubtotals. Out of those subsubtotals, 6 were incorrect. All of the incorrect results were produced in the last one hour of the lifetime of the instance. After the erroneous behavior, the GPU of the instance became unusable with an error message of “invalid memory access”.
As the result of the correction, the number increased by 7 959 250 368 (331 635 432 x24).
Code corrected (2023.11.28)
The code used in the initial enumeration was discovered to contain a mistake related to GPU thread synchronization. A corrected version of the code is used in the second enumeration. No erroneous incident caused by this mistake was found.
Subsets and subtotals
Since the number is too huge to count in a single task, the entire task is divided into numerous small subtasks. Counts for the subtasks are available.

 (3 of subtotals are known to be incorrect, but are kept uncorrected intentionally.)
 List of subsubtotals (234MBytes .gz)
 (18 of subsubtoals are known to be incorrect, but are kept uncorrected intentionally.)
Storategies and Codes
Strategies in counting magic squares
CUDA code (corrected on 2023.11.28 and updated on 2024.05.04)
 Counts magic squares of an order from 3 to 6 up to Mtransformations.
 Runs at a typical speed of 3.8G counts/sec on Nvidia GeForce RTX4090 for order 6.
 Nvidia GPU of Pascal architecture (sm_60) or newer is assumed.
 Multi GPU systems are supported. Be cautious, however, when you use a multi4090 system .
 Compiling and linking:
nvcc O3 arch=sm_60 maxrregcount=40 Wnodeprecateddeclarations ms.cu lcrypto
 If you don't need md5 checksums, add
DnoMD5
and dropWnodeprecateddeclarations
andlcrypto
.
 For orders less than 6, specify the order by a compiler option
DN=order
.  The executable takes 0, 2, or 4 parameters. The 1st and the 3rd parameters are just place holders.
./a.out
counts all magic squares./a.out dummy representative_magic_series_in_hex
counts magic squares whose representative magic series is equal to the given hex number../a.out dummy1 representative_magic_series_in_hex dummy2 2nd_largest_magic_series_in_hex
counts magic squares whose representative magic series and the 2nd magic series parallel to the representative are as specified. The code doesn't check the validity of parameters given by users. Invalid parameters will result in a wrong answer or a runtime error.
NonCUDA code in C using pthread (updated on 2023.09.18)
 Compiling and linking:
gcc O3 DNTH=number_of_threads ms.c lpthread lcrypto
 Options
DnoMD5
andDN=order
have the same effects as in the Cuda code.  Much slower than the CUDA code, but easier to read.
Semimagic squares
The latest code used in the second enumeration counted semimagic squares besides magic squares and produced the same result as discovered by a very different approach[5]. This match is considered as a cross verification of the counting method and of computing resources used in the calculation. All subtotals of semimagic (including magic) squares are available.
 5. A. Ripatti, On the number of semimagic squares of order 6, arXiv:1807.02983, July 2018.
Acknowledgments
I would like to thank Walter Trump for his discussion, suggestion, and encouragement. He reviewed my method and verified a part of my result using his own code. He suggested me to count semimagic squares in the second enumeration and encouraged me to complete my work. Visit his website Notes on Magic Squares and Cubes which is a good summary of magic square related topics.