### Table of Contents

## The number of magic squares of order six counted up to rotations and reflections.

- nms6.txt
17 753 889 189 701 385 264 corrected(increased by 960) on 2023.09.07 and still to be confirmed

This result is consistent with stochastic estimates 1.7745(16)·10^{19}[1] and 1.775392(12)·10^{19}[2].

Using hundreds of GPUs at cloud resource rental services, it took about six months to complete the counting, Though the number and models of GPUs used varied over time, the total computation time amounts to about 80,000 hours of GeForce RTX-4090.

Because of the extraordinary volume of the calculation, it is not easy to deny a possibility the result is contaminated by accidental errors. I have conducted a preliminary double-check trying to exclude such errors and am currently performing a thorough double-check. I would appreciate confirmations or disputes by others.

- K. Pinn and C. Wieczerkowski, Number of Magic Squares From Parallel Tempering Monte Calrlo, International Journal of Modern Physics C, 9 April 1998.

## Errors found (updated on 2023.09.07)

During the thorough double-check, it was discovered that a portion of the results generated by an instance was incorrect. The instance ran with two RTX-4090s for 60 hours and generated 3,771 sub-subtotals. Out of the 3,771 sub-subtotals only 12 was incorrect and all incorrect results were generated by only one of the two RTX-4090s. It is unlikely that these errors are due to logical flaws or coding mistakes. Hardware defects or instability are the most probable causes.

Correcting these errors increased the number by 960(40×24).

While these errors have not damaged my confidence in the logic and the code used in the calculation, it is possible that errors of similar nature may still be contained in the result. Therefore, the results should be considered unconfirmed until the thorough double-check will be completed.

## Subsets and subtotals

Since the number is too huge to count in a single task, the entire task is divided into numerous small sub-tasks. Counts for the sub-tasks are available.

- List of subsubtotals (234MBytes .gz)

## Codes

Strategies in counting magic squares

- Counts magic squares of an order 3..6 up to M-transformations.
- Runs at a typical speed of 2.5G counts/sec on Nvidia GeForce RTX-4090 for order 6.
- Nvidia GPU of Pascal architecture (sm_60) or newer is assumed.
- Multi GPU systems are supported.
**Be cautious, however, when you use a multi-4090 system**. - Compiling and linking:
`nvcc -O3 -arch=sm_60 -maxrregcount=40 -Wno-deprecated-declarations ms.cu -lcrypto`

- If you don't need md5 checksums, add
`-DnoMD5`

and drop`-Wno-deprecated-declarations`

and`-lcrypto`

.

- For orders less than 6, specify the order by a compiler option
`-DN=`

.*order* - The executable takes 0, 2, or 4 parameters. The 1st and the 3rd parameters are just place holders.
`./a.out`

counts all magic squares`./a.out`

*dummy representative_magic_series_in_hex*

counts magic squares whose representative magic series is equal to the given hex number.`./a.out`

*dummy1 representative_magic_series_in_hex dummy2 2nd_largest_magic_series_in_hex*

counts magic squares whose representative magic series and the 2nd magic series parallel to the representative are as specified.- The code doesn't check the validity of parameters given by users. Invalid parameters will result in a wrong answer or a runtime error.

- new code updated on 2023.09.20 ( 8% faster than the 2023.01.31 version )

Non-CUDA code in C using pthread

- Compiling and linking:
`gcc -O3 -DNTH=`

*number_of_threads*ms.c -lpthread -lcrypto - Options
`-DnoMD5`

and`-DN=`

have the same effects as in the Cuda code.*order* - Much slower than the CUDA code, but easier to read.

- new code updated on 2023.09.18 ( 30% faster than the 2023.01.31 version )