2.6. Steinhaus-Moser Notation


We now return to discussing large number notations with Steinhaus-Moser notation, a simple notation that grows at a similar rate to up-arrows. In this page I'll discuss the history of the notation, comparing it to Knuth's up-arrows, and the named numbers mega, megiston, and Moser.

Steinhaus's Original Notation

Originally this notation was defined by a mathematician named Hugo Steinhaus just for fun, as an example of how to easily make some very large numbers. It not certain when it was created, but it was in 1950 or earlier.

One place where Steinhaus describes his notation is in his book, Mathematical Snapshots, a book about recreational mathematics. In the 1983 edition (an edition published after his death), on the end of page 28 and the start of page 29 he briefly discusses a way to make large numbers. He puts in a brief passage about very large numbers between something about probability and combinatorics, and puzzles about chessboards.

This is what Steinhaus says: (note: I use /x\ here to mean x in a triangle, [x] to mean x in a square, and (x) to mean x in a circle):

"It is easy to write down very large numbers. Such giants can be defined very simply if we agree to write <a> instead of aa, [a] instead of 'a in a triangles,' and (a) instead of 'a in a squares.' Then the number 'Mega' = (2) is

MEGA = (2) = [[2]] = [//2\\] = [/22\] = [44] = [256] = ///...///256\\\...\\\

already too great to have any physical meaning (30)a, the last symbol being 256 in 256 triangles, and the reason why we have abandoned the ordinary system of writing numbers is clear. (The reader may try to explain the 'Megiston' given by (10))."

~ Steinhaus [with mistake corrected][1A]

a (30) refers to the diagram above (MEGA = (2) = [[2]]...) explaining how to compute Mega

Steinhaus made a mistake here: it should not be [[4]] after (2), but [[2]] by the definition of x in a circle. But in any case, this is clearly a simple way to make large numbers, as I'll discuss when examining the mega and its big brother the megiston.

1983 is not when the notation was first defined. It's not even known when it was defined, though it has existed since 1950 or earlier, because the same passage appears in the 1950 book around page 19[1B].

Moser's Extension

Another mathematician, Leo Moser, discovered Steinhaus's notation and extended upon it. It is not known when he extended upon the notation or where he published it, but the extension is well-known and described on Wikipedia[2] and on Wolfram MathWorld[3A]. It's defined it like so:

x in a triangle = xx
x in a square = x in x nested triangles
x in a pentagon = x in x nested squares
x in a hexagon = x in x nested pentagons

Now that is a pretty nice extension. He extended upon Steinhaus's notation with the idea of polygons and made it much more powerful, now as powerful as Knuth's arrows as we will see. 

With Moser's system, the mega is instead 2 in a pentagon, and the megiston is 10 in a pentagon. But Moser defined a number far far larger than either of Steinhaus's numbers, defined as two in a polygon with mega (that's 2 in a pentagon) sides. How big are these numbers though? Let's find out and examine each of these numbers.

The Mega

The mega is defined by Steinhaus as equal to 2 in a circle in his notation. Let's use these text substitutes for numbers in shapes:

/x\ is x in a triangle
[x] is x in a square
(x) is x in a pentagon/circle

Then we can start calculating the mega:

Mega = (2) = [[2]] = [//2\\] = [/2^2\] = [/4\] = [4^4] = [256] = 

At this point estimating this number becomes a hassle from calculating all the nested triangles. The first gives us approximately:


and the next is impossible to calculate exactly at all! What are we going to do??

We need to estimate things now. First we should estimate (3.03*10^616)^(3.03*10^616). That can be lower-bounded as:

> (10^616)3.03*10^616
= 103.03*10^616*616
> 101866*10^616
> 101000*10^616
= 1010^3*10^616
= 1010^(3+616)
= 1010^619

With similar ideas we can lower bound (10^10^619)10^10^619:

= 1010^619*10^10^619
= 1010^(619+10^619)
~> 1010^10^619

Here ~> means greater than and approximately equal to. Adding 619 to 10^619 is a negligible effect; therefore, the 619 can be cut off and not have much of an effect on the number's size.

With similar logic we can bound each next triangle to be 10^(the previous number), i.e.

/256\ ~ 10^616
//256\\ ~ 10^10^619
///256\\\ ~ 10^10^10^619
////256\\\\ ~ 10^10^10^10^619

Therefore ///...(256 /s)...///256\\\...(256 \s)...\\\ ~ 10^10^10.........^10^619 with 256 10's. That's E619#256 in Sbiis Saibian's Hyper-E notation.

If you're new to googology you may wonder, what is Hyper-E notation? I'm not going to describe it in detail until section 3, but I'm going to use it now for convenience's sake. For now, you just need to know that Ex#y = 10^10^10^....^10^x with 10 y's. For example, a googolplex, 10^10^100, can be written E100#2 in Hyper-E notation.

How will we upper-bound the mega? I'll give a fairly liberal upper-bound for the sake of giving an easy bound, like so:

/256\ < 10^617

//256\\ < (10^617)10^617 = 10617*10^617 < 101000*10^617 = 1010^3*10^617 = 1010^620

///256\\\ < (10^10^620)10^10^620 = 1010^620*10^10^620 = 1010^(620+10^620) < 1010^10^621

////256\\\\ < (10^10^10^621)10^10^10^621 = 1010^10^621*10^10^10^621 = 1010^(10^621+10^10^621) < 1010^10^10^622

With a similar idea:

/////256\\\\\ < 10^10^10^10^10^623 = E623#5
//////256\\\\\\ < E624#6


Then ///...(256 /s)...///256\\\...(256 \s)...\\\ < E874#256


E619#256 < Mega < E874#256

Actually, with a bit more effort, the mega can be shown to be less than E620#256. How can you do that? First let's get back to the upper bound for the second triangle:

//256\\ < 10617*10^617 ~< 103*10^619

///256\\\ < (10^(3*10^619))10^(3*10^619) = 103*10^619*10^(3*10^619) = 103*10^(619+3*10^619) < 1010*10^(619+3*10^619) < 1010^(620+3*10^619)

////256\\\\ < (10^10^(620+3*10^619))10^10^(620+3*10^619) = 1010^(620+3*10^619)*10^10^(620+3*10^619) = 1010^(620+3*10^619+10^(620+3*10^619))

< 1010^(10^620+10^(620+3*10^619)) < 1010^(10*10^(620+3*10^619) = 1010^10^(621+3*10^619)

With a similar idea we can get /////256\\\\\ < 1010^10^10^(622+3*10^619), etc, until we get ///...(256 /s)...///256\\\...(256 \s)...\\\ < E(873+3*10^619)#255

< E(10^619+3*10^619)#255

= E(4*10^619)#255

= E(10^620)#255

= E620#256


E619#256 < Mega < E620#256

If you want the bounds in tetration form, though they're less accurate it's easy to see that the lower bound E619#256 is larger than E10#256 = E1#257 = 10^^257, and the upper bound E620#256 is less than E10,000,000,000#256 = E10#257 = E1#258 = 10^^258.

But to get more accurate than that, you can just use computer calculations to show that:

Steinhaus's mega ~ E619.29937#256

In any case, the mega is a very huge number, bigger than a googolplex, or a googolduplex, or even Jonathan Bowers' giggol (10^^100) ... but it's much less than Steinhaus's next number, the megiston.

The Megiston

10 in a circle in Steinhaus's notation is known variously as megiston, megistron, and megaston. Megiston is the correct name because that's what Steinhaus calls it in his book[1] as we saw in the section about the mega. Wolfram MathWorld[3B] misspells it as megistron, and because of that, Googology Wiki used to call it megistron. Both websites fgave the name "megistron" has some recognition as well. The third name, "megaston", was a misspelling by Jonathan Bowers on his Infinity Scrapers page[4].

Unlike the mega, the megiston takes full advantage of the pentagon operator to produce an extremely huge number. It's so big that Steinhaus leaves the reader to imagine how big it is, and that's exactly what I'll do.

= [[[[[[[[[[10]]]]]]]]]]]

A number surrounded by 10 squares isn't exactly a good sign, considering how much one square operator can do. Let's continue:

= [[[[[[[[[//////////10\\\\\\\\\\]]]]]]]]]]
= [[[[[[[[[/////////10^10\\\\\\\\\]]]]]]]]]]
= [[[[[[[[[////////(10^10)^(10^10)\\\\\\\\]]]]]]]]]]
= [[[[[[[[[////////10^(10*10^10)\\\\\\\\]]]]]]]]]]
= [[[[[[[[[////////10^10^11\\\\\\\\]]]]]]]]]]
= [[[[[[[[[///////(10^10^11)^(10^10^11)\\\\\\\]]]]]]]]]]

At this point things are ALREADY difficult to compute, and we haven't even finished the first square! However we can lower bound with what we did with the mega:


= [[[[[[[[//// ... ... ... (E11#10 /s) ... ... ... ////E11#10\\\\ ... ... ... (E11#10 \s) ... ... ... \\\\]]]]]]]]

?!?!??!?! We've barely started the second square and now the number seems SUPER IMPOSSIBLE to compute. The mega seemed really daunting at first, but it's nothing compared to this.

But we can still estimate this number. Remember that /x\ ~> 10^x, so we can estimate further:

[E11#10] ~> E11#(10+E11#10) ~> E11#(E11#10) ~> 10^^(E11#10)

And then we can estimate the third square, [10^^(E11#10)] as 10^^10^^(E11#10)...
and the fourth as 10^^10^^10^^(E11#10), lower-bounded by 10^^10^^10^^10^^10 = 10^^^5
and the fifth, then, as 10^^^6


and finally, the tenth as 10^^^11. So there we have it, 10^^^11 is a lower bound for the megiston. That makes it larger than a deka-taxis, a number we encountered in the page about up-arrows, and needless to say, mind-crushingly larger than the mega!

But naturally we want to upper-bound it as well. To do that, I'll make use of a theorem Sbiis Saibian proved about up-arrows called the Knuth Arrow Theorem, which states that for any positive integers a, b, c, and n > 1, (a^nb)^nc < a^n(b+c)

First off let's bound just [10] alone:

/10\ = 10^10
//10\\ = 10^10^11
///10\\\ = (10^10^11)^(10^10^11) = 10^(10^11*10^10^11) = 10^10^(11+10^11) < 10^10^10^12

with a similar idea:

////10\\\\ < 10^10^10^10^13


[10] = //////////10\\\\\\\\\\\ < E19#10 < 10^^12

But what about [[10]]? That's harder now. First let's try one triangle on [10]:

/[10]\ < (10^^12)^(10^^12) = 10^(10^^11*10^^12) = 10^10^(10^^10+10^^11) < 10^10^10^^12 = 10^^14

Actually we can generalize this for any X to:

/10^^X\ = (10^^X)^(10^^X) = 10^(10^^(X-1)*10^^X) = 10^10^(10^^(X-2)+10^^(X-1)) < 10^10^(10^^X) = 10^(10^^(X+1)) = 10^^(X+2)

Therefore /10^^X\ < 10^^(X+2)

So now we can apply this technique itself to /[10]\:

//[10]\\ < /10^^14\ < 10^^16

///[10]\\\ < 10^^18

etc, with that we can say ///...(a /s)...///[10]\\\...(a \s)...\\\ < 10^^(10+2a)

So now we can say that [[10]] =  ///...([10] /s)...///[10]\\\...([10] \s)...\\\ < 10^^(10+2*[10]) < 10^^(10+2*10^^12) = 10^^(10+2*10^10^^11)
< 10^^(10+10*10^10^^11) = 10^^(10+10^(1+10^^11)) < 10^^(10*10^(1+10^^11)) < 10^^(10^(2+10^^11)) < 10^^(10^10^^12))
< 10^^10^^13

Now what about [[[10]]]? We do the same idea:

[[[10]]] < 10^^(10+2*[[10]]) < 10^^(10+2*10^^10^^13) < 10^^(10*10^^10^^13) = 10^^(10^(1+10^^10^^13))
< 10^^(10^(10^^10^^14)) = 10^^(10^^(1+10^^14) < 10^^10^^10^^15

With similar ideas:

[[[[10]]]] < 10^^10^^10^^10^^17

[[[[[10]]]]] < 10^^10^^10^^10^^10^^19


And megiston = [[[[[[[[[[10]]]]]]]]]] < 10^^10^^10^^10^^10^^10^^10^^10^^10^^10^^29 (that's 10 10's)

10^^10^^10^^10^^10^^10^^10^^10^^10^^10^^10^^10 (that's 12 10's)

= 10^^^12

So now we have two fairly compact bounds on the megiston: 

10^^^11 < Megiston < 10^^^12

It's possible to do better, but I think these two bounds will do for the megiston.

Now on to the Moser, which was eponymously defined by Moser and not Steinhaus.

Intermission: Comparing Polygons to Up-arrows

We can almost examine the legendary Moser, which is known by Wikipedia[2] and many other pages as Moser's number, but it is also often just called the Moser. But before we can examine it we first need to examine the behavior of hexagons, heptagons, octagons, etc.

I suppose you could denote x in a hexagon as <x> with angle brackets, but starting with heptagons the characters get unwieldy. Therefore we need a new notation:

a in a triangle is a[3]
a in a square is a[4]
a in a pentagon is a[5]

First off, let's compare each polygon to up-arrows:

a[3] = a^a by definition

Now for squares:

a[4] = a[3][3]...[3][3] with a [3]s


a[3] = a^a
a[3][3] = (a^a)^(a^a) > a^a^a
a[3][3][3] > (a^a^a)^(a^a^a) > a^a^a^a


Then we can conclude: a[3][3]......[3][3] with x a's ≥ a^^(x+1)

And so:

a[4] ≥ a^^(a+1)

And now upper-bounding a in a square:

a[3] = a^a
a[3][3] = (a^a)^(a^a) = a^(a*a^a) = a^a^(a+1) < a^a^a^a =  a^^4
a[3][3][3] < (a^a^a^a)^(a^a^a^a) = a^(a^a^a*a^a^a^a) = a^a^(a+a^a) < a^a^a^a^a = a^^5

We can then conclude:

a[4] < a^^(a+2)

And so:

a^^(a+1) ≤ a[4] < a^^(a+2)

And now for lower-bounding a[5]:

a[4] ≥ a^^(a+1)
a[4][4] > (a^^(a+1))^^(a^^(a+1)+1) > (a^^a)^^(a^^a) > a^^a^^a
a[4][4][4] > (a^^a^^a)^^(a^^a^^a+1) > a^^a^^a^^a

We can conclude then that:

a[5] > a^^^(a+1)

and upper-bounding:

a[4] < a^^(a+2)

a[4][4] < (a^^(a+2))^^(a^^(a+2)+2) < a^^(a+2+a^^(a+2)+2) < a^^(a+4+a^^(a+2)) < a^^(2*a^^(a+2)) < a^^(a^(a^^(a+2))
< a^^(a^^(a+3)) < a^^a^^a^^a = a^^^4

a[4][4][4] < (a^^^4)^^(a^^^4+2) < a^^(a^^^3+a^^^4+2) < a^^(2*a^^^4) < a^^a^^a^^^4 = a^^a^^^5 < a^^^6

and in general, a[4][4]...[4][4] with x 4's < a^^^(2a)


a[5] < a^^^(2a)


a^^^(a+1) < a[5] < a^^^(2a)

To confirm that this is true let's try the mega and megiston on these bounds:

Mega = 2[5] and the bounds tell us:

2^^^(2+1) < 2[5] < 2^^^(2*2)

simplifies to:

2^^^3 < 2[5] < 2^^^4

simplifies to:

65,536 < 2[5] < 2^^65,536

To confirm this we know that mega >>> 65,536

2^^65,536 can be lower bounded like so:

2^^65,536 > (2^^3)^^65,533 = 16^^65,533 > 10^^65,533, and 10^^65,533 is much larger than 10^^258, a fairly generous upper bound to the mega.

Therefore, what we've shown above is now confirmed true.

And the megiston:

10^^^(10+1) < 10[5] < 10^^^(2*10)

10^^^11 < 10[5] < 10^^^20

which once again we know is true, since 10^^^20 is larger than 10^^^12 which I proved to be larger than the megiston.

Now let's try for hexagons, first with lower bounding a[6]:

a[5] > a^^^(a+1)
a[5][5] > (a^^^(a+1))^^^(a^^^(a+1)+1) > (a^^^a)^^^(a^^^a) > a^^^a^^^a = a^^^^3
a[5][5][5] > (a^^^^3)^^^(a^^^^3) > a^^^(a^^^^3) = a^^^^4

We can conclude then that a[6] > a^^^^(a+1).

And upper-bounding:

a[5] < a^^^(2a)

a[5][5] < (a^^^(2a))^^^(2*a^^^(2a)+2) < a^^^(2a+2*a^^^(2a)+2) = a^^^(3+2a+2*a^^^(2a)) < a^^^(3*a^^^(2a))
< a^^^(a^^(a^^^(2a+1)) = a^^^(a^^^(2a+2)) < a^^^a^^^a^^^a = a^^^^4

a[5][5][5] < (a^^^^4)^^^(2*a^^^^4) = (a^^^(a^^^^3))^^^(2*a^^^^4) < a^^^(a^^^^3+2*a^^^^4)
< a^^(3*a^^^^4) < a^^(a^^^^5) = a^^^^6

We can then conclude that a[6] < a^^^^(2a).

With replacing all the ^^^s in the two above proofs with ^^^^s, and ^^^^s with ^^^^^s, we can easily prove that a^^^^^(a+1) < a[7] < a^^^^^2a, and a similar thing for a[8], a[9], a[10], etc.

That allows us to give all these bounds to compare Steinhaus's polygons to Knuth's arrows:

a[3] = a^a
a^^(a+1) < a[4] < a^^(a+2)
a^^^(a+1) < a[5] < a^^^(2a)
and in general:
ak-2(a+1) < a[k] < ak-2(2a)

So now we're ready to look at the Moser.

The Moser

The Moser (also called Moser's number[6]) is equal to 2 in a mega-gon, a polygon with mega (2 in a pentagon) sides. With the ASCII notation it can be written 2[2[5]]. It was defined by Leo Moser in the 1980s as a mind-boggling number with his extension to Steinhaus's triangle/square/circle notation. With our system for bounding Moser's notation in terms of up-arrows we can easily bound it as:

2mega-23 < Moser < 2mega-24

These bounds work because of what I did above. There's not much else to say about the Moser's size, other than its position among the googolisms. It holds an interesting spot because its number of up-arrows when approximating it in up-arrows is a tetration level number, indicating just the very beginning of something new: iterating the number of up-arrows, that is, feeding a number defined with up-arrows into the number of up-arrows in the number, or feeding THAT kind of number into the number of up-arrows. That kind of iteration is the key to expanding googolisms into a whole new level of hugeness! Therefore the Moser can be sort of seen around the edge of the landscape of Knuth's arrows and gazing into an endless land of iteration and interation.

An example of this kind of iteration is Graham's number. It's defined like so:

G(1) = 3^^^^3
G(2) = 3^^^...(G(1) ^s)...^^^3
G(3) = 3^^^...(G(2) ^s)...^^^3


G(64) is Graham's number. Graham's number is the subject of the next article, but before we get ahead of ourselves let's wrap up what we've covered.


In conclusion, Steinhaus-Moser notation is a simple notation that is interesting because of the numbers it defined, and how just about anyone can understand it. The mega looks innocent but is really mind-blowingly huge. If you come to appreciate how huge the mega is, the megiston itself is a mind-blow because of its use of the full power of the pentagon operator. However the Moser just transcends both, and can be kind of seen as a sneak peek at the mind-boggling magnitude of even larger googolisms!

Up next I will discuss Graham's number, not just its magnitude but also its real history, and the common misconceptions regarding it.


[1A] Mathematical Snapshots, 1983 edition of book by Hugo Steinhaus (Google Books link) where he himself describes his notation, and the mega and megiston. Pages cited: 28-29

[1B] Mathematical Snapshots, 1950 edition by Steinhaus (Google Books link). The link only has a small amount of the information viewable, but it proves that the notation was defined 1950 or earlier.

[2] Wikipedia article on Steinhaus-Moser notation (link)

[3A] Wolfram MathWorld's page describing Steinhaus-Moser notation (link)

[3B] Wolfram MathWorld's page on the megiston, which erroneously names it megistron (link)

[4] Jonathan Bowers' Infinity Scrapers page, which erroneously calls megiston "megaston" (link)

[5] Sbiis Saibian's proof of the Knuth Arrow Theorem (link)

[6] Wikipedia's article on Steinhaus-Moser notation (link)