The Karpathy constant is one of the best learning rates for the popular Adam (deep neural network) optimizer. It is defined as รยท = 3e-4. The actual symbol for the constant is รยฑ_k.
What is the correct learning rate for adam in this case?
Just use the Karpathy constant dude
48๐ 2๐