You already saw a variation of the LSTM, namely the GRU. We have, extensively discussed how these two architectures differ. There are other variations that also exist and are quite noteworthy. One of these is the LSTM variation, that includes something known as a peephole connections. These connections permit information to flow from the cell state all the way back to the information gates (forget, update, and output). This simply lets our LSTM gates peek at the memory values from previous timesteps while it computes the current gate values at the current time.