As mentioned, the LSTM architecture relies on a series of gates that can independently influence the activation values (a(t-1)), as well as the memory (c(t-1)), from previous timesteps as information flows through an LSTM unit. These values are transformed as the unit spits out the activations (at) and memory (ct) vectors pertaining to the current timestep at each iteration. While their earlier counterparts enter the unit separately, they are allowed to interact with each other in two broad manners. In the following diagram, the gates (denoted with the capital Greek letter gama, or Γ) represent sigmoid activation functions applied to the dot product of their respectively initialized weight matrix, with previous activations and current input:
Germany
Slovakia
Canada
Brazil
Singapore
Hungary
Philippines
Mexico
Thailand
Ukraine
Luxembourg
Estonia
Lithuania
Norway
Chile
United States
Great Britain
India
Spain
South Korea
Ecuador
Colombia
Taiwan
Switzerland
Indonesia
Cyprus
Denmark
Finland
Poland
Malta
Czechia
New Zealand
Austria
Turkey
France
Sweden
Italy
Egypt
Belgium
Portugal
Slovenia
Ireland
Romania
Greece
Argentina
Malaysia
South Africa
Netherlands
Bulgaria
Latvia
Australia
Japan
Russia