Understand the capabilities of cyclic encoding | by Shiro Matsumoto

To make the above explanation intuitive, I will illustrate it with an actual example. Although a bit verbose, here is the complete code to make it easier to understand what we are about to do.
Note that the purpose of this experiment is to demonstrate the representational capabilities of cyclic encoding. Therefore, the training and validation data are not split, and there is no noise on the objective variable. Note that no regularization or early stopping was performed to avoid the risk of overtraining.

import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.optimizers import Adam
import matplotlib.pyplot as plt# NeuralNetwork classes, passing the number of hidden layers and their number of nodes as list arguments
class NeuralNetwork:
def __init__(self, input_dim, hidden_units):
self.input_dim = input_dim
self.hidden_units = hidden_units
self._build_model()
def _build_model(self):
input_layer = Input(shape=(self.input_dim,))
x = input_layer
for units in self.hidden_units:
x = Dense(units, activation='relu')(x)
output_layer = Dense(1)(x)
self.model = Model(inputs=input_layer, outputs=output_layer)
self.model.compile(loss='mean_squared_error', optimizer=Adam())
def train(self, input_data, output_data, epochs=1000, batch_size=32, verbose=1):
self.model.fit(input_data, output_data, epochs=epochs, batch_size=batch_size, verbose=verbose)
def predict(self, input_data):
return self.model.predict(input_data)
# The only input values are the cyclic encodings sin and cos.
def create_input_data(x):
sin_x = np.sin(x)
cos_x = np.cos(x)
return np.column_stack((sin_x, cos_x))
# Objective variable we want to train (we'll see it later in the graphs)
def create_target_data(x, func):
if func == 'sin(x)':
return np.sin(x)
elif func == 'sin(2x)':
return np.sin(2*x)
elif func == 'sin(5x)':
return np.sin(5*x)
elif func == 'sawtooth':
return x % (2 * np.pi)
elif func == 'step':
return np.floor(x % (2 * np.pi))
elif func == 'jigzag_step':
return ((np.floor(x % (2 * np.pi)))*2) % 5
elif func == 'x':
return x
else:
raise ValueError("Invalid function")
# Drawing Graph
def plot_results(test_x, true_values, predictions, func_name, hidden_units):
plt.figure(figsize=(16, 6))
plt.scatter(test_x, true_values, label='True Values', color='salmon')
plt.plot(test_x, predictions, label='Predictions', linestyle='dashed', color='darkcyan')
plt.xlabel('x')
plt.ylabel(func_name)
plt.title(f'Function: {func_name}, Hidden Units: {hidden_units}')
plt.legend()
plt.xticks([-2*np.pi, -1.5*np.pi, -np.pi, -0.5*np.pi, 0, 0.5*np.pi, np.pi, 1.5*np.pi, 2*np.pi],
['-2π', '-1.5π', '-π', '-0.5π', '0', '0.5π', 'π', '1.5π', '2π'])
plt.show()
# Neural Network Training and Drawing
def train_and_plot(func_name, hidden_units, epochs):
# Data generation
x = np.linspace(-2*np.pi, 2*np.pi, 100)
# Instantiating the neural network
input_dim = 2
nn = NeuralNetwork(input_dim, hidden_units)
# Choosing the target variable
output_data = create_target_data(x, func_name)
# Training
input_data = create_input_data(x)
nn.train(input_data, output_data, epochs=epochs, verbose=0)
# Predictions for the test data
test_x = np.linspace(-2*np.pi, 2*np.pi, 100)
test_input_data = create_input_data(test_x)
predictions = nn.predict(test_input_data)
# Plotting the results
plot_results(test_x, output_data, predictions, func_name, hidden_units)

Unit sine wave

Now you are ready for the experiment.
First, we will check the simplest form, the case where the objective variable was a unit sine wave. Since the explanatory variables are unit sine and unit cosine, it seems to be possible to explain it well without using a neural network.
Here we assume one hidden layer with four nodes.

func_name = 'sin(x)'
hidden_units = [4]
epochs = 1000
train_and_plot(func_name, hidden_units, epochs)

And here is the result.

Not surprisingly, the model is able to train the unit sine wave well.

Sin(2x)

Then, let’s see if the model can learn sin(2x), where the frequency is doubled. The number of nodes in the hidden layer is 16.

func_name = 'sin(2x)'
hidden_units = [16]
epochs = 1000
train_and_plot(func_name, hidden_units, epochs)

And here is the result.

Worked. The model is also learning sin(2x). That is, by repeatedly adding constants to sin(x) and cos(x), multiplying by a constant, and applying Relu, we found that sin(2x) can be obtained.

Sin(5x)

Just to be sure, let’s also look at sin(5x).

func_name = 'sin(5x)'
hidden_units = [16]
epochs = 1000
train_and_plot(func_name, hidden_units, epochs)

And here is the result.

Oh, what a disappointing result…
Actually, this is because the assumption of the universal approximation theorem, “neural network with a sufficiently large hidden layer,” was not satisfied.

To make it sufficiently large, either increasing the number of layers or increasing the number of nodes in the layers is theoretically possible, but in this case, increasing the number of layers is more efficient, so we retried with 2 hidden layers and 16 nodes each.

func_name = 'sin(5x)'
hidden_units = [16, 16]
epochs = 1000
train_and_plot(func_name, hidden_units, epochs)

Here is the result this time.

Even high frequencies such as sin(5x) could be generated from sin(x) and cos(x).

Sawtooth wave

Now what if it is a sawtooth wave instead of a sine or cosine wave?

func_name = 'sawtooth'
hidden_units = [32, 32]
epochs = 1000
train_and_plot(func_name, hidden_units, epochs)

This is the result.

In fact, such a worsening of the approximation at discontinuities is also observed in Fourier decomposition and is known as the Gibbs phenomenon. However, in Gibbs phenomenon, the approximation behaves as if it overshoots outward at discontinuities. This result differs in that point.

Staircase wave

What would a staircase function with many discontinuities look like?

func_name = 'step'
hidden_units = [64, 64, 64]
epochs = 1000
train_and_plot(func_name, hidden_units, epochs)

Here is the outcome.

Although the approximation is not very good in some places, the true values are generally reproduced.

Zigzag Staircase

Finally, what if it were a more complex staircase shape?

func_name = 'jigzag_step'
hidden_units = [64, 64, 64]
epochs = 3000
train_and_plot(func_name, hidden_units, epochs)

Here is.

It is still able to reproduce the true value even with complex discontinuous functions.

To be sure, the acyclic function

What we have checked is “Can a neural network with sin(x) and cos(x) as input values approximate a function with arbitrary period 2π? Of course, any function other than periodic functions cannot be approximated by a neural network with sin(x) and cos(x) as input values.

func_name = 'x'
hidden_units = [64]
epochs = 1000
train_and_plot(func_name, hidden_units, epochs)

It goes like this.

It doesn’t work at all, to be sure.

As we have seen, we have confirmed that we can approximate arbitrary periodic functions by using two inputs, sin(x) and cos(x), and a neural network of appropriate complexity. This indicates the possibility of reproducing the periodicity of arbitrary shapes by adding sin(x) and cos(x) to the features of the model when the objective variable is considered to have periodicity. If the objective variable is considered to have periodicity by time of day, by week or day of week, and by year, then we can add sin(x) and cos(x) corresponding to the time of day, sin(x) and cos(x) corresponding to the week or day of week, and sin(x) and cos(x) corresponding to the annual period.

Source link