Announcing Neo-X: An All-New Ultra-Reliable Activation Function for Multi-Modality Models

date

Apr 27, 2023

slug

ultra-reliable-all-new-activation-function

status

Published

Neo-X, An All-New Activation Function for Multi-Modality Models:

In the world of deep learning, activation functions play a crucial role in introducing non-linearity to neural networks, enabling them to learn complex patterns and relationships in the data. Popular activation functions like ReLU, sigmoid, and tanh have been widely used in various deep learning architectures. However, as AI models become more complex and multi-modal, there is a growing need for stronger, more reliable, and fluid foundational modular components to handle these challenges.

Today, we are excited to introduce Neo-X, a novel activation function designed to address these needs. Neo-X is based on the Caputo fractional derivative, providing a more flexible and potentially better-performing alternative to traditional activation functions.

The Problem with Popular Activation Functions

Traditional activation functions like ReLU, sigmoid, and tanh have been successful in many applications, but they may not be the most reliable choice for more complex and multi-modal AI models. These functions can suffer from issues like vanishing gradients, dead neurons, and limited expressiveness, which can hinder the learning capabilities of the model.

Neo-X: A New Hope for Multi-Modality AI Models

Neo-X is a custom activation function that combines the base activation function with its fractional derivative, allowing it to capture more complex and non-linear relationships in the data. This unique design makes Neo-X a promising candidate for multi-modality AI models, where traditional activation functions may fall short.

The architecture of Neo-X consists of four main components:

Base Activation Function: The base activation function can be any standard activation function like ReLU, sigmoid, or tanh. This function is responsible for introducing non-linearity into the model.

Fractional Derivative: The Caputo fractional derivative is a generalization of the traditional derivative to non-integer orders. It allows the activation function to capture more complex relationships in the data by considering the fractional order derivative of the base activation function.

Adaptive Step Size: The adaptive step size is used to calculate the Caputo fractional derivative. It is based on the input tensor's statistics (mean and standard deviation) to ensure that the step size is appropriate for the input data.

Combination: The base activation function and its fractional derivative are combined (e.g., addition) to form the final output of the Neo-X function.

Unlocking the Potential of Neo-X

Neo-X has the potential to revolutionize the way we design and train multi-modality AI models. By experimenting with different base activation functions, tuning the fractional derivative order, optimizing the adaptive step size calculation, and parallelizing the Caputo fractional derivative calculation, we can further enhance the performance of Neo-X and unlock its full potential.

In conclusion, Neo-X is a powerful new activation function that has the potential to address the limitations of popular activation functions and provide a more reliable and fluid foundation for multi-modality AI models. As we continue to push the boundaries of deep learning, Neo-X could play a pivotal role in shaping the future of AI.

Roadmap:

As we continue to develop and refine Neo-X, our focus will be on the following three key optimizations to further enhance its performance and applicability:

Polymorphic base activation function selector (Selects a base activation function depending on the model’s purpose)

Advanced Base Activation Functions: Investigate and experiment with more advanced base activation functions, such as Swish, Mish, and GELU, to improve the expressiveness and learning capabilities of Neo-X in various deep learning architectures.

Hyperparameter Optimization: Develop efficient techniques for tuning the fractional derivative order and other hyperparameters of Neo-X, such as Bayesian optimization or genetic algorithms, to find the optimal configuration for different tasks and datasets.

Efficient Implementation: Optimize the implementation of Neo-X to take full advantage of GPU acceleration and parallel computing, reducing the computational overhead and making it more suitable for large-scale multi-modality AI models.

Code:

neox5.py is the latest iteration and the most performant look back to older iterations to see progress and changes:

EXA/exa/modular_components/activations/neox at master · kyegomez/EXA

An EXA-Scale repository of Multi-Modality AI resources from papers and models, to foundational libraries! - EXA/exa/modular_components/activations/neox at master · kyegomez/EXA

https://github.com/kyegomez/EXA/tree/master/exa/modular_components/activations/neox

Join Agora,

Agora is the open source research organization radically devoted to advancing Humanity with Multi-Modality powering this cutting edge research!

Join us and help liberate Humanity from starvation and food security and disease and death!

Join the Agora Discord Server!

Agora advances Humanity with Multi-Modality AI Research | 120 members

https://discord.gg/nTCHVwzz