[Weekend Read] KAN — Kolmogorov-Arnold Networks
Key take aways from KAN paper
KANs (Kolmogorov Arnold Networks) may change the way we build neural networks. You may access the pre-preprint of this work from here.
It promises to be better than the currently dominant MLP (Multi-Layer Perceptron) architecture in terms of accuracy and explainability.
While MLPs have fixed activation functions on nodes, KANs have learnable activation functions on edges.
KANs have no linear weights at all — every weight parameter is replaced by a univariate function parameterized as a spline.
Some key fascinating aspects from the paper:
- KAN produces symbolic mathematical formulas
- KAN can be pruned to optimize drastically
- KAN scales faster than MLP (as shown in the above experiments)
pykan provides an easy to use library to build KAN models. Give it a try!
Given that current AI models are hitting a ceiling in terms of data and computation, KANs may help steer new innovations with much smaller networks for similar tasks.
KANs are designed to solve scientific tasks, it remains to see its applicability to general tasks like pattern recognition or language modeling.