空间模型 (2-1)

好书推荐：数学联邦政治世界观、万人迷被强制爱的日常、惊世狂妃：皇叔一宠到底、双生劫、高冷冥夫宠上身、潜执CP（自创版）、倾世恋人：九尘君主的天才小君妃、师尊：徒弟别乱来、浮生旧梦祭红尘、穿书炮灰女配要修仙、

An available state space model for modeling long sequences

Paper: Efficiently Modeling Long Sequences with Structured State Spaces

Motivation and current problem

• A central problem in sequence modeling is efficiently handling data that contains long-range dependencies (LRDs). 一般要求上万步(16k)，现在能做到几千步就不错了。

• 用special matrix（HIPPO）武装起来的latent space model本来具有长时间记忆的能力，但在计算上不可行：O(N 2L) operations and O(N L) space。尽管依据经典linear algebra的降维算法被提出了，但是在数值上不稳定：A的条件数比较大。

• 希望有一个general-purpose sequence model: 现在的model总是针对一个particular domain(images, audio,text, time-series),处理一个narrow range of problems ( efficient training，fast generation, handling irregularly sampled data).这种现状的原因是这些模型想要高效，就需要domain-specific preprocessing，inductive biases, and architectures.

Contribution

1. S4解决了SSM模型过往的computational neck;在speed和memory overhead 上都达到了efficient transformer的水平；

2. 在LRD任务上成为SOTA，特别地，第一次解决了长达16k，涉及到图像空间推理的Path-X问题；

3. 除了LRD任务，S4具备成为general-purpose sequence model的潜力：

具有efficient training, fast generation, handling irregularly sampled data（比如说调整speech的采样频率）的多种功能

在不调整结构的情况下，能handle diverse domains:surpasses Speech CNNs on speech classification, outperforms the specialized Informer model on time-series forecasting problems, and matches a 2-D ResNet on sequential CIFAR with over 90% accuracy.

Preliminary

1.SSM Model

数学联邦政治世界观提示您：看后求收藏（同人小说网http://tongren.me），接着再看更方便。